Introduction

Solid state fermentation (SSF) is a biotechnological process in which microorganisms are grown on solid substrates in the absence of free water [1,2,3,4] or with enough water present only for the development of microorganisms that secrete metabolites such as enzymes [5,6,7]. In SSF, bioreactors resemble the natural habitat of several microorganisms, including fungi that grow in conditions of low humidity [8, 9]. The process takes place in less time, but with high productivity [10, 11], in addition to the possibility of using solid waste as a substrate.

The most important factors to be considered during the development of SSF are the choice of microorganisms and substrates [12]. Moreover, the specific surface area of the substrate is a critical factor, since, in a fermentative process, the particle size interferes in the sufficient effective surface area for the adsorption and penetration of hyphae, providing adequate diffusion of nutrients and gases for microbial development [13]. In this condition, they are able to synthesize and secrete different enzyme complexes in addition to other metabolites [14, 15]. The enzymes produced, in turn, are less susceptible to problems of inhibition by the substrate and stable in the face of changes in temperature and pH [16, 17].

Microbial enzymes obtained in SSF [18] are highly important in the food and pharmaceutical industries [19, 20], due to their applications in detergent formulation [21], food and feed processing [22], waste treatment [23], bio-whitening [24], beverage preparation [25], cosmetics [26], biodiesel synthesis [27], ethanol [28], bioremediation [29], and fertilizer formulation [30].

The production of these enzymes by the microorganisms in the SSF depends on fermentative parameters such as incubation temperature, fermentation time, pH, moisture content, aeration, and spore concentration [31]. The incubation period in the SSF influences the proliferation and accumulation of biomass, since, as the amount of mycelium increases, there is also an increase in the amount of cellulase. However, there is a maximum period and, upon reaching this period, the substrate can be consumed for growth purposes, decreasing enzyme synthesis [32].

pH is a parameter in the complex monitoring of SSF, and therefore, it is important to choose microorganisms that can grow in a wide pH range [33]. Another factor that interferes with the performance of the microorganisms in SSF is moisture content, where a high content results in less porosity of the substrate, which prevents oxygen penetration; in contrast, a low moisture content can lead to poor accessibility of the nutrients, resulting in slow microbial growth [34].

Temperature is considered one of the main parameters and requires some attention in the fermentation process, since most of the microorganisms used in the SSF are mesophilic, with an ideal temperature for growth between 20 and 40 °C and maximum growth below 50 °C. Gradients of temperature can delay microbial activity, dehydration of the environment, and undesirable metabolic deviations [33].

The ideal relationship between these variables (SSF parameters) can be achieved using statistical methods [12, 35,36,37,38,39], which are gaining an increasing trend in finding an optimal parameter. Their application may involve both methodologies: univariate and multivariate.

The univariate methodologies are based on the analysis of a factor by time (OFAT), in which the optimization is performed by analyzing the effect of one factor at a time on the experimental response and only one parameter is changed while the others are constant; however, this is a limited technique, since it does not have the capacity to relate the effects between the variables, in addition to the high number of experiments [40, 41]. Despite the presented disadvantages, there are still works being carried out with this technique [42,43,44,45,46,47].

In contrast to OFAT, multivariate methodologies solve these limitations, since they use a data matrix that mitigates the levels and variables studied, enabling the interaction between all the factors studied and their influence on the desired response [48, 49].

Studies already carried out confirm that, in comparison to univariate methodology, the application of multivariate methodologies can help to increase enzyme production, finding more beneficial process/reaction conditions. Al-Saman et al., in their study of lovastanin production by Aspergillus terreus ATCC 10,020 by SSF, observed a 600% increase in enzyme production when using a multivariate technique, the central composite design [50]. In another SSF study, Das et al. optimized the production of Penicillium amphipolaria inulinase using the central composite design and obtained a 310% increase in enzyme production [51]. Other works found in the literature report significant increases in enzyme production when using multivariate optimization techniques [5, 52]. Despite the advantages over the univariate methodology, the multivariate methodology is restricted to the experimental domain (lower and upper levels) and does not have the ability to extrapolate that domain. One way to avoid these limitations is the use of artificial neural networks (ANN) [53].

ANN is one of the classes of bioinspired computational algorithms that make up the area of artificial intelligence and are applied to modeling, prediction, and classification of data from different areas of knowledge [53]. Its fundamental elements are artificial neurons, which are organized in a neural network with the ability to learn and generalize the input–output relationship of the available data set [54].

Thus, this review demonstrates the importance in the application of bioprocesses such as SSF in the production of enzymes and the application of ANN as an important optimization tool.

Application of Chemometric Techniques in the Production of Enzymes by SSF

The conventional multivariate statistical techniques (central composite design (CCD), Box-Behnken design (BBD), Doehlert design (DD), and mixtures planning simplex-centroid (SC)) used for enzyme production by SSF differ according to the experimental objective. All techniques have a similarity, since it is necessary to determine the maximum and minimum values for each variable included in the experimental domain [55]; in addition, the user must recognize two types of variables, the independent ones (factors) and the dependent ones (responses). The independent variables influence the response and can be divided into two distinct groups, the process variables (fermentation time, pH, initial humidity, incubation temperature, concentration, and inoculum, among others) and the mixing variables [56] (Table 1).

Table 1 Independent variables (process and mixing) applied in the optimization of fermentation processes

When applying a matrix using process variables, their levels can vary independently of each other; however, when mixing variables are used, such as the proportion of residues in fermentation processes, the answer is related to the proportion of each one of the components and their levels should vary considering the others [55, 56].

The statistical techniques mentioned up to this point are restricted and have disadvantages, such as the inability to evaluate interactions between variables (univariate methodology) and the maximum point in the studied planning is restricted within the experimental domain. These limitations can be overcome using the artificial neural network hybridized to optimization techniques.

Artificial Neural Network

Artificial neural networks are data processing systems that present a mathematical model inspired by a neural structure of living organisms. The most widely used artificial neuron has a multiple linear regression as a mathematical model as a function of the neuron inputs propagated to the output by a possibly non-linear function, called the activation function (Table 2). Each artificial neuron can have a number of linear regression weights equal to its number of inputs and one more bias, a weight that allows a translation of the output to adjust non-zero mean functions [67]. The non-linear function gives the neuron the possibility to model non-linear relationships between its inputs and output [68, 69]. ANN is formed by the set of interconnected neurons, a set of interconnected non-linear regressors, giving it the property of a universal approximator [70].

Table 2 Main activation functions in artificial neural networks

A determining factor for the capacity of effective generalization of the network is the way in which the neurons are arranged and interconnected, that is, their structure [71, 72].

ANN Structures

There are different models of organization of these neurons in the literature, each generating a network with specific functionality and application. Feedforward neural networks (FFNN) (Fig. 1a), also called multilayer perceptrons (MLP), have in their topology an input layer, an output layer, and at least one hidden layer; the term feedforward indicates that the network is designed to travel the signal given in one direction, from the input nodes passing through the hidden layers to the output nodes, without connections to the previous nodes [70]. In MLP, artificial neurons are organized in parallel forming hidden layers between the input layer (input data) and the output layer (output data). Due to the unique direction of propagation of the signal from the input to the output, they are called feedforward. If the outputs of a layer are completely connected to the inputs of the next layer, they are called fully connected [71].

Fig. 1
figure 1

Structural representation of feedforward neural networks (FFNN)—multilayer perceptrons (MLP) (a), recurrent neural network (RNN) (b), modular neural network (MNN) (c), and their weights (w)

Recurrent neural networks (RNN) (Fig. 1b) are dynamic systems with memory and the ability to incorporate the feedback loop and, consequently, powerful representation capacity [73]. In RNN, the connections between nodes generate a closed cycle and this characteristic is what differentiates it from FFNN, they are more suitable models for processing sequential input and learning long time dependencies within the data, and each sample is considered dependent on previous data [74]. Modular neural networks are used to predict oil production [75], computational process prediction [76, 77], and meteorological forecasts [78].

Modular neural networks (MNN) (Fig. 1c) are a combination of structures in which small neural networks are moderated by some intermediary fuse to solve a problem; this type of network is indicated to eliminate local minimums in larger networks, such as multilayer perceptrons [79]. After the resolution of the separate modules, the combination occurs from an integration unit, which generates the general output of the complex system [80]. Modular neural networks can be used in computing as in the creation of patterns [80], they can act by assisting neural networks with unbalanced training sets [81], and they can be applied in medicine, through the classification of lung diseases [82].

Neural Network learning

The ability of ANN to learn the input–output relationship of a set of data occurs through the optimization process of the weights and bias of neurons. This optimization aims to minimize a function of the error between the expected value of an output for a given input and the output obtained by the network for the same input. As an example of the most-used error functions are the mean squared error (MSE) and the root mean squared error (RMSE) [83].

Weights optimization is called network training, usually carried out by methods based on the gradient of the network error in relation to weights and biases. The gradient, calculated by the chain rule, relates the network error to all weights and bias from the output layer to the input layer, giving this optimization process the name backpropagation due to the direction of error propagation. The most usual gradient-based method is the Levenberg–Marquardt, with the main advantages also using the error hessian in relation to weights and bias and a variable learning rate [70].

The optimization process can be carried out in batch (only once), in parts (minibatch). or with the new data being applied one by one, in sequence. Usually, a minibatch is used, with a user-defined size, to avoid the high computational cost of optimizing all data at once and reducing the randomness of doing the optimization for each data separately.

The network training process depends on the number of neurons in the network (or number of hidden layers and number of neurons per layer in the case of MLP) and depends on the initial conditions of weights and bias used in the optimization process. This makes the training process of a network experimental, where different numbers of neurons and different initializations of the weights and bias must be tested until a satisfactory result of the error function is reached [71].

Another problem associated with training is overfitting, where the excess of neurons leads to models without generalization capacity, especially if the training data contains outliers or noise [71]. This is evidenced by a model with a high hit rate for training data, but a low hit rate for other data. A measure commonly used for the quantitative assessment of the generalization capacity of a network is the coefficient of determination, which represents an error measure with greater weighting for errors in data that are more distant from the average of the outputs [83].

As a methodological tool to create generalist networks, the set of data available for learning is divided into training, validation, and test data, generally divided into 70%, 15%, and 15% of the data, respectively [71].

The training data are used to minimize the error function; in this step, the number of neurons is also defined [70]. The validation data are used to measure the generalization capacity of a trained network, and if the network does not present an adequate value, a new training is carried out. Finally, the test data is data not applied in the previous steps and used to measure the performance of the network for new data, simulating the application of the obtained network. It is usual to measure the function of error and generalization for the test data and to extrapolate these statistical measures for future applications into data statistically compatible with the test data, where the expected output is usually unknown [70].

Use of ANN in SSF

The application of hybridized ANN to optimization techniques for enzymatic production by SSF can be considered as a new line of research, since eight studies were found in the last 10 years (Table 3). In these studies, factors such as fermentation time, incubation temperature, humidity, pH, and supplementation with various salts were used as input data. The investigated enzymes were lipase by Penicillium roqueforti ATCC 10,110 [83] and Candida rugosa NCIM 3462 [88], exoglucanase by P. roqueforti ATCC 10,110 [84], laccase by Pleurotus ostreatus PVCRSP-7 [85], xylanase by Thermomyces lanuginosus VAPS -24 [86], tannase by Bacillus gottheilii M2S2 [87], protease by Rhizopus oryzae (SN5) / NCIM-1447 [89], and cellulase by Trichoderma stromaticum AM7 [90]. In all reported studies, satisfactory values in precision and prediction were obtained (factors that indicate satisfactory modeling performance); as well, all studies showed high values of R2.

Table 3 Applications of hybridized artificial neural networks with optimization techniques for enzymatic production by solid state fermentation

Conclusion

Multivariate statistical techniques are successfully applied in solid state fermentation (SSF) to optimize parameters such as pH, incubation temperature, fermentation time, initial humidity, and substrate proportions. Artificial neural networks hybridized to optimization techniques have the ability to overcome the limitations of univariate (lack of interaction between variables and a high number of experiments) and multivariate (inability to extrapolate the experimental domain) methodologies, enabling higher enzyme yields. There already exist reports that prove the efficiency of this powerful tool based on artificial intelligence for the production of enzymes by SSF.