Introduction

Due to the high complexity of biological materials, the design of processing operations and their control systems becomes difficult. In such cases, the use of generic regression models often fails to provide desired predictive results and therefore, advanced modelling techniques are required for better control and process optimization. Recently, the application of machine learning tools in bioprocessing operations has gained popularity due to its robustness, efficient pattern recognition capacity and adaptability to new and imprecise data (Bhagya Raj and Dash 2022). Among such tools, artificial neural networks (ANNs) have been widely applied for various food and food waste processing operations such as drying (Tarafdar et al. 2019a, b; Kumar et al. 2019), extraction (Muthusamy et al. 2019), hydrolysis (Sirohi et al. 2021), pressure cooking (Torrecilla et al. 2007), extrusion (Dalbhagat and Mishra 2019), filtration, food quality detection, fermentation (León-Roque et al. 2016), and food traceability (Wang et al. 2017). Among such operations, microfluidization is an emerging liquid/semi-solid food processing technique that relies on high velocity impact, intense shear and occasional cavitation to alter the food quality characteristics as desired. Although microfluidization has been extensively used for several liquid food commodities such as carrot juice (Koley et al. 2020), peach juice (Wang et al. 2019), sea buckthorn juice (Abliz et al. 2021), sapodilla juice (Singh et al. 2022), yam juice (Liu et al. 2021), tomato juice (Dai et al. 2022), among many others; the application of ANN to the microfluidization process is extremely limited (Tarafdar et al. 2020; Santos et al. 2022). Considering that a substantial share of the work on food applications of microfluidization has been done in just the past 5 years, it is highly likely that this technique is expected to gain immense popularity among the juice processing industries in the near future. It is therefore essential that intelligent models be applied to this mechanical processing technique to initiate research in developing a process model that could enable the prediction of the quality parameters of beverages with variation in microfluidization processing conditions. To this effect, ANNs could be exploited to generate robust process models with adaptive prediction capabilities.

To provide a brief overview, ANNs are data-driven machine learning models that can process input signals in a similar way to that of the human brain. The basic infrastructure of ANN consists of an input layer (consisting of independent variables), one or more hidden layer (based on the complexity of the problem), and an output layer (consisting of dependent variables). Each layer consists of one or more nodes (or neurons) which can be fired (or activated) using a transfer function. ANNs exhibit the capability to recognize complicated patterns in data and can be trained through examples using various training algorithms (Abiodun et al. 2019). For supervised training, the ANN is provided with the inputs and outputs and allowed to learn the data pattern at a designated learning rate. The developed model can then be tested and validated through unknown inputs.

Although the use of ANN is expected to express its superiority over traditional multiple regression and semi-empirical models, this may not be the case for all bioprocessing operations. In such cases, it is important to evaluate the efficacy of ANN for the specific operation being dealt with. Therefore, in this work, ANNs were applied to the microfluidization process with sugarcane juice (plant-based tropical thirst-quenching drink) as the test beverage. To the best of our knowledge, there are no studies that have modelled the microfluidization processing of sugarcane juice using ANN, till date. The current study will also evaluate the predictive capabilities of ANN with varying modelling conditions and provide the optimum model setup for the unit operation under consideration.

Materials and methods

Material procurement and processing

Sugarcane stems (var. Co0238) were procured from a farm in Kundeshwari, Kashipur, Uttarakhand (India). The stems were brought to the lab, cleaned, sized to 0.45 m length pieces and blanched in hot water at 90 °C for 5 min. The blanched stems were cooled under running portable water at 25 °C, and crushed using a twin-roller mechanical cane crusher to obtain the juice. The crushing operation was repeated thrice on each stem to recover maximum amount of sugarcane juice. The juice was then filtered through a muslin cloth and subjected to microfluidization.

Sugarcane juice was microfluidized in a lab-scale microfluidizer with a Y-type diamond interaction chamber (M-110P Microfluidics Corp., USA) under different pressure-cycle combinations. Four different microfluidization pressures (50, 100, 150, 200 MPa) and cycles (1, 3, 5, 7) were used. Passing the juice through the microfluidizer once constituted a single cycle. The outlet temperature of the juice was controlled at 25 °C through circulation of cold water. The microfluidized juice was collected in sterile glass bottles, capped, kept at 4 °C and analyzed for quality characteristics within 24 h.

Determination of sugarcane juice quality parameters

The microfluidization process was intended to preserve the quality of sugarcane juice in terms of maintaining the physico-chemical and functional properties, inactivating deteriorative enzymes, reducing microbial load and lowering the sedimentation rate. In this regard, a total of 18 responses including the total soluble solids (TSS), antioxidant activity by four assays (2,2-diphenyl-1picryl hydrazil radical scavenging, Ferric chloride antioxidant power, Hydroxyl radical scavenging activity and metal ion chelating activity assay), total phenolic content (TPC), total flavonoid content (TFC), aerobic plate count (APC), yeast and mold count (YMC), polyphenol oxidase activity (PPO), peroxidase activity (POD), sucrose neutral invertase activity (SNI), reducing sugars (RS) and total sugars (TS), chlorophyll content (CHL), color change (ΔE), sensory acceptability and sedimentation rate (SR) were evaluated using standard procedures. TSS was measured using a digital refractometer (RX-7000, Atago, Japan). The detailed methodology of the antioxidant assays can be found in the work of Tarafdar et al. (2021a, b, c). TPC and TFC were analysed using the Folin-Ciocalteu reagent assay (Singleton et al. 1999) and the AlCl3 colorimetric assay (Chang et al. 2002), respectively. APC and YMC were determined using the protocols specified by APHA. PPO, POD and SNI were determined using the protocols given by Mao et al. (2007) and Etzbach et al. (2019). RS and TS was evaluated using the method detailed by Miller (1959) and Dubois et al. (1956), respectively. CHL was analysed using the method of Qudsieh et al. (2002), and ΔE was determined using a handheld colorimeter (Konika Minolta Chroma-400, Japan). SR was determined using the method described by Tarafdar et al. (2021a, b, c), and sensory acceptability was calculated on a 9-point hedonic scale with 29 semi-trained panellists with their prior consent. The major focus of this investigation was to elucidate the applicability of ANN to the microfluidization process.

Data modelling through artificial neural network

The data collected over 16 experiments was subjected to artificial neural network (ANN) modelling in MATLAB v. 2012b (MathWorks Inc., USA). The ANN architecture was built with one input, one hidden and one output layer with a feed-forward back propagation (FFBP) network. This network allows for the training error (difference in predicted output and actual output) to be backpropagated to the network thereby allowing it to re-evaluate its weights and biases. The backpropagation process continues till a desired minimum level of training error is obtained. The microfluidization pressure (50, 100, 150, 200 MPa) and number of cycles (1, 3, 5, 7) were fed to the model as inputs while the quality characteristics described earlier were provided as targets.

Different activation functions were used for constructing the ANN architecture. In the output layer, a linear approximation activation function (purelin) was used while in the hidden layer, both tansigmoidal and logsigmoidal activation functions were used individually, and their relative contribution in model improvement was evaluated based on the correlation coefficient (R) and the mean square error (MSE). The purpose of the activation functions was to determine if a particular input signal is important to the output prediction process. Additionally, the effect of two separate training algorithms namely, Levenberg–Marquardt (LM), and Gradient descent with momentum and adaptive learning rate backpropagation (GDX), on the model prediction capability was also tested. The training algorithms were chosen based on their speed of convergence (attainment of minimum error). LM is a fast-converging algorithm while GDX is a slow-converging algorithm. Although fast-conveging algorithms can lead to more accurate training, it could sometimes overshoot the minimum error point hence, the latter was also considered. The number of hidden layer neurons (HLNs) was varied from 5 to 30 in steps of 5 neurons. The range of the neurons was decided based on preliminary model runs which showed network degeneration beyond 30 neurons. The training condition with the highest R (training) at the lowest possible HLN and MSE (validation), was considered desirable. The data was randomly divided into 70%, 15% and 15% for training, validation and testing, respectively using the dividerand command of MATLAB. The models were trained thrice and the best condition set based on higher training R was selected for further evaluation.

Results and discussion

Separate ANN models were constructed based on difference in training algorithms (LM, GDX), hidden layer transfer functions (tansigmoid or T, logsigmoid or L) and hidden layer neurons (5–30). Initially, the combination of these conditions was screened based on a cut-off criterion of R = 0.95 (Fig. 1A). Based on this preliminary screening, it was observed that LM + T models exhibited R in the range of 0.69–0.90 and were therefore, not considered further. Tansigmoidal functions vary in the range of − 1 to + 1. Hence, the tansigmoidal function will generate a value closer to − 1 if the argument of the function is significantly negative, thus continuing the model learning process. In contrast, a logsigmoidal function which varies in the range of 0 to 1 will generate a value of 0 for any values that tend to be negative which will almost stop the model from learning from the dataset (Javed et al. 2022). This indicates that a faster convergence algorithm with tansigmoidal activation function may converge too fast and could lead to network degeneration which was observed in the current investigation (LM + T). In the remaining three conditions, two of the models (LM + L and GDX + L) met the desired criteria at HLNs > 15 (R = 0.97–0.99) while GDX + T model met the criteria at 10 HLNs (R = 0.958). It can be also inferred from the results that models with a tansigmoidal transfer function were able to reach R > 0.90 at lower HLNs than models with a logsigmoidal transfer function. In case of LM + L, the algorithm may reach convergence sooner (due to LM) where it can reduce the learning rate (due to L) and thus achieve higher accuracy. However, with a slower convergence algorithm such as GDX which constitutes a variable learning rate component, both activation functions will lead to good prediction accuracy. Overall, the model GDX + T was preferred as it could reach the desired accuracy (R2 = 0.917) at lower HLNs, which was one of the criteria for model selection. Tansgimoidal activation function in the hidden layer has also been preferred by other researchers for the prediction of hydration rate of chickpea (Kumar et al. 2021), to quantify secondary metabolites in grapes (Boido et al. 2022), modelling the volume and surface area of apple (Ziaratban et al. 2017) and for infrared, cabinet, vibro-fluidized bed and vacuum drying of food commodities (Sadeghi et al. 2019; Subramanyam and Narayanan 2023; Dhurve et al. 2021; Tarafdar et al. 2021c), among other works.

Fig. 1
figure 1

A Variation in ANN model training R with number of hidden layer neurons (HLNs); B Validation mean square error with different training algorithm, transfer function and HLNs; C Testing and validation R of the selected ANN model (GDX + T, 10 HLNs); D Distribution of model error at all processing conditions

To further confirm the model selection, the validation MSE was considered based on a cut-off criterion of 0.1 (Fig. 1B). It was observed that LM + L and GDX + T models showed MSE < 0.1 at 5 HLNs. Although both models had a lower R at 5 HLNs, the GDX + T model met the MSE criteria even at 10 HLNs and was therefore selected for the prediction of sugarcane juice quality parameters based on microfluidization pressure-cycle combinations. The developed model was then tested and validated with the remaining 30% of the data (divided equally), which was previously not used for training. The model revealed a R of 0.98 (R2 = 0.96) for testing, and 0.99 (R2 = 0.98) for validation (Fig. 1C). It is noteworthy that the test dataset which was used to test the prediction capability of the trained model showed 96% accuracy. Ciğeroğlu et al. (2018) also suggested that closeness in the training and testing R2 is a sign of an efficient model. Further, to avoid model overfitting, the number of iterations was limited to the point where the error on the validation dataset exceeded the training error. The weights and bias of the selected model at constrained iterations has been shown in Table 1. To check whether the model predictions were unbiased, the error associated with each quality parameter at all microfluidization pressure-cycle combination was assessed. Figure 1D shows that the distribution of error was relatively uniform ensuring unbiased predictions.

Table 1 ANN model weights and biases for the prediction of sugarcane juice quality parameters at different microfluidization pressures and cycles

Conclusion

This work successfully developed a neural network based intelligent model for predicting the quality characteristics of microfluidized sugarcane juice based on microfluidization pressures and cycles. The results revealed that the selected ANN architecture (GDX, tansigmoid, 10 HLNs) can be used to predict the juice characteristics with reasonably high accuracy (> 90%). Testing and validating the model with a new dataset also confirmed the prediction accuracy (96%) of the model. The findings of this work are meant to facilitate other researchers in developing ANN models for the microfluidization process of other food commodities. Such studies may lead to the development of a generalized neural network architecture for the microfluidization technology.