Introduction

Pork is a significant component of the diet and provides multiple nutrients for human beings. With the continuous improvement of people’s living standards, consumers’ demand for high-quality pork is also increasing (McCarthy et al. 2004, Lind 2007). The quality of pork can usually be identified by its color, which is mainly determined by the content and state of the myoglobin within it (Mancini and Hunt 2005,Calnan et al. 2016,Nair et al. 2014). Freezing preservation technology is the most common approach for ensuring pork quality. However, prolonged frozen storage can cause nutrient loss and spoilage, especially inducing changes in the myoglobin content of pork (Li and Sun 2002,Zhuang et al. 2022). Myoglobin (Mb) is a binding protein composed of a polypeptide chain and a heme prosthetic group, which has three redox states, including deoxymyoglobin (purplish red), oxymyoglobin (bright red), and metmyoglobin (reddish brown) (Rong et al. 2023,Suman and Joseph 2013). During storage, changes in the ratio of the three types of myoglobin significantly affect the coloration of pork. With the prolongation of storage time, deoxymyoglobin (DeoMb) and oxymyoglobin (MbO2) will be gradually oxidized to produce metmyoglobin (MetMb); meanwhile, the color of pork turns reddish brown and then spoils (Chen et al. 2016, Stewart et al. 1965). Therefore, MetMb content is considered an essential indicator for evaluating the quality of frozen pork. That is, it is of great significance for public health and food safety to realize high-accuracy detection of MetMb content in pork. It is known that traditional detection techniques like ultraviolet–visible spectrophotometry and liquid chromatography have been applied for the determination of MetMb content (Mancini and Ramanathan 2014,Lindsay et al. 2015). Among them, the advantage of ultraviolet spectrophotometry is that it does not consume samples and low concentration of salt, and most buffer solutions do not interfere with the determination, which can be used as a quantitative standard; at the same time, liquid chromatography has high separation efficiency, good selectivity, high detection sensitivity, automatic operation, and wide application range. However, these two methods have obvious disadvantages such as high analysis cost, complicated and expensive equipment, long analysis time, cumbersome operation process, and need to destroy samples, which is not conducive to the detection of MetMb content. Thus, it has certain theoretical and practical significance to provide a high-accuracy, high robustness and portable and economical non-destructive detection method for frozen pork quality monitoring (Rong et al. 2023).

VIS–NIR spectroscopy, a novel non-destructive detection technique, has numerous significant properties, such as rapid analysis, high accuracy, easy operation, without sample pre-treatment, non-destructive, and non-polluting, which make it ideally suited for application in the field of intrinsic quality analysis of food products. It relies on the different absorption and scattering characteristics of light by different biomolecules and thus can high accurately capture the effective information carried by the substance. Recently, it has been recognized as an attractive technological solution for the detection of various chemical substances in meat. Nguyen et al. (2019) verified that VIS–NIR diffuse reflectance spectroscopy can quantitatively detect the content and oxygenation process of MetMb in pork and beef. In 2020, Cheng et al. proposed a method using a VIS–NIR hyperspectral imaging system combined with the iVISSA-CARS-LSSVM algorithmic model to predict the variation of Mb (DeoMb, MbO2, MetMb) content of Tan sheep during cold storage (Cheng et al. 2020). The same year, Yuan et al. demonstrated that VIS–NIR spectroscopy technology can rapidly predict MetMb content in cooked Tan sheep. Previous studies have revealed the capabilities of VIS–NIR spectroscopy for meat quality detection, but most have concentrated on the evaluation of fresh meat. With the increasing demand in the frozen food market, it is necessary to carry out regular quality testing of frozen meat (Yuan et al. 2020). In 2023, Rong et al. determined Mb (DeoMb, MbO2, MetMb) content in frozen hind leg pork using a VIS–NIR spectrometer combined with Si-CARS-PLS algorithmic modeling (Rong et al. 2023). Compared to fresh pork, there are few reports about VIS–NIR detection for frozen pork quality. The evaluation results of frozen pork quality are limited by the differences in MetMb content of different pork parts, and the robustness of the algorithm also seriously affects the accuracy of the prediction model. Therefore, extensive systematic studies are needed for the prediction of MetMb content in frozen pork by VIS–NIR spectroscopy.

This paper successfully proves a rapid and non-destructive detection method that can effectively improve the detection accuracy of metmyoglobin content in frozen pork. It aims to reveal the corresponding relationship between pork spectral characteristics and metmyoglobin with the change in frozen storage time. The prediction model is established by the PLS and RF algorithm, and SPA is used to improve the accuracy of the model. It solves the problem of cumbersome and time-consuming operation of metmyoglobin content detection in traditional methods and provides some important reference value for the quality and safety detection of frozen food.

Materials and Methods

Figure 1 illustrates the complete process for the determination of MetMb content in frozen pork. It consisted of five main parts (distinguished by different dashed box colors): (a) prepared frozen pork samples; (b) established a VIS–NIR spectral detection system to collect raw spectral data from samples with different freezing times; (c) determined the MetMb content by using conventional spectrophotometry; (d) selected the optimal scheme from six spectral pre-processing methods; and (e) developed generic models for the detection of the relative MetMb content in frozen pork by using PLS and RF, respectively, in conjunction with the SPA characteristic wavelength selection algorithm, and ultimately selected the optimal modeling combination.

Fig. 1
figure 1

Schematic diagram of the operational process for the determination of MetMb in frozen pork by VIS–NIR spectrometry

Frozen Pork Sample Preparation

Fresh pork tenderloins were selected as experimental samples purchased from a large fresh supermarket. We adopted low-temperature preservation measures for rapid transportation of experimental samples to ensure freshness and non-contamination. To avoid the influence of other biological tissues on the sampling results, tissues such as tendons and fat were removed from the experimental samples, and only the lean meat portion was retained. In the preparation process, the utensils used alcohol with a concentration of 75%, such as knives, chopping boards, and propylene trays. The pork was put in multiple samples (5 cm × 5 cm × 2 cm in size) and subsequently encapsulated and numbered in sterile sealed bags. They were stored in the freezer (− 18 °C) of a refrigerator in preparation for subsequent experimental tests. To determine the change in MetMb content versus frozen storage time, a detailed experimental plan was designed. In the present work, a total of 144 frozen pork samples were tested for spectral data and myoglobin content using VIS–NIR spectroscopy and spectrophotometry, respectively. The above test procedure was repeated in various storage times (1st, 14th, 28th, 35th, 42nd 56th, 63rd, and 70th days), and 18 samples were employed for each test.

VIS–NIR Spectral Acquisition System

Spectral data of frozen pork samples were collected by a VIS–NIR spectroscopy system as shown in Fig. 2. The system mainly included frozen pork test samples, a food-grade black tray, an LED light source (LED-7019), a passive optical fiber (P600-2-SR, Ocean Optics), a spectrometer (QE65 Pro, Ocean Optics), and a high-speed computer. The frozen pork sample was placed in a food-grade black tray on an optical support frame. The light emitted by the LED light source was illuminated on the test sample. The generated diffuse reflection light was received by an optical fiber probe and transmitted to the spectrometer via the passive optical fiber. The detected optical signals were converted into electrical signals by the optical fiber spectrometer, and finally the spectral data were processed and analyzed using the corresponding software (SpectraSuit) installed in a high-speed computer. The optical fiber spectrometer can collect spectral data in the wavelength range of 200–1100 nm and has a spectral resolution of 1.2 nm. However, only spectral data within the wavelength range of 400–700 nm were selected for modeling due to the data at both ends of the spectra (200–400 nm and 700–1100 nm) being susceptible to background noise. To minimize the impact of noise on the results, the average times and smoothness in the software were set to 3 and 1, respectively. The integration time was set to 500 ms.

Fig. 2
figure 2

VIS–NIR spectroscopy data acquisition system for frozen pork quality detection. Spectrophotometric determination of MetMb content

The experiments were performed at room temperature and normal atmospheric pressure with suitable humidity. During the measurement process, it is necessary to record the dark and reference spectra accurately. The dark spectrum was measured under darkroom conditions with the light source switched off. Meanwhile, the spectral reflectance of a calibrated standard whiteboard was taken as a reference spectrum. To decrease accidental errors, raw spectral data were collected from frozen pork samples using multiple measurements (three times) averaging method. The raw spectral data were recorded quickly to prevent frozen pork from melting, and after testing, the pork was returned to the fridge at – 18 °C for storage. The above process was repeated for each subsequent test.

Spectrophotometric Determination of MetMb Content

Traditional methods are not only used to verify the accuracy of spectral non-destructive testing but are also necessary for establishing a prediction model. Therefore, the reference data of MetMb content were determined by using spectrophotometry (according to the method of Krzywicki K’s research (1982)). The experimental steps are briefly described below: First, a homogenate pork sample was prepared. A 5 g of the sample was added to phosphate buffer (25 mL, 0.04 mol/L, and pH 6.8) and was stirred using a homogenizer for 30 s. Then, the process of centrifugal separation was carried out. After 1 h of storage in a refrigerator at 4 °C, the homogenate was placed in a high-speed centrifuge (with a speed of 4500 r/min) for a differential centrifugation duration of 20 min, further separating the supernatant from the precipitate. Next, the determination of absorbance was conducted. The supernatant was filtered through a filter paper, and phosphate buffer was used as a blank control. A spectrophotometer was used for the determination of absorbance values of the filtrate at different wavelengths (525 nm, 545 nm, 565 nm, and 572 nm). Finally, the MetMb content was calculated according to the equation reported in the Ref. 17, which is expressed as follows:

$${P}_{\text{Metmb}}={2.514A}_{1}+{0.777A}_{2}+{0.800A}_{3}+1.098$$
(1)

where \({P}_{\text{Metmb}}\) represents the relative content of MetMb. \({A}_{1}={A}_{572}/{A}_{525}\), \({A}_{2}={A}_{565}/{A}_{525}\) and \({A}_{3}={A}_{545}/{A}_{525}\) represent the absorbance ratios at different wavelengths, respectively.

Chemometric Methods

Chemometric methods were applied for the pre-processing of raw spectral data, involving spectral processing and wavelength selection. The raw spectra collected may be affected by environmental factors and background noise. Especially under freezing conditions, the muscle fiber structures are twisted and warped (Cheng et al. 2020), making the raw spectral curves in the experiments disturbed by baseline drift and light scattering noise. Therefore, it is essential to perform spectral processing methods to eliminate the uncorrelated factors in the spectra (Arslan et al. 2021). In this work, six different spectral processing methods were used to optimize the raw spectral data, such as 1st derivative, 2nd derivative, Savitzky-Golay convolutional smoothing (S-G smoothing), standard normal variable (SNV), multiplicative scatter correction (MSC), and vector normalization (VN). Afterwards, data analytical methods were considered to establish calibration models for quantitative analyses, concerning a classical linear calibration method and a common machine learning method, namely partial least squares (PLS) (Haaland and Thomas 1988) and random forest (RF) (Y.L. Pavlov 1997), respectively. Among them, the partial least squares algorithm is one of the most widely used spectral modeling methods in the field of food quality detection. It is used to explore the relationship between response variables and predictive variables. It is a regression analysis method that combines the advantages of principal component analysis, multiple linear regression, and correlation analysis. Random forest algorithm is an ensemble learning algorithm, which uses multiple decision trees for classification and regression. It can deal with large-scale data sets and reduce the risk of overfitting. Their advantages have certain commonalities, that is, they can quickly deal with high-dimensional data and multicollinearity problems, and have good stability, accuracy, and robustness.

Usually, there are some redundant and uninformative variables in the original spectral data. Successive projections algorithm (SPA) is a forward variable selection algorithm that minimizes the collinearity of vector space. By using vector projection analysis, the wavelength is projected onto other wavelengths, and the size of the projection vector is compared. The wavelength with the largest projection vector is selected as the wavelength to be selected, and then the final characteristic wavelength is selected based on the correction model. Compared with the competitive adaptive reweighting algorithm (CARS), the performance improvement is limited. The main reason is that the feature selection process of this method is unsupervised, and the selected variables maximize the interpretation of the independent variable space, so the variable interpretation ability is limited. However, its advantage is to extract several characteristic wavelengths of the full band, which can quickly and maximize the elimination of redundant information in the original spectral matrix (Ye et al. 2008), and is very suitable for the screening of spectral characteristic wavelengths. To improve the prediction accuracy and enhance the robustness of the model, successive projections algorithm was adopted to extract the characteristic wavelengths, and thus the characteristic variables related to MetMb were selected. Generally, the best model standard is evaluated according to its ability to predict reliability. The evaluation indicators involve the correlation coefficient (Rc2) and root mean square error (RMSEC) in the calibration set, as well as the correlation coefficient (Rp2) and root mean square error (RMSEP) in the prediction set. The specific calculation formula of RMSE is as follows:

Calibration set root mean square error:

$$RMSEC=\sqrt{\frac{{\sum }_{i}^{n}{\left(yi, actual-yi, predicted\right)}^{2}}{n-1}}$$
(2)

Prediction set root mean square error:

$$RMSEP=\sqrt{\frac{{\sum }_{i}^{n}{\left(yi, actual-yi, predicted\right)}^{2}}{m-1}}$$
(3)

Higher correlation coefficient and lower root mean square error provide better predictive ability and accuracy (Zhou et al. 2020). The above data analysis was run under Matlab 2021a software.

Results and Discussion

Analysis of Physiochemical Values

Sample set partitioning based on the joint x–y distances (SPXY) algorithm, a metric method for measuring the similarity between samples, calculates distances based on the symbiotic relationship between spectral and physiochemical values of the two samples (Rong et al. 2023). The advantage of SPXY lies in its ability to effectively cover the multidimensional vector space, thus improving the predictive ability of the model, for which it has been widely used in the establishment of spectral quantitative models. The sample set was divided into calibration set and prediction set in the ratio of 3:1 by SPXY method. Among them, the calibration set is used to train the model, that is, to determine the weight and bias of the model. Usually, we call these parameters as learning parameters. The test set is used to evaluate the model. The test set is used only once, that is, when the final model is evaluated after the training is completed, it does not participate in the learning parameter process or the parameter hyperparameter selection process and is only used for model evaluation. Descriptive statistics related to MetMb, such as content range, mean value, and standard deviation (SD), are shown in Table 1. It was evident that the range of the calibration set perfectly covered that of the prediction set, and therefore the distribution of the frozen pork samples in the two subsets was ideally suited for subsequent modeling. Furthermore, the distance between the samples was calculated using two variables to ensure the maximum characteristics of the sample distribution, which was more conducive to subsequent modeling.

Table 1 Physiochemical values for MetMb of frozen pork in the calibration and prediction sets

Figure 3 depicts the variation of MetMb content in pork tenderloin during freezing. Variation in freezing time had a significant effect on MetMb content, which increased overall with increasing frozen storage time. For MetMb, the content was relatively lower (less than 15%) at a freezing time of only 24 h (1st day), whereas its content tended to increase dramatically with the increase of frozen storage time (period from 7th to 42nd day). This was because both DeoMb and MbO2 in frozen fresh pork were oxidized to form MetMb. During this oxidation process, Fe2+ in the heme group was oxidized to Fe3+, resulting in browning and deterioration of pork during freezing. In the freezing period from 56 to 70th day, the MetMb content exhibited a relatively slower increasing trend. The main reason was the conversion rates of DeoMb and MbO2 towards MetMb became slower in this period. Since pork itself contains a MetMb reductase (MRE) system (Alonso et al. 2016, Thiansilakul et al. 2010), it can reduce the formed MetMb to ferrous myoglobin once more, thereby restoring the bright red color of the meat. But the MRE activity of pork in the frozen state was greatly reduced. This made it difficult to achieve reversibility of the redox reaction of myoglobin in frozen pork, and the rate of the oxidation reaction was reduced. For this reason, the observed MetMb content had a trend of slowly and consistently increasing.

Fig. 3
figure 3

Changes in MetMb proportions of pork during freezing

Analysis of Average Reflectance Spectral Curves

The raw spectral curves of the pork samples are shown in Fig. 4a. It was found that the spectra of pork samples with different myoglobin contents presented similar trends, but had different reflectances spanning the entire spectral region. Also, it was observed that the main absorption bands of frozen pork were at 430 nm, 545 nm, and 565 nm owing to the chemical bonding absorbed energy at specific wavelengths. The collected average reflectance spectral curves of the pork samples were shown in Fig. 4b. It revealed that the average reflectance spectra of pork samples stored for different days had similar profiles, especially with significant reflectance peaks at 545 nm and 565 nm for pork.

Fig. 4
figure 4

Reflectance spectral curves of pork at different freezing times. a Raw reflectance spectra. b Average reflectance spectra

During the whole frozen storage process, changes in reflectance spectra, pork colors, and the three types of myoglobin contents were correlated with each other (Suman and Joseph 2013). The frozen storage time was divided into three stages. In the first stage (frozen for one month), the DeoMb content was gradually reduced and sequentially converted to MbO2 and MetMb with approximately the same conversion rate. Pork exhibited a dark red color, and its spectral reflectance gradually decreased. Likewise, in the second stage (frozen for two months), the DeoMb content continued to decrease, continuing the conversion towards MbO2 and MetMb. However, the increase rate of MbO2 was greater than that of MetMb at this stage, and the pork appeared bright red, as well as its reflectance began to increase. While for the third stage (more than two months), the increase rate of MetMb was higher than the increase rate of MbO2. At this point, the pork appeared reddish brown in color, and thus the pork began to spoil and deteriorate, decreasing its reflectance. Myoglobin had sufficient time for oxidative degradation or alteration in its secondary structure (e.g., α-helix and β-fold sheets) during freezing (Liu and Chen 2001). Moreover, it was believed that the differences in reflectance values could be attributed to modification of physical properties and major chemical components (e.g., water, fat, and protein) of pork caused by microbial spoilage and enzyme activity during the storage process (Cheng et al. 2020).

Results of PLS Model

In the process of collecting the original spectral data, the collected signals are often affected by factors such as the state of the instrument, temperature, and stray light in the experimental environment, which affect the accuracy of the data and the establishment of the model. In order to eliminate the influence of noise and other factors, six methods, such as 1st derivative, 2nd derivative, S-G smoothing, SNV, MSC, and VN, are selected to preprocess the original spectral data. Among them, the multiplicative scatter correction (MSC) method can eliminate the physical effects caused by the uneven distribution and size of the particles and reduce the interference of redundant spectral information on subsequent modeling.

PLS models for MetMb in frozen pork were established to evaluate the influences of different pre-processing methods on the prediction accuracy. Table 2 illustrates evaluation results of PLS models using different spectral pre-processing methods. The results indicated that the performance of models was significantly enhanced when appropriate pre-processing methods were selected. By comparison, the best performance was demonstrated by the MSC pre-processing method, with Rc2 and Rp2 of 0.912 and 0.891 for the calibration and prediction sets, respectively. Correspondingly, the RMSECV and RMSEP of the calibration and prediction sets were 0.0195 and 0.0216, respectively. It featured a minimal RMSE difference, indicating that the model constructed by PLS combined with MSC pre-processed spectral data was the most stable and had a better generalization ability. Therefore, MSC was considered most suitable pre-processing method to diminish the influences of scattering. Figure 5 shows the scatter plot of the predicted MetMb content from pork using the MSC-PLS model. The predicted values were concentrated around the target line, indicating a better predictive performance.

Table 2 Evaluation indicators of PLS model for MetMb in frozen pork
Fig. 5
figure 5

Relationship between reference and predicted values using MSC + PLS prediction model in a wavelength range of 400–700 nm

Results of RF Model

RF is a machine learning algorithm based on decision trees for classification and regression. Mutually independent decision trees as basic units to avoid overfitting in random processes (Y.L. Pavlov 1997). For this reason, RF algorithm was selected to develop a common model for the detection of MetMb content in frozen pork, by using MSC-processed spectral data, and for the sample data was the same as the PLS model. Table 3 summarizes the evaluation results of the RF algorithm with MSC pre-processed spectral data. Compared with the PLS-RF model, the correlation coefficients (Rc2 and Rp2) of the MSC-RF model were higher, reaching 0.950 and 0.891, respectively. Moreover, the RMSE difference between calibration and prediction sets was minimal (RMSECV = 0.0140, RMSEP = 0.0247), further indicating that the MSC-RF model had better stability and generalization ability. Scatter spots of MetMb predicted values from the MSC-RF model were concentrated near the target line with a minimal deviation, as shown in Fig. 6. Based on the above research results, the best pre-processing method was selected for the subsequent modeling analysis.

Table 3 Evaluation indicators of RF model for MetMb in frozen pork
Fig. 6
figure 6

Relationship between reference and predicted values using MSC + RF prediction model in a wavelength range of 400–700 nm

Selection of Characteristic Wavelengths

In the actual analysis process, the collected data are usually high-dimensional and complex data signals. To achieve data dimensionality reduction, as well as to extract optimized spectral features, SPA algorithm was applied for characteristic wavelength selection. SPA can effectively eliminate the redundant information in the raw spectral matrix, thus solving the issues of overlapping and non-collinearity of spectral information (Pang et al. 2021, Araújo et al. 2001). Take the average spectrum of the sample frozen for one day as an example. In the SPA characteristic wavelength selection, the final number of variables selected was determined based on the change in RMSE (The number of variables ranged from 1 to 28 with a step size of 1). The RMSE value was optimal at the number of variables being 19, as shown in Fig. 7a. The selected characteristic wavelengths from the raw spectra are presented in Fig. 7b.

Fig. 7
figure 7

Characteristic wavelength selection using SPA algorithm. a RMSE variation plot. b SPA selection variable

Table 4 presents the evaluation indicators of the prediction model for MetMb content in frozen pork based on SPA characteristic wavelength selection. Compared with the raw spectra, the number of wavelengths required for modeling was significantly reduced using the SPA method. After the characteristic wavelength extraction, the correlation coefficients and root-mean-square errors of the model had obvious improvement. SPA characteristic wavelength selection method effectively increased the accuracy and reliability of the prediction model. The results showed that the MSC-RF-SPA model performed the best performance in the prediction for MetMb content of frozen pork. The extracted number of characteristic wavelengths was 19, which accounted for 4.87% of the raw spectral data. And, the Rp2 and RMSEP of the prediction set were 0.901 and 0.0145, respectively. Above all, MSC-RF-SPA had certain predictive accuracy and predictive ability for the detection of MetMb content in frozen pork.

Table 4 Evaluation indicators of the prediction model for MetMb content in frozen pork based on SPA characteristic wavelength selection

Conclusion

Based on the combination of visible-near infrared spectroscopy and chemometrics, this study used partial least squares and random forest algorithms to predict the proportion of metmyoglobin in frozen pork stored for 10 weeks. The results showed that the model established by MSC pretreatment method combined with random forest algorithm had higher prediction accuracy when predicting the MetMb content in pork under different frozen storage times. The correlation coefficient and root mean square error of the prediction set were 0.892 and 0.0247, respectively. After extracting the characteristic wavelength by continuous projection algorithm, it was found that the model effect of random forest continuous projection algorithm was better. The correlation coefficient and root mean square error of the prediction set were 0.901 and 0.0145, which realized the optimization of the model. In summary, Vis–NIR spectroscopy combined with SPA-RF can well replace time-consuming and destructive mechanical methods to measure metmyoglobin content in frozen pork and has excellent predictive ability. In addition, changes in myoglobin content are affected by many factors, such as pork variety, pork part, storage temperature, and test conditions. Therefore, these factors should be further studied and different varieties, and different materials should be used to verify changes in metmyoglobin and model performance.