Introduction

As derived from the fish tissues, fish oil contains a large amount of long-chain ω-3 (n-3) polyunsaturated fatty acids (PUFA). It was reported that about 20 % of the total fatty acids in fish oil are PUFA (Cozzolino et al. 2005). PUFA have been proved to have protective properties against cancer (Berquin et al. 2007), cardiovascular disease (Wang et al. 2006), Parkinson’s disease (Bousquet et al. 2008), cognitive decline (Morris et al. 2005), rheumatoid arthritis (Rennie et al. 2003), lupus (Wright et al. 2008), depression (Nemets et al. 2006), and psoriasis (Wolters 2005). PUFA also have beneficial effects on the development of infant’s nervous system (Jensen et al. 2010), visual system (Institute of Medicine 2005), and immune system (Calder et al. 2010). Due to low dietary intakes of PUFA in daily diet, fish oil has become one of the popular dietary supplements of PUFA for consumers. The market of fish oil for providing rich PUFA has rapidly expanded in recent years.

The quality of PUFA in fish oil should be strictly controlled. Currently, inaccurate listing of PUFA levels is a core quality problem for fish oil supplements. For some fish oil products, the amounts of PUFA shown on the packages do not match the actual contents in the fish oil. This mislabeling seriously infringe on the rights and interests of consumers. However, it is very difficult for consumers to tell the factual contents of PUFA in fish oil barely by using their naked eyes to observe the fish oil. Traditionally, fatty acids in foods and biological samples are customarily analyzed by gas chromatography (GC) of their methyl ester derivatives or directly as the ethyl esters (Curtis 2007). However, GC methods always take a very long time to analyze and need a lot of sample preparation. There are considerable managing steps involved in GC processes, which may cause oxidation of lipids, give variable results depending on the columns used, and pose difficulty in correctly identifying each fatty acid in the chromatograms (Igarashi et al. 2000). In addition, besides the inspection of PUFA in final products, it is needed for the determination of fatty acids in fish oil during the process in an industrial operation, preferably in real time to monitor the progress of the enrichment and to determine the optimal end point (Azizian et al. 2010); however, GC methods are not suitable for rapid determination of fish oil’s quality. The lack of rapid tools for determining PUFA content in fish oil has been a main obstacle for the development of quality control for fish oil and promoting the development of better sensing techniques for such objectives.

Nowadays, some spectroscopic techniques, such as visible and near infrared (Vis–NIR) spectroscopy, mid-infrared (MIR) spectroscopy, and nuclear magnetic resonance (NMR) spectroscopy, have been successfully proved to be efficient and advanced tools for rapid and nondestructive determination of food quality. Visible spectra mainly contain the pigment information, and near infrared spectra arise from overtones and combination bands of C–H, O–H, N–H, and S–H stretching and bending vibrations. Mid-infrared spectra provide more information of frequencies and intensities, which are richer and stronger than near infrared spectra as bands in the mid-infrared spectra are fundamental bands of near infrared spectral bands (Wu et al. 2008). NMR spectroscopy measures the resonant frequencies of certain atomic nuclei present in different chemical surroundings, which provide physical and chemical properties of molecules where the atoms are contained. Different from the time-consuming GC measurement, spectroscopic techniques have the advantages of simplicity, rapid, and minimal sample preparation, so that they agree with the green chemistry principle. PUFA have been rapidly determined using near infrared (Azizian et al. 2010), mid-infrared (Flåtten et al. 2005), Raman (Olsen et al. 2007), or NMR (Sacchi et al. 1993; Miyake et al. 1998; Tyl et al. 2008) spectroscopic techniques. Besides, endeavors of using near infrared spectroscopy (NIR) have been reported for the quality evaluation of fish oil, such as iodine value and saponification value (Endo et al. 2005), and moisture (Cozzolino et al. 2005). NMR spectroscopy also has many successful applications in the quality assessment of oil products (Guillén and Ruiz 2003; Masoum et al. 2007).

The main components of PUFA in fish oil include eicosapentaenoic acid (EPA), docosahexaenoic acid (DHA), and docosapentaenoic acid (DPA). All of these acids are omega-3 fatty acids, in which the first double bond is located at the third carbon from the methyl end. Clinical investigations show that EPA can treat several psychiatric and neurodegenerative diseases due to its anti-inflammatory and neuroprotective effects (Song and Zhao 2007). Studies have also suggested that EPA may be of benefit in depression (Huan et al. 2004; Martins 2009). DHA is the predominant fatty acid of membrane phospholipids in the brain grey matter and in the retina of mammals (Guesnet and Alessandri 2011). Early DHA intervention has beneficial effects on cognition in age-related cognitive decline (Yurko-Mauro et al. 2010). In addition, DHA was found to have the ability of inhibiting growth of human colon cancer cells (Schønberg et al. 2006) and was shown to enhance the efficacy of chemotherapy in prostate cancer cells (Shaikh et al. 2008). On the other hand, available data suggest that DPA has beneficial health effects (Kaur et al. 2011). Therefore, it is our interest to investigate the feasibility of using spectroscopic techniques for determining specific contents of EPA, DHA, and DPA in fish oil. To the best of our knowledge, this is the first study on measuring these important PUFA contents in fish oil using visible–short-wave near infrared spectroscopy (Vis–SNIR), long-wave near infrared spectroscopy (LNIR), MIR, and NMR spectroscopy in tandem with multivariate selection and calibration and comparing the performances of these spectroscopic techniques.

The overall objective of this study was to compare Vis–SNIR, LNIR, MIR, and NMR spectroscopy in tandem with multivariate selection for the rapid and reliable assessment of EPA, DHA, and DPA in fish oil. The specific objectives of the current work were to (1) acquire spectral data of tested fish oil samples in Vis–SNIR (346–1,050 nm), LNIR (870–2,534 nm), MIR (4,000–400 cm−1), and NMR (0–7.4 ppm) regions; (2) select important variables that were most correlated to the assessments of EPA, DHA, and DPA in fish oil by using successive projections algorithm (SPA) and uninformative variable elimination (UVE) algorithm; (3) establish multivariate calibration models using partial least squares regression (PLSR) based on the whole range spectra or spectra at the selected variables; (4) compare the prediction abilities of four spectroscopic techniques Vis–SNIR, LNIR, MIR, and NMR spectra; and (5) evaluate the improvement ability of considering UVE before the SPA calculation for the variable selection.

Materials and Methods

Sample Preparation and Spectral Measurements

Fish oil from Amway Nutrilite®, By-Health®, and General Nutrition Centers (GNC)® were purchased from the local markets, resulting in 28 sets of fish oil samples obtained. Four samples were obtained from four batches with different production time for each set in the NMR analysis, and five samples were obtained from five batches with different production time for each set in other spectroscopic analysis. Therefore, there were 140 (5 samples per set × 28 sample sets), 140 (5 samples per set × 28 sample sets), 140 (5 samples per set × 28 sample sets), and 112 (4 samples per set × 28 sample sets) samples obtained for Vis–SNIR, LNIR, MIR, and NMR spectral measurements, respectively. A USB4000 miniature fiber optic spectrometer (Ocean Optics, Inc., USA) was used for the spectral scanning in the wavelength range of 346–1050 nm that covers the Vis–SNIR spectral wavelengths. The measurement of LNIR, which covers the wavelength range of 870–2,534 nm, was achieved by using an NIR 256–2.5 spectrometer (Ocean Optics, Inc., USA). Both spectrometers were equipped with a light source, SMA-terminated optical fibers, and a cuvette holder (DH2000, P400-2-VIS/NIR, and CUV-UV Holder for 1-cm cuvettes, respectively, Ocean Optics, Inc., USA). Optical fibers were used for connecting both source and detector to the cuvette holder. The quartz cell with a 1-cm path length was filled with oil sample and was inserted into the cuvette holder for the transmittance spectral measurement. An FTIR spectrometer Nicolet iS10 (Thermo Fisher Scientific Inc., Waltham, MA, USA) attached with a ZnSe attenuated total reflection accessory was used for the MIR acquisition in the wave number range of 4,000–400 cm−1. The NMR spectra of samples in the chemical shift from 0 to 7.4 ppm were collected by a Bruker Avance 500 MHz NMR spectrometer (Bruker, Karlsruhe, Germany) operating at 500.17 MHz for the proton nucleus at 298 K. A full spectrum of each sample was recorded with reference to tetramethylsilane. An exponential window function with a line-broadening factor of 1 Hz was applied to the free induction decay before the Fourier transformation. The 1H-NMR spectra were phased and baseline corrected using TopSpin 2.1 (Bruker, Karlsruhe, Germany) and were automatically reduced by using the AMIX (version 2.5, Bruker GmbH, Karlsruhe, Germany).

GC Analysis

GC was used to detect the reference contents of EPA, DHA, and DPA in fish oil samples. The fish oils were saponified and the fatty acids were esterified (with a 30 % solution of BF3 in methanol) in accordance with the AOAC method (AOAC Method 963.22 2000). The methyl esters were analyzed in a gas chromatography (Agilent 6890 N; Agilent Technologies Inc., USA) with a flame ionization detector and an Agilent DB-1701 capillary column (30 m × 0.32 mm × 0.25 μm; Agilent Technologies Inc., USA). A 1-μL aliquot of treated sample was injected with nitrogen (99.995 % purity) as a carrier gas at a flow rate of 3.0 mL/min. The column temperature was maintained at 180 °C for 2 min after injection then programmed at a heating rate of 10 °C/min to 240 °C (hold 10 min). The injector port and the detector temperatures were set at 250 °C. The flow rates of air and hydrogen were 450 and 45 mL/min, respectively. Identification of the fatty acids was carried out using the retention times relative to commercial methyl esters standards (EPA, DHA, DPA) (Sigma, USA). Calculations were based on previous analysis of standard mixtures and individual correction coefficients.

Spectral Preprocessing

Spectral preprocessing is an integral part of chemometric analysis, which is usually performed prior to model calibration. The goal of the preprocessing is to reduce the effects of length variation of light path and light scattering and to enlarge the hidden information in the original spectral data. It should be noticed that spectral preprocessing is not obligatory and should be considered only when it can really improve the model’s predicting ability. In this work, seven widely used preprocessing algorithms were considered, which were Savitzky–Golay smoothing (SGS), standard normal variate (SNV), multiplicative scatter correction (MSC), Savitzky–Golay first derivative (1st Der), Savitzky–Golay second derivative (2nd Der), SGS combined with 1st Der, and SGS combined with 2nd Der. Savitzky–Golay smoothing is an averaging algorithm that fits a polynomial equation to the data points to reduce noise from the spectral data without reducing the number of spectral variables. As a row-oriented transformation, SNV centers and scales each individual spectrum to remove the spectral scatter. MSC is designed to correct scatter effects in spectral data and also has functions for treating other similar effects, such as path length, offset shifts, and interference. Derivatives are accomplished to correct baseline effects in spectra and resolve nearby peaks. The prediction abilities of these preprocessing algorithms were compared based on the PLSR calibration with the full range of spectra.

Multivariate Calibration

One of the advantages in using spectroscopic techniques is the wealth of information that reside in the spectra, which requests the use of intriguing chemometric tools. Such massive amounts of data contain many uninformative signals and need a strategy of multivariate chemometrics to calibrate a quantitative/qualitative model by extracting meaningful information efficiently. In this study, such strategy was implemented by using PLSR algorithm to extract meaningful information from the massive amounts of spectral data, which were then used to establish a quantitative relationship with the concentrations of EPA, DHA, and DPA in corresponding fish oil. Before the calculation of PLSR, the spectrum of each sample was arranged together to form a matrix consisting of a spectral matrix (X) as response variables and a column vector (Y) as the dependent variable. PLSR was then aimed to establish a quantitative relationship between the spectral matrix (X) and the column vector (Y). The column vector in this work was the reference values of one of the PUFAs (EPA, DHA, or DPA) obtained from the traditional GC measurement.

PLSR has been widely used as a classical multivariate calibration technique for spectral analysis with predictive purpose. Known as a bilinear factor method, PLSR attempts to find multidimensional direction in the spectral matrix (X) that explains the maximum multidimensional variance direction in the column vector (Y). In PLSR calculation, both the spectra (response variables) and concentration (dependent variables) information are decomposed simultaneously, ending in a set of orthogonal factors (latent variables, LVs) projected. Different from principal component regression, dependent variables are actively considered in the decomposition process of PLSR, which ensures that the first few LVs are most related to predict dependent variables. Typically, the optimal LV set should be used in the models to obtain efficient and robust models and avoid overfitting. Predicted residual error sum of squares (PRESS) is commonly used as a standard to determine the optimal LV set, in which the PRESS values of the first few LVs are calculated and the LVs with the lowest value of PRESS are defined as the optimal LV set. The PRESS is determined using the following equation in full cross-validation:

$$ \mathrm{PRESS}={{\displaystyle \sum \left({Y}_{\mathrm{Val}}-{Y}_{\mathrm{Ref}}\right)}}^2 $$
(1)

where Y Val is the calculated values of EPA, DHA, or DPA in fish oil samples estimated by full cross-validation, and Y Ref are the reference values of samples.

Variable Elimination

Variable selection is a critical step in the spectra analysis (Balabin and Safieva 2011; Balabin and Smirnov 2011). Because of possessing hundreds or thousands of spectral variables, spectral data are characterized as high dimensionality, which can lead to convergence instability. On the other hand, congruent spectral variables often exhibit similar spectral information, which is a well-known problem of multicollinearity. The correlated variables are considered as redundancy. In addition, as useless or irrelevant information for the model calibration might be contained in some spectral variables, these variables should be eliminated to avoid worsening the predictive ability of the model. Therefore, the selection of informative variables is preferable for reducing the spectral high dimensionality and multicollinearity, thus simplifying the structure of the calibration model and improving the model’s predictive capability. In the variable reduction scheme, only those variables that are informative for the model calibration and have less multicollinearity are selected, and redundant, uninformative, and multicollinear variables are not considered in the model calibration and thus eliminated.

In this paper, two novel variable selection algorithms were investigated, namely SPA and UVE. SPA is a variable selection algorithm proposed by Araujo et al. (2001). It was designed to solve the collinearity problems of spectral variables by selecting variables with minimal redundancy. There are two steps in SPA. The first step consists of a sequence of projection operations carried out in the columns of the spectral matrix, which constructs candidate subsets of variables. The second step consists of evaluating candidate subsets of variables according to the prediction performance of the calibrated model. Details of SPA description are shown in the literature (Wu et al. 2012). Although SPA could select a subset of variables with a minimum of collinearity, some of these variables might be uninformative and could worsen the model’s performance in some cases. On the other hand, UVE is a PLSR-based variable selection algorithm proposed by Centner et al. (1996), which tries to eliminate the variables that have no more information for modeling than noise. In UVE process, the stability of each variable is calculated by dividing the mean of the PLSR regression coefficients by the standard deviation of the regression coefficients of the variable. With the help of using an artificial random variable matrix as a reference, these spectral variables that are less important in the model than the random variables are eliminated. However, there would be still dozens or hundreds of variables selected by UVE and these variables might have a problem of multicollinearity. In order to complement advantages of both methods and overcome their shortcomings, it is suggested to conduct SPA on informative variables selected by UVE, which is called UVE-SPA variable selection strategy. It is dedicated that the variables selected in this work were wavelength (Vis–SNIR and LNR spectroscopy), wave number (MIR spectroscopy), or chemical shift (NMR spectroscopy) variables, rather than latent variables in PLSR models.

Model Evaluation

In the spectral analysis for quantitative regression, the calibrated spectral model must be evaluated for its validity by the validation process. According to some other works (Marini et al. 2003; Cozzolino et al. 2007), cross-validation is preferred as the validation method when the number of samples is limited (Martens and Dardenne 1998). Many works working on spectral analysis used cross-validation instead of splitting samples into a calibration set and a prediction set (ElMasry et al. 2011; Papadopoulou et al. 2011; Rivero et al. 2012; Kimiya et al. 2013), which show that the cross-validation is a commonly used and accepted validation method. Since the samples from the same sample set that had similar properties should not be validated in the leave-one-out way, segmented cross-validation was applied instead of leave-one-out cross-validation for the validation purpose. In segmented cross-validation, one segment is kept out of the calibration at a time. By choosing different segments, the predictions can be made on all segments for the validation procedure. Although leave-one-out cross-validation could be too optimistic sometimes, the cross-validation with several samples kept out for validation at a time was proved to be able to give a more realistic idea of the predictive power of the model than leave-one-out cross-validation (Sileoni et al. 2013). In this study, there were 28 segments generated to represent the 28 sample sets of fish oil for the segment cross-validation in the PLSR process. After the PLSR models were calibrated, their accuracy and predictive ability were evaluated in terms of determination coefficient of calibration (r 2C ) and root mean square error of calibration (RMSEC) for the calibration procedure and determination coefficient of cross-validation (r 2CV ), root mean square error of cross-validation (RMSECV), and residual predictive deviation (RPD) for the validation procedure. Generally, a good model should have high determination coefficients (r 2C  ) and (r 2CV ) and RPD as well as low root mean square error (RMSEC and RMSECV). Besides, a robust model with less overfitting should have a small absolute value between RMSEC and RMSECV (AV_RMSE). The achievements of preprocessing algorithms and multivariate calibration were conducted using Unscrambler V9.7 (CAMO Process AS, Oslo, Norway), while the calculation of SPA and UVE was performed based on MATLAB 7.12.0 (MathWorks, Natick, USA).

Results and Discussion

Spectra of Fish oil

Typical spectral profiles of the tested fish oil samples in Vis–SNIR, LNIR, MIR, and NMR ranges are shown in Fig. 1. Generally, the shapes of the original spectra were quite homogeneous for different samples in all four sub-images. Due to spectra’s overtone and combination absorptions of molecules, the spectral transmittance of fish oil shows quite even profiles in visible and near infrared spectral regions (Fig. 1a, b) with only some broadband peaks. In the visible spectral region (429–700 nm, Fig. 1a), the spectral pattern had high transmittance in red regions (about 550–650 nm) and low transmittance in blue region (about 430–500 nm), which explained why fish oil gets its golden color. There were some transmittance peaks over the entire near infrared spectral region (700–2337 nm, Fig. 1a, b). According to the literatures of Osborne and Fearn (1986), two weak peaks found at around 900 and 1020 nm in Fig. 1a were assigned, respectively, to C–H stretching third overtone of CH3 and a combination of C–H stretching first overtone and C–H deformation second overtone of CH3. Medium peak between 1,100 and 1,300 nm was due to C–H stretching second overtone of CH, CH2, CH3, and HC=CH. Another medium peak between 1,300 and 1,550 nm was caused by a combination of C–H stretching first overtone and C–H deformation first overtone in CH, CH2, and CH3. The high transmittance peak at 1,760 nm was assigned to C–H stretching first overtone of CH2 and CH3. There were two high transmittance peaks after 2,000 nm. In specific, the one at around 2,200 nm was caused by a combination of CH2 stretching and C= stretching of HC=CH, and the one at around 2,300 nm was due to a combination of C–H stretching and C–H deformation in CH3 and CH2.

Fig. 1
figure 1

Typical spectral profiles of the tested fish oil samples. a Visible and short-wave near infrared spectra. b Long-wave near infrared spectra. c Mid-infrared spectra. d NMR spectra

In the MIR region as shown in Fig. 1c, the transmittance peak at around 700 cm−1 was assigned to the rocking and bending out vibration of –(CH2) n – and –HC=CH– (cis). Two medium peaks were found at around 1,100 and 1,030 cm−1, which were due to –C–O stretching. The one at around 1,150 cm−1 was assigned to stretching and bending vibration of –C–O and –CH2, and –C–H (CH3) bending (sym) vibration was found at around 1,370 cm−1. The transmittance peaks at around 1,460, 1,650, and 1,740 cm−1 were assigned to –C–H (CH2, CH3) bending (scissoring), –C=C– (cis) stretching, and –C=O (ester) stretching vibrations, respectively. There were also some peaks between 2,800 and 3,100 cm−1, in which the one at around 3,010 cm−1 was assigned to =C–H (cis) stretching, the one at around 2,920 cm−1 was assigned to –C–H (CH2) asym stretching, and the one at around 2,855 cm−1 was assigned to –C–H (CH2) sym stretching. In the NMR spectra as shown in Fig. 1d, the peak between 5.20 and 5.50 ppm was assigned to –CHOCOR of glyceryl group and –HC=CH– of acyl group; the peak between 4.10 and 4.30 ppm was assigned to –CH2OCOR of glyceryl group; the peak between 2.75 and 2.90 ppm was assigned to =HC–CH2–CH= of acyl group; the peak between 2.25 and 2.40 ppm was assigned to –OCO–CH2– of acyl group; the peak between 1.95 and 2.15 ppm was assigned to –CH2–CH=CH– of acyl group (except for –CH2– of DHA acyl group in β position); the peak between 1.67 and 1.74 ppm was assigned to –OCO–CH2–CH2– of EPA acyl group; the peak between 1.55 and 1.65 ppm was assigned to –OCO–CH2–CH2– of acyl group (except for DHA and EPA acyl groups); the peak between 1.20 and 1.40 ppm was assigned to –(CH2) n – of acyl group; the peak between 0.94 and 1.00 ppm was assigned to –CH3 of polyunsaturated ω-3 acyl group; and the peak between 0.86 and 0.93 ppm was assigned to –CH3 of saturated, monounsaturated ω-9 and ω-7, and diunsaturated ω-6 acyl groups. The assignments of wave number for MIR and chemical shift for NMR were carried out according to the literature of Guillen et al. (2008).

As a conclusion from analyzing Fig. 1, many chemical molecules and functional groups of constituents in fish oil had their spectral information that was important to determine PUFA contents of fish oil in a quantitative manner. However, when the spectra of more samples were considered, their profiles had a similar pattern. It is difficult to directly observe spectra for the PUFA determination when more samples were considered, as their spectral curves would be overlapped and there were no feature peak-related PUFA contents directly. Thus, chemometrics were employed for the data mining and analysis purpose, resulting in calibrated quantitative models. In addition, because the noise-to-signal ratios of the spectral detectors in the spectroscopic systems were rather low in some parts of spectral data, the spectral matrix with only the spectral range of 429–1,048 nm for Vis–SNIR spectral matrix (3,250 variables × 140 samples), the range of 1,031–2,337 nm for LNIR spectral matrix (201 variables × 140 samples), and the ranges of 650–1,494, 1,687–1,782, and 2,820–3,036 cm−1 for MIR spectral matrix (2,400 variables × 140 samples) were used for further developing the calibration models. For NMR spectra, the chemical shifts with values less than 40 were eliminated as there was no useful information but only noise, resulting in the NMR spectral matrix with the size of 3,636 variables × 112 samples. In addition, the reference values of PUFAs obtained using GC analysis are shown in Table 1. The correlation coefficients of EPA and DHA, EPA and DPA, and DHA and DPA were 0.66, 0.71, and 0.83.

Table 1 Statistics of eicosapentaenoic acid, docosahexaenoic acid, and docosapentaenoic acid concentration (in milligrams per gram) of the fish oil samples

EPA Analysis

Calibration of multivariate models was first executed using the full range spectra of samples. A classic linear calibration method named PLSR was applied for the purpose of model calibration based on the spectral information of samples and their corresponding reference PUFA contents determined by GC method. After the model was established, the segment cross-validation was carried out in order to estimate the predictive capability of the established model and to evaluate the model’s overfitting level.

Calibration Based on Full Range Spectra and Analysis of Spectral Preprocessing

At the calibration stage of using full range spectra, besides the use of original spectral data, the preprocessed spectra of different preprocessing algorithms (SGS, SNV, MSC, 1st Der, 2nd Der, SGS-1st Der, and SGS-2nd Der) were also considered in order to determine the best input spectral data for the PLSR calibration and variable selection. When Vis–SNIR spectra were considered, the PLSR model established using original spectral data had a good result with r 2CV of 0.852 and RMSECV of 25.69 mg/g. When the seven spectral preprocessing algorithms (SGS, SNV, MSC, 1st Der, and 2nd Der) were considered respectively, the best performance was obtained by the MSC-PLSR model with r 2CV of 0.884 and RMSECV of 22.79 mg/g. Since there was no much improvement by considering the preprocessing, the original Vis–SNIR spectra were used in further calculation. When LNIR spectra were considered, although the original spectra only obtained the PLSR model with r 2CV of 0.773 and RMSECV of 31.83 mg/g, the performances of seven PLSR models with spectral preprocessing were all better than that of the PLSR model with the original spectra. The best prediction result with r 2CV of 0.920 and RMSECV of 18.89 mg/g was obtained when 2nd Der was applied. For both the MIR and NMR spectra, the best results (shown in Table 2) were achieved when the original spectra were used as the inputs of PLSR model. The prediction became worse when the spectral preprocessing was considered for both MIR and NMR spectra.

Table 2 Predictive results of eicosapentaenoic acid (EPA) content in fish oil using infrared spectroscopy and nuclear magnetic resonance with chemometrics (available range of EPA is between 104.87 and 353.22 mg/g)

SPA Calculation Based on the Full Range Spectra

The selection of the most important variables is critical to analyze the spectral data, which leads to reducing the high dimensionality and multicollinearity of spectra and improving the model’s predictive ability in terms of accuracy and robustness. In the “Calibration Based on Full Range Spectra and Analysis of Spectral Preprocessing” section, the full range spectra were delivered to the model calibration and no consideration was given to select the important variables contributed for the EPA prediction. In the current section and Sections 3.2.3 and 3.2.4, three variable selection strategies of SPA, UVE, and UVE-SPA were considered to select the important variables.

In the SPA calculation, after comparing the root mean square error (RMSE) values of different candidate subsets of variables that were obtained by a sequence of projection operations, eight variables which had the lowest RMSE were selected from the full Vis–SNIR spectra for the EPA analysis. The selected variables then formed new reduced spectral matrix by selecting the spectral data only at the selected variables, which were then set as the inputs of PLSR model instead of the full range spectra (Table 2). For the Vis–SNIR spectral analysis, the SPA-PLSR model with only eight input variables (0.25 % of variables in the full range) had a similar result with r 2CV of 0.865 and RMSECV of 24.53 mg/g compared with the result of the PLSR model with the full range spectra (F-PLSR). For the LNIR, MIR, and NMR spectral analysis, 16, 8, and 10 variables were selected by SPA. The input variables of corresponding SPA-PLSR models decreased by 92.04, 99.67, and 99.72 %, respectively, and the corresponding RMSECV decreased by 7.73 %, increased by 13.63 %, and increased by 5.38 %. In addition, the AV_RMSE of SPA-PLSR models was only 29.59, 68.36, 22.86, and 50.88 % of those of F-PLSR models for Vis–SNIR, LNIR, MIR, and NMR spectral analysis, respectively, showing that the consideration of SPA could reduce the multicollinearity of inputs of the PLSR models and improve the models’ robustness. This is agreed with the conclusion of Wu et al. (2013).

UVE Calculation Based on the Full Range Spectra

In the process of UVE calculation, the importance of both spectral variables and random variables was quantitatively measured according to their stability, which is shown in Fig. 2 (take Vis–SNIR spectra as an example). Spectra variables were at the left part of the sub-image, while random variables were at the right part. Two horizontal lines were the lower and upper cutoffs used for the variable elimination. Variables that had stability within the cutoff lines were treated as uninformative ones and were eliminated. The remaining variables were then used as the inputs of PLSR models, whose results are shown in Table 2. In general, consideration of UVE significantly improved the predictive ability of PLSR models for the Vis–SNIR and MIR spectral analysis. Their RMSECV decreased by 37.45 and 43.39 % compared with the corresponding F-PLSR models. Meanwhile, the RMSECV decreased by 15.51 and 10.10 % for LNIR and NMR spectral analysis. The results indicated that variables with no more information for EPA prediction than noise were eliminated by the UVE analysis. Compared with the results of SPA-PLSR models, UVE-PLSR models had better prediction capabilities. However, it was noticed that although uninformative variables were eliminated by UVE analysis, there were still hundreds of variables remained because UVE can only select informative variables that might have a problem of multicollinearity. In order to obtain the most important variables that were informative and had no multicollinearity shortcoming, SPA was further applied to select the important variables based on the informative variables selected by UVE.

Fig. 2
figure 2

Stability of each variable calculated in UVE process (take Vis–SNIR spectra as an example). Two horizontal lines indicate the lower and upper cutoff

SPA Calculation Based on the UVE-Selected Spectra

Similar to the SPA calculation based on the full range spectra, the best candidate subset of variables was determined by RMSE screen plot in the SPA calculation based on the UVE-selected spectra, resulting in 18, 14, 13, and 13 variables selected from Vis–SNIR, LNIR, MIR, and NMR spectra, respectively. Table 2 presents the main statistics achieved in the calibration and validation of the PLSR models developed by using the important variables selected by UVE-SPA. The variables selected by UVE-SPA decreased by 99.20, 91.87, 98.73, and 99.56 % compared with the variables selected by UVE for Vis–SNIR, LNIR, MIR, and NMR spectra, respectively, showing that the calculation of SPA after UVE could significantly reduce the input variables of UVE-PLSR models. Meanwhile, the UVE-SPA-PLSR models could maintain the accuracy of the UVE-PLSR models for LNIR and NMR spectra, but the RMSECV of UVE-SPA-PLSR models for Vis–SNIR and MIR spectra increased by 37.52 and 92.07 % compared with those of the UVE-PLSR models. On the other hand, the consideration of UVE before SPA improved the efficiency of the SPA calculation for all the spectroscopic techniques analyzed in this study except for Vis–SNIR spectra. The improvement of SPA calculation by conducting SPA after UVE analysis is agreed with the results of other works (Chen et al. 2010; Wu et al. 2010; Wu et al. 2011). The numbers of the variables selected by UVE-SPA decreased by six, one, and two than those selected by only SPA for LNIR, MIR, and NMR spectra, respectively, meanwhile their RMSECV decreased by 9.52, 4.32, and 19.66 %. The AV_RMSE also decreased by 33.18, 10.69, and 43.75 % for these three kinds of spectroscopy techniques, showing that their UVE-SPA-PLSR models were more robust than their SPA-PLSR models. For the Vis–SNIR spectra, although its UVE-SPA-PLSR model had a lower RMSECV and higher r 2CV , there were 12 variables used, while its SPA model only had eight variables. Considering the UVE-SPA model had a larger AV_RMSE, the important variables for Vis–SNIR spectra in EPA analysis were determined as those selected by SPA. In addition, by comparing the results of the F-PLSR models and the corresponding best PLSR models with variable selection (BV-PLSR model), namely the SPA-PLSR model for Vis–SNIR spectra and the UVE-SPA-PLSR models for LNIR, MIR, and NMR spectra, it was found that the RMSECV values of the BV-PLSR models decreased by 4.52, 16.52, and 15.34 % for Vis–SNIR, LNIR, and NMR spectra, respectively, while that of the MIR model increased by 8.72 %. On the other hand, the AV_RMSE values of the BV-PLSR model decreased by 70.41, 54.33, 79.59, and 71.38 %. In general, the BV-PLSR models had similar performances of the F-PLSR models, but had lower AV_RMSE than the F-PLSR models in the EPA analysis. The selected important variables of the BV-PLSR models for Vis–SNIR, LNIR, MIR, and NMR spectroscopy in EPA prediction are shown in Table 3. In Vis–SNIR analysis, 935 nm was assigned to C–H stretching third overtone of CH2, 1,028 nm was assigned to a combination of C–H stretching first overtone and C–H deformation second overtone of CH3, and 1,045 nm was assigned to a combination of C–H stretching first overtone and C–H deformation first overtone of CH2. In LNIR analysis, 1,178 nm was assigned to C–H stretching second overtone of HC=CH, 1,204 nm was assigned to C–H stretching second overtone of CH2, 1,376 nm was assigned to a combination of C–H stretching first overtone and C–H deformation of CH3, 1,880 nm was assigned to C=O stretching second overtone of –COOH, 1,932 nm was assigned to a combination of O–H stretching and O–H of H2O, and 2,100 nm was assigned to a combination of O–H stretching and O–H of –OH. In MIR analysis, 658 cm–1 was assigned to the rocking and bending out vibration of –(CH2) n – and –HC=CH– (cis), 1,737 and 1,750 cm–1 were assigned to –C=O (ester) stretching vibrations, 2,929 cm−1 was assigned to –C–H (CH2) asym stretching, and 2,997 cm−1 was assigned to =C–H (cis) stretching. In NMR analysis, the variables were mainly assigned to –CH3, –(CH2) n –, –CH2–CH=CH–, and –OCO–CH2 of acyl group.

Table 3 Selected important variables of the BV-PLSR models for Vis–SNIR, LNIR, MIR, and NMR spectroscopy in EPA, DHA, and DPA prediction

DHA Analysis

Similar to the EPA analysis, the investigation of four spectroscopic techniques for the DHA prediction started at the calibration based on the full range spectra with the analysis of different preprocessing algorithms. The original spectra were the most suitable data for the analysis of Vis–SNIR, MIR, and NMR spectra, and the spectra preprocessed by SNV had the best prediction result for LNIR spectra. It was noticed that when full range spectra were considered, the DHA content in fish oil could not be well determined by using Vis–SNIR or MIR spectra (Table 4). Variable selection was then conducted to see if the performances of four kinds of spectra could be further improved. At the first step, SPA was carried out for selecting important variables based on the full range spectra. There were 7, 13, 15, and 16 variables selected by SPA for Vis–SNIR, LNIR, MIR, and NMR spectra, respectively. New reduced spectral matrixes were then formed based on these variables to establish PLSR models (Table 4). Compared with the F-PLSR models, the application of SPA improved the corresponding PLSR models with the decrements of RMSECV by 16.18, 16.32, 10.59, and 18.92 % for Vis–SNIR, LNIR, MIR, and NMR spectra, respectively. The AV_RMSE was also decreased by about 60 % when SPA was considered. All of these were achieved when over 90 % variables were deleted.

Table 4 Predictive results of docosahexaenoic acid (DHA) of fish oil using infrared spectroscopy and nuclear magnetic resonance with chemometrics (available range of DHA is between 100.25 and 223.33 mg)

When UVE was used to eliminate uninformative variables, the RMSECV of Vis–SNIR, MIR, and NMR models decreased by 37.93, 49.08, and 21.69 % and the AV_RMSE decreased by 51.51, 63.29, and 49.49 % of those of the corresponding F-PLSR models. Although the RMSECV of LNIR spectral model only decreased by 1.03 %, its AV_RMSE decreased by 30.30 %, and 37.81 % variables were eliminated at the same time. In general, UVE selected informative variables from the full range spectra for DHA prediction, which were then used for further SPA calculation. As a result, there were 7, 13, 14, and 10 variables selected by UVE-SPA for Vis–SNIR, LNIR, MIR, and NMR spectra, respectively. The prediction results of the UVE-SPA-PLSR models for the DHA prediction are shown in Table 4. Compared with the results of UVE-PLSR models, the RMSECV of UVE-SPA-PLSR models decreased by 14.34 and 16.58 % for LNIR and NMR spectra, respectively, and increased by 22.60 and 31.07 % for Vis–SNIR and MIR spectra, respectively. Therefore, further calculation of SPA on UVE could maintain the models’ prediction ability while dramatically reducing the numbers of input variables. On the other hand, the calculation of UVE before SPA was proved to be an effective way to improve the selection of SPA for all the spectroscopic techniques except for LNIR spectra. Especially for the NMR spectra, when only ten variables were selected by UVE-SPA compared with 16 variables selected by SPA, the RMSECV of UVE-SPA-PLSR model decreased by 19.42 % of that of the SPA-PLSR model. The UVE-SPA model for MIR spectra also had a decrement of 25.35 % of that of the corresponding SPA-PLSR model. In addition, the AV_RMSE of the UVE-SPA models for NMR and MIR spectra decreased respectively by 28.05 and 54.05 % of those of SPA models. In the comparison between the BV-PLSR models (SPA-PLSR model for LNIR spectra and UVE-SPA-PLSR models for Vis–SNIR, MIR, and NMR spectra) and the F-PLSR models, it was found that the BV-PLSR models had better performances with the RMSECV decrements of 23.91, 16.32, 33.25, and 34.67 % for Vis–SNIR, LNIR, MIR, and NMR spectra, respectively. On the other hand, the AV_RMSE values of the BV-PLSR models decreased by 72.42, 42.26, 73.52, and 78.54 %. The above results demonstrated that the variable selection improved the prediction accuracy and robustness of the PLSR model in the DHA analysis. The selected important variables of the BV-PLSR models for Vis–SNIR, LNIR, MIR, and NMR spectroscopy in DHA prediction are shown in Table 3. In Vis–SNIR analysis, 893 and 935 nm were assigned to C–H stretching third overtone of CH3 and CH2, respectively, and 1,025 nm was assigned to a combination of C–H stretching first overtone and C–H deformation second overtone of CH3. In LNIR analysis, 1,031 nm was assigned to a combination of C–H stretching first overtone and C–H deformation second overtone of CH3, 1,051 nm was assigned to a combination of C–H stretching first overtone and C–H deformation first overtone of CH2, 1,396 and 1,416 nm were assigned to a combination of C–H stretching first overtone and C–H deformation of CH2, 1,495 nm was assigned to O–H stretching first overtone of –OH, 1,757 nm was assigned to C–H stretching first overtone of CH2, 1,899 nm was assigned to C=O stretching second overtone of –COOH, 2,093 nm was assigned to a combination of O–H stretching and O–H deformation of –OH, and 2,266 and 2,317 nm were assigned to a combination of C–H stretching and C–H deformation of CH3 and CH2, respectively. In MIR analysis, 662, 697, 721, and 761 cm−1 were assigned to the rocking and bending out vibration of –(CH2) n – and –HC=CH– (cis), 1,026 and 1,057 cm−1 were assigned to –C–O stretching, 1,147 cm−1 was assigned to stretching and bending vibration of –C–O and –CH2, 1,370 cm−1 was assigned to –C–H (CH3) bending (sym) vibration, and 1,717 and 1,741 cm−1 were assigned to –C=O (ester) stretching vibration. In NMR analysis, the variables were mainly assigned to –CH3, –OCO–CH2–CH2–, –CH2–CH=CH–, –OCO–CH2–, and =HC–CH2–CH= of acyl group.

DPA Analysis

Similar to the analysis of the above two PUFA, the original full range spectra and those preprocessed by different preprocessing algorithms of four spectroscopic techniques were used separately to establish PLSR models for the prediction of DPA content in fish oil. It was found that the spectra preprocessed by 2nd Der had the best prediction result for LNIR spectra, while the original spectra were the most suitable data for other three kinds of spectra. When SPA was applied to select the important variables from the full range spectra for establishing PLSR model, the predictive accuracy of Vis–SNIR, MIR, and NMR spectra had been improved with the RMSECV decrements of 22.36, 28.525, and 28.30 %, respectively, while the RMSECV of the LNIR spectral model increased by 4.93 %. On the other hand, the AV_RMSE values decreased from 34.62 to 61.48 % for four spectroscopic techniques with the involvement of SPA. In addition, there were less than 10 % variables selected by SPA for the establishment of PLSR models.

The other strategy of SPA process for variable selection was to execute it based on the informative variables selected by UVE instead of the full range spectra. After the UVE calculation, there were 1,334, 102, 487, and 1,647 variables selected for Vis–SNIR, LNIR, MIR, and NMR spectra, respectively. The PLSR models established based on these variables selected by UVE had the RMSECV decrements by 21.94, 28.17, 61.41, and 44.65 % for four spectroscopic techniques, respectively, compared with those of F-PLSR models. The above results indicated that the variables selected by UVE were informative and could be used for the further SPA calculation. After the calculation, there were 7, 12, 8, and 10 variables selected by UVE-SPA for Vis–SNIR, LNIR, MIR, and NMR spectra, respectively. The prediction results of the UVE-SPA-PLSR models are shown in Table 5. For the Vis–SNIR spectra, the UVE-SPA-PLSR model had similar results of the SPA-PLSR and UVE-PLSR models. For MIR spectra, the UVE-SPA-PLSR model had a poorer performance than the SPA-PLSR model, in which the RMSECV decreased by 5.63 %. Therefore, the consideration of UVE before SPA had no improvement for the variable selection, and the SPA-PLSR model was the best choice for both Vis–SNIR and MIR spectra in the DPA analysis. On the other hand, the UVE-SPA-PLSR models had better prediction results than the SPA-PLSR models for LNIR and NMR spectra. The RMSECV decreased by 42.95 and 32.46 % and the AV_RMSE decreased by 41.03 and 18.18 % for LNIR and NMR spectra, respectively. Compared with the results of the F-PLSR models, it was found that the BV-PLSR models had better performances with most variables eliminated. The RMSECV values decreased by 22.36, 40.14, 28.52, and 51.57 % and the AV_RMSE values decreased by 61.48, 73.56, 34.62, and 66.67 % for Vis–SNIR, LNIR, MIR, and NMR spectra, respectively, showing that the prediction accuracy and robustness of the PLSR model could also be improved by the variable selection in the DPA analysis. The selected important variables of the BV-PLSR models for Vis–SNIR, LNIR, MIR, and NMR spectroscopy in DPA prediction are shown in Table 3. In Vis–SNIR analysis, 892 and 935 nm were assigned to C–H stretching third overtone of CH3 and CH2, respectively, 892 nm was assigned to C–H stretching third overtone of CH3, 1,005 nm was assigned to a combination of C–H stretching first overtone and C–H deformation second overtone of CH3, and 1,037 nm was assigned to a combination of C–H stretching first overtone and C–H deformation first overtone of CH2. In LNIR analysis, 1,078 and 1,184 nm were assigned to C–H stretching second overtone of CH, CH2, CH3, and HC=CH, 1,310, 1,370, 1,403, 1,482, 1,514, and 1,534 nm were assigned to a combination of C–H stretching first overtone and C–H deformation first overtone in CH, CH2, and CH3, 1,880 nm was assigned to C=O stretching second overtone of COOH, 2,228 and 2,254 nm were assigned to a combination of CH2 stretching and C= stretching of HC=CH and CH3, respectively, and 2,279 nm was assigned to a combination of C–H stretching and C–H deformation in CH3. In MIR analysis, 658, 661, 703, and 763 cm−1 were assigned to the rocking and bending out vibration of –(CH2) n – and –HC=CH– (cis), 1,735 and 1,740 cm−1 were assigned to –C=O (ester) stretching vibration, and 3,035 cm−1 was assigned to =C–H (cis) stretching. In NMR analysis, the variables were mainly assigned to –CH3, –(CH2) n –, –CH2–CH=CH–, –OCO–CH2–, and =HC–CH2–CH= of acyl group.

Table 5 Predictive results of docosapentaenoic acid (DPA) of fish oil using infrared spectroscopy and nuclear magnetic resonance with chemometrics (available range of DPA is between 10.95 and 32.60 mg/g)

Discussion

The potential of four spectroscopic techniques of Vis–SNIR, LNIR, MIR, and NMR was investigated for the rapid and reliable assessment of EPA, DHA, and DPA in fish oil. Specifically, in the EPA analysis, the F-PLSR model for NMR spectra had a high r 2CV of 0.959, which was further improved to 0.970 based on the BV-PLSR model with eight important variables selected by UVE-SPA. The results of LNIR spectra were also acceptable, in which r 2CV of 0.920 was obtained by the F-PLSR model and r 2CV of 0.944 by the BV-PLSR model with ten important variables. Although the UVE calculation helped the MIR spectra to obtain a good prediction of EPA with r 2CV of 0.974, the SPA calculation either based on full range spectra or variables selected by UVE could only cause the PLSR models having the r 2CV of around 0.9. Similar situation happened for Vis–SNIR spectra. Although the UVE-PLSR model had r 2CV of 0.942, either the SPA-PLSR model or the UVE-SPA-PLSR model had r 2CV lower than 0.9. In the aspect of the robustness of the established BV-PLSR models, NMR spectra had the smallest AV_RMSE value of 0.162, followed by MIR spectra that had the value of 0.376. The AV_RMSE values of other two spectra were larger than 0.5.

In the DHA analysis, the BV-PLSR model of Vis–SNIR spectra had the poorest prediction with the RMSECV of 13.40 mg/g. The RMSECV decreased to around 10 mg/g for both the BV-PLSR models of LNIR and MIR spectra. The best prediction with r 2CV of 0.982 and the RMSECV of 4.73 mg/g was obtained by the BV-PLSR model for NMR spectra. The prediction of DHA using NMR spectra in tandem with multivariable selection and calibration had better prediction than internal standard method and DHA calibration curve method used by Igarashi et al. (2000), in which RMSE values were 14.92 and 17.26 mg/g for above two methods, respectively (calculated from data in Table 3 in the work of Igarashi et al. (2000). Therefore, the use of multivariable analysis was an efficient way to process NMR data instead of internal standard method and calibration curve method for DHA determination in fish oil. In addition, the BV-PLSR model for NMR spectra also had the smallest AV_RMSE value of 0.085 showing its good robustness, while the BV-PLSR models for other three spectra had the AV_RMSE values all higher than 0.25. In the DPA analysis, the BV-PLSR models of both LNIR and NMR spectra achieved good prediction with r 2CV of 0.983 and 0.979 and small AV_RMSE values of 0.018 and 0.022, respectively. On the other hand, the prediction of the BV-PLSR model using either Vis–SNIR or MIR spectra was not very good (the r 2CV values were 0.905 and 0.872).

The above results indicated that the important PUFA (EPA, DHA, and DPA) contents in fish oil could be determined by spectroscopic techniques in a rapid and reliable way. The NMR spectroscopy was the most suitable one among the four spectroscopic techniques investigated for the determination of EPA, DHA, and DPA contents in fish oil. The RPD values of the BV-PLSR models were 5.613, 7.144, and 7.416, showing that the BV-PLSR models were adequate for quality control and even process control (Williams 2001). The LNIR spectroscopy was also a good choice (RPD = 4.086 for EPA, 3.185 for DHA, and 6.718 for DPA). Especially considering that its measurement of using a small spectrometer (dimensions of 153.4 × 105.2 × 76.2 mm) was nondestructive, inexpensive, and very convenient to implement. It should be noticed that spectroscopy techniques analyzed with chemometrics are indirect methods that need reference values of target attribute for model calibration. Therefore, the accuracy of spectral models depends on and could not be better than the accuracy of the reference method. The reference method should be conducted carefully to minimize its test error, so that the corresponding established spectral models would be meaningful. In this work, the reference EPA, DHA, and DPA values were measured carefully using GC method. As GC is a standard method to determine fatty acids in foods and biological samples (AOAC Method 963.22 2000), its measured PUFA values could be used as the reference values for establishing spectral models.

Moreover, it was found that there were 11 twelfth cases (four spectroscopic techniques × three PUFA), in which the BV-PLSR models had better predictions (RMSECV decreased by an average of 23.20 % in all 12 cases) than the corresponding F-PLSR models. Meanwhile, the AV_RMSE of the BV-PLSR models decreased by an average of 64.90 % compared with the corresponding F-PLSR models in all 12 cases. Such improvement was achieved after over 90 % variables of the full range spectra were featured as redundant and uninformative and the remaining variables were used to establish the PLSR model. The above statistical data showed that the variable selection with the help of UVE and SPA was important for improving the prediction accuracy and robustness of the PLSR models for the determination of EPA, DHA, and DPA contents in fish oil. Compared with the work of Azizian et al. (2010) in which the PLSR models were established using near infrared spectra without variable selection (RMSECV of 3.23 % for EPA and 2.57 % for DHA), the PLSR models established in this work using near infrared spectra with variable selection by UVE-SPA had lower RMSECV of 15.77 mg/g for EPA and 10.75 mg/g for DHA, showing that the variable selection was efficient to increase the prediction accuracy of PUFA in fish oil using spectroscopy technology.

In addition, among the 12 cases researched in this study, there were eight cases in which UVE-SPA was the best strategy of variable selection. The SPA calculation based on the full range spectra was considered as the best for the rest four cases. Although it was found in other cases that the calculation of UVE could absolutely improve the results of SPA for spectral variable selection (Chen et al. 2010; Wu et al. 2010; Wu et al. 2011), the results of this work indicated that the elimination of uninformative variables was helpful for improving the efficiency of SPA for only some spectral data, while for other spectra, SPA could directly select the important variables from the full range spectra. Therefore, it is suggested that for different applications, both variable selection strategies of SPA and UVE-SPA should be analyzed and compared to choose the best one.

Conclusions

This study was carried out to evaluate the feasibility of using Vis–SNIR, LNIR, MIR, and NMR spectroscopic techniques for rapid prediction of three PUFA (EPA, DHA, and DPA) in fish oil. The results discussed in this paper indicated that the four techniques were reasonably efficient as rapid and convenient tools instead of the traditional GC method for assessing the EPA, DHA, and DPA contents in fish oil. By means of PLSR algorithm, quantitative relationships were correlated between the spectral data and the PUFA contents of fish oil samples. All results showed good performances of the established PLSR models for predicting PUFA by four spectroscopic techniques. Among them, NMR spectra had the best performances for all three PUFA predictions, followed by LNIR, MIR, and Vis–SNIR spectra. Since the full range spectra were characterized as high dimensionality, multicollinearity, and redundancy, variable selection was conducted to locate the most important variables for the prediction. The results indicated that the BV-PLSR models with over 90 % variables eliminated had lower RMSECV with an average decrement of 23.20 % and had lower AV_RMSE with an average decrement of 64.90 % in all 12 cases compared with the corresponding F-PLSR models. Two variable selection algorithms of UVE and SPA were investigated in this work to see if the efficiency of SPA for variable selection could be improved based on the informative variables selected by UVE. In 67 % cases, UVE-SPA had better prediction results than the direct calculation of SPA based on the full range spectra, while the UVE calculation had no improvement for the SPA selection in other cases. In view of the quality evaluation of fish oil, the results of this study verified the substantial propensity of the spectroscopic technologies to be excellent alternatives to the time-consuming and conventional GC methods. By implementing spectroscopic techniques as a key component of a rapid inspection system, it could enable the fish oil industry to sort, label, and price fish oil products truthfully according to their PUFA contents.