Introduction

The importance of oilseed Brassicas in human and animal nutrition and also as a crop with industrial application, is well recognized. Over the past three decades the production of oilseed Brassicas has increased to an extent that it is now one of the important sources of vegetable oil in the world. The modification of fatty acid composition of seed oil to develop new genotypes having alternative oil characteristics has been an important objective in quality breeding in rapeseed-mustard [1, 2]. The functional and nutritional value of vegetable oil is dependent on the type of constituent fatty acids present in the oil (triacylglycerols). The development of high oleic/low linolenic acid cultivars for use in frying applications and selection for low and very low (zero) content in saturated fatty acids for certain markets is a prerequisite [3]. On the other hand, high erucic acid oilseed rape cultivars have received renewed interest for industrial purposes, e.g., the lubricant industry [4]. The utilization of Brassica oil for fuel has also been commercialized, as a substitute for traditional fossil diesel fuel, to reduce green house gas emissions [5]. The development of cultivars for different target usages demand extensive evaluation of germplasm for traits of interest utilising standard analytical methods, which are labour intensive, time consuming, sometimes expensive, and hazardous too.

Near-infrared reflectance spectroscopy (NIRS) is an alternative powerful tool for analysis of chemical and physical properties without sample preparation and has been applied for the analysis of quality characteristics in food and agricultural commodities [68]. The technology is non-destructive, non-polluting, rapid, accurate, and less labour intensive using simple sample preparation methods [9]. To predict the concentration of various parameters, prediction equations are required and are based on the calibration data sets, which encompass the wide range of sample characteristics previously documented based on wet chemistry techniques. Multivariate calibration techniques are often employed to extract information of importance from NIRS analysis. Standardization of NIRS is mandatory for all the laboratories that install it and require one time accurate wet analysis for development of a calibration equation for simultaneous estimation of various biochemical quality traits. Standardization requires initial labour input, but once standardized, the NIRS system is efficient, fast, and safe. The aim of this study was to develope a calibration equation for estimation of different fatty acids in Brassica juncea germplasm for use in various Brassica spp. breeding programmes.

Materials and Methods

The plant material (seeds of 200 B. juncea genotypes) was procured from Oilseed Section, Department of Plant Breeding and Genetics, PAU, Ludhiana.

Fatty Acid Analysis Using Gas Liquid Chromatography

The selected samples were analysed for fatty acid composition by gas liquid chromatography (GLC) [10] on a M/s Nucon Engineers AIMIL Gas Chromatograph (solid state) model 5700 series equipped with a flame ionization detector fitted with 6 % butane diol succinate (BDS) on Chromosorb WAW/DWCS column, 6 feet in length × 1/4′′ outer diameter. For this about 50 mg of ground seed was thoroughly mixed with 1 ml petroleum ether (60–80 °C). To this was added 1.5 ml sodium ethoxide (0.02 M NaOH in 99.5 % of ethanol). The tubes were vortexed and kept at room temperature for 30–40 min. Added 1.5 ml of 8 % sodium chloride solution in water and the tubes were vortexed again. After half an hour the two layers got separated. The upper petroleum ether layer was transferred into another tube and allowed to evaporate and finally dissolved in 10 µl of petroleum ether and 2 µl was then injected into GLC using a microsyringe (Hamilton).

Scanning Samples for NIRS Analysis

A NIR Systems Model 6500 spectrophotometer (Foss-NIR Systems, Inc., Silver Spring, MD, USA) equipped with a transport module was used to perform the NIRS analyses [11]. Intact seeds from samples of Brassica juncea were used for wet analysis and were also analyzed on NIRS using a small ring cup and reflectance spectra (log 1/R) from the 400 to 2500 nm that were recorded.

Selecting Samples for NIRS Analysis

The NIRS manipulation for scanning, mathematical processing, and statistical analysis was performed with the WinISI II software (Windows version 4.0, Foss and Infrasoft International LLC, USA). The distance between a sample and its neighbour was measured as the Mahalanobis distance (H distance) called the neighborhood H. The score algorithm ranks spectra according to the Mahalanobis distance (H distance) from the average spectrum, and provides spectral boundaries to eliminate outliers with H > 3.0 (global H) and similar samples with H < 0.6 (neighbourhood H). Therefore, the final number of samples for calibration and validation was variable based on the cutoff point of H distance, depending on the spectral and chemical variability of samples in the population used for NIRS estimation.

Developing the NIRS Equation

After the removal of outliers by scoring, the calibration set was used to calibrate and cross-validate the equation derived, and internal cross validation (using the same set of samples) was used to avoid overfitting of the equation. The external validation was done to test the goodness of fit of the developed equation using random samples not included in calibration sample set.

Results and Discussion

The range, mean, and standard deviation of the parameters analysed by reference methods are summarised in Table 1 for B. juncea. The quality exhibited by the total seed samples varied between 6.4–49.9 % in oleic acid, 10.9–40.8 % in linoleic acid, 3.9–36.3 % in linolenic acid, and 0.4–57.2 % in erucic acid. The erucic acid exhibited a very wide range. The mean value for oleic acid was 24.8 %, linoleic acid 24.8 %, linolenic acid 13.0 %, and erucic acid 32.2 %. Font [12] obtained similar results for B. juncea, oleic acid 8.3–50.0 %, linoleic acid 11.2–43.5 %, linolenic acid 5.1–14.0 %, and erucic acid 0.02–56.2 % with mean values of 19.7, 22.8, 9.4, 38.3 %, respectively. The concentrations of the various parameters analyzed by others also reported a good range [1316].

Table 1 Fatty acid composition of Brassica juncea seeds (wet chemistry data)

Reflectance Spectra of B. juncea Genotypes and Their Scoring

All the 200 seed samples of B. juncea were scanned in ISI SCAN in a NIR spectrophotometer System. The instrument was operated at a constant temperature of 23 ± 2 °C. The spectra were collected over the range 400–2498 nm including the visible and NIR regions. The NIR reflectance spectra of intact seed samples are shown in Fig. 1. The main absorption bands were observed at 1208 nm related to C–H stretching 2nd overtone (–CH2), 1496 nm related to C–H stretching 1st overtone, 1724 nm related to C–O (oil) and C–H stretching 1st overtone (–CH2), 1936 nm related to O–H bending 2nd overtone (water), and 2308 nm related to C–H bending 2nd overtone (oil). The information about functional groups in the spectrum were searched from Win ISI software. The overall spectrum showed strong absorption bands related to oil and water, and were similar to those for other oil crops such as perilla, peanut, soybean, and sesame, especially, in the near-infrared region [17].

Fig. 1
figure 1

Near-infrared reflectance (NIR) spectra of intact seeds of Brassica juncea

The reflectance spectra along with laboratory values were then transferred from ISI SCAN to Win ISI 4.0 software as a cal. file where the spectra were subjected to various manipulations for making the calibration equation. Spectra showed the outliers, which were eliminated by scoring. Score file generation showed the ranking of each sample from the spectral mean called global H. The global H (GH) is the Mahalanobis H distance, which describes the distance of an individual sample from the centre of the population of the calibration samples. It basically seeks spectra, which are most different from other spectras.

Calibration Models for Fatty Acid Composition

In developing a NIRS calibration model for use in estimation of fatty acids of interest in Brassica samples, the statistics of calibrations and cross-validations are shown in Table 2, including the standard error of calibration (SEC) and the coefficient of determination in calibration (R 2). The one minus the ratio of unexplained variance divided by variance (1-VR) and the standard error of cross validation (SECV) are also shown in Table 2. The robustness of the equation obtained after calibration was strengthened first by internal cross validation followed by external cross validation. Internal validation was used to avoid overfitting of the equation by selecting the minimum number of PLS terms after the data scans through various cross validation passes in Win ISI software [18]. The MPLS regression model for the whole NIR spectra range (400–2500 nm) using the second derivative transformation with scatter correction (SNVD) of raw reflectance spectra yielded the equation for fatty acid analysis in B. juncea. The standard error of calibration for oleic acid was 3.91, the coefficient of determination was 0.91, SECV was 4.52, and 1-VR was 0.88. In the case of linoleic acid, SEC was 3.02, R 2 was 0.83, SECV was 3.20, 1-VR was 0.81, for erucic acid SEC was 6.01, R 2 was 0.88, SECV was 1.77, and 1-VR was 0.84. But for linolenic acid, calibration was not good; MPLS on 184 samples showed SEC 1.50, R 2 was 0.53, SECV was 1.77, and 1-VR was 0.35 as shown in Table 2. These results suggest that the variability for fatty acid composition in the calibration set is one of the main factors determining development of a successful calibration equation. Because in the case of linolenic acid, 1-VR is 0.35 whereas for others it is more than 0.8. There should be a close relation between R 2 and 1-VR as their values should be near about 0.9, whereas the value of SEC and SECV should be as little as it can be. If SEC and SECV are more, they can be corrected by adding more samples. Kim [11] reported R 2 for oleic acid 0.98, linoleic acid 0.90, linolenic acid 0.84, and erucic acid 0.98 with SEC 2.67, 1.00, 0.62, and 2.60, respectively in 249 seed samples of rapeseed. Mailer [19] reported R 2 for oleic acid and linoleic acid as 0.99, 1.0 with 1-VR 0.98, and 0.99, respectively in the case of olive oil. While building a calibration set, the following three parameters should be taken into consideration: the number of samples, the concentration range to be covered by the samples, and the distribution within this range. A large number of samples should be collected to ensure that the model covers a whole variation range and the samples are uniformly distributed within this range (in terms of all possible constituents or other parameters) [20].

Table 2 Calibration equation statistics for near-infrared reflectance spectroscopy (NIRS) for fatty acid analysis in intact seed samples of Brassica juncea

Internal and External Validation for Fatty Acid Composition

Although internal cross-validation provides a useful way to estimate the predictive ability of the calibration equation, an external validation was carried out to analyze in more detail the performance of the calibration equations. The results obtained in the development of calibration equations for predicting the concentration of individual fatty acids were very heterogeneous (Table 3). The calibration equations for oleic acid, linoleic acid, and erucic acid showed a good R 2 in validation; Fig. 2 indicats that a reliable estimation of the concentration of these fatty acids can be obtained in samples of B. juncea by applying these equations. These values are similar to those reported for calibration equations integrating samples from different species of Brassica [4, 21]. The calibration equations for the linolenic acid were characterized by poor validation statistics, with r 2 0.25 (Table 3; Fig. 2) suggesting that the calibration equations for this could be useful only to make preliminary selections or to get an approximate classification of the entries; the SEP (Standard error of performance) was too high to consider them as adequate for reliable analyses. Previous studies developing either single- [22, 23] or multispecies calibration equations [4] for these fatty acids in the genus Brassica were reported as considerably lower in SEP values. For example, Velasco [4] obtained a SEP of 2.25 (mean 16.6 %) for C18:2 in a multi-species calibration equation, as compared with a SEP of 3.7 (mean 17.5 %) obtained in single species calibration.

Table 3 Internal and external validation statistics for calibration equation for fatty acid analysis in intact seeds of Brassica juncea
Fig. 2
figure 2figure 2

Correlation plots between predicted values and lab (wet) values for oleic (a), linoleic (b), linolenic (c) and erucic (d) acid (%) for the internal validation of near-infrared reflectance spectroscopy (NIRS) calibration equation for Brassica juncea

For oleic acid in B. juncea, Bias was 0.01, SEP 6.57, slope 0.94, and R 2 0.77. Whereas in case of linoleic acid, Bias was 0.21, SEP 4.29, slope 0.92, R 2 0.68, and for erucic acid Bias 0.75, SEP 7.49, slope 0.95, and R 2 0.81. But for linolenic acid, Bias 0.36, SEP 2.96, slope 1.05, and R 2 0.25. Since, R 2 for calibration (0.91) and validation (0.77), in the case of oleic acid, 0.83 calibration R 2 and 0.68 validation R 2 in linoleic acid, and 0.88 (calibration) and 0.81 (validation) R 2 in erucic acid indicated good correlation and R 2 calculated by calibration and the validation model is close to each other, calibration equations for these parameters are considered as significant for their estimation. For the linolenic acid, the calibration and validation models showed low R 2 values of 0.53 and 0.20 respectively.

The external validation results showed that in the case of B.juncea (Table 3; Fig. 3), the equation was valid for oleic acid (R 2 0.89), linoleic acid (R 2 0.69), and erucic acid (R 2 0.90). The external and internal validation results showed good R 2 values for oleic acid, linoleic acid, and erucic acid. Therefore, in conclusion, this NIRS calibration equation for intact seed samples of B. juncea can be used for the estimation of the linoleic acid, erucic acid, and oleic acid profile except for linolenic acid.

Fig. 3
figure 3

Correlation plots between predicted values and lab (wet) values for oleic (a), linoleic (b), and erucic (c) acid (%) for the external validation of the near-infrared reflectance spectroscopy (NIRS) calibration equation for Brassica juncea

Conclusion

The study concludes that calibration equation development for individual species of genera increases the accuracy for the estimation of various fatty acids. For good calibration equation development, the values of each quality trait should be uniformly distributed throughout the range of that trait in the population. The calibration equation having higher R 2 values in internal and external cross validation revealed a good calibration equation in comparison to low R 2 values that need to be updated with a new set of samples that cover a wide range in intact seeds of B. juncea.