Introduction

Meat and meat products represent an important component of the human diet. In addition to proteins, red meat also offers minerals and trace elements, particularly zinc and iron, to the diet. Game meat, also high in proteins (20.0–23.8%), offers a healthy alternative to other red meat as it is known to be much lower in fat (0.8–2.45%) compared to beef (14.2% fat; 19.2% protein) (Hoffman 2007). South African (SA) game meat is considered an organic food product since the animals are wild and free-roaming, in contrast with many game species in other parts of the world that are semi-domesticated (Hoffman and Wiklund 2006; Mostert and Hoffman 2007). For this reason, SA game meat is a highly priced commodity making it an attractive target for species substitution (Ballin 2010; Kamruzzaman et al. 2013).

Meat species substitution is a current problem involving economic and safety issues since one cannot easily detect the source of origin or differentiate between species when evaluating meat visually (Kamruzzaman et al. 2013). Beef burgers (produced in Ireland in 2013) were found to contain horse meat, exposing consumers to undeclared animal species in meat products (O’Mahony 2013; Walker et al. 2013). In South Africa, Cawthorn et al. (2013) found species (such as chicken, goat, water buffalo and donkey) that were not declared on the product labelling in beef sausages. Such reports subsequently raised consumers’ concern regarding traceability and origin of the food they eat (Verbeke and Ward 2006). Correct and reliable labelling of meat products is important to allow consumers to make informed choices.

Due to the cost of analytical methods (chromatography, electrophoresis, enzyme-linked immunosorbent assay (ELISA) and DNA-based techniques) required for accurate identification of meat species (Cawthorn et al. 2013; Fajardo et al. 2010; Jonker et al. 2008), raw meat products are not tested on a regular basis. To address this shortcoming, near-infrared (NIR) spectroscopy can be used as a rapid screening method (Manley 2014) for detection of potential substitution of meat species (Ding and Xu 1999; Cozzolino and Murray 2004). NIR spectroscopy can be used to quantify and qualify physical, chemical and biological attributes of food samples based on their spectral signature (Manley 2014).

Visible (400–780 nm) and NIR (780–1100 nm and 1100–2500 nm) spectroscopy has been indicated as an effective test method for meat species identification. NIR spectroscopy works well in combination with chemometrics for more decisive classification of food samples (Reid et al. 2006). Information contained in NIR spectra can be extracted using various multivariate techniques that relate several variables to chemical properties. The most frequently used techniques allow samples with similar characteristics to be grouped, in order to establish classification methods for unknown samples (qualitative analysis) or to perform methods determining some property of unknown samples (quantitative analysis). Ding and Xu (1999), Cozzolino and Murray (2004), and Mamani-Linares et al. (2012) coupled NIR with chemometrics to achieve good classification models of meat species. Visible-near-infrared (vis-NIR) spectroscopy was used in initial studies on meat species classification. Ding and Xu (1999) differentiated beef from kangaroo meat samples with a classification accuracy of 83%. No kangaroo meat samples were misclassified (100% accuracy). Similarly, Cozzolino and Murray (2004) identified muscles from beef, pork, chicken and lamb with accuracies of more than 85%. Discrimination of cattle, llama and horse meat species was possible with accuracies of 100, 95 and 89%, respectively (Mamani-Linares et al. 2012). Prieto et al. (2008) used only the NIR region (1100–2500 nm) to discriminate ground meat samples of adult steers (oxen) from that of young cattle. Using partial least squares discriminant analysis (PLS-DA), an overall classification accuracy of 100% was obtained. Intramuscular fat and water content were shown to be the main sources of variation between these sample groups. Since 2013, the availability of handheld instruments (O’Brien et al. 2013) opened up the opportunity to take the instrument to the sample, in contrast to desktop NIR instruments which require samples to be transported to the laboratory for analysis.

The consumption of game meat is becoming popular all over the world. For example, in Spain, approximately one million two hundred thousand hunters and nine million animals of the main hunted species are hunted each year. In Andalusia, over five million big and small game species were hunted during the 2015 and 2016 season (Moreno-ortega et al. 2018). In Sweden, 70% of the population including non-hunters are reported to have consumed game meat (LJung et al. 2012). In Africa, among the most hunted species are springbok (Antidorcas marsupialis), gemsbok (Oryx gazella), impala (Aepyceros melampus), blesbok (Damaliscus pygargus phillipsi), kudu (Tragelaphus strepsiceros), blue wildebeest (Connochaetes taurinus) and red hartebeest (Alcelaphus buselaphus caama) (Van Schalkwyk & Hoffman 2016). The South African consumer demand for game meat within the formal market has been considerably lower than for more conventional livestock species such as beef, mutton and pork. The lower demand can potentially be attributed to limited availability, higher retail prices as well as the naturally darker colour of game meat (Hoffman and Wiklund 2006; Wassenaar et al. 2019). Nonetheless, Saayman et al. (2011) investigated the effect of local hunting on the South African economy and found that there was a largely positive economic impact; hunting had a contribution of over 6 billion ZAR to the gross domestic profit (GDP) of the country along with job creation. Van der Waal and Dekker (2000) found approximately 13,700 permanent jobs created as well as extra people being hired temporarily during the hunting season. In a 2018 report, the same authors found that trophy hunting contributed significantly to the national economy and supplied over 17,000 jobs, which could result in areas of lower income becoming more economically stable (Saayman et al. 2018). However, the increase of hunted wild game meat markets all over the world is however hampered by the lack of a well-structured food chain (Marescotti et al. 2019).

Marketing game meat on species level rather than a collective ‘game meat’ has been considered. However, some of the game species are more popular and sought after by consumers. Springbok and eland are, e.g. favoured compared to zebra which is deemed less desirable. Game meat is sold in the form of steaks, sausages, biltong and droewors. Therefore, there is a possibility that these meat portions can be mislabelled, and in such cases, meat species classification will be required.

Recently, we reported that a handheld NIR device could be used to discriminate muscles of SA game species (Dumalisile et al. 2020). Overall accuracies ranging from 85 to 100% were achieved when distinguishing between impala, eland and ostrich muscles. In this study, we aim to use NIR spectroscopy coupled with various discriminant and classification methods to differentiate between Longissimus thoracis et lumborum (LTL) muscle steaks of selected game species.

Materials and Methods

Meat Samples

A total of 118 animals of the following game species was obtained: 33 impala (Aepyceros melampus), 26 blesbok (Damaliscus pygargus phillipsi), 13 springbok (Antidorcas marsupialis), 15 eland (Taurotragus oryx), nine black wildebeest (Connochaetes gnou) and 22 zebra (Equus quagga). The game species originated from different areas and were hunted during different seasons as shown in Table 1. The animals were free-roaming and grazed on natural vegetation. All animals were hunted according to the standard operating procedure with ethical clearance (approval number: SU-ACUM14-001SOP; Stellenbosch University (SU) Animal Care and Use Committee). The animals were eviscerated at abattoirs according to the South African red meat regulations (DAFF (Department of Agriculture, Forestry and Fisheries) 2004; Van Schalkwyk and Hoffman 2016), and transported chilled to the meat research laboratory at the Department of Animal Sciences, SU. After 24 to 48 h post-mortem, the Longissimus thoracis et lumborum (LTL) muscle was removed at the sixth rib of each carcass.

Table 1 Description of game species (blesbok, impala, springbok, black wildebeest, eland and zebra), number of samples, provenance and harvest season

Near-Infrared Spectral Acquisition

Each LTL muscle was cut into a 2.0 to 2.5-cm-thick steak and allowed to bloom for 30 min at ambient temperature. Near-infrared (NIR) spectra were collected from each muscle with a MicroNIR™ OnSite spectrophotometer and spectral acquisition software (Viavi Solutions®, San Jose, CA, USA). The illumination source comprised of two integrated vacuum tungsten lamps coupled to a linear variable filter and a 128-pixel indium gallium arsenide (InGaAs) photodiode array detector. The reflectance spectra were recorded from 908 to 1680 nm at 6.2 nm intervals, resulting in 125 data points. The InGaAs detector was used to achieve a resolution of 30 μm × 250 μm/50 μm (< 12.5 nm resolution). A 2-mm-thick Steriplan glass Petri dish was placed on top of the meat samples to prevent direct contact of the spectrophotometer with surface moisture. Triplicate spectra were collected through the glass surface, at three different positions for each sample. A sample spectrum was recorded in about 0.25 to 0.5 s. Each spectrum was the average of 100 scans. The external white and dark references were scanned every 10 min during sample collection.

Moisture, Protein and Fat Analysis

Moisture, protein and fat content of the game meat steaks were determined as described by Neethling et al. (2014a, b). The moisture content (g/100 g) of each species was determined by drying the homogenised muscles at 100 °C for 24 h, according to the Association of Official Analytical Chemist’s Standard Techniques (AOAC method 934.01.30). For protein content determination, dried and defatted meat samples were ground to a fine powder. The crude protein was analysed using the LECO combustion method also known as the Dumas combustion method (AOAC method 992.15). Approximately 0.15 g sample was weighed and inserted into a foil wrap designed for a Leco protein analyser (LECO FP-528 Nitrogen Analyzer, Leco Corporation). An ethylene diamine tetra-acetic acid (EDTA) calibration sample (Part number 502-092) was analysed with each batch of samples to ensure accuracy and recovery rate. The protein content was determined as nitrogen (% N) content multiplied by a factor of 6.25. The fat content was determined by homogenising the samples in a blender, followed by chloroform/methanol (2:1) extraction (Lee et al. 1996).

Multivariate Data Analysis

The spectral data were imported into and analysed with The Unscrambler® X version 10.5 (CAMO Software, Oslo, Norway) and PLS_Toolbox (Version 8.6.2, Eigenvector Research, Inc., Manson, WA USA) data analysis software packages. Triplicate spectra were averaged, to obtain one spectrum per sample. The data was converted to absorbance with the following formula:

$$ A=-\log (R) $$

Where:

A:

absorbance

Log:

log base 10

R:

reflectance

Spectral Pre-processing

Two combinations of pre-processing (mathematical transformation) techniques were applied to reduce potential scattering effects, baseline shifts and noise in the data (Rinnan et al. 2009; Engel et al. 2013). Firstly (combination one), spectra were smoothed with a seven-point moving average to remove noise followed by standard normal variate (SNV) and de-trending (Barnes et al. 1989). SNV was applied to remove the scattering effects by centring and scaling each spectrum and de-trending was applied to reduce the baseline shift and curvature. Secondly (combination two), spectra were treated with SNV and de-trending (SNV-Detrend), followed by Savitzky-Golay 2nd derivative, 2nd order polynomial and seven-point smoothing. Savitzky-Golay 2nd derivative was applied to smooth noise fluctuations without introducing distortions and to enhance peaks not clearly visible in the original spectra (Savitzky and Golay 1964).

Principal Component Analysis

Principal component analysis (PCA) was computed in The Unscrambler®. PCA (Cowe and McNicol 1985) decomposes the raw data matrix (X) into scores and loadings, according to the following equation:

$$ X={T}_1{P}_1^{\prime }+{T}_2{P}_2^{\prime }+\dots +{T}_k{P}_k^{\prime }+E $$

where:

X:

raw data matrix

T:

scores matrix

P:

loadings matrix

E:

residuals

k:

must be less than or equal to the smaller dimension of X

In this equation, E is that part of the original data (X) not explained by the model. The explainable part, (\( {T}_1{P}_1^{\prime }+{T}_2{P}_2^{\prime }+\dots +{T}_k{P}_k^{\prime } \)), captures the essential patterns in the data and is known as the principal components (PCs). The first PC accounts for as much of the variability in the data as possible, and each succeeding component accounts for the remaining variance. Thus, in a 3-component model, PC1 will have the largest explained variance, PC2 the second most and PC3 the least. The explained variance, similar to the eigenvalues, indicates the portion of variability captured by a PC (Wold 1987). The larger the eigenvalue, the greater the amount of the variance the PC explains.

Calibration and Validation Sets

Calibration and validation (test) sets were obtained from the original data using the Kennard-Stone (KS) algorithm (Kennard and Stone 1969). This algorithm allows to design model set uniformly, i.e. samples are selected into a model set by including samples that represent the most different sources of variability. Thus, it employs distance calculations and selects samples based on their spectral features (Pasquini 2018). The algorithm was employed on the full data set to split it into a calibration set comprised of 83 samples (70% of the original data set) and the remaining 35 (30%) were used for validation. Table 2 illustrates the number of samples used for calibration and validation.

Table 2 Calibration and validation (tests) sets obtained by Kennard-Stone algorithm

Classification Methods

Classification models were developed using hard and soft modelling methods. Here, we used popular techniques such as linear discrimination analysis (LDA) (Fisher 1936), partial least squares discriminant analysis (PLS-DA) (Barker and Rayens 2003) and soft independent modelling of class analogy (SIMCA) (Wold 1976; Brereton 2011). The different species were grouped according to size, medium-sized (impala, blesbok and springbok) and large-sized (eland, black wildebeest and zebra) species and models were developed within each of these groups. For the PLS-DA approach, groups of classes were modelled simultaneously using one PLS-2 model. Cross-validation based on venetian blinds was applied during the calibration process to determine the optimum number of latent variables (LVs) and all models were independently validated. For all algorithms, class modelling was set to “Class Predict Strict” in the PLS_Toolbox (Version 8.6.2, Eigenvector Research, Inc., Manson, WA USA). In this approach, each sample belongs to a given class if the probability is greater than a threshold value for that class. If no class has a probability greater than the threshold, or if more than one class has a probability exceeding it, the sample is assigned to class zero (0) indicating no class could be assigned. Confusion matrices were used to evaluate the performance of the individual models. To interpret the confusion matrix results, classification accuracy was calculated using the following equation (Oliveri and Downey 2012):

$$ \% Accuracy=\frac{TP+ TN}{TP+ TN+ FP+ FN}\times 100 $$

where,

TP:

True positive (when samples belonging to the class being modelled are correctly predicted to be inside the boundary of that class) e.g. for a blesbok class model, true positive samples are blesbok samples predicted as such

FN:

False negative (when samples belonging to the class being modelled are incorrectly predicted to be outside the boundary of the class), e.g. in a blesbok class model, false negative samples are blesbok samples that are misclassified

FP:

False positive (when samples not belonging to the class being modelled are incorrectly predicted to be inside the boundary of the class), e.g. in a blesbok class model, false positive samples are samples not being blesbok, predicted as blesbok

TN:

True negative (when samples not belonging to the class being modelled are correctly predicted to be outside the boundary of the class), e.g. in a blesbok class model, true negative samples are samples not being blesbok, predicted as such

Results and Discussion

Proximate Analysis

Proximate analysis was done to support the spectral interpretation of the species, and these are presented in Table 3. The moisture content of game meat usually varies between 70 and 77% (Hoffman 2007). In this study, the moisture content was between 75.30 and 75.60%, with no major differences between the species. As expected, the protein content was within the range (20.0–23.8%) reported by Hoffman (2007), with blesbok the lowest (21.53%) and eland the highest (22.96%). The fat content was within the reported limits (0.8–2.45%), except blesbok with a fat content of 2.48%. This was higher than the 1.7% reported by Von la Chevallerie (1972). The zebra’s composition was similar results to that reported by Hoffman et al. (2016).

Table 3 Average proximate chemical composition (moisture, fat and protein) (%) of the LTL muscles of blesbok, impala, eland and zebra

Characterisation of NIR Spectra

Mean spectra (raw and pre-processed) of the medium-sized antelopes and the large-sized game species are shown in Figs. 1 and 2, respectively. The raw spectra (Figs. 1a and 2a) show three broad absorption bands typical of red meat samples. The bands at 976 and 1434 nm are related to third and second overtone stretching of the O-H bond (Barbin et al. 2012; Elmasry et al. 2011) associated with the moisture content of the samples. Water is the main component of meat (ca. 75%) (Table 3). In addition to these, the wavelength band at 1186 nm corresponds to the second overtone of a C-H stretching bond, associated with intramuscular fat (Cozzolino and Murray 2004; Ding and Xu 2000; Osborne et al. 1993).

Fig. 1
figure 1

Mean spectra of samples of the medium-sized antelopes (impala, blesbok and springbok species) with a raw spectra, b combination 1 pre-processed spectra and c combination 2 pre-processed spectra. Wavebands 976–988 nm and 1410–1434 nm are associated with moisture and wavebands 1168–1186 nm with fat

Fig. 2
figure 2

Mean spectra of the samples of the large-sized game species (black wildebeest, eland and zebra) with a raw spectra, b combination 1 pre-processed spectra and c combination 2 pre-processed spectra. Wavebands 976–988 nm and 1410–1434 nm are associated with moisture and wavebands 1168–1186 nm with fat

Pre-processing enhanced the differences between the respective species at these wavelengths (Figs. 1b, c and 2b, c). The average spectra of blesbok and springbok seemed to be more similar than that of impala. The average spectra of the large-sized species showed differences in absorbance values between all three species, especially at the bands associated with moisture. The large-sized species seemed to be more similar in terms of fat. Due to the broad bands observed in NIR spectra, it was not always possible to distinguish between the different species based on visual inspection of the raw or pre-processed spectra. Further analysis such as exploratory data analysis and classification model development are required to effectively determine the potential of NIR spectroscopy to distinguish between game meat muscles.

Principal Component Analysis

The principal component analysis (PCA) scores plot (PC1 vs. PC3) of all six species, pre-processed with combination 1 transformation and accounted for 94% of the total explained, is shown in Fig. 3a. Two clear clusters separated the medium-sized antelopes from the large-sized game species. The loadings line plot of PC3 (Fig. 3b) indicates a waveband at ca. 1372 nm, associated with fat, accounting for the separation between the medium- and large-sized game species. This is evident from the difference in average fat content of blesbok and impala (2.05%) compared to that of eland and zebra (1.49%) (Table 3).

Fig. 3
figure 3

a PCA scores plot (combination 1 pre-processed spectra; smoothing and SNV-Detrend) of PC1 vs. PC3 (94% explained variance) illustrating separation of the medium-sized antelopes (impala, blesbok and springbok) from the large-sized game species (black wildebeest, eland and zebra) samples in the direction of PC3. b PC3 loadings line plot, with the waveband at ca. 1372 nm (associated with fat) contributing to the separation of the meat samples from the medium-sized antelopes and large-sized game species

Fig. 4 and Fig. 1 of the supplementary material depict the PCA scores plots of the medium-sized antelopes, pre-treated with combination 1 (smoothing and SNV-Detrend) and combination 2 (SNV-Detrend and 2nd derivative) pre-processing, respectively. The first two principal components (PCs) explain 98% of the total variance when the spectra were pre-processed with combination 1 and 95% when pre-processed with combination 2. The PCA scores plots show separation in both cases, in the direction of PC1, between the impala muscles and those of springbok and blesbok. The PC1 loadings line (Fig. 4b), for the data pre-processed with combination 1, shows prominent wavebands at 982 and 1416 nm (O-H bonds) and 1093 and 1570 nm (N-H bonds), associated with moisture and protein, respectively (Osborne et al. 1993). When the data was pre-processed with combination 2, wavebands at 976 nm (moisture), and 1155 and 1366 nm (fat) were contributing to the separation.

Fig. 4
figure 4

a PCA scores plot (smoothing and SNV-Detrend pre-processed spectra) of PC1 vs. PC2 (98% explained variance) illustrating separation of the impala meat muscles from those of blesbok and springbok in the direction of PC1. b PC1 loadings line plot, depicting wavebands associated with protein (1093 and 1570 nm) and moisture (982 and 1416 nm) mainly contributing to the separation of impala from blesbok and springbok

The similar spectral characteristics observed between the springbok and blesbok samples are probably because they were harvested from the same farm, during the same season and grazing on the same pasture/fodder. Van Zyl and Ferreira (2004) reported a distinct chemical difference between springbok, blesbok and impala harvested from different regions. In addition, Neethling et al. (2018) noted that springbok from three farm locations differed significantly in their proximate composition and sensory attributes. Based on these findings, it appears that the geographical origin of the species has a meaningful impact on their chemical composition. The lack of geographic variation in our study is evident.

The PCA scores plots of the large-sized species pre-processed with combination 1 (smoothing and SNV-Detrend) and combination 2 (SNV-Detrend and 2nd derivative) are shown in Fig. 2 of the supplementary material and Fig. 5, respectively. Fig. 2a of the supplementary material shows the scores of PC1 vs. PC2 (95% explained variance) illustrating separation of zebra muscles from eland and black wildebeest in the direction of PC1. The wavelength bands (982 and 1422 nm (O-H) and 1087 and 1570 nm (N-H)) responsible for this separation are shown in the PC1 loadings line plot (Fig. 2b of the supplementary material). The O-H bands are related to third and second overtone stretching of the O-H bond (Barbin et al. 2012), associated with the moisture content of the samples, while the N-H bands are associated with the second overtone stretching related to NH2 compounds (proteins) (Osborne et al. 1993). In the direction of PC2, eland muscles are separated from black wildebeest and the accompanying loadings line plot (Fig. 2c of the supplementary material) indicates 1174 nm as the responsible waveband. This C-H, second overtone stretching bond is associated with fat (Cozzolino and Murray 2004). Hoffman et al. (2009) reported that black wildebeest harvested in spring (regardless of sex), to have a low fat content. Thus, it seems possible that the difference in fat content between the eland and black wildebeest muscles is due to the fact that the black wildebeest species were harvested in spring (Table 1).

Figure 5 a displays the scores plot of PC1 vs. PC3 (78% explained variance) showing clustering of the three groups of large-sized species. The PC1 loadings line plot (Fig. 5b) reveals the main wavelength bands responsible for the grouping as those located at 970, 1155 and 1366 nm, which correspond to the moisture and fat, respectively. The loadings line plot for PC3 (Fig. 5c) shows prominent bands at 1112 and 1366 nm (both CH bands), responsible for the separation of eland and black wildebeest.

Fig. 5
figure 5

a PCA scores plot (SNV-Detrend and 2nd derivative pre-processed spectra) of PC1 vs. PC3 (78% explained variance) showing the grouping of the zebra, black wildebeest and eland muscles. b PC1 loadings line plot, showing the wavebands associated with the separation of most of the zebra samples from those of eland and black wildebeest (970 nm = moisture; 1155 and 1366 nm = fat). c PC3 loadings line plot depicts the separation of the samples of eland and black wildebeest due to difference in fat (1112 and 1366 nm)

Classification Methods

Because of the separation observed in the PCA scores plot (Fig. 3a), the two groups (medium-sized antelopes and the large-sized game species) were classified with PLS-DA and a 96% classification accuracy was obtained (Fig. 3 of the supplementary material). The model showed one medium-sized antelope (impala) sample misclassified as a large-sized game species, while two large-sized game species (zebra) were misclassified as medium-sized antelopes. Based on these results, subsequent classification models were developed within these two groups.

Table 4 shows the classification accuracies of models developed with LDA, PLS-DA and SIMCA for both pre-processing combinations. When combination 1 was used, LDA delivered overall prediction results above 68% with 100% accuracy for the impala meat samples. The PLS-DA models gave the lowest classification accuracies ranging from 47 to 91%. Regardless of the low accuracy (57%) obtained for zebra species, the overall classification accuracy for large-sized species was 77%. SIMCA models yielded classification accuracies ranging from 67%, up to 100% for impala and eland meat samples. However, when pre-processing combination 2 was used for spectral treatment of SIMCA models, the lowest accuracies were obtained (50 to 84%); this highlights the importance of the pre-processing method used and concurs with Rinnan et al. (2009). The LDA model generated the best prediction results across all categories with classification accuracies ranging from 72 to 95%. The PLS-DA model also gave good accuracies, ranging from 70 to 96%. With respect to the medium-sized antelopes category, impala samples gave outstanding results across the models. This was already evident in the spectral features (Fig. 1). In the case of the large-sized species, the calibration and prediction accuracies of eland samples were outstanding for PLS-DA models compared to the others (LDA & SIMCA). In contrast, when SIMCA was used for classification, the lowest accuracies were achieved for the eland samples. Classification accuracies of up to 82% were obtained for the zebra samples, despite the difference between the two batches (Table 1). This indicates that as much variation as possible is needed from each species to build a robust model.

Table 4 Calibration (Cal) and validation (Val) accuracy (%) results of LDA, PLS-DA and SIMCA models, for classification of meat from medium-sized antelopes and large-sized species using pre-processed spectral data (combination 1: smoothing and SNV-Detrend; combination 2: SNV-Detrend and 2nd derivative)

Table 5 illustrates the confusion matrix of the calibration sets. A notable result was obtained with PLS-DA pre-treated with combination 2 for springbok samples, where 50% were misclassified (false negatives). Springbok meat samples had the highest misclassification rate, and were misclassified as blesbok (Fig. 4 of the supplementary material). This is likely because both species were harvested from the same farm in the same season, feeding on the same pasture (Table 1) (Neethling et al. 2014a, b). A noteworthy model for the large-sized species was obtained with SIMCA pre-treated with combination 1. Eland had a 50% misclassification rate, while the other classes (True negatives) were correctly classified. Thus, this model was capable of identifying all the other classes as a group. Even though the eland had the lowest classification accuracy (75%), in contrast to black wildebeest (90%) and zebra (78%), the model had a validation accuracy of 100% (Table 4). This should be approached with caution though as only 5 eland samples were in the validation set. When pre-processed with combination 2, none of the eland samples were correctly classified. This further stresses the importance of choice of pre-processing method.

Table 5 Confusion matrix obtained for LDA, PLS-DA and SIMCA classification models for medium-sized antelopes and large-sized species

The confusion matrix results are best visualised graphically, as illustrated in Fig. 4 of the supplementary material where the PLS-DA (pre-processed with combination 2) predictions are shown for the medium-sized antelope. In the blesbok model, two samples were misclassified as springbok while four springbok samples were misclassified as blesbok, which also concurs with the overlapping spectral features discussed in the “Proximate Analysis” section (Fig. 1). For the impala model, one sample was misclassified and one springbok sample was classified as impala. Finally, for the springbok model, three springbok samples were misclassified as blesbok, and three blesbok and four impala samples were misclassified as springbok.

Conclusions

To date, this is the first reported study to discriminate different South African game meat species using NIR spectroscopy in combination with multivariate data analysis. From this study, it was attested that it is possible to differentiate game meat with classification accuracies of 67 up to 100%. Moreover, the three discrimination methods applied have proven to discriminate meat samples from the two groups (medium-sized antelopes and large-sized species) of game species. In general, impala, black wildebeest and eland gave the best classification results while blesbok and springbok were not good due to spectral similarities. Furthermore, it was observed, especially with the PLS-DA and the SIMCA models, that the classification accuracy of a model is influenced by the pre-processing method. In this study, SIMCA models performed better when treated with smoothing and SNV-Detrend while PLS-DA models gave better accuracies with SNV-Detrend and Savitzky-Golay 2nd.