Keywords

1 Introduction

Daily, humans need to eat food to obtain nutrients, proteins, and vitamins that allows to have a healthy life and get the energy required to develop their activities [1]. But having poor nutrition increases vulnerability to diseases alters the normal physical and mental development, and diminishes productivity. Therefore, people need to eat nutrient-rich foods, aiming to cover their dietary requirements [2].

Today, getting clean and healthy food is becoming hard –healthy foods are varied in nutrient-rich foods such as fruits, vegetables, cereals, etc. which have been grown preferably naturally, in accordance with the environment [3]. Because of the pollution and bad agricultural practices, more crops are contaminated by pathogens [4], and organic and inorganic residuals [5, 6]. Usually, farms are cultivated with contaminated soil and streams, which contain residual waters from the population and livestock [7,8,9], rest of chemical fertilizers [10] and pesticides [11]. The consumption of such foods can cause health problems, usually infectious diseases of the stomach, which are commonly observed in children and elderly people [12, 13].

In order to mitigate the action of pathogens, it is recommended a good hygiene and washing practices for consumed raw fruits and vegetables, since recent studies have concluded that the pathogen immigration on the leaf surface or pathogen internalization within the plant augment their environment survival [14]. Thus, many studies have been conducted in favor to identify the type of contamination in soils [9], water [6,7,8], fruits and leafy vegetables [11, 12, 14]. Identification and measure of bacterial contamination are carried out by microbiological analysis. In such a process, bacteria are identified, cultured and counted in a laboratory [15]. However, the analysis for detecting pathogenic microorganisms of a single product is complex, time-consuming and expensive. So, it is desirable to obtain a general contamination profile, since some vegetables are more contaminated than others, depending on its origin.

Recently, powered by new sensing technologies, novel methods for rapid detection of bacteriological contamination in vegetables were presented [16]. A rapid detection method has to be simple, rapid, and sensitive. Lately, rapid non-destructive detection methods are based on bio-sensors and spectroscopy. The work [17] reported the use of bio-sensors to detect effectively Pseudomonas aeruginosa and Salmonella typhimurium, [18] use a laser ray to detect microorganisms, and in [19] used the near-infrared (NIR) spectral as an indicator of contamination livestock products.

This paper presents a method to detect the presence of bacteriological contaminants in three consumed raw vegetables (lettuce, cilantro, peppermint). To learn a discriminative model, the vegetables come from two different places: the first, a field irrigated with the river water that exceeds the permissible limits of biological organisms for irrigation [20]; the second, a home garden irrigated with drinking water. No fertilizer and no pesticides were applied. In harvest time, vegetables were collected and measured their reflectance spectrum using a spectrophotometer [21]. By using both of the visible and NIR measured wavelength as feature descriptors, trained a linear Support Vector Machine (SVM), we achieved encouraging results as a fast bacteriological contamination detector.

The remainder of this paper is organized as follows. Section 2 presents related work. Section 3 explains the proposed method. Section 4 details the experimental results and discussions. Finally, Sect. 5 exposes conclusions and future work.

2 Related Work

Humans need to eat foods to satisfy the nutritional requirements that allow them to make activities. Among foods, vegetables and fruits are the most recommended, because they content-rich carbohydrates, vitamins, and minerals [3]. However, usually consumed raw vegetables or fruits can cause infectious diseases since they were contaminated or handled hygienically [14]. The presence and degree of contaminants usually depend on the environmental conditions where the growth of the food [22]. The contamination could come from the used water, soil, and fertilization type employed during farming. Therefore, vegetables cultivated with the residual stream are more contaminated than vegetables irrigated using water compliant with irrigation water quality.

Because eating contaminated foods can produce undesirable diseases, the detection and discarding of contaminated foods can avoid them. Usually, bacteriological contamination such as Escherichia coli is among the most common cause of stomach and intestinal infections, recently even antibiotic-resistant [23]. Therefore, people require methods to detect contaminated food.

In opposite to traditional detection methods based on biochemical analysis which is lagging and costly, recent methods are faster and credible at low cost to detect bacteriological contaminants [16,17,18]. In this sense, spectrometry seems a promising method to detect microorganisms [19] and other properties of the object of interest. Spectroscopy is concerned about the interaction between the light and the object of interest and the spectrometry measures the electromagnetic radiation to obtain information about the object [24]. Normally, a spectrometer provides a measurement of the reflection of electromagnetic radiation, named reflectance [21], usually from the visible and near-infrared spectrum–NIR (between 350 and 2500 nm).

Spectrometry has many applications. For instance in agriculture, instead of a soil chemical test, the measurements of a spectrometer can provide data to determine the contamination level and type of particles in the soil [25]; can be detected microbial contamination in water [26] and pesticide residues in lettuces [27]; also plant diseases can be detected as the yellowing leaves caused by the rhizospheric bacteria in chrysanthemum [28], the leaf miner damage on tomato leaf [29], and fungal pathogens on sugar beet leaves [30]. All these presented diseases are produced by some microbiological pathogen, so is feasible to use a spectrometer to detect it.

Taking advantage of the spectrometry, in this work, we present an automatic recognizer to detect bacteriological contaminants in three consumed raw vegetables in Ayacucho town, Peru. Lettuces, cilantro, and peppermints were cultivated using highly contaminated water with microbiological pathogens, such as Escherichia coli and thermotolerant coliforms. According to the official report [20], the irrigation water of the field contains 2.2E + 5 CFU/100 of Escherichia coli and 1.7E + 6 NMP/100 mL of thermotolerant coliforms, which is above the permissible 1000 CFUs for irrigation water.

3 Our Proposal

Ayacucho is a town located in the central part of Peru. There, it is common to eat raw vegetables as an accompaniment to typical dishes. However, to eat that food is becoming a health problem for citizens: most fruits and vegetables are irrigated with wastewater of Ayacucho city. The residual treatment plant flows its residual waters into the Alameda river, without proper treatment to make it suitable for irrigation, so the river and the basin are contaminated.

Among contaminants, we highlight Escherichia coli and thermotolerant coliforms reported in [20], because they cause digestive intoxication, and in a long term cancer (recently, the Healthy Ministry of Peru communicated a high incidence of stomach cancer in Ayacucho [31]). In the market, to detect and distinguish a vegetable from a contaminated or non-contaminated field is hard, since microorganisms are imperceptible to the human eye and because the sellers usually do not say the origin of the product. Therefore, people are at risk of getting sick. For instance, Fig. 1 shows two lettuce samples: one is from a field irrigated with contaminated water (Fig. 1a) and then another one, from a field, irrigated with potable water (Fig. 1b), however, they seem healthy.

Fig. 1
figure 1

Lettuce samples. a From a contaminated field, b From a non-contaminated field

To build an automatic detector, to discriminate between contaminated and non-contaminated vegetables, the proposed framework is shown in Fig. 2. It works as follows:

Fig. 2
figure 2

The proposed detection framework

Vegetable dataset. The collected dataset consists of wavelengths of the electromagnetic spectrum measured with a spectrometer [21] for each vegetable sample.

The used sensor provides measurements of the reflectance of the sunlight between 350–2500 nm, e.g., the visible and the near-infrared spectrum. There- fore, measures were performed at 12 m–01 p.m., on different sunny days outdoor.

Vegetables consist of lettuce, cilantro, and peppermint and form two groups:

  • The first group was cultivated in a field at the Latitude = − 13.136103 and Longitude = − 74.193728, irrigated using residual water coming from the wastewater treatment plant of Ayacucho city, located at the Latitude = − 13.140945, Longitude = − 74.206654. Samples were labeled as contaminated.

  • The second group was cultivated in a field at the Latitude = − 13.164441 and Longitude = − 74.248020, irrigated using potable water. Samples were labeled as non-contaminated.

Data pre-processing

For this stage, each electromagnetic spectral measurement can be thought of as a row vector. Because of the sun intensity, the moisture and the wind at the time of measurement can produce noise, we applied two pre-processing steps:

  • To remove that noisy water band, approximately 1800–2000 nm [21], so the vector length is 1753–dim.

  • To normalize the measurement according to Eq. 1, since the spectrum wavelengths could have variations in their intensity, but nothing in their shape.

    $$ x_{n} = \frac{{x - x_{\hbox{min} } }}{{x_{\hbox{max} } - x_{\hbox{min} } }} $$
    (1)

Feature extraction

Since the sensor provides a single row data for each spectral measurement, each of one is treated as a row feature vector.

Classification

To perform detection tasks, we trained an SVM classifier in one-against-one fashion [32]. For classification, we have six classes lettuce-co, lettuce-nc, cilantro-co, cilantro-nc, peppermint-co, and peppermint-nc, where co means contaminated and nc, non-contaminated.

Detection

This stage recognizes the given unknown vegetable and labels it as contaminated or non-contaminated.

4 Results and Discussions

4.1 Spectral Wavelengths Dataset

We collected a dataset which contains spectral wavelengths from 20 samples of lettuce, 10 of cilantro, and 10 of peppermint, both from contaminated and non-contaminated fields, and both from the leaves and the stems. An illustration of the measurement process to collect data is shown in Fig. 3.

Fig. 3
figure 3

Electromagnetic spectrum measurement: a Lettuce, b Cilantro, and c permint

Next, in order to remove noisy data, we discarded three segments from the collected raw data: 1340–1350 nm, 1800–2000, and 2400–2500 nm. They are produced by uncontrolled changes in moisture and wind during measurements. Next, the data is normalized. Figure 4 illustrates the pre-processing step: the raw data Fig. 4a, the raw data is cropped Fig. 4b and the raw data is normalized Fig. 4c.

Fig. 4
figure 4

Pre-processing of the data: a Raw data signal, b Raw data after removing some useless bands, and c Normalized data

We collected data from both the leaves and the stems of the vegetables. Figure 5 shows the plotted data as 1D signals for both the leaves and stem of lettuce, peppermint, and cilantro.

Fig. 5
figure 5

Illustration fo the spectral data both from the leaf and stem from a sample: a Lettuce leaf, b Lettuce stem, c Peppermint leaf, d Peppermint stem, e Cilantro leaf, and f Cilantro stem. The blue signal represents the contaminated data, while the red signal shows the non-contaminated one

4.2 Experiments

In order to assess the discriminative power of the measured electromagnetic spectral data (both the visible and the near-infrared data), the classification was performed changing the features in three ways: (I) Only measurements from the leaves, (II) Only measurements from the stem, and (III) Concatenating of I and II. To reduce the effect of a small dataset, experiments were conducted using the Leave-One-Out Cross-Validation (LOOCV) scheme [32].

4.3 Results

Table 1 shows the obtained results using the Linear SVM (Cost = 1) under the LOOCV experimental setup.

Table 1 Experimental results according to their accuracy and Kappa metrics

According to Table 1, the experimental setup II outperforms the other approaches, achieving a detection accuracy rate of 91.89% and a Kappa statistic measure rate of 89.87%. Using only leaf features leads to lower performance, and even contrary to be expected, combined features get the worst performance.

These results suggest that the classifier has detected microorganisms on the stem of these vegetables but no on their leaf. It is reasonable since, during the irrigation the residual water covers up to the stem of the plant, microorganisms remain on the stem and over other parts that got wet. And few bacteria live on the leaves because the water does not contact them and because of the bactericidal effect of the light [33].

5 Conclusions and Future Work

This paper presented an approach to detect bacteriological contaminants in three consumed raw vegetables, named lettuce, peppermint, and cilantro, in Ayacucho tow, Peru. The proposal resides on the use of spectroscopy to measure the reflected light spectrum of the object and since spectrophotometry is a low cost, fast, and non-invasive tool of analysis. By measuring only the reflectance on the stem, the detector achieved a recognition rate of 91.89% in a collected dataset from a highly contaminated basin with bacteriological pathogens. It demonstrates his feasibility in favor of food safety and people’s health in Peru.

As future work, we intend to extend this study to other vegetables and to establish the degree of contamination.