Automated weed detection system in smart farming for developing sustainable agriculture

Rani, S. V. Jansi; Kumar, P. Senthil; Priyadharsini, R.; Srividya, S. Jahnavi; Harshana, S.

doi:10.1007/s13762-021-03606-6

Automated weed detection system in smart farming for developing sustainable agriculture

Original Paper
Published: 23 August 2021

Volume 19, pages 9083–9094, (2022)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

International Journal of Environmental Science and Technology Aims and scope Submit manuscript

Automated weed detection system in smart farming for developing sustainable agriculture

Download PDF

S. V. Jansi Rani¹,
P. Senthil Kumar^2,3,
R. Priyadharsini¹,
S. Jahnavi Srividya¹ &
…
S. Harshana¹

981 Accesses
16 Citations
Explore all metrics

Abstract

In the Indian agricultural industry, weedicides are sprayed to the crops collectively without taking into consideration whether weeds are present. More intelligent methods should be adopted to guarantee that the soil and crops obtain exactly what they need for optimum health and productivity in smart agriculture. In smart farming industry, the use of robotic systems enabled with cameras for case-specific ministrations is on the rise. In this paper, the crop and weed have been efficiently differentiated by first applying the feature extraction methods followed by machine learning algorithms. The features of the weed and crop are extracted using speeded-up robust features and histogram of gradients. The logistic regression and support vector machine algorithms are used for classification of weed and crop. The method which used histogram of gradients for feature extraction and support vector machine for classification shows better results compared to other methods. This model is deployed on a field robot, weed detection system. The system helps in spraying weedicide only wherever it is required, thereby eliminating manual engagement with harmful chemicals and also reducing the number of toxic chemicals that enter through the food. This automated system ultimately helps in the sustainable smart farming for agricultural growth.

Smart Farming Techniques for New Farmers Using Machine Learning

Autonomous Pesticide Spraying Robot Using SVM

Crop Decision Using Various Machine Learning Classification Algorithms

Discover the latest articles, news and stories from top researchers in related subjects.

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

Agricultural workers in Indian farms remove weeds by spraying herbicides manually. Directly exposing farmers to weedicide can cause a range of health complications such as cancer, reproductive disorders, dermatitis, wheezing, coughing and other respiratory problems. Moreover, the presence of chemicals in the food has become a major concern today. Ingestion of even traces of these chemicals can cause a plethora of diseases among the consumers. The information about the pesticides and their usage is given that carcinogenic potential, endocrine/hormone-disrupting and immune toxic effect are 56, 81, 38, respectively (Kumar and Reddy 2017).

It is very miserable to note that in India, even the breast milk and blood samples are polluted with pesticides from the environment and biological system. The various levels of pesticides present as 12.5% of unapproved pesticides, 18.7% samples of pesticide residues, 2.6% of samples are noted which has residues above the maximum residue level (MRL) recommended by FSSAI. The samples include tea, milk, fish, egg, spices, meat, cereals, pulses and vegetables and collected during 2014–15. There were also samples noted with multiple pesticides. This has resulted in the trend of organic foods taking over the world. The high cost of pure organic weedicide and low yield that results from organic farming has led to a drastic increase in the prices of organic food.

Moreover, weedicide can contaminate soil, water and other organisms (Aktar et al. 2009). In addition to killing weeds, weedicide can be toxic to a host of other organisms including birds, fish, beneficial insects and non-target plants. Heavy treatment of soil with synthetic weedicide can cause populations of beneficial soil microorganisms to decline. According to the soil scientist’s, loss of both bacteria and fungi degrades the soil. Indiscriminate use of chemicals might work for a few years, but after a while, the beneficial soil organisms become non-existent to hold onto the nutrients, making the soil infertile.

Thus, the efficient way of cultivating food that benefits both producers and consumers while preserving the environment would be a method that goes midway in the development of sustainable agricultural growth. The robotic system that can efficiently differentiate between a weed and a crop plant can be deployed on the field, which eliminates the human engagement in the delivery of chemicals. As precision agriculture is adopted, the amount of chemicals used on the crop is heavily reduced and thereby making it healthier and cheaper for the consumers.

This work intends to propose an intelligent solution to identify the non-crop vegetation using combination of image feature extraction methods with machine learning algorithms, on the images captured by a camera mounted on a semi-automated robot. The solution includes the analysis of how two feature extraction techniques, namely speeded-up robust features (SURF) and histogram of gradients (HOG) and two classification algorithms, namely logistic regression and support vector machine (SVM), affect the accuracy and efficiency of weed detection system (WDS).

Existing systems

Using shape of leaves and their veins for classification, a method is used to distinguish between crop and weed leaves in real-time corn field using fast Fourier transform, produced the accuracy of 92% in detecting weed plants (Nejati et al. 2008). It is semi-automatic and gives good results for big leaves only. Astrand and Baerveldt (2005) developed to remove the weed from the field using selective rotary hoe and a machine which has computer vision guidance. This robotic control weed detection machine works well when weed is less and crops are more in the field. Another system proposed by Tannouche et al. (2015) is an automated robot that uses artificial neural network (ANN) for the detection and classification of the weed leaves from that of the onion plants. However, the model has multiple disadvantages, and this model uses the classification on the type of weed instead of the crop plant.

Guyer et al. (1993) used the curvature, area and perimeter of the leaf to identify the leaf edge patterns. They used the technique of combining and structuring the basic, explicit information into subjective shape knowledge. Thus, it combines the rule-based data with low-level quantitative feature, transformed into high-level quantitative feature. The small-sized low-cost interactive robots are used (Arif and Butt 2014) which helps in precision farming. The robot navigates through row technique. The digital image processing techniques are used in the computer vision robot which distinguishes crop from weed. They use Hough transformation and greyscale conversion for identification. There are other approaches in the literature such as shape analysis, colour analysis and texture analysis (Chaisattapagon (1995). Various colour filters can be used on black and white images during pre-processing stages.

The different features such as morphology, spectral property, visual texture and spatial context are considered in various works. Herrera et al. (2014) discussed about the number of boundary pixels in the segmented part of the image. The region-based shape indexed factor method was developed by Bakhshipour et al. (2018). The average of the third component in YIQ colour space spectral features is discussed by Sabzi et al. (2018). Campos et al. (2017) used the statistical visual texture features such as the autocorrelation of the degree of similarity of the elements in an image. Dynamic programming technique which uses the previous knowledge of the geometric structure for crop row detection was developed by García-Santillán et al. (2018, 2017) and Vidović et al. (2016).

The detection, identification and guidance techniques used in general purpose robotic system to control weeds are reviewed (Slaughter et al. 2008). Next evolution, a more robust identification technique which handles the occlusion problem in differentiating between weed and crop was developed (Tian et al. 2000 and Lee et al. 1999). Hague et al. (2000) used different types of sensors, machine vision, accelerometers odometers and compass for the navigation of the robot in the farm field. An algorithm of shape analysis was developed by Perez et al. (2000) for differentiating crop and weed.

Liu and Chal (2018) have developed a system to detect common invertebrate pests in farmland using a multi-spectral mechanism. Though the precision score of this system is good, still there is scope of optimizing the robot system. Sabzi et al. (2018) identified three different types of weeds in the potato field using heuristic algorithm. Initially, feature selection is done using the cultural algorithm, and five most important features are selected. Then, the optimal configuration of the network is found using the harmony search algorithm. The limitation is that when the field density is high, the system cannot identify the weed. Zhai et al (2018) used particle swarm optimization techniques for precise spraying of pesticide.

Chang et al. (2018) developed a small-scale robot machine which uses multitasking and computer vision that identifies weed and also performs variable rate irrigation techniques to increase productivity. Image processing techniques such as morphology operator, binary segmentation, HSV (Hue, Saturation and Value) conversion are used. It can classify weed and plant and also irrigate the land using fuzzy logic controller. Weed detection part needs to be improved. Jiang et al (2020) used convolutional neural network is used to identify weed, and further it has to be updated for wide range of soil, location and image acquisition heights.

The existing methods considered only the biological morphology, spectral features, visual textures and spatial contexts features and do not consider the generic features of the images. Also not much of the system considered the blur and rotated images. The proposed solution in this paper looks to adopt feature extraction techniques such as HOG and SURF which considers the generic feature for a more generalized approach. It also captures the edge and corner features which has more information through the HOG technique. This work aims to solve the problems stated and discussed in the literature survey by proposing a solution that is robust to leaves of different size shapes and colours using machine learning aalgorithms such as logistic regression and SVM for classification. Moreover, the usage of the two different datasets including a custom captured dataset is an attempt to arrive at a best methodology that can be adopted for any exiguous crop weed combination dataset.

Materials and methods

The main idea behind the project is to achieve real-time weed detection in fields using a robot through a binary classification of the images captured through the camera, i.e. the WDS. Initially, the image captured is subject to image pre-processing. This is necessary as images should not contain unwanted information which could lead to the poor performance of the model. Feature extraction techniques are used to acquire the important features from the image which are then fed to the trained model. Two datasets are used. CWFID is used to compare with the existing works since it is the benchmarking dataset. To test our model, custom dataset is generated by WDS field robot. The weedicide is sprayed if it is not a plant, otherwise, it moves forward. Figure 1 describes the architecture of the WDS.

For training the model, testing is done. For evaluating the testing of the model, the various performance metrics used are precision, recall, F1 score and accuracy. The model is tested for several runs, recorded the results and then evaluated the performance score of each factor.

Precision

Precision is the ratio of number of true positives (TP) or relevant instances to the sum of TP and false positives (FP) or retrieved instances. It is used to find how much samples are wrongly identified as positive instead of negative. Its formula is given by:

$$\mathrm{Precision}=\frac{\mathrm{TP}}{\mathrm{TP}+\mathrm{FP}}$$

(1)

Recall

Recall is the ratio of true positives to the sum of true positives and false negatives (FN). The recall is subliminally the ability of the classifier to find all the positive samples which in turn tells how many of the true positives were actually recalled. Its formula is given by:

$$\mathrm{Recall}=\frac{\mathrm{TP}}{\mathrm{TP}+\mathrm{FN}}$$

(2)

F1-score

F1 score can be interpreted as a weighted harmonic mean of precision and recall, where an F-beta score reaches its best value at 1 and worst score of 0.

$$\mathrm{F}1\mathrm{ Score}=\frac{2\times \mathrm{Recall}\times \mathrm{Precision}}{\mathrm{Precision}+\mathrm{Recall}}$$

(3)

Accuracy

Accuracy is an important metric for evaluating classification models. Informally, accuracy is the fraction of predictions the model correctly identified. For binary classification, accuracy can also be calculated in terms of positives and negatives as follows:

$$\mathrm{Accuracy}=\frac{\mathrm{TP}+\mathrm{TN}}{\mathrm{TP}+\mathrm{TN}+\mathrm{FP}+\mathrm{FN}}$$

(4)

Dataset acquisition

Crop/weed field image dataset (CWFID)

The CWFID, a dataset widely used in applications for precision agriculture based on computer vision, especially weed control (Swapnali and Vijay 2014), was used for the project. Sample images consisting of both the crop and weed are shown in Fig. 2. It consisted of 60 images with annotations. For every image in the dataset, a ground truth segmentation mask to separate the vegetation from the background and a manual annotation of the type of plant (Crop vs. Weed) was available in the dataset.

Custom dataset

A custom labelled dataset in which images were captured using the Raspberry Pi 3 camera module night vision camera from a height of 45 cm was created for this study. Amaranthus dubius was taken as the crop, and dwarf copperleaf was taken as the weed specimen. The dataset consisted of 300 images, 150 images of crops and weeds each. Sample images consisting of crop and weed are depicted in Fig. 3.

Dataset enhancement

Given below are the pre-processing techniques adopted for the two datasets, i.e. custom generated and CWFID dataset, respectively.

CWFID dataset

Every image in the dataset had a combination of both the crop and the weed and separately, masks were provided for the crops and weeds as shown in Fig. 4. As a part of the pre-processing step, the masks were applied onto each image in the dataset to obtain two images, one of the crops and the other of weed. Thus, the 60-image dataset described in “Crop/weed field image dataset (CWFID)” section was used to populate 120 images consisting of 60 crops images and 60 weed images.

Custom dataset

The custom-generated 300-image dataset as described in “Custom dataset” section consists of 150 colour (RGB) images of each crop and weed. And to facilitate effective separation of the plant parts from the background, a technique called excess green is adopted, which enhances the green scale in the image. Further background segmentation was executed using thresholding.

1.
Excess green thresholding was applied on the image to introduce effective contrast between the foreground (plant) and the background (soil), by enhancing the greenness of the image.

R(red) = image [:,:,0], G(green) = image [:,:,1], B(blue) = image [:,:,2]
2*G-R-B is the excess green function used.

2.
Background segmentation was executed using thresholding.

The technique of thresholding involves a comparison of each pixel value of the image (pixel intensity) to a specified threshold calibrated. This splits all the pixels of the input image into two segments:

i.
Pixels that contain lower intensity values than the calibrated threshold.
ii.
Pixels that contain greater intensity values than the calibrated threshold.

These two segments are now given different values; to segment the background from the foreground here the background was given 0 (black) and the foreground 1 (white) as shown in Fig. 5. The threshold value used to segment this dataset was “80”.

Feature extraction

Speeded-up robust features

After pre-processing, the SURF method (Bay and Tuytelaars 2006) was used to extract features from the image. SURF is a method to extract interesting points of an image and create descriptors. This method is widely used for its powerful attributes, which include lighting and contrast invariance, scale and translation invariance, rotation invariance. It detects objects from images captured under a variety of intrinsic and extrinsic settings.

The algorithm comprises of the following four steps:

i.
Integral image generation
ii.
Interest point detection (Fast-Hessian detector)
iii.
Descriptor orientation assignment (optional)
iv.
Descriptor generation

All the ensuing parts of the algorithm use integral image resulting in a significant acceleration in their speed. Equation (4) shows an integral image. While using an integral image, only four-pixel values are read from the original image to calculate the surface integral of any size.

$$ I\left( {x,y} \right) = \mathop \sum \limits_{i = 0}^{n} \sum I\left( {i,j} \right) $$

(5)

while calculating the responses of Haar and Gaussian wavelet filter, this fact is generally used.

$$ H\left( {x,y} \right) = \det \left( {_{{\frac{{\partial^{2} f}}{\partial x\partial y}\frac{{\partial^{2} f}}{{\partial y^{2} }}}}^{{\frac{{\partial^{2} f}}{{\partial x^{2} }}\frac{{\partial^{2} f}}{\partial x\partial y}}} } \right) $$

(6)

$$ H\left( {\overline{x}} \right) = D_{xx} \left( {\overline{x}} \right)D_{yy} \left( {\overline{x}} \right) - \left( {0.9D_{xy} \left( {\overline{x}} \right)} \right)2 $$

(7)

$$ \overline{x} = \left( {x,y,s} \right) $$

(8)

while calculating the responses of Haar and Gaussian wavelet filter, this fact is generally used.

The SURF method, using the determinants of Hessian matrices, locates the significant points on the images. Equation (8) depicts the original definition for this determinant of general two-dimensional function. Two significant methods using the Fast-Hessian detector are used to modify this equation:

1.
Replacing second-order partial derivatives by convolutions of an image with approximations of the Gaussian kernels second-order derivatives. Coefficient 0:9 in Eq. (7) compensates for this approximation.
2.
Both positions in the image and size are used to parameterize the Gaussian kernels.
$$H(x)=H+\frac{{\partial H}^{{\varvec{T}}}}{\partial \mathrm{x}}x+\frac{1}{2}{x}^{T}\frac{{\partial }^{2}\mathrm{H}}{{\partial \mathrm{x}}^{2}}x$$
(9)
$$\widehat{x}=\frac{{\partial }^{2}{\mathrm{H}}^{-1}}{{\partial \mathrm{x}}^{2}}x\frac{\partial H}{\partial x}$$
(10)

The parameter s in Eq. (8) is scaling, which represents the scale space. In SURF algorithm, the representative points are assigned with dynamic weights. The weight of every representative point can be calculated using the following equation:

$${\mathrm{W}}_{\mathrm{P}}=\frac{\mathrm{No}.\mathrm{of detected images w}.\mathrm{r}.\mathrm{t Point p}}{\mathrm{No}.\mathrm{of training images in object}}$$

(11)

Clustering SURF features

The number of SURF features obtained from each image is vast. Instead of using all the SURF features for training, similar SURF features were grouped using K-means clustering (Sammut and Webb 2017), and samples are depicted in Fig. 6. The number of clusters was decided using the Elbow method (Syakur et al. 2018) and 300 was the resulting number of clusters. The whole dataset was split into train and test data in 80–20 ratio, respectively. For each image, a training instance was computed which a collection of 300 attributes, each was corresponding to the frequency of occurrence of the respective cluster number. Hence, the training and test datasets were prepared.

Histogram of gradients

The essential idea behind the HOG descriptors is that the distribution of intensity gradients or edge directions describes shape and the local object appearance within an image. HOG descriptors are formed by combining local histograms for the cells (localized portions) of an image. These histograms represent information of gradient directions or edge orientations for the pixels in the cells. To make the histograms invariant to light, every histogram of the cells is contrast normalized.

To summarize, histogram calculation involves:

1.
Calculation of gradients
2.
Generation of histograms
3.
Normalization of histogram

Calculation of gradients

To begin with feature extraction through HOG, the first step is to find first-order differential coefficients, G_x(i,j) and G_y(i,j), is computed by the following equations:

$${G}_{x}(i,j)=f(i+1,j)-f(i-1,j)$$

(12)

$${G}_{y}(i,j)=f(i,j+1)-f(i,j-1)$$

(13)

where f (i,j) means intensity at (i, j). Then, magnitude, m, and orientation, θ, of the gradients are computed with the following formulae, respectively.

$$m(i,j)=\sqrt{{G}_{x}(i,j{)}^{2}+{\mathrm{G}}_{y}(i,j{)}^{2}}$$

(14)

$$\uptheta (i,j)=\mathrm{arctan}(\frac{{G}_{x}(i,j)}{{G}_{y}(i,j)})$$

(15)

Generation of histograms

After computing values of magnitude (m) and orientation (θ), histograms are generated as follows:

1.
Identify the bin θ (i, j)
2.
Increment the value in bin determined in step 1
3.
Similarly, the above steps are repeated for all gradients in the cell

To decrease the effect of aliasing, the values in the two neighbouring bins are increased. After incrementing, n indicates the bin number to which orientation value, θ (i, j) belongs to, and n + 1 is the neighbouring bin. The values of m _n and m _n+1 are calculated as below:

$$n=floor(\frac{b\uptheta (\mathrm{i},\mathrm{j}) }{\pi })$$

(16)

$${m}_{n}=(1-\alpha )m(i,j)$$

(17)

$${m}_{n+1}=\alpha m(i,j)$$

(18)

where b represents the total number of bins, α is a parameter for proportional distribution of magnitude m (i, j) which is defined as the distance from θ (i, j) to class n and n + 1,

$$\alpha =\frac{b}{\pi }(\uptheta (\mathrm{i},\mathrm{j})\mathrm{ mod}\frac{\pi }{b})$$

(19)

Histogram normalization

The final step is combining all the local histograms of the cells in a block to form the final histogram. To make the features invariant to illumination and contrast, L1-norm is adopted. After obtaining the large combined histogram, it is normalized:

$$ v = \frac{{V_{k} }}{{\left| {\left| {V_{k} } \right|} \right| + \varepsilon }} $$

(20)

Here, V_k represents the combined histogram for the block, ε is a small constant, and v is the final HOG feature vector. In Fig. 7, it is observed that the HOG features extracted from each of the crop and the weed in both the considered datasets.

Training a binary classification model

Logistic regression

Logistic regression (Kleinbaum et al. 2002) was used as the model to classify the images into crop and weed. In logistic regression, the probability of having the outcome as positive case is generally defined in the response variable. If the response variable is found to be equal to or greater than a discrimination threshold, the positive class is predicted; otherwise, the negative class is predicted. The logistic function as given below always returns a value between zero and one:

$$F(t)=\frac{1}{1+{{\varvec{e}}}^{-t}}$$

(21)

The training set consisting of 300-length vectors representing the cluster frequencies was trained using the logistic regression model with inverse regularization parameter of 0.001 and a penalty l2.

Support vector machine

A support vector machine (Pisner et al. 2020) can be used for regression, classification or other tasks.

In a high- or infinite-dimensional space, it primarily constructs a hyper-plane or set of hyper-planes. The hyper-planes help to find a good separation between the different classes based on the largest distance of the hyper-lane to the closest training data point of the classes (also called as functional margin). Thus, if the margin is larger, then the generalization error of the classifier is lesser.

Suppose there are training vectors, $x_{i} \in R^{p}$, i = 1,.. n, in two classes, and a vector $y \in \left\{ {1, - 1} \right\}^{n}$, SVC solves the following primal problem:

$$\min_{{\omega ,b,c\frac{1}{2}\omega T\omega }} + C\sum\limits_{i = 1}^{n} {C_{i} ,}\;\text{subject to}\;y_{i} \left( {\omega^{T} \varvec{\varphi }\left( {x_{i} } \right) + b} \right) \ge 1 - \c{c}_{i} ,\;\c{c}_{i} \ge 0,\;i = 1, \ldots ,n$$

Its dual is:

$$\min\nolimits_{{\alpha \frac{1}{2}\alpha^{T} Q\alpha - e^{T} \alpha }},\;\text{subject to}\;y^{T} \alpha = 0,\;0 \le \alpha_{i} \le C,\;i = 1, \ldots ,n$$

where e is the vector of all ones, C > 0 is the upper bound, Q is an n by n positive semi-definite matrix, $Q_{ij} \equiv y_{i} y_{j} K\left( {x_{i} ,x} \right)$, where K $K\left( {x_{i} ,x_{j} } \right) = \varvec{\varphi }\left( {x_{i} } \right)^{T} \varvec{\varphi }\left( {x_{j} } \right)$ is the kernel φ.

The function is used to implicitly map training vectors to bigger dimensional space.

The decision function is:

$$ {\text{sgn}} \left( {\sum\limits_{i = 1}^{n} {y_{i} \alpha_{i} K\left( {x_{i} ,x} \right) + \rho } } \right) $$

(22)

For the given training set, x_i represents the feature which can be SURF or HOG as extracted from the previous step and y_i ε {0,1}, where 0 represents weed, and 1 represents the plant. These features are the vectors, and the hyper-plane is calculated. The penalty is a squared l2 with regularization parameter.

Experimental setup

The weed detection model is deployed on a four-wheeled field robot. The root is 45 cm tall and about 40 cm. The camera is hoisted on top of the robot and is connected to a Raspberry Pi module. The raspberry pi module controls the physical movements and classifies the image captured by the camera as shown in Fig. 8. Based on the prediction of the model, the robot then sprays the weedicides if weeds are detected in the captured image.

A Raspberry Pi 3 camera module night vision camera is used to capture the field image in natural lighting. The camera is fixed at a height of 45 mm from the ground. The captured image is then processed by the Raspberry Pi 3 module mounted on the chassis of the robot. The chassis of the robot is a 45 mm × 45 mm × 45 mm structure with four wheels (two motors on the back wheels) attached to four legs and a platform to accommodate the Raspberry Pi module, battery, circuitry and the weedicide tank attached to the spraying mechanism as shown in Fig. 9 (Table 1).

Table 1 Specifications of the components used in the WDS robot

Full size table

Results and discussion

The WDS which uses machine learning algorithms and field robot design is implemented using Python and trained for both CWFID and custom datasets. Once the model is trained, it is saved as.h5 model file and then deployed in the Raspbian OS of the hardware setup. The robot was then programmed to move and capture pictures every square foot. The captured image is processed by the Raspberry Pi board which uses the.h5 model file to decide whether to spray weedicide or not.

The results of the previous work of Haug et al (2014) using Jaccard index for segmentation and Haug et al (2014) using spatial smoothing and interpolation techniques are taken from their work. The trained model was tested using the testing set. The classification report was generated with precision, accuracy and recall as the metrics. The results are tabulated for both CWFID and custom datasets as shown in Table 2.

Table 2 Performance results of the previous works

Full size table

From Table 3, the results show that the SVM model after feature extraction using histogram of gradients for the custom dataset, and the best accuracy of 83% has been achieved. Also, it has a recall of 0.83 which means 83% of the predicted results are true positives. And the precision of 0.85 implies 85% of the actual results are true positives. On the other hands, logistic regression model after feature extraction using SURF for the CWFID dataset has the least accuracy of 56%. Thus, it is evident that SVM outperforms logistic regression in both the feature extraction techniques and with both the datasets.

Table 3 SVM along with both feature extraction methods SURF and HOG outperforms the logistic regression model which is evident from the underlined values of Table 3

Full size table

The hassle of a weed detection task depends on the discrimination between the weed and the crop plant leaves that often have similar properties. Generally, the following four categories of features, i.e. biological morphology, spectral features, visual textures and spatial contexts, are used. This work aims to test out the more generic feature extraction methods, namely HOG and SURF, for this purpose. In the HOG feature descriptor, the distribution (histograms) of directions of gradients (oriented gradients) is used as features. Gradients (x and y derivatives) of an image are useful because the magnitude of gradients is large around edges and corners (regions of abrupt intensity changes), and it is known that the edges and corners pack in a lot more information about object shape than flat regions.

It was observed that compared to the existing work done with the CWFID dataset, the usage of HOG has had a significantly higher accuracy when used with both SVM and logistic regression. SURF is good at handling images with blurring and rotation, but not good at handling orientation and illumination change. And usage of SURF resulted in a reduced accuracy which could be the consequence of illumination change in the dataset and the size of the leaves in question. As in the case of the custom dataset, SURF outperforms HOG with both the classification algorithms. The accuracy of SURF and HOG along with SVM has improved by 20% compared to the other classification models. This makes these two techniques better alternatives when the exact features to be considered are indiscriminate.

Conclusion and future work

In this paper, an efficient automated weedicide detection system using feature extraction algorithms and machine learning techniques hosted on a field robot that can work on smart agricultural field is developed. This study uses CWFID dataset and custom dataset generated by the field robot for the removal of weeds. The system showed different results for various methods. Combination of feature extraction methods and the machine learning methods helped in identifying the best method for weed and crop detection. From the experimental results, it is concluded that HOG followed by SVM proves to be best one, and the same has been deployed to automate the weed detection system. In future, the field robot can be built to customize according to the type of crops grown and size of the cultivable land. Moreover, an app would be developed to completely take care of the weed removal process, beginning from the locations where herbicide was sprayed to detect the weeds in the field and suggest suitable methods to encounter them and thus contribute towards the sustainable agricultural growth.

References

Aktar MW, Sengupta D, Chowdhury A (2009) Impact of pesticides use in agriculture: their benefits and hazards. Interdiscip Toxicol 2(1):1. https://doi.org/10.2478/v10102-009-0001-7
Article Google Scholar
Arif A, Butt K.M. (2014) Computer vision based navigation module for sustainable broad-acre agriculture robots. Science International 26(5)
Astrand B, Baerveldt AJ (2005) A vision based row-following system for agricultural field machinery. Mechatronics 15(2):251–269. https://doi.org/10.1016/j.mechatronics.2004.05.005
Article Google Scholar
Bakhshipour A, Jafari A (2018) Evaluation of support vector machine and artificial neural networks in weed detection using shape features. Comput Electron Agric 145:153–160. https://doi.org/10.1016/J.COMPAG.2017.12.032
Article Google Scholar
Bakhshipour A, Jafari A, Nassiri SM, Zare D (2017) Weed segmentation using texture features extracted from wavelet sub-images. Biosys Eng 157:1–12. https://doi.org/10.1016/J.BIOSYSTEMSENG.2017.02.002
Article Google Scholar
Bay H, Tuytelaars T, Van Gool L (2006) Surf: speeded up robust features. In: European conference on computer vision. Springer, Berlin, Heidelberg, pp. 404–417. Doi: https://doi.org/10.1007/11744023_32
Campos Y, Sossa H, Pajares G (2017) Comparative analysis of texture descriptors in maize fields with plants, soil and object discrimination. Precis Agric 18(5):717–735. https://doi.org/10.1007/s11119-016-9483-4
Article Google Scholar
Chaisattapagon NZC (1995) Effective criteria for weed identification in wheat fields using machine vision. Trans ASAE 38(3):965–974. https://doi.org/10.13031/2013.27914
Article Google Scholar
Chang CL, Lin KM (2018) Smart agricultural machine with a computer vision-based weeding and variable-rate irrigation scheme. Robotics 7(3):38. https://doi.org/10.3390/robotics7030038
Article Google Scholar
García-Santillán ID, Montalvo M, Guerrero JM, Pajares G (2017) Automatic detection of curved and straight crop rows from images in maize fields. Bio Syst Eng 156:61–79. https://doi.org/10.1016/J.BIOSYSTEMSENG.2017.01.013
Article Google Scholar
García-Santillán I, Guerrero JM, Montalvo M, Pajares G (2018) Curved and straight crop row detection by accumulation of green pixels from images in maize fields. Precis Agric 19(1):18–41. https://doi.org/10.1007/s11119-016-9494-1
Article Google Scholar
Guyer DE, Miles GE, Gaultney LD, Schreiber MM (1993) Application of machine vision to shape analysis in leaf and plant identification. Trans ASAE 29(6):1500–1507
Article Google Scholar
Hague T, Marchant JA, Tillett ND (2000) Ground based sensing systems for autonomous agricultural vehicles. Comput Electron Agric 25(1–2):11–28. https://doi.org/10.1016/S0168-1699(99)00053-8
Article Google Scholar
Haug S, Ostermann J (2014) A crop/weed field image dataset for the evaluation of computer vision based precision agriculture tasks. In: European Conference on Computer Vision. Springer, Cham, pp. 105–116. Doi: https://doi.org/10.1007/978-3-319-16220-1_8.pdf
Haug S, Michaels A, Biber P, Ostermann J (2014) Plant classification system for crop/weed discrimination without segmentation. In: IEEE winter conference on applications of computer vision. pp. 1142–1149 doi: https://doi.org/10.1109/WACV.2014.6835733
Herrera PJ, Dorado J, Ribeiro A (2014) A novel approach for weed type classification based on shape descriptors and a fuzzy decision-making method. Sensors 14(8):15304–15324
Article Google Scholar
Jiang H, Zhang C, Qiao Y, Zhang Z, Zhang W, Song C (2020) CNN feature based graph convolutional network for weed and crop recognition in smart farming. Comput Electron Agric 174:105450. https://doi.org/10.1016/j.compag.2020.105450
Article Google Scholar
Kleinbaum DG, Dietz K, Gail M, Klein M, Klein M (2002) Logistic regression. Springer-Verlag, New York, p 536
Google Scholar
Kumar D, Reddy D.L (2017) High pesticide use in India: health implications. Health action pp. 7–12
Lee WS, Slaughter DC, Giles DK (1999) Robotic weed control system for tomatoes. Precis Agric 1(1):95–113
Article Google Scholar
Liu H, Chahl JS (2018) A multispectral machine vision system for invertebrate detection on green leaves. Comput Electron Agric 50:279–288. https://doi.org/10.1016/j.compag.2018.05.002
Article Google Scholar
Nejati H, Azimifar Z, Zamani M (2008) Using fast fourier transform for weed detection in corn fields. In: 2008 IEEE International Conference on Systems, Man and Cybernetics, IEEE. pp. 1215–1219
Perez AJ, Lopez F, Benlloch JV, Christensen S (2000) Colour and shape analysis techniques for weed detection in cereal fields. Comput Electron Agric 25(3):197–212. https://doi.org/10.1016/S0168-1699(99)00068-X
Article Google Scholar
Pisner DA, Schnyer DM (2020) Support vector machine. Machine learning. Academic Press, London, pp 101–121
Chapter Google Scholar
Sabzi S, Abbaspour-Gilandeh Y, García-Mateos G (2018) A fast and accurate expert system for weed identification in potato crops using metaheuristic algorithms. Comput Ind 98:80–89
Article Google Scholar
Sammut C, Webb G.I (2017) Encyclopedia of machine learning and data mining. Springer. Doi: https://doi.org/10.1007/978-1-4899-7687-1
Slaughter DC, Giles DK, Downey D (2008) Autonomous robotic weed control systems: a review. Comput Electron Agric 61(1):63–78. https://doi.org/10.1016/j.compag.2007.05.008
Article Google Scholar
Swapnali B, Vijay K, Varsha H (2014) Feature based object detection scheme. In: Techno vision International Conference
Syakur MA, Khotimah BK, Rochman EMS, Satoto BD (2018) Integration k-means clustering method and elbow method for identification of the best customer profile cluster. IOP Conf Ser Mater Sci Eng 336(1):12–17. https://doi.org/10.1088/1757-899X/336/1/012017
Article Google Scholar
Tannouche A, Sbai K, Ounejjar Y, Rahmani A (2015) A real time efficient management of onions weeds based on a multilayer perceptron neural networks technique. Int J Farm Allied Sci 4(2):161–166
Google Scholar
Tian L, Slaughter DC, Norris RF (2000) Machine vision identification of tomato seedlings for automated weed control. Trans ASAE 40(6):1761–1768
Google Scholar
Vidović I, Cupec R, Hocenski Ž (2016) Crop row detection by global energy minimization. Pattern Recogn 55:68–86. https://doi.org/10.1016/J.PATCOG.2016.01.013
Article Google Scholar
Zhai Z, Martínez Ortega JF, Lucas Martínez N, Rodríguez-Molina J (2018) A mission planning approach for precision farming systems based on multi-objective optimization. Sensors 18(6):1795. https://doi.org/10.3390/s18061795
Article Google Scholar

Download references

Acknowledgements

We would like to thank the management of Sri Sivasubramaniya Nadar College of Engineering, Chennai, Tamil Nadu, India, for funding this project.

Funding

This research was funded by Sri Sivasubramaniya Nadar College of Engineering, Chennai, Tamil Nadu, India.

Author information

Authors and Affiliations

Department of Computer Science and Engineering, Sri Sivasubramaniya Nadar College of Engineering, Kalavakkam, Chennai, 603110, India
S. V. Jansi Rani, R. Priyadharsini, S. Jahnavi Srividya & S. Harshana
Department of Chemical Engineering, Sri Sivasubramaniya Nadar College of Engineering, Kalavakkam, Chennai, 603110, India
P. Senthil Kumar
Centre of Excellence in Water Research (CEWAR), Sri Sivasubramaniya Nadar College of Engineering, Chennai, Tamil Nadu, India
P. Senthil Kumar

Authors

S. V. Jansi Rani
View author publications
You can also search for this author in PubMed Google Scholar
P. Senthil Kumar
View author publications
You can also search for this author in PubMed Google Scholar
R. Priyadharsini
View author publications
You can also search for this author in PubMed Google Scholar
S. Jahnavi Srividya
View author publications
You can also search for this author in PubMed Google Scholar
S. Harshana
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to P. Senthil Kumar.

Ethics declarations

Conflict of interest

The authors declared that they have no conflict of interest.

Additional information

Editorial responsibility: Samareh Mirkia.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Rani, S.V.J., Kumar, P.S., Priyadharsini, R. et al. Automated weed detection system in smart farming for developing sustainable agriculture. Int. J. Environ. Sci. Technol. 19, 9083–9094 (2022). https://doi.org/10.1007/s13762-021-03606-6

Download citation

Received: 08 September 2020
Revised: 04 July 2021
Accepted: 11 August 2021
Published: 23 August 2021
Issue Date: September 2022
DOI: https://doi.org/10.1007/s13762-021-03606-6

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Automated weed detection system in smart farming for developing sustainable agriculture

Abstract

Similar content being viewed by others

Smart Farming Techniques for New Farmers Using Machine Learning

Autonomous Pesticide Spraying Robot Using SVM

Crop Decision Using Various Machine Learning Classification Algorithms

Explore related subjects

Introduction

Existing systems

Materials and methods

Precision

Recall

F1-score

Accuracy

Dataset acquisition

Crop/weed field image dataset (CWFID)

Custom dataset

Dataset enhancement

CWFID dataset

Custom dataset

Feature extraction

Speeded-up robust features

Clustering SURF features

Histogram of gradients

Calculation of gradients

Generation of histograms

Histogram normalization

Training a binary classification model

Logistic regression

Support vector machine

Experimental setup

Results and discussion

Conclusion and future work

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation