Introduction

Agricultural workers in Indian farms remove weeds by spraying herbicides manually. Directly exposing farmers to weedicide can cause a range of health complications such as cancer, reproductive disorders, dermatitis, wheezing, coughing and other respiratory problems. Moreover, the presence of chemicals in the food has become a major concern today. Ingestion of even traces of these chemicals can cause a plethora of diseases among the consumers. The information about the pesticides and their usage is given that carcinogenic potential, endocrine/hormone-disrupting and immune toxic effect are 56, 81, 38, respectively (Kumar and Reddy 2017).

It is very miserable to note that in India, even the breast milk and blood samples are polluted with pesticides from the environment and biological system. The various levels of pesticides present as 12.5% of unapproved pesticides, 18.7% samples of pesticide residues, 2.6% of samples are noted which has residues above the maximum residue level (MRL) recommended by FSSAI. The samples include tea, milk, fish, egg, spices, meat, cereals, pulses and vegetables and collected during 2014–15. There were also samples noted with multiple pesticides. This has resulted in the trend of organic foods taking over the world. The high cost of pure organic weedicide and low yield that results from organic farming has led to a drastic increase in the prices of organic food.

Moreover, weedicide can contaminate soil, water and other organisms (Aktar et al. 2009). In addition to killing weeds, weedicide can be toxic to a host of other organisms including birds, fish, beneficial insects and non-target plants. Heavy treatment of soil with synthetic weedicide can cause populations of beneficial soil microorganisms to decline. According to the soil scientist’s, loss of both bacteria and fungi degrades the soil. Indiscriminate use of chemicals might work for a few years, but after a while, the beneficial soil organisms become non-existent to hold onto the nutrients, making the soil infertile.

Thus, the efficient way of cultivating food that benefits both producers and consumers while preserving the environment would be a method that goes midway in the development of sustainable agricultural growth. The robotic system that can efficiently differentiate between a weed and a crop plant can be deployed on the field, which eliminates the human engagement in the delivery of chemicals. As precision agriculture is adopted, the amount of chemicals used on the crop is heavily reduced and thereby making it healthier and cheaper for the consumers.

This work intends to propose an intelligent solution to identify the non-crop vegetation using combination of image feature extraction methods with machine learning algorithms, on the images captured by a camera mounted on a semi-automated robot. The solution includes the analysis of how two feature extraction techniques, namely speeded-up robust features (SURF) and histogram of gradients (HOG) and two classification algorithms, namely logistic regression and support vector machine (SVM), affect the accuracy and efficiency of weed detection system (WDS).

Existing systems

Using shape of leaves and their veins for classification, a method is used to distinguish between crop and weed leaves in real-time corn field using fast Fourier transform, produced the accuracy of 92% in detecting weed plants (Nejati et al. 2008). It is semi-automatic and gives good results for big leaves only. Astrand and Baerveldt (2005) developed to remove the weed from the field using selective rotary hoe and a machine which has computer vision guidance. This robotic control weed detection machine works well when weed is less and crops are more in the field. Another system proposed by Tannouche et al. (2015) is an automated robot that uses artificial neural network (ANN) for the detection and classification of the weed leaves from that of the onion plants. However, the model has multiple disadvantages, and this model uses the classification on the type of weed instead of the crop plant.

Guyer et al. (1993) used the curvature, area and perimeter of the leaf to identify the leaf edge patterns. They used the technique of combining and structuring the basic, explicit information into subjective shape knowledge. Thus, it combines the rule-based data with low-level quantitative feature, transformed into high-level quantitative feature. The small-sized low-cost interactive robots are used (Arif and Butt 2014) which helps in precision farming. The robot navigates through row technique. The digital image processing techniques are used in the computer vision robot which distinguishes crop from weed. They use Hough transformation and greyscale conversion for identification. There are other approaches in the literature such as shape analysis, colour analysis and texture analysis (Chaisattapagon (1995). Various colour filters can be used on black and white images during pre-processing stages.

The different features such as morphology, spectral property, visual texture and spatial context are considered in various works. Herrera et al. (2014) discussed about the number of boundary pixels in the segmented part of the image. The region-based shape indexed factor method was developed by Bakhshipour et al. (2018). The average of the third component in YIQ colour space spectral features is discussed by Sabzi et al. (2018). Campos et al. (2017) used the statistical visual texture features such as the autocorrelation of the degree of similarity of the elements in an image. Dynamic programming technique which uses the previous knowledge of the geometric structure for crop row detection was developed by García-Santillán et al. (2018, 2017) and Vidović et al. (2016).

The detection, identification and guidance techniques used in general purpose robotic system to control weeds are reviewed (Slaughter et al. 2008). Next evolution, a more robust identification technique which handles the occlusion problem in differentiating between weed and crop was developed (Tian et al. 2000 and Lee et al. 1999). Hague et al. (2000) used different types of sensors, machine vision, accelerometers odometers and compass for the navigation of the robot in the farm field. An algorithm of shape analysis was developed by Perez et al. (2000) for differentiating crop and weed.

Liu and Chal (2018) have developed a system to detect common invertebrate pests in farmland using a multi-spectral mechanism. Though the precision score of this system is good, still there is scope of optimizing the robot system. Sabzi et al. (2018) identified three different types of weeds in the potato field using heuristic algorithm. Initially, feature selection is done using the cultural algorithm, and five most important features are selected. Then, the optimal configuration of the network is found using the harmony search algorithm. The limitation is that when the field density is high, the system cannot identify the weed. Zhai et al (2018) used particle swarm optimization techniques for precise spraying of pesticide.

Chang et al. (2018) developed a small-scale robot machine which uses multitasking and computer vision that identifies weed and also performs variable rate irrigation techniques to increase productivity. Image processing techniques such as morphology operator, binary segmentation, HSV (Hue, Saturation and Value) conversion are used. It can classify weed and plant and also irrigate the land using fuzzy logic controller. Weed detection part needs to be improved. Jiang et al (2020) used convolutional neural network is used to identify weed, and further it has to be updated for wide range of soil, location and image acquisition heights.

The existing methods considered only the biological morphology, spectral features, visual textures and spatial contexts features and do not consider the generic features of the images. Also not much of the system considered the blur and rotated images. The proposed solution in this paper looks to adopt feature extraction techniques such as HOG and SURF which considers the generic feature for a more generalized approach. It also captures the edge and corner features which has more information through the HOG technique. This work aims to solve the problems stated and discussed in the literature survey by proposing a solution that is robust to leaves of different size shapes and colours using machine learning aalgorithms such as logistic regression and SVM for classification. Moreover, the usage of the two different datasets including a custom captured dataset is an attempt to arrive at a best methodology that can be adopted for any exiguous crop weed combination dataset.

Materials and methods

The main idea behind the project is to achieve real-time weed detection in fields using a robot through a binary classification of the images captured through the camera, i.e. the WDS. Initially, the image captured is subject to image pre-processing. This is necessary as images should not contain unwanted information which could lead to the poor performance of the model. Feature extraction techniques are used to acquire the important features from the image which are then fed to the trained model. Two datasets are used. CWFID is used to compare with the existing works since it is the benchmarking dataset. To test our model, custom dataset is generated by WDS field robot. The weedicide is sprayed if it is not a plant, otherwise, it moves forward. Figure 1 describes the architecture of the WDS.

Fig. 1
figure 1

Architecture diagram of WDS

For training the model, testing is done. For evaluating the testing of the model, the various performance metrics used are precision, recall, F1 score and accuracy. The model is tested for several runs, recorded the results and then evaluated the performance score of each factor.

Precision

Precision is the ratio of number of true positives (TP) or relevant instances to the sum of TP and false positives (FP) or retrieved instances. It is used to find how much samples are wrongly identified as positive instead of negative. Its formula is given by:

$$\mathrm{Precision}=\frac{\mathrm{TP}}{\mathrm{TP}+\mathrm{FP}}$$
(1)

Recall

Recall is the ratio of true positives to the sum of true positives and false negatives (FN). The recall is subliminally the ability of the classifier to find all the positive samples which in turn tells how many of the true positives were actually recalled. Its formula is given by:

$$\mathrm{Recall}=\frac{\mathrm{TP}}{\mathrm{TP}+\mathrm{FN}}$$
(2)

F1-score

F1 score can be interpreted as a weighted harmonic mean of precision and recall, where an F-beta score reaches its best value at 1 and worst score of 0.

$$\mathrm{F}1\mathrm{ Score}=\frac{2\times \mathrm{Recall}\times \mathrm{Precision}}{\mathrm{Precision}+\mathrm{Recall}}$$
(3)

Accuracy

Accuracy is an important metric for evaluating classification models. Informally, accuracy is the fraction of predictions the model correctly identified. For binary classification, accuracy can also be calculated in terms of positives and negatives as follows:

$$\mathrm{Accuracy}=\frac{\mathrm{TP}+\mathrm{TN}}{\mathrm{TP}+\mathrm{TN}+\mathrm{FP}+\mathrm{FN}}$$
(4)

Dataset acquisition

Crop/weed field image dataset (CWFID)

The CWFID, a dataset widely used in applications for precision agriculture based on computer vision, especially weed control (Swapnali and Vijay 2014), was used for the project. Sample images consisting of both the crop and weed are shown in Fig. 2. It consisted of 60 images with annotations. For every image in the dataset, a ground truth segmentation mask to separate the vegetation from the background and a manual annotation of the type of plant (Crop vs. Weed) was available in the dataset.

Fig. 2
figure 2

Sample images from CWFID dataset

Custom dataset

A custom labelled dataset in which images were captured using the Raspberry Pi 3 camera module night vision camera from a height of 45 cm was created for this study. Amaranthus dubius was taken as the crop, and dwarf copperleaf was taken as the weed specimen. The dataset consisted of 300 images, 150 images of crops and weeds each. Sample images consisting of crop and weed are depicted in Fig. 3.

Fig. 3
figure 3

Sample images from custom dataset (a) plant (b) weed

Dataset enhancement

Given below are the pre-processing techniques adopted for the two datasets, i.e. custom generated and CWFID dataset, respectively.

CWFID dataset

Every image in the dataset had a combination of both the crop and the weed and separately, masks were provided for the crops and weeds as shown in Fig. 4. As a part of the pre-processing step, the masks were applied onto each image in the dataset to obtain two images, one of the crops and the other of weed. Thus, the 60-image dataset described in “Crop/weed field image dataset (CWFID)” section was used to populate 120 images consisting of 60 crops images and 60 weed images.

Fig. 4
figure 4

Applying masks on CWFID (a) annotation of plant (green) and weed (red-CWFID (b) masked image of plant—CWFID(c) masked image of weed CWFID

Custom dataset

The custom-generated 300-image dataset as described in “Custom dataset” section consists of 150 colour (RGB) images of each crop and weed. And to facilitate effective separation of the plant parts from the background, a technique called excess green is adopted, which enhances the green scale in the image. Further background segmentation was executed using thresholding.

  1. 1.

    Excess green thresholding was applied on the image to introduce effective contrast between the foreground (plant) and the background (soil), by enhancing the greenness of the image.

  • R(red) = image [:,:,0], G(green) = image [:,:,1], B(blue) = image [:,:,2]

  • 2*G-R-B is the excess green function used.

  1. 2.

    Background segmentation was executed using thresholding.

The technique of thresholding involves a comparison of each pixel value of the image (pixel intensity) to a specified threshold calibrated. This splits all the pixels of the input image into two segments:

  1. i.

    Pixels that contain lower intensity values than the calibrated threshold.

  2. ii.

    Pixels that contain greater intensity values than the calibrated threshold.

These two segments are now given different values; to segment the background from the foreground here the background was given 0 (black) and the foreground 1 (white) as shown in Fig. 5. The threshold value used to segment this dataset was “80”.

Fig. 5
figure 5

Images at different stages of dataset pre-processing (a) image before enhancement (b) after enhancement using excess green thresholding (c) background segmented image

Feature extraction

Speeded-up robust features

After pre-processing, the SURF method (Bay and Tuytelaars 2006) was used to extract features from the image. SURF is a method to extract interesting points of an image and create descriptors. This method is widely used for its powerful attributes, which include lighting and contrast invariance, scale and translation invariance, rotation invariance. It detects objects from images captured under a variety of intrinsic and extrinsic settings.

The algorithm comprises of the following four steps:

  1. i.

    Integral image generation

  2. ii.

    Interest point detection (Fast-Hessian detector)

  3. iii.

    Descriptor orientation assignment (optional)

  4. iv.

    Descriptor generation

All the ensuing parts of the algorithm use integral image resulting in a significant acceleration in their speed. Equation (4) shows an integral image. While using an integral image, only four-pixel values are read from the original image to calculate the surface integral of any size.

$$ I\left( {x,y} \right) = \mathop \sum \limits_{i = 0}^{n} \sum I\left( {i,j} \right) $$
(5)

while calculating the responses of Haar and Gaussian wavelet filter, this fact is generally used.

$$ H\left( {x,y} \right) = \det \left( {_{{\frac{{\partial^{2} f}}{\partial x\partial y}\frac{{\partial^{2} f}}{{\partial y^{2} }}}}^{{\frac{{\partial^{2} f}}{{\partial x^{2} }}\frac{{\partial^{2} f}}{\partial x\partial y}}} } \right) $$
(6)
$$ H\left( {\overline{x}} \right) = D_{xx} \left( {\overline{x}} \right)D_{yy} \left( {\overline{x}} \right) - \left( {0.9D_{xy} \left( {\overline{x}} \right)} \right)2 $$
(7)
$$ \overline{x} = \left( {x,y,s} \right) $$
(8)

while calculating the responses of Haar and Gaussian wavelet filter, this fact is generally used.

The SURF method, using the determinants of Hessian matrices, locates the significant points on the images. Equation (8) depicts the original definition for this determinant of general two-dimensional function. Two significant methods using the Fast-Hessian detector are used to modify this equation:

  1. 1.

    Replacing second-order partial derivatives by convolutions of an image with approximations of the Gaussian kernels second-order derivatives. Coefficient 0:9 in Eq. (7) compensates for this approximation.

  2. 2.

    Both positions in the image and size are used to parameterize the Gaussian kernels.

    $$H(x)=H+\frac{{\partial H}^{{\varvec{T}}}}{\partial \mathrm{x}}x+\frac{1}{2}{x}^{T}\frac{{\partial }^{2}\mathrm{H}}{{\partial \mathrm{x}}^{2}}x$$
    (9)
    $$\widehat{x}=\frac{{\partial }^{2}{\mathrm{H}}^{-1}}{{\partial \mathrm{x}}^{2}}x\frac{\partial H}{\partial x}$$
    (10)

The parameter s in Eq. (8) is scaling, which represents the scale space. In SURF algorithm, the representative points are assigned with dynamic weights. The weight of every representative point can be calculated using the following equation:

$${\mathrm{W}}_{\mathrm{P}}=\frac{\mathrm{No}.\mathrm{of detected images w}.\mathrm{r}.\mathrm{t Point p}}{\mathrm{No}.\mathrm{of training images in object}}$$
(11)

Clustering SURF features

The number of SURF features obtained from each image is vast. Instead of using all the SURF features for training, similar SURF features were grouped using K-means clustering (Sammut and Webb 2017), and samples are depicted in Fig. 6. The number of clusters was decided using the Elbow method (Syakur et al. 2018) and 300 was the resulting number of clusters. The whole dataset was split into train and test data in 80–20 ratio, respectively. For each image, a training instance was computed which a collection of 300 attributes, each was corresponding to the frequency of occurrence of the respective cluster number. Hence, the training and test datasets were prepared.

Fig. 6
figure 6

SURF features extracted from each class of the two datasets (a) CWFID—weed (b) CWFID—plant (c) Custom dataset—weed (d) Custom dataset—plant

Histogram of gradients

The essential idea behind the HOG descriptors is that the distribution of intensity gradients or edge directions describes shape and the local object appearance within an image. HOG descriptors are formed by combining local histograms for the cells (localized portions) of an image. These histograms represent information of gradient directions or edge orientations for the pixels in the cells. To make the histograms invariant to light, every histogram of the cells is contrast normalized.

To summarize, histogram calculation involves:

  1. 1.

    Calculation of gradients

  2. 2.

    Generation of histograms

  3. 3.

    Normalization of histogram

Calculation of gradients

To begin with feature extraction through HOG, the first step is to find first-order differential coefficients, Gx(i,j) and Gy(i,j), is computed by the following equations:

$${G}_{x}(i,j)=f(i+1,j)-f(i-1,j)$$
(12)
$${G}_{y}(i,j)=f(i,j+1)-f(i,j-1)$$
(13)

where f (i,j) means intensity at (i, j). Then, magnitude, m, and orientation, θ, of the gradients are computed with the following formulae, respectively.

$$m(i,j)=\sqrt{{G}_{x}(i,j{)}^{2}+{\mathrm{G}}_{y}(i,j{)}^{2}}$$
(14)
$$\uptheta (i,j)=\mathrm{arctan}(\frac{{G}_{x}(i,j)}{{G}_{y}(i,j)})$$
(15)

Generation of histograms

After computing values of magnitude (m) and orientation (θ), histograms are generated as follows:

  1. 1.

    Identify the bin θ (i, j)

  2. 2.

    Increment the value in bin determined in step 1

  3. 3.

    Similarly, the above steps are repeated for all gradients in the cell

To decrease the effect of aliasing, the values in the two neighbouring bins are increased. After incrementing, n indicates the bin number to which orientation value, θ (i, j) belongs to, and n + 1 is the neighbouring bin. The values of m n and m n+1 are calculated as below:

$$n=floor(\frac{b\uptheta (\mathrm{i},\mathrm{j}) }{\pi })$$
(16)
$${m}_{n}=(1-\alpha )m(i,j)$$
(17)
$${m}_{n+1}=\alpha m(i,j)$$
(18)

where b represents the total number of bins, α is a parameter for proportional distribution of magnitude m (i, j) which is defined as the distance from θ (i, j) to class n and n + 1,

$$\alpha =\frac{b}{\pi }(\uptheta (\mathrm{i},\mathrm{j})\mathrm{ mod}\frac{\pi }{b})$$
(19)

Histogram normalization

The final step is combining all the local histograms of the cells in a block to form the final histogram. To make the features invariant to illumination and contrast, L1-norm is adopted. After obtaining the large combined histogram, it is normalized:

$$ v = \frac{{V_{k} }}{{\left| {\left| {V_{k} } \right|} \right| + \varepsilon }} $$
(20)

Here, Vk represents the combined histogram for the block, ε is a small constant, and v is the final HOG feature vector. In Fig. 7, it is observed that the HOG features extracted from each of the crop and the weed in both the considered datasets.

Fig. 7
figure 7

HOG features extracted from each class of the two datasets (a) Custom Dataset—plant (b) Custom Dataset—weed (c) CWFID—Plant (d) CWFID—weed

Training a binary classification model

Logistic regression

Logistic regression (Kleinbaum et al. 2002) was used as the model to classify the images into crop and weed. In logistic regression, the probability of having the outcome as positive case is generally defined in the response variable. If the response variable is found to be equal to or greater than a discrimination threshold, the positive class is predicted; otherwise, the negative class is predicted. The logistic function as given below always returns a value between zero and one:

$$F(t)=\frac{1}{1+{{\varvec{e}}}^{-t}}$$
(21)

The training set consisting of 300-length vectors representing the cluster frequencies was trained using the logistic regression model with inverse regularization parameter of 0.001 and a penalty l2.

Support vector machine

A support vector machine (Pisner et al. 2020) can be used for regression, classification or other tasks.

In a high- or infinite-dimensional space, it primarily constructs a hyper-plane or set of hyper-planes. The hyper-planes help to find a good separation between the different classes based on the largest distance of the hyper-lane to the closest training data point of the classes (also called as functional margin). Thus, if the margin is larger, then the generalization error of the classifier is lesser.

Suppose there are training vectors, \(x_{i} \in R^{p}\), i = 1,.. n, in two classes, and a vector \(y \in \left\{ {1, - 1} \right\}^{n}\), SVC solves the following primal problem:

$$\min_{{\omega ,b,c\frac{1}{2}\omega T\omega }} + C\sum\limits_{i = 1}^{n} {C_{i} ,}\;\text{subject to}\;y_{i} \left( {\omega^{T} \varvec{\varphi }\left( {x_{i} } \right) + b} \right) \ge 1 - \c{c}_{i} ,\;\c{c}_{i} \ge 0,\;i = 1, \ldots ,n$$

Its dual is:

$$\min\nolimits_{{\alpha \frac{1}{2}\alpha^{T} Q\alpha - e^{T} \alpha }},\;\text{subject to}\;y^{T} \alpha = 0,\;0 \le \alpha_{i} \le C,\;i = 1, \ldots ,n$$

where e is the vector of all ones, C > 0 is the upper bound, Q is an n by n positive semi-definite matrix, \(Q_{ij} \equiv y_{i} y_{j} K\left( {x_{i} ,x} \right)\), where K \(K\left( {x_{i} ,x_{j} } \right) = \varvec{\varphi }\left( {x_{i} } \right)^{T} \varvec{\varphi }\left( {x_{j} } \right)\) is the kernel φ.

The function is used to implicitly map training vectors to bigger dimensional space.

The decision function is:

$$ {\text{sgn}} \left( {\sum\limits_{i = 1}^{n} {y_{i} \alpha_{i} K\left( {x_{i} ,x} \right) + \rho } } \right) $$
(22)

For the given training set, xi represents the feature which can be SURF or HOG as extracted from the previous step and yi ε {0,1}, where 0 represents weed, and 1 represents the plant. These features are the vectors, and the hyper-plane is calculated. The penalty is a squared l2 with regularization parameter.

Experimental setup

The weed detection model is deployed on a four-wheeled field robot. The root is 45 cm tall and about 40 cm. The camera is hoisted on top of the robot and is connected to a Raspberry Pi module. The raspberry pi module controls the physical movements and classifies the image captured by the camera as shown in Fig. 8. Based on the prediction of the model, the robot then sprays the weedicides if weeds are detected in the captured image.

Fig. 8
figure 8

Block diagram of WDS

A Raspberry Pi 3 camera module night vision camera is used to capture the field image in natural lighting. The camera is fixed at a height of 45 mm from the ground. The captured image is then processed by the Raspberry Pi 3 module mounted on the chassis of the robot. The chassis of the robot is a 45 mm × 45 mm × 45 mm structure with four wheels (two motors on the back wheels) attached to four legs and a platform to accommodate the Raspberry Pi module, battery, circuitry and the weedicide tank attached to the spraying mechanism as shown in Fig. 9 (Table 1).

Fig. 9
figure 9

Weed detection system field robot

Table 1 Specifications of the components used in the WDS robot

Results and discussion

The WDS which uses machine learning algorithms and field robot design is implemented using Python and trained for both CWFID and custom datasets. Once the model is trained, it is saved as.h5 model file and then deployed in the Raspbian OS of the hardware setup. The robot was then programmed to move and capture pictures every square foot. The captured image is processed by the Raspberry Pi board which uses the.h5 model file to decide whether to spray weedicide or not.

The results of the previous work of Haug et al (2014) using Jaccard index for segmentation and Haug et al (2014) using spatial smoothing and interpolation techniques are taken from their work. The trained model was tested using the testing set. The classification report was generated with precision, accuracy and recall as the metrics. The results are tabulated for both CWFID and custom datasets as shown in Table 2.

Table 2 Performance results of the previous works

From Table 3, the results show that the SVM model after feature extraction using histogram of gradients for the custom dataset, and the best accuracy of 83% has been achieved. Also, it has a recall of 0.83 which means 83% of the predicted results are true positives. And the precision of 0.85 implies 85% of the actual results are true positives. On the other hands, logistic regression model after feature extraction using SURF for the CWFID dataset has the least accuracy of 56%. Thus, it is evident that SVM outperforms logistic regression in both the feature extraction techniques and with both the datasets.

Table 3 SVM along with both feature extraction methods SURF and HOG outperforms the logistic regression model which is evident from the underlined values of Table 3

The hassle of a weed detection task depends on the discrimination between the weed and the crop plant leaves that often have similar properties. Generally, the following four categories of features, i.e. biological morphology, spectral features, visual textures and spatial contexts, are used. This work aims to test out the more generic feature extraction methods, namely HOG and SURF, for this purpose. In the HOG feature descriptor, the distribution (histograms) of directions of gradients (oriented gradients) is used as features. Gradients (x and y derivatives) of an image are useful because the magnitude of gradients is large around edges and corners (regions of abrupt intensity changes), and it is known that the edges and corners pack in a lot more information about object shape than flat regions.

It was observed that compared to the existing work done with the CWFID dataset, the usage of HOG has had a significantly higher accuracy when used with both SVM and logistic regression. SURF is good at handling images with blurring and rotation, but not good at handling orientation and illumination change. And usage of SURF resulted in a reduced accuracy which could be the consequence of illumination change in the dataset and the size of the leaves in question. As in the case of the custom dataset, SURF outperforms HOG with both the classification algorithms. The accuracy of SURF and HOG along with SVM has improved by 20% compared to the other classification models. This makes these two techniques better alternatives when the exact features to be considered are indiscriminate.

Conclusion and future work

In this paper, an efficient automated weedicide detection system using feature extraction algorithms and machine learning techniques hosted on a field robot that can work on smart agricultural field is developed. This study uses CWFID dataset and custom dataset generated by the field robot for the removal of weeds. The system showed different results for various methods. Combination of feature extraction methods and the machine learning methods helped in identifying the best method for weed and crop detection. From the experimental results, it is concluded that HOG followed by SVM proves to be best one, and the same has been deployed to automate the weed detection system. In future, the field robot can be built to customize according to the type of crops grown and size of the cultivable land. Moreover, an app would be developed to completely take care of the weed removal process, beginning from the locations where herbicide was sprayed to detect the weeds in the field and suggest suitable methods to encounter them and thus contribute towards the sustainable agricultural growth.