Keywords

1 Introduction

Agriculture has been the basis of human existence since its inception. India’s main occupation is agriculture. India is second in terms of agricultural production. Wherein variety of crops are grown. Modern organic farming has brought more attention to quality and yield. As the number of crops increases yearon year, so do the diseases. Plant diseases can ruin agricultural yields. It is a serious problem for food safety. Climatical conditions are not control of humans, and this is a major setback for farmers and hence a big loss. Due to uncontrolled change in climate, the agriculture sector is attacked by millions of pests. This should be detected in early stages, failing which there are the chances of completed failure in crop yield. The symptoms can be seen in different parts of plants, such as the leaves, stems and lesions, and the fruits. The leaf will show the symptoms by changing color or showing spots. This process will help in plant disease classification and detection, which leads to better quality and higher plant productivity.

Traditional methods for diagnosing disease require extensive knowledge and experience in the field. Manual observation and pathogen detection are the best methods for diagnosing disease. However, this can be costly and time-consuming. Farmers used to monitor their crops at regular intervals. If they could not identify the disease symptoms, they would apply a certain amount of pesticide or fertilizer which may lead in reduction in yield.

The absence of disease can lead to incorrect fertilizer applications, which ultimately harm both the plant as well as the soil. Farmers often resort to pesticides and expensive methods to avoid these diseases. This approach also increases production costs and causes major monetary losses to farmers. Effective disease management begins with early detection.

To improve accuracy and minimize detection of traditional leaf diseases, as well as to take into account leaf position, we use image processing with the neural network. The proposed approach improves the detection of tomato diseases and can even suggest treatments.

The aim is to develop a sheet recognition algorithm based on specific features from photography. This therefore introduces an approach where the plant is identified based on the properties of its leaves such as area, histogram equalization, and edge detection and classification. The main purpose of this algorithm is to use OpenCV resources.

The tomato plant is considered for experimental study. Compare to all other plants, tomato plant is quite sensitive, and it requires particular weather conditions to grow. As the prize of tomato is fluctuating in our day-to-day life, it is very important to detect the diseases reduce the loss.

Table 1 shows the life span of tomato plant stage by stage. It is very important to observe or monitor the plant during the period of 30 to 40 days. Because during thig stages, there is a high chance of plants getting infected. If we monitor plants correctly during this period, then we can reduce the loos of production due to infection.

Table 1 List of leaf images for training

To help farmers, a new method to identify tomato diseases is suggested. Our approach is able to detect tomato diseases more accurately. For experimentation, we are using totally 2511 image of 5 disease classes to train the model. Finally, by comparing both the outputs from ResNet-50 and CNN model, the system gives the better accuracy compared to existing methods. The different images with respect to the diseases are as shown in Table 2.

Table 2 Number of leaf images after classification

2 Related Work

The rapid advancement of computers in the last few years has made vision and deep learning possible. This has significantly increased image recognition’s flexibility as well as accuracy. Deep learning is able to extract classifications in a better way compared to other technology. Using deep learning, features can be extracted directly without the need to use classifiers. Deep learning is an effective method of classifying in many situations. It works very effectively in at generalization, especially when it comes to the extraction of complex and special features.

Aravinth et al. [1] introduced a method to identify brinjal leaf diseases like Bacterial Wilt and Cercospora Leaf Spot. Collar Rot and the method to detect diseases with care. Artificial neural network was used for classification. K-means clustering algorithm was used for segmentation, and texture features identification is used for feature identification. Kamlapurkar proposed a system that can give more precise results in the classification and identification of disease from an image of a leaf. They used different methods [2] such as pre-processing, training, and identification. They used feature extraction to classify images and diagnose. Zhou et al. had restructured the residual dense network to identify tomato leaf diseases. The hybrid deep learning model combines the best of dense and deep residual networks. This can improve the accuracy of [3] calculations as well as increase the flow of information. The model achieves a top-one average identification accuracy, according to experimental results. Ding and colleagues had used tomato leaves for their experiments. They used [4] deep learning to extract disease features from the leaf surface. ResNet-50 is used as the base network model in this experiment. Subhajit Maity and colleagues proposed a simple method to detect leaf diseases [5] by using images of leaves. This was done with image processing and segmentation. For identification, they used Otsu’s method and k-means clustering. Ding et al. used a pixel wise [4] instance segmentation technique, mask region-based convolutional neural network, of an improved version, in order to detect cucumber fruits. This [6, 7] research identifies the disease in four stages: image acquisition, image segmentation, and feature extraction. The extracted features include contrast, energy, homogeneity, and mean, standard deviation, variance, and energy. Saxen and colleagues proposed an easy and quick face alignment method for pre-processing. They also address the [8, 9] problem of estimating facial attributes using RGB images for mobile devices. MobileNetV2 and NASNet mobile are two lightweight CNN architectures.

3 Proposed System Flow

The flow of the system is shown in Fig. 1. Firstly, we are taking raw images of five different diseases of different sizes. To make size of the images, same pre-processing is done. The block diagram is the proposed model which is as shown in Fig. 1.

Fig. 1
A diagram depicts the system flow where 5 raw images of different diseases of different sizes undergo the same pre-processing methods to compare the feature and find healthy defects.

System flow diagram

3.1 Data Acquisition

Tomatoes are one of the most widely grown agricultural crops. It is grown extensively in both north and south India. The experiment produced 2511 images showing the five most prevalent tomato leaf diseases: Septoria, bacterial spot, and leaf mold fungus. The data was obtained from Kaggle. Some examples can be seen in Fig. 2.

Fig. 2
A set of twenty four images depicts leaves of different sizes and shapes, arranged in 6 columns and 4 rows.

Dataset

3.2 Pre-processing

The database is pre-processed, such as image reshaping and resizing. The test image also undergoes similar processing. Pre-processing refers to the improvement of image data in order to suppress unwanted distortions or enhance some important image features for further processing. The resizing of image is shown in Fig. 3.

Fig. 3
A set of two photographs. A, a leaf with a dimension of 256 by 256. B, it is resized version, where the dimension is only 50 by 50.

Image resizing

3.3 Feature Extraction

Here, we use convolutional neural network which acts as a combination of two components: feature extraction part and the classification part. Where feature extraction part uses the convolutional layers and extracts the image feature, then classification is done using softmax classifier. Initially, the image is converted into pixel format, and the values are based on RGB. The average of these three values is taken and used as features. The kernel now learns about the features and identifies the disease.

3.4 Training and Testing the Model

The dataset has five different plant diseases. Any image can be used as a test. CNN uses the train dataset to train the model so it can identify the test images and determine the disease. CNN has many layers: dense, dropout, and activation as well as convolution2D and maxpooling2D. These layers can be used to extract the feature or classify the disease. The algorithms is able to detect the disease in a plant species once it has been trained. This is done by comparing features from the test image and tarin. The trained model and the test image can detect the disease in the leaf.

The experiment involved a classification and sorting of the training photos. These were then placed in the folder that corresponded to the disease category name. Comparing the ResNet-50 network with its original ResNet-50 network, we found different activation functions and convolution kernel size. For the classification of diseases in a image, we use a convolution neural network. Here, we implement CNN using pre-trained ResNet-50 [10] neural network architecture using TensorFlow and OpenCV in Python platform.

4 Result Analysis

The experiments is done using the available dataset which is having images of all five different diseases considered. The pre-processed images are shown in Table 3. The values are for all five diseases with five models. The values are noted which are numbers of iteration, as the number of iteration increases, the diseases are identified in specific. The values obtained are plotted using bar chart, and it is evident that the infection is identified only with more numbers and iteration as shown in Fig. 4.

Table 3 Weights of each disease obtained
Fig. 4
A bar graph of weights versus iteration depicts the difference in weights of each disease. The difference in weights of each disease septoria at iteration 4 has the highest value at 9.9991322, and the lowest is different in weights of each disease Y L C V of iteration 5 at 1.1140285.

Bar graph represents weights of each disease obtained

The weights obtained by both techniques are compared to calculate in efficiency of a proposed model. The efficiency of each infection, for each model, is shown in Table 4. These values are plotted as shown in Fig. 5. It is observed that between fifth of sixth iteration, the infection is identified in specific, and hence, the healthy leaf value comes down.

Table 4 Accuracy values of detected diseases
Fig. 5
A line graph of accuracy versus the number of iterations, for healthy, Y L C V, leaf mold, bacterial spot, and septoria virus. Data for healthy increase, peak, and decrease, while the rest exhibit increasing trends.

Accuracy of detected diseases

5 Conclusion

The proposed model is evaluated using two methods to extract features, identify, and classify. The experimentation has been done using the existing dataset and a sample dataset created by our own images. The proposed model is able to perform better compared to existing models, because of dual model application, the results obtained are shown in Table 4, where the accuracy is calculated for different diseases.