Introduction

Indian economy is extremely reliant on the agricultural productivity. Plant disease management plays an important role in the field of agriculture to ensure high productivity and quality yield. To overcome the disease spreading problems, it is important to constantly monitor the plants over the field. In conventional farm management practice, this task is accomplished by human experts which is error-prone, time-consuming, and labour-intensive. To overcome these challenges, the images of the infected portion of the plants are considered for processing and analysing the disease pattern with vision technology (Chen et al., 2002) in modern smart farm management practice (Chung et al., 2016; Xiao et al., 2018; Zhang et al., 2018; Rahman et al., 2020). In these farm management systems, the decisions to control the epidemic are taken depending upon the analysed pattern of the visual symptoms using the machine learning technology (Alpaydin, 2020). Generally the visual symptoms appear several days after the infection, but, in this due time, the disease has already spread and the quality of the yield has deteriorated which causes a major loss in productivity. So, vision technology lacks in predicting the disease before the appearance of a lesion and hence fails to achieve early diagnosis, assuming that the plants are still in the incubation period before the onset of the disease. This inefficiency of vision technology in the prediction field motivates the researcher in exploring the thermal features of the infected plants (Vadivambal & Jayas, 2011) for disease prediction. Generally every disease causes two types of changes over the infected plants - internal and physical changes. Physical changes appear after certain period when disease already spread over the plant but internal chemical changes appear just after the inoculation of the disease. These internal changes cause temperature variation over the infected plants that are invisible to bare eyes (Chen & Shakhnovich, 2010). Thermal imaging technology is highly sensitive to this type of temperature variation of the object body as it captures the infrared radiation over the surface of an object in 7500 to 14,000 nm wavelength (Battalwar et al., 2015), whereas visual imaging technology cannot capture this variation as it works on 380 to 700 nm wavelength. Thus, thermal image provides more minute details of the object. If the pattern of this detailed information about the thermal changes can be analyzed, then the early prediction of the diseases is possible.

Thermal imaging or thermography is a rapidly growing field of technology that improves the object visibility by identifying the infrared radiation from the object and forming an image based on that (Vollmer & Möllmann, 2017). In recent years, thermal imaging technology has been applied at various research field successfully like avian science, optics, microsystem, medical science, wildlife study, forestry, food and agriculture industry, plant physiology, Eco physiology, plant water stress measurement (Gull et al., 2019; Siddiqui et al., 2019), canopy temperature management, etc. (Vadivambal & Jayas, 2011; Vollmer & Möllmann, 2017; Manickavasagan et al., 2005) due to its high sensitivity, extensive range and dynamic detection capability. In the paper (Bhakta et al., 2018) the authors have analysed the temperature variations over the healthy and sick leaves by extracting the features from the thermal images manually and showed that the temperature over the leaves reduced gradually with the increase in the severity of the disease.

The research findings from the literature, on plant disease identification, classification using machine learning technology with thermal imaging technology are summarized in Table 1 according to the type of problem, crop type, output class, input image type, features for classification, learning algorithm, and achieved accuracy (Oerke et al., 2011; Calderón et al., 2015; Prince et al., 2015; Sanchez et al., 2015; Bayoumi & Abdullah, 2016; Omran et al., 2017; Banerjee et al., 2018; Zhu et al., 2018; Hornero et al., 2021; Poblete et al., 2021) (Table 1).

Table 1 Comparative analysis of the reviewed paper

The observations that are obtained by reviewing all the above literatures are –.

  1. 1.

    Very few literatures have used near field thermal image processing to detect plant diseases.

  2. 2.

    Thermal images can capture the temperature variation due to the internal changes in a plant infected by the diseases. This information is further helpful in predicting diseases before any visual symptoms.

  3. 3.

    Thermal imagery combined with deep learning techniques will show a new research gateway in the plant disease prediction field. The dynamic and enormous ranges of features in thermal imagery, and the supremacy of deep learning techniques, encourage the researchers to use them together in plant disease prediction model.

  4. 4.

    Literatures also show that the CNN is very suitable for image based plant disease analysis.

Motivated with this varying range of successful applications of thermography, this paper emphasizes on the prediction of plant diseases with the thermal image-based data analytics and machine learning technology.

Rice is a staple food in Indian agriculture. Many diseases such as Bacterial Leaf Blight (Xanthomonas oryzae), Blast (Pyricularia grisea), Brown Spot (Helminthosporium oryzae), Sheath Rot (Sarocladium oryzae), Sheath Blight (Rhizoctonia solani), etc. (Elazegui, 2003) can affect the rice productivity in susceptible rice grain. These kinds of diseases are very vicious for the rice because they spread very fast. Hence to control the quick spreading of the rice diseases, a prediction model is proposed in this paper. Among all the different kinds of diseases in rice plant, Bacterial Leaf Blight (BLB) is considered as a case study for the model as it is very common in Indian environment. Due to complex and costly process of data collection, the collected dataset is limited compared to the requirement of the standard deep learning model. To eliminate the problem due to limited dataset, a novel Convolutional Neural Network (CNN) with three convolutional layers has been proposed which achieves better accuracy in disease prediction.

The objective of the proposed model is not only to predict the disease before the appearance of any visual symptoms but also to analyse how fast the disease can be predicted based on the thermal symptoms. Thus this study contributes to precision agriculture research by developing a model to predict the plant disease at very early stage based on thermal symptoms.

Materials and methods

A novel prediction model for BLB disease in rice plants has been proposed in this study using thermal imaging and deep learning technique. In order to develop the proposed model the following stages are carried out in a pipeline – Study Area Design, Data Collection, Hypothesis and Data Analysis Plan, Data Pre-processing, Prediction model design, training and evaluation. A general architecture of this workflow has been depicted in Fig. 1.

Fig. 1
figure 1

General architecture of the proposed system

Study area design

The study was carried out at Rice Research Institute, West Bengal, India under the supervision of the agricultural scientists during the month of February-March, 2018. A rectangular field given in Fig. 2 of size 6 m X 3 m was prepared to cultivate susceptible rice variety IR24 for BLB. This field was divided into 5 subplots (S1, S2, S3, S4, and S5) with 120 seedlings being planted (14/02/2018) in the form of 10 × 12 matrices in each subplot (Total 600 seedlings). These rows and columns of the subplots were marked as R1, R2, R3, R4, R5, R6, R7, R8, R9, and R10 and C1, C2, C3, C4, C5, C6, C7, C8, C9, C10, C11, and C12 respectively for unique identification of each plant within a subplot. Then sufficient water and nutrients were applied time to time according to the advice of the agricultural scientists to ensure the proper growth of the plants.

Fig. 2
figure 2

Original Rice Field Considered as Study Area in Rice Research Institute, Chin surah, Hooghly, West Bengal

Once the plants developed to the tillering stage, the sub-plots S1 to S4 were considered for inoculation and sub-plot S5 was selected for control. The plants in rows R3 to R8 of sub-plots S1 to S4 were inoculated with bacterial suspension on (08/03/2018). So total, 72 × 4 = 288 plants were inoculated only one time and after that no intervention was made on these plants. Remaining 312 plants were not inoculated. The images of the selected leaves of these 288 inoculated plants were captured at three stages after inoculation with one-day interval to follow up the progression of the disease.

Data Collection

The thermal and visual images of the leaves selected for inoculation (288), were captured using FLIR C2 camera with the thermal sensitivity less than 0.10℃, Field of View 41° X 31°, instantaneous field of view 8.94483 mRad, infrared image resolution 80 × 60 (4,800 measurement pixels). Other measurement parameters are set with the Flir Tools software to bring uniformity in the acquired images like - emissivity (0.95), Reflectance temperature (30℃) and Relative humidity (50%).

The images of the selected leaves of 288 plants were captured before inoculation on 07/03/2018 and labelled as “Normal”. The bacterial suspension was inoculated on 08/03/2018. After inoculation, the images of the inoculated leaves were captured on 10/03/2018 and labelled as “Stage1” infection. Next images were captured on 12/03/2018 and labelled as “Stage2” infection. On 14/03/2018 the visual symptoms were appeared in most of the inoculated leaves, so these images were captured and labelled as “Stage3” infection. Finally, four sets of images were collected in this way. Figure 3 represents the sample of the labelled images.

Fig. 3
figure 3

Collected Thermal images of the rice leaves before and after pre-processing: (a)RGB Normal Leaf, (b)RGB Infected Leaf at Stage1, (c)RGB Infected Leaf at Stage2, (d)RGB Infected Leaf at Stage3, (e)Thermal Normal Leaf, (f)Thermal Infected Leaf at Stage1, (g)Thermal Infected Leaf at Stage2, (h)Thermal Infected Leaf at Stage3, (i) pre-processed Normal Leaf, (j)Pre-processed Infected Leaf at Stage1, (k)Pre-processed Infected Leaf at Stage2, (l)Pre-processed Infected Leaf at Stage3

All the images were collected manually by the thermal camera by placing the selected leaf on a black cardboard to eliminate the complex background from an average distance of 20 cm. In this way the images were captured during 7a.m. to 10a.m. of the day. The atmospheric temperature range of the data collection area at that time was 25℃ to 35℃. After collecting the images, it is observed that, 261 out of the 288 inoculated plants developed visual symptoms of the BLB disease. Considering those images corresponding to 261 infected plants, a final dataset of four classes (Normal, Stage1, Stage2, Stage3), with a total of 261 × 4 = 1044 images were prepared for the study.

Hypothesis and data analysis plan

There are total 288 plants where bacterial suspension was inoculated. Among them 261 plants developed visual symptoms of BLB disease after one week of inoculation which is confirmed by the agricultural scientists. Before inoculation the images of these 288 plant leaves are selected as “Normal”. After inoculation the images are collected at three regular intervals and marked as Stage1, Stage2, and Stage3. The Stage1 images have no visual symptoms. Stage2 images have very mild symptoms. Stage3 images contain the visual symptoms of full grown disease. The study considers the Stage3 images as gold standard as visual symptoms are available in this stage. Based on this stage, 261 leaves are marked which are infected by the disease and confirmed by the agricultural scientists. The corresponding images of these 261 leaves at Stage1 and Stage2 are considered for further processing. Stage1, Stage2 and Stage3 images are compared with the Normal images to show that with disease progression the proposed model enhances the prediction accuracy with high precision. The main objective of the study is to predict the disease with minimum error at Stage1, when there is no visual symptoms. So, the hypothesises for the study are –.

  1. 1.

    Null Hypothesis: Thermal Imaging method is not effective to identify diseased leaves at Stage-1.

Alternate Hypothesis: Thermal Imaging method is effective to identify diseased leaves at Stage-1

  1. 2.

    Null Hypothesis: Thermal Imaging method is not effective to identify diseased leaves at Stage-2.

Alternate Hypothesis: Thermal Imaging method is effective to identify diseased leaves at Stage-2

  1. 3.

    Null Hypothesis: Thermal Imaging method is not effective to identify diseased leaves at Stage-3.

Alternate Hypothesis: Thermal Imaging method is effective to identify diseased leaves at Stage-3

The study used McNemar’s test (Lachenbruch, 2014) for hypothesis testing. This test is used for paired nominal data. It compares the sensitivity and specificity of two analytical tests on the same group of data.

Data pre-processing

The main aim of pre-processing is to improve the quality of the input data for further analysis. As the data for this study are acquired in open environment so various noise may be there like background, atmospheric or environment temperature, direct sun radiation, luminance and wind.

In this study, to minimize the background interference, the images of the rice leaves are collected by placing them on a black cardboard. The varying atmospheric temperature range was standardized with the Flir software to 25℃ to 30℃ for the whole dataset. Environment temperature and direct sun radiation influence the thermal readings and the same plant would have different temperature profile under different environment temperature and sun direction (Kim et al., 2018). To control these effects, the data was collected at a particular time of the day and short interval so that the change in climatic condition was very minimal. Not only that, the emissivity and Reflectance temperature are also adjusted for the Thermal camera and set to 0.95 and 30℃ respectively. Luminance factor is not as important for this study because here thermal images are considered. The effect of wind is also not considered here as the data has been acquired by fixing the leaf on the black cardboard.

In literature (Phadikar et al., 2008; Phadikar et al., 2012; Tete et al., 2017) it has been seen that thresholding is easier and common to extract the region of interest from the acquired images. For this reason, only infected leaf area has been extracted from the acquired images using thresholding approach applied on the Hue channel of the HSV model of the images with the threshold value 179. The images after extracting the infected leaf area are given in Fig. 3. These images are then resized to 256 × 256 with the standard function in Keras image pre-processing library to make it appropriate for the proposed prediction model.

Proposed model

Artificial neural network (White, 1992) is a part of artificial intelligence (Russell & Norvig, 2016) that mimics the functionality of biological neural network to identify an object. It has the ability to build a model by observing the existing pattern of data through supervised learning. Deep learning is an evolution of traditional neural network that uses large number of hidden layer during learning for better accuracy (Shanmugamani, 2018). CNN (LeCun et al., 2010) is a type of deep neural network that can generate filters with different weights and biases for learning input object pattern. Hence, it is able to recognize various patterns without any pre-processing from the raw input image with its self-learning ability. It is advantageous than classical neural network, as it can quantify the spatial dependencies among the image pixels, with reduced number of parameters and reused weights, in a sophisticated manner with better accuracy. In visual imagery, CNN architectures like AlexNet (Krizhevsky et al., 2017), AlexNetOWTBn (Krizhevsky et al., 2017), GoogLeNet (Szegedy et al., 2015), Overfeat (Sermanet et al., 2013) and VGG (Simonyan & Zisserman, 2014) etc. have been successfully used for complex image recognition in many cases. Hence, the CNN is used in this paper to build the prediction model for plant diseases.

Architecture of the proposed model

The standard deep learning architectures available in the literature are mainly built for large dataset with a large number of hidden layers. In this study, the size of dataset is lesser compared to them. Literature (Shanmugamani, 2018; Barz & Denzler, 2020; Zhang & Ling, 2018) shows two possible ways to deal with small dataset–.

a) Training a user defined new convolutional neural network from scratch.

b) Using a standard model with transfer learning strategy.

Due to the high computational requirement of the second option, the first option is considered in this study.

Generally the standard deep learning models are composed of numerous numbers of hidden layers. If transfer learning is applied based on these models, then a high-end computer is required for training, which is disadvantageous for a small researcher with limited resources. Keeping all these things in mind, in this paper a CNN architecture comprised of three convolutional layers (Conv1, Conv2 and Conv3) and two fully connected layers (FC1, FC2) is proposed empirically for small dataset. The details of this architecture are shown in Table 2.

Table 2 Proposed Deep Learning Model Parameter Details

Convolutional layer captures the different patterns present within the input image by applying filters and convolution operation. Each filter decides a feature in the input image to generate a feature map according to a specific pattern in that image. Each feature map represents a particular characteristic in the input image. The output of the convolutional layer is fed forward to the rectified linear units (ReLUs) in order to generate the rectified activation map and introduce non-linearity into the CNN model (LeCun et al., 2010). The output of this unit is fed forward to the max pooling layer to reduce the feature space. The outputs of the last convolutional layer (Conv3) are fed to the first fully-connected layer (FC1). The output of the first fully connected layer (FC1) is fed to the final output layer (FC2) which uses softmax activation function (LeCun et al., 2010) to produce a distribution of two class labels. Moreover, the small number of training samples may cause overfitting. To overcome these issues, a small dropout regularisation (Srivastava et al., 2014) has been introduced in the proposed network after the first fully connected layer. This technique randomly deactivates some neurons in the network to prevent overfitting (Sermanet et al., 2013).

There are more than twenty million parameters and 7,928,740 neurons involved in the architecture. The architecture was implemented in python with keras and tensorflow in backend. The machine has GPU of an NVIDIA® GTX1660-Ti card and the CUDA® parallel programming environment. The graphical representation of the proposed architecture is depicted in Fig. 4.

Fig. 4
figure 4

The proposed deep learning architecture

Parameter initialization and CNN training

An important part of deep neural network training is parameter setting. There are mainly two kinds of hyper parameter that directly affects the training process of a deep neural model – Optimizer hyper parameter and Model specific hyper parameter. The proposed model uses Adam optimization function with batch size, decay rate, and number of epoch set to 32, 0.9, and 100 respectively. All the layers use an equal learning rate of 0.001. The parameter details are described in Table 3.

Table 3 Parameter Initialization

The model has been trained with images belong to four classes (Normal, Stage1, Stage2 and Stage3) in three phases. There are 261 images in each of the four classes. At first phase, Normal vs. Stage1 class images then Normal vs. Stage2 and at last Normal vs. Stage3 class images are applied for training. Among these three stages (Stage1, Stage2, Stage3) of images, only Stage1 does not contain any visual symptoms of the disease. Stage2 contains mild symptoms of the disease and Stage3 contains the images with visual symptoms of full grown disease. As Stage1, Stage2 and Stage3 contain images of the same leaves at three different stage of the disease so the performance measured at three phases further proves the efficiency, robustness, consistency and validity of the model in predicting the disease at the earliest stage. A 5-fold cross validation method has been used to split the dataset before the training procedure.

Results and discussion

A dataset of total 1044 images comprising of four classes (Normal, Stage1, Stage2 and Stage3) has been prepared for the experiment as per the procedure described in data collection section. Then the dataset of ‘Stage1’, ‘Stage2’ and ‘Stage3’ are compared with the ‘Normal’ one to predict the disease at the earliest.

Fig. 5
figure 5

Comparison study among proposed method, standard Resnet50, Resnet101, VGG16, VGG19, SVM and LR

The accuracies achieved at these three stages using 5-fold cross validation method are 95.31% ≈ 95%, 96.77% ≈ 97% and 98.39% ≈ 98% respectively as shown in Fig. 5. It is quite obvious that accuracy will be very high (98%) at Stage3 as thermal changes have already become prominent during this stage resulting in the appearance of visual symptoms in the infected leaves. The gradual increase in accuracy with the disease progression proves the validity of the proposed model in plant disease prediction. The experiment results show that 95% accuracy is achieved in the Stage1 thermal images, which is highly significant irrespective of the fact that visual symptoms did not appear during this phase. This proves the efficacy of the system in predicting the diseases much before the appearance of any visual symptoms. It is also established that the proposed model is capable of predicting the disease only within 48 h of the inoculation. This is a significant improvement considering the fact that in normal case it takes at least 7 days for a human to detect the disease by observing the visual symptoms. The changes in thermal images due to the disease increase gradually with time. The difference of time among these three stages of disease is one day. So, infection gradually spread up over the leaves. The patterns of the disease also get prominent in thermal images. For this reason, the accuracies increase gradually. Hypothesises of this study have been proved by the McNemar test result in Table 4.

Table 4 Hypothesis testing

The table shows the stage wise Accuracy, Precision, Type-I error rate, Type-II error rate, McNemar statistics and p-value for the proposed model. At first stage Normal and Stage1 images are compared, where Normal contains non-infected images and Stage1 contains infected thermal images with no visual symptoms. The achieved accuracy, precision, Type-I error, Type-II error, McNemar statistics and p-value at this stage are 95%, 97.5%, 2.3%, 7.7%, 6.5 and 0.011 respectively. At this stage, P value less than 0.05 provides strong evidence to reject the null hypothesis. So, Thermal Imaging method is effective to identify diseased leaves at Stage1. At second stage Normal and Stage2 images are compared, where Normal contains non-infected images and Stage2 contains infected thermal images with mild visual symptoms. The achieved accuracy, precision, Type-I error, Type-II error, McNemar statistics and p-value at this stage are 97%, 98.8%, 1.1%, 5.4%, 5.9 and 0.015 respectively. At this stage, P value less than 0.05 provides strong evidence to reject the null hypothesis. So, Thermal Imaging method is effective to identify diseased leaves at Stage2. At third stage Normal and Stage3 images are compared, where Normal contains non-infected images and Stage3 contains infected thermal images with visual symptoms. The achieved accuracy, precision, Type-I error, Type-II error, McNemar statistics and p-value at this stage are 98%, 99%, 0.07%, 3.8%, 4.08 and 0.04 respectively. At this stage, P value less than 0.05 provides strong evidence to reject the null hypothesis. So, Thermal Imaging method is effective to identify diseased leaves at Stage3. This stage wise gradual increase in accuracy and precision with the appearance of visual symptoms proves the reliability of the proposed model. Since only Stage1 contains thermal images with no visual symptoms, so this study considers only the stage1 result as the effective one and thus fulfil the objective. That the proposed model effectively predicts the disease from the thermal images without any visual symptoms within two day of occurrence of the infection.

The performance of the proposed model has been compared with the four most commonly used deep learning models Resnet50, Resnet101 (He et al., 2016), VGG-16 and VGG-19 (Simonyan & Zisserman, 2014). The prediction accuracies for the Resnet50 model, while comparing the Normal images with the Stage1, Stage2 and Stage3 infection, are 93%, 95%, and 96% respectively. In Resnet101 model the achieved accuracies are 94%, 96%, and 97% respectively.

Similarly for VGG-16, the accuracies are 91%, 92%, and 94% respectively. In VGG19 model the achieved accuracies are 90%, 92%, and 94% respectively. The graphical representation of this comparison has been shown in Fig. 5. It clearly shows that the proposed method outperforms the other two standard deep learning models in each stage.

The performance of the proposed model has also been measured with respect to machine learning algorithms to show the supremacy of deep learning algorithm. For this purpose, two broadly used machine learning algorithms - Support Vector Machine (SVM) and Logistic Regression (LR) are considered. Comparing the Normal images with the Stage1, Stage2 and Stage3 infection the achieved the accuracies are 67%, 71%, and 80% respectively in case of SVM model. Similarly, for LR model, the accuracies are 66%, 72%, and 78% respectively. The graphical representation of this comparison has been shown in Fig. 5. The proposed deep learning model outperforms the SVM and LR model for its dynamic nature of feature introspection.

All the above algorithms run in the system with the processor Intel(R) Core(TM) i7-9750 H CPU@2.60 GHz, RAM 16.0 GB, Intel(R) UHD Graphics 630, GPU NVIDIA GeForce GTX 1660 Ti and six cores with 12 logical processors. Table 5 shows the computation time for all the above algorithms in this system.

Table 5 Computation time for all algorithms

The values in the table show that proposed algorithm performs better than all other compared algorithms except Resnet50. Though Resnet50 takes less computation time than proposed algorithm, but the difference is very small.

Conclusions

Smart plant disease management is gaining more interest in the last decade with the developments of enabling technologies in the field of precision agriculture to ensure high productivity and quality yield. An important domain that is impacting the disease management system is prediction of the disease before any visual symptoms. Every disease causes some external temperature variation over the infected area of the plants due to the internal chemical changes. Thermal imaging technology can capture this temperature changes, whereas visual imaging cannot capture this variation. So, the analysis of these thermal changes can bring a fruitful direction in the early prediction of the diseases. General Machine learning technology can be used to learn the pattern of pre-symptomatic thermal images of the infected plant but the overhead of doing feature engineering explicitly in this case make it adverse than deep learning technology. A deep learning algorithm can inspect the input data implicitly for correlated features and thus enable faster learning by combining them together. In case of pre-symptomatic thermal images of infected plants, defining features explicitly is a tough and challenging task. Deep learning technology makes this task easier with its automated feature engineering properties. This combination of thermal imaging with deep learning technology can bring a new challenging research era in smart plant disease management system.

The intent of this study is to predict the plant diseases at the earliest phase using thermal imaging and deep learning technology. This paper proposes a CNN model based architecture to fulfil this objective. BLB in rice plants are considered as a case study in this paper. Data collection was a challenging task due to the complex process of field preparation in an open environment. As a result, the collected dataset was small. A novel deep learning model based on CNN architecture has been implemented to overcome this challenge. The model consists of three convolutional layers and two fully connected layers to reduce computational overhead.

The model is trained in three phases to get the desired result and prove the robustness. At first phase, Normal vs. Stage1 class images then Normal vs. Stage2 and at last Normal vs. Stage3 class images are applied for training. The model achieved 95%, 97% and 98% accuracy at first, second and third phase respectively. The results show the efficacy of the proposed CNN architecture in automatic exploration of the related thermal features for Bacterial leaf blight prediction at pre-symptomatic stage (Stage1) with 95% accuracy and 97.5% precision. The reliability of the proposed model for small dataset has also been proved through the comparison with the other standard deep learning models (Resnet50, Resnet101, VGG-16 and VGG-19) and machine learning algorithms (SVM and LR).

The model can be improved with the following research directions.

  • For predicting the disease much earlier, the interval of collecting data may be further reduced and tested.

  • Though the study gives a good initiation in rice disease prediction model with deep learning framework, but more experiments are needed with the dataset of diverse diseases to improve the robustness of the model.