Keywords

1 Introduction

At present, cassava is cultivated in Thailand as it is the main economic crop of the country. The industrial sector needs the quality cassava to be processed into various products. Therefore, it is necessary to maintain the cassava to meet the quality requirements. However, the quality control is difficult because most of the cassava plantation areas are large. Moreover, at the moment, farmers are still facing the problem of Cassava Mosaic Disease (CMD), which is spreading in large numbers. It can spread out in the immediate vicinity quickly. When there is a problem or damage, the caregiver is unable to figure out the problem and the amount of severity it caused. Therefore, the actual production quantity is not known whether it is sufficient for production or not. This paper thus proposes an idea to develop a drone to detect anomalies by creating a system to detect, find and analyse the abnormalities in cassava fields and collecting credentials to show the location of the disease on a large diagram. Presently, the number of cassava plantation business is increasing. To meet both domestic and international markets, which are more than 9,000,000 rai (1 rai is equal to 1600 km2) at present. In the past, they have faced the crisis of CMD. As a result, cassava is damaged at a level of 80–100%, wasting both time and productivity. Hence, the desired profit proportion is not obtained. The problem arises from the use of diseased cassava cuttings for further planting. Additionally, the whitefly pests are vector, causing widespread outbreaks quickly. Walking to survey the farmer's plot is slow and not thorough. It takes a lot of walking time to explore. However, there are no studies related to artificial intelligence (AI) for pest inspection of cassava plants with drone.

This study aims to help cassava farmers for detecting the abnormalities in plots by using the developed drone for survey and preliminary analysis of the abnormalities so that people walk in to check what the malfunction is. The main emphasis is on the speed of the survey. This will be useful for farmers and general users to access the location of the abnormal area to shorten the time spent exploring. In this research, the structure of the drone is designed, developed and equipped with a camera to be used for surveying. The intelligent detection system to find the abnormal cassava leaves is studied and implemented using machine learning models.

2 Developed Drone Structure

The drone used for cassava farm surveying is developed in this research. The survey drone has to bear the burden while surveying the multispectral camera weighing 500 g and flying for a minimum of 15 min. The 6-bladed multi-rotor drone is designed in Hex-V format to provide the stability during the flight. There is a system to determine the coordinates with GPS, and two positions for accuracy in indicating the coordinates are sent back to the base by Telemetry. The structure is designed to be lightweight with composite materials. Figure 1 shows the structure of developed drone. The drone is tested by lifting a load of 5 kg, and it is found that the structure is able to withstand the forces that occurred without damage. Safety factor is 1.6 (see Fig. 2).

Fig. 1
The two model diagrams depict the structure of H e x V, the developed drone, and two sketched diagrams depict the planned structure of the drone.

Structure of developed drone

Fig. 2
The two-model diagram a. the simulation of the drone while carrying the weight, b. loaded simulation to determine safety factors and two heat maps with different values.

Simulation of drone: a simulation of drone while carrying the weight and b load simulation to determine safety factor

3 Cassava Abnormality Detection

The detection of cassava abnormalities uses artificial intelligence techniques combined with colour image analysis and enhancement through image processing where the human vision is important. It is also one of the complex image acquisition mechanisms. It provides information needed for simple tasks such as object recognition, and for complex tasks such as planning, decision making and researching. The interest of vision or visual information is image processing which is the process of dealing with the digital image data with the aim of improving and developing the visual information in order to be able to interpret to better describe the image. Image processing process can be divided into three steps. The initial process is image data acquisition and image enhancement, namely image sharpness correction, image conversion and image signal recovery. The intermediate process is the isolation of the elements of an image and the extraction of important features or identities and the advanced process is the interpretation and image recognition (Chowalit 2016).

For artificial intelligence (AI) techniques, the detection of abnormalities of cassava leaves from drone photographs using deep learning techniques is applied, which is an adaptation of the original models with advanced object detection capabilities. In this study, teaching AI to learn is divided into two steps: preparing the data for creating AI to detect cassava plants and creating AI to detect the location of abnormal cassava plants.

3.1 Data Set Preparation

Data sets play a key role in teaching AI to learn the unusual locations of cassava plants. However, because the drone photos are too large and the details of the cassava plants are not clearly visible. Therefore, it is necessary to cut the image into 36 sub-images before creating a data set. The split images have a filename that represents the order of the segments from the original image. The format of the file name is “DJI_XXXX_YY”, where XXXX refers to the numbered sequence of the image taken by the drone, while YY refers to the part number from the original image (see Fig. 3). The cropped image is 912 × 608 pixels.

Fig. 3
The five photographs of the process of segment a photo with 36 sub photos.

Example of segmenting an image into 36 sub-images

3.2 Image Identification and Amount of Data

To pinpoint for teaching AI to learn what unusual cassava leaves are. Visual Geometry Group Image Annotator (VIA) is utilized, which is a web platform for generating the location data within the image data for AI. The programme can create a location file from the bounding box for using as the sample data for developing AI object detection. The amount of data used to teach AI will be adjusted for each version of the data to study the trend of AI's ability to detect abnormal locations of cassava. As a result, the total of 4 versions of the data set is implemented. They are used to teach AI. Agronomists who specialize in CMD assist in validation of the data set (see Fig. 4). Cassava abnormality data set is selected (2500 images).

Fig. 4
The two photographs of the bound box creation, text document depicts the file creation of annotation.json.

Example of creating the dataset

3.3 Creating AI to Detect Cassava (Object Detection)

To create an AI to detect cassava plants, the research project selected 4 AI algorithm techniques, i.e. Mask R-CNN, EfficientDet, RetinaNet and YOLOv5, for the object detection and localization of cassava plants. There are advantages in common. The developers of each technique have taught AI to learn the basic of the object detection from Big Data; therefore, it can take specific data that would like to detect. Figure 5 shows the example of prediction results of AI algorithm techniques. Figures 6, 7, 8 and 9 illustrate the structures of AI algorithm techniques.

Fig. 5
The four photographs of the mask R-C N N, b. the retinaNet, c. the efficientDet, and d.the YOLOv 5.

Example of prediction results: a Mask R-CNN, b RetinaNet and c EfficientDet, d YOLOv5

Fig. 6
The two-model diagram depicts mask R C N N structures, which includes faster R C N N w or ResNet of 19, faster R C N N w or F P N of 27 with class and box.

Sample of Mask R-CNN structure (Kaiming et al. 2017)

Fig. 7
A model diagram depicts the retinaNet structure, which includes class plus box subnet with W multiplies H multiplies 256 values.

Sample of RetinaNet structure (Toan et al. 2019)

Fig. 8
A model diagram depicts the efficientDet structure, which includes input, efficientNet backbone, B i F P N layer, class net, and box net.

Sample of EfficientDet structure (Mingxing and Adams 2020)

Fig. 9
A schematic diagram depicts Y O L Ov 5 structure, which includes backbone colon C S P Darknet, neck colon P A Net, Head colon yolo layer, and also with cross stage partial network, spatial pyramid pooling, convolutional layer, concatenate function.

Sample of YOLOv5 structure (Renjie et al. 2021)

Mask R-CNN is an instance segmentation model that helps us locate pixels smarter. “Instance Segmentation” means grouping individual objects within a scene regardless of whether they are of the same type. In addition to specify the class and bounding box location, it also can display coloured pixels in the corresponding region (Kaiming et al. 2017).

RetinaNet is able to deal with unequal class problems. For example, there are two classes, Disease A and Disease B. The labelled figures of Disease A are more than of Disease B. RetinaNet can solve this problem well and able to use all the information in learning effectively. Although the amount of data used to learn is small, it still has good performance (Tsung-Y et al. 2018).

EfficientDet is different from other models. Other models are normally developed based on the idea that the model has the depth because it allows the model to remember information well, but EfficientDet is the model developed by the idea that the model has both depth to remember information well and the ability to extract features of the data as much as possible, making EfficientDet more efficient than other models. It also uses less computational resources compared to other models (Mingxing and Quoc 2019).

YOLOv5 is developed with a focus on real-time object detection regardless of the version. The higher version is more accurate, making YOLO popular for applications that require real-time object detection, such as detecting objects from a video camera and detecting fast moving objects (Raphaël et al. 2021).

By comparing the best results of all models, RetinaNet-50 is the most efficient model with 0.675 mAP, the highest, and precision and recall at 0.75 and 0.77, respectively. This means that the model can detect irregular cassava leaves with the high accuracy and can cover a large number of labelled spots. Compared to Mask R-CNN, which is a benchmark model to compare its performance. The research project increased the mAP from 0.034 to 0.675, an increase of 0.641 from the original, so it is chosen as the model to be put into practise.

4 Conclusion

This research develops the drone to help cassava farmers for detecting the area in plots that may be caused by CMD. The drone used for the cassava farm surveying is developed. The 6-bladed multi-rotor drone is designed in Hex-V format for providing the stability during the flight. It can be concluded that the developed drone can be used to inspect and find the abnormal area that may be caused by CMD. The intelligent detection system for finding the abnormal cassava leaves is investigated and implemented using machine learning models. The results show that RetinaNet-50 is the best performance with 0.675 mAP, 0.75 precision, 0.774 recall and takes around 0.06 s to process per image. Comparing with Mask R-CNN benchmark, mAP is improved up to 0.641 with RetinaNet-50 trained on round 4 data set. The images are collected, which are 2500 labelled training images successfully.