Medicinal Plant Classification Using Neural Network

Khate, Avilie; Sharma, Bobby

doi:10.1007/978-981-99-4362-3_28

Avilie Khate³⁹ &
Bobby Sharma³⁹

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 1061))

Included in the following conference series:

International Conference on Emerging Global Trends in Engineering and Technology

239 Accesses
2 Citations

Abstract

The earth is filled with a different kinds of medicinal plants. These medicinal plants are used in some useful ways such as formulation of drugs, herbal products made from it, and common ailments and diseases cured by making medicines out of the medicinal plants. There are many medicinal plants in the wilderness. Recognition of those medicinal plants by human sight are going to take a long time, slow, tiresome, and not accurate. As many of them are under extinction as per the IUCN records, image processing comes into play by identifying the endangered plants and helping in preserving it. The Mendeley dataset has a collection of different species of healthy medicinal herbs such as Alpinia Galanga (Rasna), Citrus Limon (Lemon), and Moringa Oleifera (Drumstick), and 30 different medicinal plants with 1500–2000 images are available in Mendeley’s dataset. In each respective medicinal plant folder, 50–100 high-quality images are present. The species botanical/scientific name are named as the folder name which will be used to train the model. In this paper, it proposed a system that adopts the deep learning method to obtain high accuracy in the classification and recognition of medicinal plants. Convolutional Neural Network (CNN) is used as the system for classifying of medicinal plant images based on deep learning.

Access provided by Autonomous University of Puebla. Download conference paper PDF

Automatic Classification of Medicinal Plants of Leaf Images Based on Convolutional Neural Network

MediNET: A Deep Learning Approach to Recognize Bangladeshi Ordinary Medicinal Plants Using CNN

Medicinal Plant Recognition from Leaf Images Using Deep Learning

Keywords

1 Introduction

There are thousands of plants that can be found around us, many of which are utilised as medicines. Long-established methods for medications made from medicinal plants by our forefathers are used extensively within the pharmaceutical industry and research for which many lives have been saved. To this day, many indigenous people and populations employ medicinal plants. To date identifying plants successfully and correctly is done manually based on morphological characteristics (Begue et al. 2017). A professional botanist’s inherent knowledge is essential for recognising and discovering unknown new plants. People use their eyes, noses, hands or other human organs to assess the shape, colour, taste, and texture of the whole plant or individual parts (leaf, flower, fruit, or bark), and then they decide the species of medicinal plants based on either reference or experience (Kan et al. 2017). Nonetheless, manually identifying medicinal plants is time-consuming and difficult, and it is highly dependent on the person’s knowledge, experience, and skills about the plants. Hence, automatic classification of plants based on their characteristics is supported by many researchers researching this subject. Thanks to this many folks can now easily recognised medicinal plants. With the advancement of image processing and pattern recognition technologies, computer-based automatic image identification is now widely used in practice with the majority of them relying on machine learning techniques.

Machine learning is data analysis technique. It takes data as inputs, and it learns to spot or identify patterns before making a decision with little human intervention. It improves as a result of training. Machine learning is employed in a variety of fields, including medicine, finance, and many more. By attempting to automate the system, it decreases people’s workload.

Although the methods are very similar, the systems produced so far use a different number of steps to automate the process of automatic classification. These processes entail prepping the leaves collected, performing some pre-processing to identify their unique characteristics, classifying the leaves, building the database, training for identification, and finally evaluating the results. The most important parameter in recognition accuracy is the ability to detect similar things as well as distinguish between distinct types of objects.

2 Brief Literature Review

In paper (Begue et al. 2017), computer vision algorithm was employed to extract many shape-based properties from medicinal plants leaves. Machine learning methods are then used to classify 24 distinct plant species into their correct categories. The most accurate classifier was the Random Forest classifier which had a 90.1% accuracy.

In paper (Dudi and Rajesh 2019), a medicinal plant recognition based on CNN and machine learning was proposed, and 32 species and 1800 data were used from Flavia dataset. Both testing and training were conducted using the Flavia dataset. They employed ANN, SVM, KNN, and Naïve Bayes (NB) as well as other machine learning techniques. It was achieved with a 98% accuracy.

This research, (Kan et al. 2017) proposes a useful strategy for classifying 12 different types of medicinal plant leaf photos. After pre-processing of leaf photos, the shape and texture details of the plant leaves are retrieved. Secondly, the classification efficiency of numerous models such as the BPNN (BP neural network), probabilistic neural network (PNN), K-nearest neighbour classification algorithm, and SVM (support vector machine classifier) is examined in a comparison experiment. For the image classification of medicinal plant leaves, the suggested SVM classification approach based on both shape and texture data was effective and practicable.

It was proposed in this work (Venkataraman and Mangayarkarasi 2016) that a vision-based technique is used to construct an automated system that recognises plants. Even a man who is not familiar with this sector can grasp what plants he is seeing and their therapeutic properties thanks to the system’s design. The creation of the feature set, which is a key stage in recognising any plant species, is discussed in this work.

A leaf classification system based on the dual-path deep CNN is proposed in the publication (Shah et al. 2017) by Shah, Sougatta Singh et al. The remaining big operations will be carried out using the dual-path CNN.

i.
Both form and texture properties are examined and investigated.
ii.
The resulting features are optimised for categorisation.

It claimed that it outperformed other CNN approaches since the method they utilised had a good accuracy of around 99.28% (Flavia dataset).

References	Classifier	Accuracy (%)	Dataset	Training	Testing	Species
1	Random Forest	90.1	NM	NM	NM	24
2	ANN, KNN	98	1800	1400	320	32
3	SVM	93.3	1800	240	120	12
4	Non-classifier	*	5	*	*	1
5	Dual-Path CNN	99.28	6630	5500	800	29

3 Medicinal Plant Recognition

3.1 Classification Method

(A)
VGG16

VGG16 is a CNN model that has been pre-trained using images from the ImageNet dataset. VGG16 gets its name from the fact that it contains 16 layers with different weights. It’s a really huge network, with around 138 million (approximately) parameters. It has 3 × 3 filter convolutional layer with a stride of one, as well as an equivalent padding and max pool layer of 2 × 2 filter of stride 2. Throughout the architecture, the convolution and max pool layers are arranged in the same order. As illustrated in Fig. 1, VGG16 contains two FC (fully connected layers) for output, followed by a softmax. Each convolutional layer employs a fixed number of kernels with variable dimensions in each layer to conduct a simple convolution operation on the input pictures. The beginning layer of the convolution extracts edge and colour characteristics from the input images. The feature maps of the input photos are generated by these layers.

A schematic representation of V G G 16 architecture has a 224 by 224 photograph of a road with a car. It is processed into different sizes of convolution plus R e L u layers, max-pooling layers, fully connected plus R e L u layers, and softmax layers. — **Fig. 1**

The amount, number, and size of the filters used during a single convolutional layer determine the size of the feature maps. The output of the last convolutional layer is the input to the first fully connected layer. The first two each have 4096 channels, whereas the third uses a 1000-way ILSVRC classification system and hence has 1000 channels. The softmax layer receives the output from the last FC layer. The classification is done via the softmax layer, which also calculates the probability values for each species. Rectification (ReLU) nonlinearity is present in all buried layers.

(B)
VGG19

In VGG19 pictures with a fixed size of (224 * 224), RGB image is sent to the network, indicating that the matrix is of the shape (224, 224, 3). The only pre-processing is the elimination of the mean RGB value from each pixel; it utilises kernels of (3 * 3) size with a stride size of 1 pixel, allowing it to span the whole image concept. To retain the image’s spatial resolution, spatial padding is utilised. Max pooling is done with two-pixel sweeps across 2 * 2 pixel window. After that, a rectified linear unit (ReLu) was employed to inject nonlinearity into the model in order to enhance classification and reduce processing time, since previous models depended on sigmoid functions, which had shown to be considerably superior. In the end, VGG19 implements three fully connected layers, the first two of which were of size 4096, followed by a layer with 1000 channels for 1000-way ILSVRC classification, and last, as seen in Fig. 2, a softmax function.

A schematic representation of the V G G 19 architecture starts with convolution layers of depth 64 leading to a max pool layer, followed by 4 groups of convolution layers of depths 128, 256, 512, and 512 with a max pool layer in between every two. It ends with a softmax layer after a max pool layer. — **Fig. 2**

(C)
RESNET

Hundreds or thousands of convolutional layers are allowed to be used in Residual Network (ResNet); also ResNet is a Convolution Neural Network. As seen in Fig. 3, ResNet stacks identity mappings, or layers that don’t do anything at first, and passes over them, reusing the activations from preceding levels. Batch Normalisation is at the heart of ResNet. Batch Normalisation improves the network’s performance by adjusting the input layer. The issue of covariate shift has been solved. The Identity Connection is used by ResNet to prevent the network against vanishing gradient concerns.

A schematic representation of the residual network represents multiple stacked convolutional layers in two groups labeled V G G 19 and res net 152. The right side has a feed-forward neural network that contains res net 152 output plus V G G 19 output. — **Fig. 3**

4 Experimental Result

(A)
Data collection

A medicinal plant dataset (Mendeley dataset) is employed. Alpinia Galanga (Rasna), Citrus Limon (Lemon), Moringa Oleifera (Drumstick), and many other plant/herb species are included in the dataset. The dataset contains 1800 photos of 30 species. Each folder has 50–100 high-quality photos, and the folders are named by the species they contain. These datasets are then utilised to train and test the model. In the image, i.e. Figure 4, a sample data is shown.

A group of four photographs of four different leaves. These include Brassica Juncea, Basella Alba, Moringa Oleifera, and Citrus Limon. — **Fig. 4**

(B)
Experiment

Seventy per cent of the medicinal plant dataset is used for training, whereas 30% is used for testing. The training set contains approximately 1284 images, while the validation set contains approximately 551 images for 30 classes of Alpinia Galanga (Rasna), Amaranthus Viridis (Arive-Dantu), Artocarpus Heterophyllus (Jackfruit), Azadirachta Indica (Neem), Basella Alba (Basale), Brassica Juncea (Indian Mustard), Carissa Carandas (Karanda), Citrus Limon (Lemon), Ficus Auriculata (Roxburgh fig), Ficus Religiosa (Peepal Tree), Hibiscus Rosa-sinensis, Jasminum (Jasmine), Mangifera Indica (Mango), Mentha (Mint), Moringa Oleifera (Drumstick), Muntingia Calabura (Jamaica Cherry-Gasagase), Murraya Koenigii (Curry), Nerium Oleander (Oleander), Nyctanthes Arbor-tristis (Parijata), Ocimum Tenuiflorum (Tulsi), Piper Betle (Betel), Plectranthus Amboinicus (Mexican Mint), Pongamia Pinnata (Indian Beech), Psidium Guajava (Guava), Punica Granatum (Pomegranate), Santalum Album (Sandalwood), Syzygium Cumini (Jamun), Syzygium Jambos (Rose Apple), Tabernaemontana Divaricata (Crape Jasmine), Trigonella Foenum-graecum (Fenugreek). The dataset is then fed into the model, which is subsequently trained and assessed using several hyperparameters such as epoch, batch size, and optimiser.

The experiment is carried out using Google Colab on a machine with an Intel Core i3 processor, 8 GB of RAM, and 2 GB of AMD M3 graphics.

We aimed to evaluate the classification with different Convolutional Neural Networks. The results of the experiment are that VGG19 gains an accuracy of 0.9964; however, its loss accuracy is higher than that of VGG16. Meanwhile, RESNET 50 did perform well as it obtained an accuracy of 0.9951; however, due to its huge loss accuracy, it is not the suitable classifier for this experiment. Finally, we recognised that VGG16 performs the best when the optimiser is of ADAM, its epoch is 20 and its batch size is 64 with an accuracy of 0.9952. For evaluation, the results of the different CNN are given in Table 1.

Table 1 .

Full size table

RESNET

Out of all the experiments in RESNET50, we find that opt = adam, epoch = 6, batch_size = 64 has the highest accuracy, i.e. 0.9951.

VGG16

Optimiser	Epoch	Batch_size	Loss	Accuracy	Val_loss	Val_acc
SGD	6	32	1.9052	0.4877	1.8079	0.3891
	6	64	0.2303	0.9426	0.2501	0.9339
	10	32	0.1398	0.9776	0.1686	0.9572
	10	64	0.2728	0.9309	0.2265	0.9416
	20	32	0.3474	0.9124	0.3055	0.9339
	20	64	0.2019	0.9611	0.5116	0.8911
Adam	6	32	0.0828	0.9776	0.1253	0.9689
	6	64	0.0549	0.9864	0.1198	0.9611
	10	32	0.0776	0.9747	0.1851	0.9533
	10	64	0.0093	0.9971	0.1147	0.9650
	20	32	0.0290	0.9952	0.0749	0.9728
	20	64	0.0129	0.9971	0.1524	0.9533

Out of all the experiments in VGG16, we find that opt = adam, epoch = 20, batch_size = 64 has the highest accuracy, i.e. 0.9952.

VGG19

Optimiser	Epoch	Batch_size	Loss	Accuracy	Val_loss	Val_acc
SGD	6	32	0.0546	0.9883	0.1148	0.9650
	6	64	0.0273	0.9871	0.1128	0.9767
	10	32	0.0062	0.8948	0.0635	0.9767
	10	64	0.0100	0.9971	0.0774	0.9728
	20	32	0.0147	0.9987	0.1808	0.9300
	20	64	0.0363	0.9964	0.1805	0.9705
Adam	6	32	0.1512	0.9591	0.7713	0.8599
	6	64	0.2317	0.9292	0.6767	0.8282
	10	32	0.0202	0.9971	0.5598	0.9572
	10	64	0.0192	0.9971	0.4418	0.9494
	20	32	0.1231	0.9640	0.3721	0.9339
	20	64	0.0772	0.9734	0.4292	0.8949

Out of all the experiments in VGG19, we find that opt = SGD, epoch = 10, batch_size = 32 has the highest accuracy, i.e. 0.9964.

From the graph Figs. 5 and 6, we know that VGG16 is the best model since its accuracy and val_accuracy are higher than the other models at an epoch of 20, batch size of 32, and using adam as its optimiser.

A line graph depicts the different accuracy of the models with respect to different epochs and batch sizes. It plots the fluctuating trends of Res Net, V G G 16, and V G G 19. — **Fig. 5**

A grouped bar graph plots the values of accuracy, validation accuracy, and loss. Accuracy has the highest value of 0.9971 for V G G 19. — **Fig. 6**

5 Conclusion

In this work, we evaluated the different CNN models in classifying medicinal plants, and we recognised that out of all the models used VGG16 performed the best based on its accuracy, loss, validation loss, and validation accuracy. Besides CNN help in medicinal plant recognition to be applied in real life as it extracts the features on its own.

In the future, different state-of-the-art models could be used in the future to improve its accuracy. Furthermore, alternative machine learning algorithms can be applied, and their performance can be compared to the current performance of CNNs.

References

Azla MAF, Chua LS, Rahmad FR, Abdullah FI, Alwi SRW (2019).https://doi.org/10.3390/computers8040077
Begue A, Kowlessur V, Singh U, Mahomoodally F, Pudaruth S (2017) Automatic recognition of medicinal plants using machine leaning techniques. Int J Adv Comput Sci Appl (IJACSA) 8(4)
Google Scholar
Dileep MR, Pournami PN (2019) AyurLeaf: a deep learning approach for classification of medicinal plants. 978-1-7281-1895-6/19/$31.00 c 2019 IEEE
Google Scholar
Dudi B, Rajesh V (2019) Medicinal plant based on CNN and machine learning. Int J Adv Trends Comput Sci Eng 8(4):999–1003
Google Scholar
Gopal A, Gayatri V, Prudhveeswar Reddy S (2012) Classification of selected medicinal plants leaf using image processing. 978-1-4673-2322-2112/$31.00 ©2012 IEEE
Google Scholar
Kan HX, Jin L, Zhou FL (2017) Classification of medicinal plant leaf image based on multi-feature extraction. Pattern Recogn Image Anal 27(3):581–587. ISSN 1054-6618
Google Scholar
Pushpanathan K, Hanaf M, Mashohor S, Ilahi WFF (2020) Machine learning in medicinal plants recognition: a review.https://doi.org/10.1007/s10462-020-09847-0
Rajani S, Veena MN (2018) Study on identification and classification of medicinal plants. Int J Adv Sci Eng Technol 6(2)(Spl.Issue-2). ISSN(p) 2321-8991, ISSN(e) 2321-9009, http://iraj.in
Shah MP, Singha S, Awate SP (2017) Leaf classificationusing marginalized shape context and shape+texture dual-path deep convolutional neural network. 978-1-5090-2175-8/17/2017 IEEE
Google Scholar
Venkataraman D, Mangayarkarasi N (2016) Computer vision based feature extraction of leaves for identification of medicinal values of plants. In: 2016 IEEE international conference on computational intelligence and computing research
Google Scholar
Vo AH, Dang HT, Nguyen BT, Pham V-H (2019) Vietnamese herbal plant recognition using deep convolutional features. Int J Mach Learn Comput 9(3)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Engineering, School of Technology, Assam Don Bosco University, Guwahati, India
Avilie Khate & Bobby Sharma

Authors

Avilie Khate
View author publications
You can also search for this author in PubMed Google Scholar
Bobby Sharma
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Avilie Khate .

Editor information

Editors and Affiliations

Department of CSE, Indian Institute of Technology Guwahati, Guwahati, Assam, India
Jatindra Kumar Deka
Department of Mechanical Engineering, Indian Institute of Technology Guwahati, Guwahati, India
P. S. Robi
Department of Computer Science and Engineering, Assam Don Bosco University, Guwahati, Assam, India
Bobby Sharma

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Khate, A., Sharma, B. (2024). Medicinal Plant Classification Using Neural Network. In: Deka, J.K., Robi, P.S., Sharma, B. (eds) Emerging Technology for Sustainable Development. EGTET 2022. Lecture Notes in Electrical Engineering, vol 1061. Springer, Singapore. https://doi.org/10.1007/978-981-99-4362-3_28

Download citation

DOI: https://doi.org/10.1007/978-981-99-4362-3_28
Published: 01 November 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-4361-6
Online ISBN: 978-981-99-4362-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics