Abstract
Texture, a crucial aspect of an image, is something made up of components that are related to one another. Reliable feature extraction in image files requires the use of a texture-based categorization method, which is significant. This study proposes an effective method for classifying textures using machine learning (ML) approaches. Using these ML classifiers, which are in the form of artificial intelligence (AI), programmers can predict results exactly without providing instructed to do so explicitly. The proposed study focuses on the creation of own dataset in the form of CSV file, to do so Haralick features (contrast, dissimilarity homogeneity, energy, and correlation) extracted from the Brodatz texture dataset. Different ML algorithms are used like: K-Nearest Neighbor, Decision Tree Classifier, Random Forest Classifier, Gradient Boosting Classifier, and AdaBoost Classifier which are experimented on the created dataset to classify the texture of Brodatz dataset. Proposed approach exhibits better results with 100% accuracy with less computation time as compared to previous work in the literature.
Access provided by Autonomous University of Puebla. Download conference paper PDF
Similar content being viewed by others
Keywords
1 Introduction
Image texture is characterized as a recurring pattern appears on an object's surface or structure. In computer vision and computer graphics, texture is still a crucial and essential issue with a verity of applications, consisting of synthesis, image comprehension, and picture content querying [1, 2]. Texture is a property that is used to divide visuals into regions of interest and to categorize those parts. The physical layout of shades or intensities in a vision is made clearer by texture. The spatial patterns of intensity levels within a community define texture. Because of differences in visual appearance, orientation, or scale, textures in the actual world are not consistent, which presents a significant challenge for texture analysis [3].
A subclass of artificial intelligence [2, 4], machine learning, focuses on applying statistical methods to generate expert systems that can learn from databases that are already accessible. Machine learning algorithms use computer techniques to “learn” information directly from data, without requiring an existing equation as a model. The algorithms adjust to the performance of the samples when more are made available for learning. There are many classifiers used for classification in machine learning. The experimentation in the proposed paper considers different types of ML classifiers which are K-Nearest Neighbors (KNN), Decision Tree (DT), Random Forest Tree (RFT), etc.
2 Literature Review
Danuta et al. [1] presented the texture classification method and ML approach to identify image features of lightweight cementitious composites (LLC). The coating has changed with the nanocellulose and used them to address the strength of materials. Kapil et al. [2] show comparison result of the ML algorithms and chosen the K-Nearest Neighbor and the SVM classifiers to compare the results. Authors got the good accuracy with SVM classifier as compared to the k-NN. Daniel et al. [4] worked on most widely used machine learning algorithms in petroleum industry for reservoir properties. Authors used the ANN and SVM algorithms. ANN yields the better result and also worked with hybridization of multiple algorithms. Morshedul [5] worked with the machine learning algorithm to predict the Alzheimer’s disease. These machine learning algorithms are used to identify the Dementia among various patients, and they have used the OASIS dataset. Among all the algorithms, SVM has given the good result for detecting the disease. Gregorius et al. [6] proposed the ratings of online review using the machine learning algorithms. Authors have used the text preprocessing and feature extraction methods. First worked with single and ensemble model. Next applied the best identified classifier for prediction. Finally applied the linear support vector classifier and got the good results.
Garpebring et al. [7] have worked on the Haralick texture features for image analysis. Authors have examined the effectiveness of density estimation techniques for GLCM approximation and subsequent identification of the related invariant features. Hiremath and Bhusnurmath [3] have worked structured approach for classification of the texture by using the local directional binary patterns and non-subsampled contourlet transform. Hiremath and Bhusnurmath [8] have worked on texture image classification based on novel color textures using the local directional pattern and anisotropic diffusion based on the RGB color space.
3 Methodology
Proposed approach for image classification follows the following steps:
-
Step 1: Reading the Brodatz texture image from the dataset.
-
Step 2: Divide each texture image into sub image of size 64 X 64.
-
Step 3: Extracted Haralick Features (contrast, dissimilarity homogeneity, energy, and correlation) from each sub image.
-
Step 4: Create the CSV file from the feature extracted in Step 3.
-
Step 5: Divide the dataset into training and testing sets in ratio of 80:20.
-
Step 6: Train the ML classifiers using training set.
-
Step 7: Test the ML classifiers using the testing set.
-
Step 8: Repeat the steps from Steps 6 to 7 using different ML classifiers.
-
Step 9: Select the best ML classifier in terms of accuracy.
3.1 Feature Extraction
Features are used to identify the characteristics of the image textures [9]. Proposed work is carried out using the following features.
3.1.1 Haralick Features
Haralick features are acquired from the Gray-Level Co-occurrence Matrix (GLCM). The GLCM describes how many times two gray-level pixels are adjacent to each other in an image [9]. For the proposed work, the five Haralick features extracted are explained below.
-
Contrast: Contrast is used to show the difference between amount of grayscale or color that exists in the images.
-
Dissimilarity: It shows the how data samples are different from one another.
-
Homogeneity: Is of type region of image that shows the changes of intensity that occurs in region.
-
Energy: Describe the changes in quality of image.
-
Correlation: Is the process of moving the mask over the image to compute the sum of product of each area.
3.2 Classification Algorithms
For the proposed approach, the different machine learning algorithms used are listed below:
-
AdaBoost Classifier.
-
Gradient Boosting Classifier.
-
Random Forest Classifier.
-
K-Neighbors Classifier.
-
Decision Tree Classifier.
3.2.1 AdaBoost Classifier
AdaBoost Classifier [10] is a supervised machine learning algorithm which is primarily used for the classification as well as regression problems.
3.2.2 Gradient Boosting Classifier
A Gradient Boosting Classifier [4, 10] is also proven to the one of strong methods, and it is also used for the both classification purpose and regression problems. It is a group of machine learning ML that is used to combine the weak models together to come up with strong model.
3.2.3 Random Forest Classifier
Random Forest [3, 10] is a machine learning algorithm which is used for regression and classification problems. It is known as meta-estimator which is collection of many numbers of decision trees. It creates the set of decision trees from randomly chosen training set.
3.2.4 K-Neighbors’ Classifier
K-Neighbor Classifier [8, 9] is a non-parametric supervised learning classifier used for the classification or prediction, which works based on the neighbor around the class. It works by finding the distance between the class and the examples of the data. Here, K defines the nearest value for the particular class.
3.2.5 Decision Tree Classifier
A classifier [15] creates the classification model by the decision trees. Each tree has defined the attribute and each attribute is having the one possible prediction value for the class. Here, the data will be continuously split by some parameters. The tree can have mainly two parts that is decision nodes and the tree levels.
4 Data Collection
4.1 Dataset Preparation
Proposed approach is experimented on two Brodatz datasets of texture images. First dataset consists of 1600 sub-images from 16 images from the Brodatz texture dataset, and it is termed as Brodatz-1. The second dataset termed as Brodatz-2 consists of 11,100 sub-images derived from 111 images of the Brodatz texture album. The images are grayscale images with format.gif and are without rotation. The Brodatz texture dataset contains 111 texture images, each of size 640 × 640 pixels. For both the datasets, sub-images are re-sampled into 100 non-overlapping sub-patch of size 64 × 64 pixels. The 16 chosen images for Brodatz-1 dataset are shown in Fig. 1. Detailed description of the datasets is shown in Table 1.
Table 1 shows the detailed description of two Brodatz texture datasets used for the proposed work.
4.2 Randomizations and Splitting the Data
The dataset which is mentioned in Sect. 4.1 is split into training and testing sets in ratio of 80:20, respectively.
5 Results and Discussion
Proposed work is experimented on Intel core i3 processor running at 2.40 GHz speed using 4 GB RAM, Windows 10 Operating System.
Proposed works focused on ML classifiers which are discussed in 3.2 are experimented. The results are as follows: Figs. 2, 3, 4, 5, and 6 represent the results of Brodatz-1 (1600 texture images) dataset in the form of confusion matrix for AdaBoost Classifier, Gradient Boost Classifier, Random Forest Classifier (RFC), K-Nearest Neighbor Classifier (KNN), and Decision Tree Classifier (DTC), respectively. Figure 7 represents the bar chart view of all five classifiers.
Table 2 shows the results of all created classifiers during this experiment.
Above figures show the details about the all the classifiers’ prediction. From the results, we can conclude that the best result is obtained for K-Nearest Neighbor Classifier with classification accuracy of 100% and the Gradient Boost classifier classification accuracy of 99% which are best fit for the feature extracted from the Brodatz image texture dataset.
Table 2 shows the details about the classification results’ accuracy and the classification metrics precision, recall, and F1-score of each classifier of Brodatz-1 (1600 texture images) dataset. The train time and test time for all the classifiers are also tabulated.
From the Table 2, it is observed that experimented results are well suited on K-Nearest Neighbor Classifier with 100% accuracy and less computation time whereas Gradient Boost Classifier exhibits accuracy of 99%, Random Forest of 64%, Decision Tree Classifier of 65%.
Table 3 shows the results of Brodatz-2 (11,100 texture images) texture dataset.
It is observed from Table 3 that K-Neighbor Classifier and Gradient Boosting Classifier best fit for the proposed work with 100% accuracy.
The performance is evaluated on Brodatz dataset in the form of accuracy. Experimental results of the proposed experimentation have given better classification rate in comparison with some state-of-the-art approaches (Table 4).
6 Conclusion and Future Work
The proposed study mainly focuses on creation of new dataset in the form of CSV file using Brodatz texture dataset through Haralick feature extraction. Experiment is carried out on two datasets, one on 16 Brodatz image texture dataset and another one is on 111 Brodatz image texture dataset. Different machine learning classifiers (AdaBoost Classifier (ABC), Gradient Boosting Classifier (GBC), Random Forest (RF) Classifier, K-Nearest-Neighbors (KNN) Classifier, and Decision Tree (DT) Classifier) are experimented on the both datasets to classify the Brodatz textures. The proposed approach has performed better on created dataset. The future work can be done with deep learning techniques to obtain better accuracy.
References
Barnat-Hunek D, Omiotek Z, Szafraniec M, Dzierżak R (2021) An integrated texture analysis and machine learning approach for durability assessment of lightweight cement composites with hydrophobic coatings modified by nanocellulose. Measurement 179:109538
Sethi K, Gupta A, Gupta G, Jaiswal V (2019) Comparative analysis of machine learning algorithms on different datasets. In: Circulation in computer science international conference on innovations in computing (ICIC 2017), vol 87
Hiremath PS, Bhusnurmath RA (2014) A novel approach to texture classification using NSCT and LDBP. Int J Comput Appl 0975-8887
Otchere DA, Ganat TOA, Gholami R, Ridha S (2021) Application of supervised machine learning paradigms in the prediction of petroleum reservoir properties: comparative analysis of ANN and SVM models. J Petrol Sci Eng 200:108182
Bari Antor M, Jamil AHM, Mamtaz M, Monirujjaman Khan M, Aljahdali S, Kaur M, ... Masud M (2021) A comparative analysis of machine learning algorithms to predict Alzheimer’s disease. J Healthc Eng 2021
Budhi GS, Chiong R, Pranata I, Hu Z (2021) Using machine learning to predict the sentiment of online reviews: a new framework for comparative analysis. Arch Comput Methods Eng 28(4):2543–2566
Garpebring A, Brynolfsson P, Kuess P, Georg D, Helbich TH, Nyholm T, Löfstedt T (2018) Density estimation of grey-level co-occurrence matrices for image texture analysis. Phys Med Biol 63(19):195017
Hiremath PS, Bhusnurmath RA (2014) Texture classification using anisotropic diffusion and local directional binary pattern co-occurrence matrix. In: Proceedings of 2nd International conference on emerging research in computing, information, communication and applications (ERCICA 2014), vol 2, pp 763–769
Alharan AF, Fatlawi HK, Ali NS (2019) A cluster-based feature selection method for image texture classification. Indonesian J Electr Eng Comput Sci 14(3):1433–1442. Patel DR, Vakharia V, Kiran MB (2019) Texture classification of machined surfaces using image processing and machine learning techniques. FME Trans 47(4):865–872
Armi L, Fekri-Ershad S (2019) Texture image analysis and texture classification methods—a review. arXiv preprint arXiv:1904.06554
Rao MS, Reddy BE, Kadiyala R, Prasanna K, Singh S (2021) Texture classification using Minkowski distance measure-based clustering for feature selection. J Electron Imaging 31(4):041204
Brodatz P (1966) Textures: a photographic album of artists and designers. Dover Publication, New York
Acknowledgements
Authors are thankful to the reviewers for all the suggestions and constructive criticism that helped to enhance the quality of the manuscript.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Bhusnurmath, R.A., Doddamani, S. (2024). Texture Feature Extraction and Classification Using Machine Learning Techniques. In: Shetty, N.R., Prasad, N.H., Nalini, N. (eds) Advances in Computing and Information. ERCICA 2023. Lecture Notes in Electrical Engineering, vol 1104. Springer, Singapore. https://doi.org/10.1007/978-981-99-7622-5_35
Download citation
DOI: https://doi.org/10.1007/978-981-99-7622-5_35
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-7621-8
Online ISBN: 978-981-99-7622-5
eBook Packages: EngineeringEngineering (R0)