Introduction

Wood species are typically identified by the physical appearance of the tree such as the shape of their trunk and leaves, and the smell of their bark and flower. But, when the trees are cut down and processed into timber logs, these significant physical appearances are no longer available to show the distinctiveness of the tree species. One of the characteristics which remain unique to each tree species even after undergoing chemical treatment is the surface texture. Each wood species exhibits distinctive texture features which may be used to discriminate the tree species such as the arrangement of vessel or pores, wood parenchyma or soft tissue, rays parenchyma, fibres, phloem, latex traces and intercellular canals (Wheeler 2011). For instance, different wood species show variations in size, arrangement, quantity and density of pores on their surfaces. The presence of single, pair and chain pores on a wood surface is shown in Fig. 1, while Fig. 2 shows two species of wood texture of different pore densities (Menon et al. 1993). The IAWA lists define the internationally standardized features commonly used for wood identification (Ruffinatto et al. 2015).

Fig. 1
figure 1

Types of pores arrangement on wood surface

Fig. 2
figure 2

Examples of wood species with a high-density pores and b low-density pores

The analysis of wood texture under varying illumination, rotation, scale and view is extremely complex and nonlinear. The wood texture even varies within similar species due to the factors such as location of growth, weather and age, as shown in Fig. 3. In some cases, the texture of different woods might look similar even though they belong to different species as exhibited in Fig. 4.

Fig. 3
figure 3

Two wood samples of the wood species Calophyllum flavo-ramulum with different quantities of pores. a Numerous pores b less pores

Fig. 4
figure 4

a Shorea laevis, b Shorea maxwelliana, c Shorea exelliptica, d Dryobalanops oblongifolia

Nonetheless, wood species classification is crucial for timber-exporting countries to curb fraudulent labelling of timber species at custom checkpoints. Thus, an automatic wood recognition system is highly desirable for stricter inspections. Over the years, several wood recognition systems have been developed and are mainly based on three approaches: spectrum-based, chemometric-based and image-based systems. Spectrum-based methods provide a lot of information which can be used to identify the wood species as proposed by Fuentealba et al. (2004), Piuri and Scotti (2010), Baas and Wheeler (2000), Rojas et al. (2011), Dawson-Andoh and Adedipe (2012). Chemometric-based approaches such as the one proposed by Miranda et al. (2014) have shown to be useful in analysing and discriminating wood species. However, both spectrum-based and chemometric-based approaches require specialized equipment, highly trained experts and laboratory set-up. They are not suitable to be implemented in real-life applications by non-experts. Generally, image-based methods are cheaper, simpler and easily understood by non-experts. Examples of tropical wood recognition systems are the ones proposed by Khalid et al. (2008), Bremanath et al. (2009), Denzler et al. (2013), Khairuddin et al. (2011) and Yusof et al. (2013). However, these works do not consider the possibility of database expansion. As time goes by, the database is expected to grow as more wood samples are added. Thus, a smart and reliable approach is needed to manage the growing wood database efficiently.

In this study, a fuzzy-based database management system which emulates human interpretation of wood texture is introduced. First, the wood database is divided into 4 broad categories (or sub-databases) using fuzzy rules based on the texture pore features of the woods. Given a test image of an unknown species, a set of texture pore features are extracted from the image and used as inputs to the fuzzy rules to assign it to one of the four categories. Then another set of texture features are extracted from the image and used with the support vector machines dedicated to the assigned category to further classify the test image to a particular wood class. When a new wood species is added to the database, the system will perform a pre-classification step to assign the new member to one of the four sub-databases. Therefore, only the SVMs of the assigned sub-database need to be retrained. This shortens the training time and emulates the experts’ reasoning when expanding the wood database. Moreover, the use of sub-databases is particularly useful as it will reduce the processing complexity and the classification time. The experimental results show that the proposed system is more robust and easier to handle when more wood species are introduced into the system.

Proposed model for wood classification

As shown in Fig. 5, the fuzzy-based database management splits the large wood database into four sub-databases based on the size of the pores which are small (S), small-to-medium (SM), medium-to-medium (MM) and large (L). The feature extraction (FE) process extracts wood features from each wood image for the classifier. For each sub-database, a group of dedicated SVMs are trained to classify the wood species of its members based on the input texture features extracted from the wood images. Given a test sample, the fuzzy pre-classifier will first assign it to one of the four groups depending upon its pore features. Then, the SVM classifier will determine the wood species of the test samples based on other texture features extracted from the image.

Fig. 5
figure 5

Block diagram of the proposed model to classify wood species

Wood image acquisition

The wood samples in the form of cubes with a side of 1 inch were prepared by the Forest Research Institute of Malaysia (FRIM). For each species, 100 samples were obtained from 10 different trees at random position in the cross section of the timber. Then, the wood samples were treated by sanding their surfaces with sandpapers of increasing grit. Finally, the samples were polished with sandpapers of 600 grit. A specially designed portable camera was used to capture the surfaces of the samples at 10 times magnification. The distance between the camera and the wood sample was approximately 10 cm. The size of each image is 768 × 576 pixels with 256 grey levels at 300 dpi resolution. All wood images are pre-processed using homomorphic filters to sharpen the wood features and flatten lighting variation in the images. A sample of an enhanced image is shown in Fig. 6a. Its corresponding binary images showing only black pores and white pores are shown in Fig. 6b, c, respectively.

Fig. 6
figure 6

a Homomorphic image, b binary images showing the black pores only and c binary images showing the white pores only

Fuzzy-based database management

Fuzzy logic algorithm mimics human reasoning in categorizing data sets and is thus suitable for database management of wood samples in this work. Examples of other applications of fuzzy logic include islanding detection in distributed network by Mokhlis et al. (2012), level detection of epilepsy in EEG signals by Sukanesh and Harikumar (2007), classification of carotid ultrasound images by Molinari et al. (2010), recognition of wood colour by Bombardier and Schmitt (2010), motion information-based video retrieval system and biometric identification system. The purpose of managing the wood database is to allow easy addition of wood species to the database and minimize the associated retraining process of the SVM classifiers used. The proposed fuzzy-based database management serves as the clustering mechanism which divides the wood samples into several groups based on pores sizes and quantities in a pre-classification step.

For a set of input images with n feature variables and m image samples, the input data (wood sample features), x p , on the pattern space are represented by the following pattern matrix:

$$x_{p} = \left[ { \, x_{p1} , \, x_{p2} , \ldots , \, x_{pn} } \right],\quad \, p = 1,2, \ldots ,m$$
(1)

These training patterns can be formed into M (M ≤ m) classes. The algorithm is presented below.

For every wood image, the pore sizes are divided into three categories: small, medium and large pores. The measurement of pore size is defined in Table 1 by Menon et al. (1993).

Table 1 Measurement of pore size (Menon et al. 1993)

Pore counting process is performed by calculating the number of pores for each category of pore size. The results of pores counting process (Table 2) are defined as x p  = [f 1 f 2 f 3], where f 1 represents the quantity of small pores, f 2 represents the quantity of medium pores and f 3 represents the quantity of large pores. Since a wood sample consists of small, medium and large pores, only the maximum pores (maxPores) were chosen to be used as fuzzy inputs. The maximum pores (maxPores) are identified by choosing the highest quantity of pores which belong to either small, medium or large pores. The maxPores size and quantity are the inputs to the fuzzy logic pre-classifier, and the group of wood samples represents the output.

Table 2 Example of results of pores counting process for 4 wood species
  • Input size maxPores, x p1 = {small, medium, large}

  • Input quantity maxPores, x p2 = {less, medium, many}

  • Output, M = {group S, group SM, group MM, group L}

In this work, let the fuzzy input be the size and quantity of maxPores of each wood sample (Fig. 7). The parameters of membership function for input size and quantity are defined as a = 95, b = 105, c = 195, d = 205 and a = 8, b = 12, c = 95, d = 115, respectively. The parameter for input size is determined based on the measurement defined by Menon et al. (1993). The parameter for input quantity (quantity of pores on an image) is determined by judiciously adjusting the width of membership function to obtain the best result. There are 4 fuzzy outputs which represent 4 different groups. Each group is uniquely created for a specific size of pores: small, medium and large. Then, medium pores group is divided into 2 groups based on the quantities, since majority of the wood samples contain medium-sized pores. Hence, the output of the fuzzy system is Group S, Group L, Group SM and Group MM. Basically, group SM represents wood samples of medium maxPores with quantity of maxPores less than 100 pores, while group MM represents wood samples of medium maxPores with quantity of maxPores more than 100 pores. It is decided for the output to have 4 groups because this will enable the wood experts to pre-classify the wood samples manually just by considering the pores size and pores quantity. For example, when the wood expert identifies a wood sample which contains majority of small pores, then, the wood sample belongs to group S. If there are too many groups, then it might be difficult for the wood expert to manually distribute the wood samples into its respective groups.

Fig. 7
figure 7

Membership functions for input a size and b quantity

Feature extraction

A feature fusion technique is implemented in this research, where two feature data sets obtained from two different feature extractors, namely basic grey-level aura matrix (BGLAM) and statistical properties of pores distribution (SPPD), were concatenated into one database. A total of 157 wood features were extracted using the fusion of feature extractors.

In essence, the BGLAMs of an image characterize the co-occurrence probability distributions of grey levels at all possible displacements configurations. The framework of BGLAM feature extractor is explained in detail in Qin and Yang (2004, 2005, 2007). Basically, in the BGLAM framework, an image X is modelled as a finite rectangular lattice S of m × n grids (m is the number of nearest neighbours from the central pixel, while n is number of grey levels) with a neighbourhood system \(\varvec{N} = \{ N_{s} , \, s \in S\}\), where N s at site s is observed as a translation of a basic neighbourhood, denoted as E, which is called the structuring element for the neighbourhood system N. The basis of all grey-level aura matrices (GLAMs) can be identified from a set of particular GLAMs computed from a single-site neighbourhood system which is called basic GLAMs (BGLAMs). Qin and Yang (2007) proved that two images of the same size are identical if and only if their corresponding BGLAMs are the same. In this way, an image can be uniquely represented by its BGLAMs. The experimental results explained in Qin and Yang (2005) show that BGLAMs outperform both symmetric GLAMs and grey-level co-occurrence matrix (GLCM). GLCM feature extractor has been applied in previous work and is explained in more detail in Khalid et al. (2008). In this paper, the image is segmented into 16 grey levels with each window size 7 × 7. Total number of BGLAM features is calculated based on \(N_{\text{f}} = \frac{{{\text{GL }}\times \left( {{\text{GL}} + 1} \right)}}{2}\). Hence, for 16 grey levels (GL), total number of features (N f) are 136.

SPPD extracts statistical properties of pores distribution on the wood image. This feature extraction method imitated the human visual inspection of the wood images. Part of the features calculated in SPPD are used by humans to classify wood species such as the number of pores of different sizes, the distribution of pores and the density of pores. The features extracted from each wood sample using SPPD are as follows (Yusof et al. 2013):

  1. 1.

    Mean size of the pores and the corresponding standard deviation (features).

  2. 2.

    Mean distance between pores and the corresponding standard deviation (2 features).

  3. 3.

    Number of small, medium and large pores (3 features).

  4. 4.

    Number of pores per square millimetres (1 feature).

  5. 5.

    Number of pair and solitary pores (2 features).

By using SPPD feature extractor, 10 features are obtained from black pores image and 10 features are obtained from white pores image. Another feature is the mean grey-level feature which is obtained from the original image. Hence, the total features extracted from each image by using SPPD technique are 21 features.

Multi-class support vector machine (SVM) for final classification

Multi-class SVM is an efficient and robust supervised classifier and has been shown to be effective in solving pattern classification problem (Gu et al. 2010). SVM has advantage over other existing classification techniques as it provides global solution for the data classification (Vapnik and Cortes 1995). A radial basis function (RBF) kernel SVM is used in this study to perform multi-class classification of wood species. When the training data are inseparable in the original space, the original input data (wood features) x i are mapped into a high-dimensional space Ф(x i ), in which mapped data are linearly separable. The expression for decision rule is as below (Chang and Lin 2011):

$$f\left( x \right) = \text{sgn} \left( {\mathop \sum \limits_{i = 1}^{l} \propto_{i} , y_{i} k\left( {x_{i} ,x_{j} } \right) + b} \right),$$
(2)

where \(k(x_{i} ,x_{j} )\) is a kernel function, and \(\propto_{i}\) denotes the Lagrange multipliers which describe the optimal separating hyperplane. The RBF is given by K (x i , x j ) = \(e^{{\left( { - \gamma \left\| {x_{i} - x_{j} } \right\|^{2} } \right)}}\), \(\gamma > 0\), where \(\gamma\) is the kernel parameter.

Experimental results and discussion

In order to solve illegal logging issues, several automated wood identification systems based on machine vision have been presented to achieve quantifiable, repeatable and reliable pattern recognition results (Hermanson and Wiedenhoeft 2011). Hardwood identification can be done based on microscopic features as listed in Wheeler et al. (1989) or macroscopic features as listed in Ruffinatto et al. (2015). There are possibilities for macroscopic wood identification to be much more limited than microscopic wood identification. In cases of closely related trade timbers, classification based on macroscopic features will probably end with a choice of several likely matches where safe separation must be left to microscopic study performed by the scientific institutions with the necessary laboratory equipment and experienced staff. However, this research focuses on automated identification based on macroscopic features which emulates the wood experts to carry out the inspection on wood texture. Besides that, the implementation of portable camera enables the identification system to be performed anywhere, for example at custom checkpoints, and can be employed even by non-experts. In this research, the proposed wood recognition system focuses on tropical wood database, and the proposed system is benchmarked with previous systems which used similar wood database based on macroscopic features.

There are three experiments conducted by using 48 wood species. There are 100 wood images captured for each wood species. From the 100 images, 70 images are used as training samples while 30 images are used as testing samples. The 70 training samples are taken from 7 individuals (trees) that are randomly chosen from 10 individuals (trees), while the 30 testing samples are taken from the remaining 3 individuals.

The experiments were done in two phases: training phase and testing phase. In training phase, all training wood samples are stored in a training database and labelled with the wood species name. In testing phase, 48 wood species are labelled with their corresponding wood species name. The classification will be done based on the trained wood database. The original label of testing database is kept to compute the classification accuracy of the wood recognition system.

The proposed wood database management

The training wood samples are distributed into 4 groups based on the pores distribution using the proposed fuzzy logic technique. The wood databases created after performing the fuzzy database management are named group S, group SM, group MM and group L, which are the representatives of their pore size group through manual inspection by wood experts.

The graphs for 20 wood samples are presented in Figs. 8 and 9. Each wood sample may consist of 3 types of pores sizes (small, medium and large). The y-axis represents the quantity of pores for 3 different pores sizes. The maxPores is determined by the size of pores that has the highest quantity. For example, the wood images in group S consist of wood samples that have small pores as maxPores (Fig. 8). On the other hand, group L contains wood samples that have large pores as the maxPores (Fig. 9). Manually, wood experts will classify a wood sample based on the pores sizes and quantity from the wood texture. Hence, the results show that the proposed technique can be used to mimic the human interpretation on wood texture based on the pores sizes and pores quantity (Fig. 10).

Fig. 8
figure 8

Graph of quantity of wood pores for 20 wood samples in group S

Fig. 9
figure 9

Graph of quantity of wood pores for 20 wood samples in group L

Fig. 10
figure 10

Wood samples from group S a Hopea ferrea, b Hopea apiculata

As can be observed manually in Fig. 10, the wood samples in group S consist of small pores compared to wood samples in group L which consist of large pores as shown in Fig. 11.

Fig. 11
figure 11

Wood samples from group L a Durio spp., b Endospermum diadenum

The performance of the proposed database management is computed for each fuzzy group by using 1440 test samples (30 test samples for each wood species × 48 species) as shown in Table 3. The total classification error caused by database management is \(\frac{16}{1440} \times 100\;\% = 1.11 \;\%\), which means out of 1440 test samples, only 16 test samples were distributed into the wrong fuzzy group due to nonlinear features on the wood texture. The pre-classification error for small and large pores is computed based on 100 and 120 test samples, respectively, compared to group SM and group MM which have 560 and 660 test samples, respectively. The test samples are unevenly distributed into 4 groups based on the pores features. It can be seen that the majority of the test samples belongs to group SM and MM.

Table 3 Pre-classification error for each fuzzy group

Final classification of wood species

This experiment is done to evaluate the capability of the proposed system to classify the test samples accurately based on the trained wood database. The SVM classifier managed to classify 1418 test samples correctly by using the proposed database management approach which resulted in \(\frac{1418}{1440} \times 100 = 98.5\;\%\) classification accuracy (1.5 % classification error). For comparison purposes, the accuracy of classifying 48 species prior to referring to the proposed database is \(\frac{1417}{1440} \times 100 = 98.4\;\%\). The proposed database management aims to treat the uncertainty and to simplify the classification process which will result in higher classification accuracy. This is proven by the increment of 0.1 % of classification accuracy when including the database management into the system. The result established the facts that the proposed fuzzy rules used in the proposed technique are able to arrange the large wood database efficiently based on their pores features. The used fuzzy logic rules in the proposed database management of wood recognition system emulates the experts’ inspection of wood pore features, paving the way for possibilities of a better classification accuracy.

Benchmark with previous works

In order to test the robustness of the proposed database management system, experiments have been implemented for 2 sets of database: 25 species database and 48 species database. Firstly, the experiments were done on a small training database which consists of 25 wood species. Then, the number of training wood species is increased to 48 wood species. As the size of database increased, the system’s performance may reduce. The smaller the reduction of the system’s performance, the more robust the system is towards increment of database.

Table 4 presents the comparisons between the proposed system and the previous systems in terms of robustness towards increment of size database. The system by Khalid et al. (2008) proposed the implementation of grey-level co-occurrence matrix (GLCM) which results in the highest percentage of performance reduction. Then, the performance reduction has been improved in systems proposed by Khairuddin et al. (2011) and Yusof et al. (2013), which focused on implementing genetic algorithm (GA) feature selection and kernel-GA nonlinear feature selection, respectively. It can be seen that the proposed system gave the least percentage of reduction which is 0.4 % compared to previous works. Hence, it can be concluded that the proposed system is more robust towards the increment of wood database compared to previous systems.

Table 4 Comparison of robustness between the proposed system and previous systems

Conclusion

An automated wood species recognition system which incorporates a fuzzy-based database management is presented. The use of fuzzy-based database management emulates the inspection of wood pore features normally performed by human experts in dividing the wood samples into four sub-databases. The advantage of dividing the database into four smaller databases is that when a new wood species is added to the system, the system will assign it to one of the four sub-databases. Thus, only the SVM classifier of the assigned sub-database needs to be retrained instead of those of the entire database. This shortens the training time and emulates the experts’ reasoning when expanding the wood database. Moreover, the use of sub-databases is particularly useful as it will reduce the processing complexity and the classification time. The experimental results show that the proposed system is more robust and easier to handle when more wood species are introduced into the system. Future works include introducing better features with higher discriminative power, increasing the number of wood species and testing other classifiers.