1 Introduction

Illegal logging has been identified as a major cause of deforestation and loss of valuable biodiversity, ecosystem, resources and other assets associated with the tropical rainforest. Despite tighter conservation regulations, demand in wood products has continued to increase due to growing population. Therefore, intelligent systems for the recognition of wood species have been proposed and developed as one of the measures to curb illegal logging. The systems should be able to identify the timber species accurately and approximate the age of the tree for a particular log.

Normally, experts identify the wood species based on the pattern of the wood surface texture (Menon et al. 1993). Experts who examine wood surface texture scrutinize several important characteristics of the wood such as the arrangement of vessel or pores, wood parenchyma or soft tissue, rays parenchyma, fibers, phloem, latex traces and intercellular canals. For instance, different wood species will have vessels of various size, arrangement, quantity and density on their surfaces.

The IAWA lists define the internationally standardized features for wood identification (Ruffinatto et al. 2015). This process of manual inspection is tedious and time consuming. Hence, several intelligent systems for the recognition of wood species have been developed based on two approaches: spectrum based and image based systems.

Spectral analysis is one of the most useful techniques in modern science. The method is based on the measurement of internal properties of products obtained by using nondestructive techniques in transmission mode such as γ-rays, X-rays, microwaves or ultrasound. This approach allows thorough inspection of the wood characteristics since internal features of wood are embedded within. The macroscopic wood features that are analyzed using spectrum analysis method provide a lot of information that can be used to identify the wood species efficiently (Fuentalba et al. 2004; Choffel 1999; Piuri and Scotti 2010; Bass and; Wheeler 2000; Rojas et al. 2011). Despite the efficiency of spectrum-based approaches, they are normally time consuming and require special equipment, settings and well trained experts who can interpret the data. Additionally, these techniques are more suitable to be carried out in laboratories rather than on-site in real applications.

Image based wood species classification is simpler because wood surface cross section can be examined by human naked eye with the aid of a magnifying glass. The wood texture is examined from the timber surface using a magnifier with 10× magnification. Several automatic tropical wood recognition systems have been developed based on image analysis to furnish quantifiable, repeatable and reliable results. Examples are in the works of Hermanson and Wiedenhoeft 2011; Tou et al. 2007, 2008; Khalid et al. 2008; Bremanath et al. 2009 and Filho at al. 2010. Approaches based on image analysis normally extract distinctive textural features from the grey level images of the wood surfaces to identify and discriminate the tree species (Wheeler 2011). Myriad of textural features can be used to classify wood textures and they include statistical, structural, transform based, model based, multi resolution and numerous other ad-hoc features.

Statistical features are inherent to the distribution of grey levels in images and they are widely used to identify textural patterns (Peng et al. 2015). Generally, different wood species have different statistical features that determine the wood types. Image based wood recognition systems are usually designed to emulate visual inspection of wood textures by human experts. Fuzzy logic can be incorporated into the system to provide linguistic interpretation normally used by experts when handling approximate data for vaguely defined problems (Anninou and Groumpos 2014; Batuwita et al. 2010). For wood texture recognition involving many classes, fuzzy model is especially useful in initial data clustering (grouping) before the final classification of the wood species is made.

In this paper, a cascaded wood species classification method is proposed based on fuzzy logic and statistical features that imitate the way human experts perform inference when analyzing wood texture. First, vessel features are used to form a broad initial grouping of the wood types. Then, a neural network classifier is used with the statistical features to further classify the wood types to arrive at the final classification of the wood species.

2 Materials and methods

2.1 Image acquisition of the wood surface texture

The first step in the proposed wood species recognition system is the image acquisition of the wood surface texture. One of the characteristics that remain unique to each wood species even after undergoing the chemical procedures is the surface texture. Different wood species will have variations in size, arrangement, quantity and density of vessels on the wood surface. These features are very important and can be used to identify the wood species. One of the important features used to identify a wood species is the vessel density. Certain wood species might have numerous vessels while some wood species can have very few vessels on the wood surface. Vessel density is determined based on the quantity of vessels per square millimeters on the wood surface. Figure 1 presents two different samples from two different wood species.

Fig. 1
figure 1

Example of wood surface with a high density and b low density

The samples are wood cubes whose side is approximately 1 inch in dimension. The treatment of the wood samples was done by sanding the wood surfaces. A specially designed portable camera is used to capture wood texture images at 10 times magnification. The camera is equipped with a systematic focusing function whereby the distance between the camera and the wood sample is fixed to 10 cm. The housing of the camera is made of a tube based on theoretical optimal object distance and hence, the object is just required to be laid against the camera housing. The “tube function” is made in opaque material in order to cut all ambient light and its fluctuations. The size of each image is 768 × 576 pixels.

The original wood image is enhanced using homomorphic filtering technique to enhance the image presentation. Homomorphic filtering technique reduces all unwanted illumination and reflectance on the image (Woods and Gonzalez 2008). Homomorphic filtering uses a linear filter to do non-linear mapping to a different domain and later it was mapped back to the original domain. The algorithm reduces all unwanted illumination and reflectance on the image. Image brightness was also normalized, thus enhanced the contrast of the image. The algorithm on homomorphic filtering is explained in more detail in Woods and Gonzalez (2008). The summary of homomorphic filtering process is shown in Fig. 2.

Fig. 2
figure 2

The flowchart of homomorphic filtering process where ln natural log, DFT discrete fourier transform, IDFT inverse discrete Fourier Transform and exp exponential

Figure 3 shows the original image and filtered image after undergoing the homomorphic filtering process.

Fig. 3
figure 3

a Original image of wood sample, b homomorphic image of wood sample

2.2 Proposed feature extraction

Vessels appearances are considered as one of the significant features that can be used by experts to identify the wood species. This is because vessels are exclusive for every species. Therefore, this research will focus on size of vessels, quantity of vessels, types of vessels and arrangement of vessels on the wood texture. There are 24 features extracted from each wood image by using the proposed statistical feature extractor.

The statistical feature extraction process consists of two steps namely vessels extraction and fuzzy vessels management (Fig. 4). The statistical features will only allow distinct pores to be acknowledged as characteristics of a wood species. Then, a neural network classifier is used to classify the wood species based on the statistical wood features.

Fig. 4
figure 4

Flowchart of the proposed wood recognition system

The first step in the proposed statistical feature extraction is the vessels extraction process. Basically, binary images are created from the homomorphic filtered image where only black pores are present and only white pores are present as shown in Fig. 5.

Fig. 5
figure 5

a Homomorphic image, b binary images showing the black vessels only and c binary images showing the white vessels only

The binary images are created by using the default binarization based on Otsu’s method for optimal threshold selection. A threshold filter displays each pixel of an image in only one of two states, black or white. That state is set according to a particular threshold value. If the pixel’s brightness is greater than the threshold, the pixel is colored white, if less than, black (Ramli et al. 2015). This thresholding technique stores the intensities of the pixels in an array. The threshold is calculated by using total mean and variance. Based on this threshold value, each pixel is set to either 0 or 1. Therefore, the change of image takes place only once.

The following formulas are used to calculate the total mean and variance. The pixels are divided into 2 classes, C1 with gray levels [1,…, t] and C2 with gray levels [t + 1,…, L]. The probability distribution for the two classes is:

$$~{{C}_{1}}=~\frac{{{p}_{1}}}{{{w}_{1}}}\left( t \right),\ldots ,~\frac{{{p}_{t}}}{{{w}_{1}}}\left( t \right)\ \text{and}~$$
(1)
$$~{{C}_{2}}=~\frac{{{p}_{t+1}}}{{{w}_{2}}}\left( t \right),~\ldots ,~\frac{{{p}_{L}}}{{{w}_{2}}}\left( t \right)~$$
(2)

where w 1 (t) = \(\sum\nolimits_{i=1}^{t} {p}_{i}\) and w 2 (t) = \(\sum\nolimits_{i=t+1}^t{L}\,{{p}_{i}}\). The means for the two classes are

$${{\mu }_{1}}=~\frac{\mathop{\sum }_{i=t}i{{p}_{i}}}{{{w}_{1}}(t)}\ \text{and}\ {{\mu }_{2}}=~\frac{\mathop{\sum }_{i=t+1}^{L}i{{p}_{i}}}{{{w}_{2}}(t)}$$
(3)

Using discriminant analysis, Otsu defined the between-class variance of the threshold image as

$$\sigma _{B}^{2}=~{{w}_{1}}{{({{\mu }_{1}}-{{\mu }_{t}})}^{2}}+{{w}_{2}}{{({{\mu }_{2}}-{{\mu }_{t}})}^{2}}$$
(4)

For bi-level thresholding, Otsu verified that the optimal threshold t* is chosen so that the between-class variance is maximized; that is t* = \(Arg~Ma{{x}_{1<t<L}}~\{\sigma _{B}^{2}~\left( t \right)\}\).

After thresholding, the area of a detected region is approximated by the number of pixels it contains. Then the region is labelled automatically before its statistical parameters can be computed. Labelling is necessary to avoid a detected one from being analyzed twice.

Let \(\hat{\sigma }\) represent the estimated standard deviation of all the void area, \(\hat{\mu }\) represent the estimated mean of all the void area, and Si represent the area of the ith void. The ith void is defined as a pore if:

$$\frac{{{s}_{i}}-\hat{\mu }~}{\hat{\sigma }\text{ }\!\!~\!\!\text{ }}>\theta ,$$
(5)
$$\hat{\mu }=~\frac{1}{n}\underset{i=1}{\overset{n}{\mathop \sum }}\,\text{Si}$$
(6)
$$\hat{\sigma }=\frac{1}{n}\sum\limits_{i=1}^{n} \,{{\left( \text{ }\!\!~\!\!\text{ }{{S}_{i}}-\hat{\mu } \right)}^{2}}$$
(7)

where \(\theta\) is a weight used to adjust the threshold of the void area.

In order to establish the cardinal directions within each image, the special properties of Gabor filters were used to estimate the best directionality distribution of wood vessels. A 2-dimension (2D) Gabor filter is capable of detecting boundaries within a 2D image I(x,y) which can be formed by convolving the Gabor function with the image on a pixel basis, so that the filter output at an orientation \(\theta\) can be represented by (Gdyczynski and Manbachi 2014):

$$V(x,y:\theta )\text{ }=\text{ }G(x,y:\theta ) * I\left( x,y \right)$$
(8)

where * represents a 2D convolution. When such a function is convolved at various orientations to an image function, the highest Gabor coefficient output is obtained when the edges are perpendicular to the angle of the filter. Hence, by scanning the wood image over a range of filter angles (0°, 45°, 90°, 135°, 180°), the angles at which the edges occur are determined.

Then features related to the statistical properties of vessels are extracted from both binary images for each wood image. The second step in the proposed statistical feature extraction is the fuzzy management process. The fuzzy algorithm is explained in Kuncheva (2000). Let Ω = {ω1,…, ωc} be a set of class labels. Let x = [x1,…,xn]T \(\in\) Rn be a vector describing an object such as an image. Each component x expresses the value of a feature such as size, quantity, length, etc. The fuzzy management system is a mapping of D : Rn → Ω. The canonical model of the fuzzy system shall be considered as a black box at the input where x is submitted and the output produces values of c discriminant functions g 1(x),…,g c(x), expressing the support for the respective classes. The maximum membership rule assigns x to the class with the highest support. Fuzzy if-else rules are employed to categorize the vessels from both binary images into several features such as sizes of vessels, types of vessels and vessels arrangements.

Suppose a set of input images from a wood database with n-feature variables and m image samples. The input data (wood sample features) x p on the pattern space is represented by the following pattern matrix: x p  = [x p1 , x p2 , …, x pn ], p = 1,2,…,m. These training patterns are classified into M (M \(\le\) m) classes. In this paper, the wood data has 24 feature variables with 2100 training wood data and 30 classes, that is, n = 24, m = 2100, and M = 30. The 24 feature variables consist of statistical features such as sizes of vessels, types of vessels and vessel arrangements which are extracted from both binary images (black vessels only and white vessels only).

2.2.1 Sizes of vessels

The vessel size is also used to distinctively define the wood species. The flowchart of the statistical feature extraction for vessels sizes is presented in Fig. 6. The measurement of vessel size is defined by Menon et al. (1993). The size of vessels is measured based on the tangential of vessels.

Fig. 6
figure 6

Flowchart of the statistical feature extraction process for vessels sizes

The grouping model is obtained using fuzzy IF-THEN rules can be written as below:

IF x p1 is A j1 (x p1) THEN x p1 belongs to group M.

Where all variables are defined as follows:

x p1 input tangential of vessel of a wood sample.

A j1(x p1) antecedent linguistic membership value.

Group M consequent group.

The vessels are divided into 3 categories by using fuzzy rules: small vessels, medium vessels and large vessels as below.

  • IF the tangential of vessel is below 100 μm, THEN the vessel is considered as small vessel.

  • IF the tangential of vessel is between 100 and 200 μm, THEN the vessel is considered as medium vessel.

  • IF the tangential of vessel is above 200 μm, THEN the vessel is considered as large vessel.

The membership function (MF) deals with mapping an input space to a membership value between 0 and 1. The membership function μA (x) describes the membership of the elements x of the base set X in the fuzzy set A, whereby for μ A (x) a large class of functions can be taken. Trapezoidal membership function is applied as shown in Fig. 7. For example: the linguistic label “medium” is associated with a possibility distribution defined by four parameters (a, b, c, d).

Fig. 7
figure 7

Membership function to determine the size of pores

The parameters of a membership function as shown in Fig. 7 are defined as a = 95, b = 105, c = 195 and d = 205, respectively. The parameters are determined based on the measurement defined by Menon et al. (1993). Each wood image may consist of 3 types of vessels sizes: small vessels, medium vessels and large vessels. Hence, the quantity of vessels for each vessel size is computed for each wood sample. For example, x1 = v[p1 p2 p3] where x1 represents a wood sample, p1 represents the quantity of small vessels extracted from the wood image, p2 represents the quantity of medium vessels extracted from the wood image, and p3 represents the quantity of large vessels extracted from the wood image.

2.2.2 Types of vessels

Vessels may be present in singles, pairs and multiple on the wood surface. Single vessels are called solitary vessels. Pair vessels are present as two vessels combined and become a single vessel. Multiple vessels are present when more than 2 vessels combined to form a single vessel. There are three types of vessels extracted from the wood texture, namely solitary vessel, pair vessel and multiple vessel. The flowchart of the statistical feature extraction for types of vessels is presented in Fig. 8.

Fig. 8
figure 8

Flowchart of the statistical feature extraction process for types of vessels

The grouping model is obtained using fuzzy IF-THEN rules can be written as below:

IF x p1 is A j1 (x p1) THEN x p1 belongs to group M.

Where all variables are defined as follows:

x p1 input number of vessels within a region of a wood.

sample.

A j1(x p1) antecedent linguistic membership value.

Group M consequent group.

The types of vessel are determined based on fuzzy rules as below while the membership function is shown in Fig. 9:

  • IF vessel is individual vessel THEN it is a solitary vessel.

  • IF vessel is combination of 2 individual vessels THEN vessel is a pair vessel.

  • IF vessel is combination of 3 or more individual vessels THEN vessel is a multiple vessel.

Fig. 9
figure 9

Membership function to determine the types of vessels

Each wood image may consist of 3 types of vessels: solitary vessels, pair vessels and multiple vessels. Hence, the quantity of each type of vessels is computed for each wood sample. For example, x1 = [q1 q2 q3] where x1 represents a wood sample, q1 represents the quantity of solitary vessels extracted from the wood image, q2 represents the quantity of pair vessels extracted from the wood image, and q3 represents the quantity of multiple vessels extracted from the wood image.

2.2.3 Vessels arrangement

There are 6 types of vessel arrangements for the wood texture, namely smallsmall, medmed, bigbig, smallmed, smallbig, medbig. The arrangements of vessels are determined by calculating the mean distance between vessels by using Euclidean distance approach. The vessels arrangements are determined based on fuzzy rules as below:

  • IF distance between small vessel to small vessel THEN it is smallsmall.

  • IF distance between medium vessel to medium vessel THEN it is medmed.

  • IF distance between large vessel to large vessel THEN it is bigbig.

  • IF distance between small vessel to medium vessel THEN it is smallmed.

  • IF distance between small vessel to large vessel THEN it is smallbig.

  • IF distance between medium vessel to large vessel THEN it is medbig.

Each wood image may consist of 6 types of arrangements: smallsmall, medmed, bigbig, smallmed, smallbig, medbig. Hence, the quantity of each type of arrangements is computed for each wood sample. For example, x1 = [r1 r2 r3 r4 r5 r6] where x1 represents a wood sample, r1 represents the quantity of smallsmall arrangement from the wood image, r2 represents the quantity of smallmed arrangement extracted from the wood image, r3 represents the quantity of smallbig arrangement extracted from the wood image, r4 represents the quantity of medmed arrangement extracted from the wood image, r5 represents the quantity of medbig arrangement extracted from the wood image, and r6 represents the quantity of bigbig arrangement extracted from the wood image.

The total number of statistical features extracted from each wood image is 24 features (12 features from black vessels image and 12 features from white vessels image). The next step after feature extraction is the classification process where the wood species are classified into its own species based on the statistical features.

2.3 Classification

A supervised artificial neural network algorithm has been implemented in various identification and prediction problems such as prediction of tension properties of cork (Iglesias et al. 2015) and trend prediction of foreign exchange rates (Zafeiriou and Kalles 2013). In this paper, a back propagation neural network is used for classification purposes. The training dataset consists of input signals assigned to corresponding target. The target is the desired output z (representing the 30 classes of species). The first layer has weights coming from the input. Each subsequent layer has a weight coming from the previous layer. The last layer is the network output. Adaption is done which updates weights with the specified learning function. The trained network is then used to simulate for the test data. The output of the simulation resulted in 30 class data for the test data. Comparison of accuracy is computed to determine the performance of the proposed system with previous systems (Antikainen et al. 2015). The classification accuracy is calculated based on percentage of correctly classified test samples over number of test samples as:

$$\text{Accuracy }=\frac{\text{Num}.\text{ }\!\!~\!\!\text{ of }\!\!~\!\!\text{ correctly }\!\!~\!\!\text{ classified }\!\!~\!\!\text{ samples}}{\text{Total }\!\!~\!\!\text{ num}.\text{ }\!\!~\!\!\text{ of }\!\!~\!\!\text{ test }\!\!~\!\!\text{ samples}}\times 100$$
(9)

Besides that, a confusion matrix is tabulated to summarize the misclassification of each wood species. The confusion matrix is calculated by applying the classification model to test data in which the target values are already known. These target values are compared with the predicted (classified) target values.

3 Results and discussion

The wood database consists of 30 wood species where 100 images were taken from each wood species: 70 images for training and 30 images for testing. The training images are classified into 30 species based on the statistical features extracted using the proposed statistical feature extractor. Then, unknown test images are classified based on the training images stored in the database. Figure 10 shows the wood texture images of 30 wood species used in this research.

Fig. 10
figure 10

Wood texture for 30 different wood species. Species numbering refers to wood species presented in Table 1

The wood texture which is the wood grain form on the timber surface is very important in identifying wood species. The wood texture will not change even after going through several chemical procedures. The wood surface has several distinctive features that may be used to discriminate the wood species. Figure 11 shows the examples of types of vessels extracted from a wood image while Fig. 12 shows the examples of different sizes of vessels extracted from a wood image. These statistical properties of vessels on wood texture are used to classify the wood species by using the proposed automated wood species recognition system.

Fig. 11
figure 11

Types of vessels on wood surface texture

Fig. 12
figure 12

Different sizes of vessels on wood surface texture

Table 1 Example of 6 statistical features extracted from each wood image for 30 different wood species

Table 1 tabulates examples of statistical features extracted from the 30 wood images presented in Fig. 10. As shown in Table 1, the statistical features extracted are the quantity of small, medium and large vessels for black vessels images which are represented as f1, f2 and f3, respectively followed by the quantity of small, medium and large vessels for white vessels images which are represented as f4, f5 and f6, respectively.

Variations in statistical features within inter wood species are shown in Figs. 13 and 14. The graphs represent 24 statistical features extracted from 2 different wood species. There are 10 samples taken from the same wood species and it can be seen that the statistical features of the same wood species are almost similar to each other. On the other hand, the statistical features vary significantly for different wood species which will be used for classification purposes.

Fig. 13
figure 13

Graph of 24 statistical features for 10 samples of wood species Shorea laevis

Fig. 14
figure 14

Graph of 24 statistical features for 10 samples of wood species Palaquium stellatum

However, there are small variations within the wood samples of the same species for the 4th feature and the 6th feature in Fig. 13 which correspond to quantity of pair vessels and quantity of multiple vessels, respectively. This is due to the nonlinearity of wood texture. The nonlinearity in wood features is affected by the variation within wood species that are associated to its place of growth and age. Younger trees will have smaller vessels on its cross section and have lighter color. Trees that grow on lower ground will acquire more nourishment, as a result, they grow faster and the trunks are not as hard as trees that grow on the hills. Trees on the hill fight for nutrients hence they grow at slower pace. Their vessels are smaller than the trees that grow on lower ground. Nonlinearity in wood features could also exist when the wood samples are taken from different parts of the same tree (Fichtler and Worbes 2012; Wheeler and Baas 1998).

Number of neurons used in the neural network classifier may affect the classification performance. There are 3 different quantities of neurons (20 neurons, 40 neurons, 80 neurons) tested in the proposed back propagation neural network. As the number of neurons in the hidden layer was increased, the classification accuracy improved significantly. This is because more neurons are required to process large size database. Besides that, 10 iterations were performed for training and testing the wood data. The number of iterations affects the generalization accuracy as shown in Fig. 15. The highest classification accuracy of the test dataset is approximately 89% when using 80 neurons in hidden layer.

Fig. 15
figure 15

Graph of classification accuracies for training samples and testing samples for different number of neurons in hidden layer

In order to examine the classification performance for each wood species, a confusion matrix is tabulated as shown in Table 2. The confusion matrix shows how the predictions are made by the proposed wood species recognition system. There are 30 testing samples used for each wood species. The highest misclassification rate of 16.7% belongs to wood species Kokoona reflexa. This is due to the nonlinearities in wood texture whereby the statistical properties of vessels for wood species Kokoona reflexa are very close to several wood species such as Kokoona sessilis which contributed to the misclassification of wood species.

Table 2 Confusion matrix for 30 wood species for 30 test samples

As shown in Table 2, this paper focused on analyzing statistical features of vessels on 3000 wood images which results in 89.3% classification accuracy for 30 wood species. This shows that classifying wood species based on statistical properties of vessels is adequate and reliable to be implemented in the timber industries. The advantage of the proposed system compared to previous works is that the proposed system enables human intervention when analyzing the wood texture and to aid human–machine interface.

Finally, the proposed system is benchmarked with the previous works. Table 3 summarizes the results published in literature using image analysis approach to classify wood species. The results show that the proposed system managed to classify wood species more effectively compared to previous works that implemented feature extractor approach based on GLCM and PCA techniques.

Table 3 Comparison of classification accuracy and database between the proposed feature extractor with previous works

4 Conclusion

In this work, a cascaded wood species recognition system based on image analysis is presented. The proposed classification method imitates the approach taken by human experts in analyzing wood surface texture. First, statistical features are extracted from each wood species and used in a broad initial grouping of wood texture using fuzzy logic. Then a neural network classifier is used to refine the broad grouping into the final wood species classification. The supervised artificial neural network trained by back propagation algorithm is able to classify 30 wood species with an accuracy of approximately 89%. In spite of the improvement introduced by the proposed method compared to the performance of previous works using GLCM and PCA features, it is clear that there is a lot of room for improvement. It can be observed that the statistical features used for classification do not identify a species definitively in all cases. For instance, in an image of a wood species, there exist small, medium and large vessels in different quantities. Perhaps the pattern that these vessels form, rather than their statistics is more important. The challenge now is to find better features and gain better understanding of how human experts classify wood textures. Gabor filters or wavelet transform can be used to extract multiresolution textural features of vessels and parenchyma that form the wood pattern. Color information can also help in the discrimination of wood species that look alike.