Introduction

Different kinds of defects, such as knots and cracks, will affect appearance quality. Before machining and working on wood materials, the position, shape, and size of the wood defects should be identified accurately. Using a vision system and analysis software is one measurement technology that could lead to great improvements in the quality of wood products.

Defect-detecting methods have been developed in the last decades, and the results have already been applied to actual systems. Germany Schütt et al. (2004) implemented laser-scanning imaging technology and neural networks to classify surface and internal defects. However, a number of parameters need to be selected, including training algorithms, level number, and performance. In terms of computer vision, Lampinen et al. (1995) extracted geometric features of the wood surface and used multi-layer perceptron to yield a correct recognition rate was 84%.

Pham and Alcock (1998) summarized 32 feature vectors of four types and designed a neuron network classifier. Their study indicates that learning rate has a great influence on the experiment results. Silvén et al. (2003) implemented unsupervised clustering methods to detect and identify wood defects; however, the test result was susceptible to noise interference and nearly half of the wrong recognition happened in the case of deep natural textures and stains. Kwon et al. (2015) used variance of variance (VOV) features and random forest to detect various surface types, such as wafer, solid car surface, pear colored car surface, and striped-metal. While it had a 92% recognition rate, it did not suffice with wood materials. Our previous work (Zhang et al. 2014b) extracted three types of features including gray-scale texture features, moment invariant features, and geometric regional features, and designed a SOM neural network as a classifier. But there were always dead neurons in the process of sample training, which became a negative factor in terms of classification results.

To improve the online classification system of wood defects and overcome the disadvantages of high dimension and computing complexity in classification algorithms (Peck and Devore 2011), and focused our research on feature fusion and classifier design. We chose LDA, a classical pattern recognition method for feature fusion and designed a compressed sensor to solve the defect identification problem. The linear discriminant criteria and projection transformation can reduce feature dimension by creating new project space, which covers the maximum between-class distance and minimum in-class distance (Li et al. 2004, 2017).

The new features after fusion can separate samples from each other and make full use of the extracted training sample information. Compressed sensing is a new sampling theory (Donoho 2006; Candes 2006), by which signals can be reconstructed through the nonlinear reconstruction algorithm based on data dictionary. Compressed sensing does not require a complicated training process, so it uses less computing time and can get ideal classification results.

Materials and methods

Materials

The research is mainly concerned about three types of wood board defects including dead knots, live knots, and wood cracks. The species of wood included Fraxinus mandshurica, Xylosma racemosum, Korean Pine, and Oak. The samples received a series of treatments, including drying and polishing before the experiment was carried out. The size of the boards were 40 cm × 20 cm × 2 cm. Meanwhile, 50 samples were used for training and the other 50 samples were used for testing.

Experiment setup

The experiments were performed on a mechanical system which is shown in Fig. 1a, and the wood image acquisition system is shown in Fig. 1b. The image acquisition system included a camera, adjustable LEDs parallel lights, and a shading enclosure box. The camera is Oscar F810C IRF. LEDs parallel lights can ensure homogeneous exposure and obtain clear images of the wood plate surface. The shading enclosure box is used to avoid the outside ambient light interference.

Fig. 1
figure 1

Computer vision system for defects detection of wood plates. a The overall system appearance, b schematic diagram

Method

The specific experiment process of our online classification method is shown in Fig. 2, which includes image collection, morphology segmentation, feature extraction, feature fusion, classifier design, and result assessment.

Fig. 2
figure 2

The process of online classification method

Image collection

The integral projection method is used to recognize the board border. The integral projection matrix V in vertical direction and matrix L in horizontal direction can be calculated by Eqs. 1 and 2, respectively.

$$ V(x) = \sum\limits_{y = 1}^{W} {f(x,y),(x = 1,2, \ldots ,H)} $$
(1)
$$ L(y) = \sum\limits_{y = 1}^{W} {f(x,y),(y = 1,2, \ldots ,W)} $$
(2)

where, f (x, y) is the pixel of the image, H is the image height, W is the image width. The border of the plate can be obtained by the formula. When the border appears in the camera range, the camera will shoot continuously until next border appear. Then surface image is captured and stored with standardized 128 × 128 with 8-bit gray levels. The surface image is shown in Fig. 3.

Fig. 3
figure 3

The wood surface images a live knot, b dead knot, and c crack

Morphology segmentation

Mathematical morphology is an image-processing method based on geometry. Our previous research has shown that this method has a number of advantages including continuous image skeleton, fewer breaking points, and rapid and exact image segmentation (Zhang et al. 2014a). With the application of mathematical morphology, the exact defect targets are separated from the background. The segmentation results of Fig. 3 are shown as Fig. 4.

Fig. 4
figure 4

The segmentation images of a live knot, b dead knot, and c crack

Wood defect feature extraction and fusion

The wood defect features extraction and fusion process is shown in Fig. 5.

Fig. 5
figure 5

Feature extraction and fusion flow

Feature extraction is an important part of defect identification, Reasonable features should include as much defect information as possible and have simple calculation work. Our previous work (Zhang et al. 2014b, 2015) showed that 25 features of three types including geometry and regional features, texture features, and invariant moment can give a representation of the defects in the wood board image. Specifically, there were seven geometry features including area, perimeter, length, width length–width ratio of the bounding rectangle, compactness, linearity, density and rectangularity; four regional features including eccentricity, diameter, short axis, and long axis; seven gray-scale texture features including means (border and inner), standard deviation, third moment, smoothness, consistency and entropy; and seven invariant moments with different orders.

However, when the dimension of the feature data is high, the amount of computation work increases, and many multivariate analysis methods have poor stability in high-dimension space. Therefore, the work of transforming high-dimension data into low-dimension data is necessary. LDA theory can significantly reduce the dimension of original pattern space, maximize the sample distribution between classes, and minimize the sample distribution within the class, and the algorithm is as follows:

In space R n, there are m samples named \( x_{1} ,x_{2} , \ldots ,x_{m} \), every type has x is a k lines matrix, and n i is the number of samples belong to class i. Assume there are altogether c classes, then \( n_{1} + n_{2} \cdots + n_{i} \cdots + n_{c} = m \cdot S_{b} \), S b is the between-class scatter matrix, S wb is the in-class scatter matrix, u is the average value of all samples, and u i is the average value of samples type i. Then the mean of sample type is:

$$ u_{i} = \frac{1}{n}_{i} \sum\limits_{x \in classi} x $$
(3)

In the same way, we can obtain the mean of the overall samples:

$$ u = \frac{1}{m}\sum\limits_{i = 1}^{m} {x_{i} } $$
(4)

According to the definition of the between-class dispersion matrix and the in-class dispersion, the following formulas can be obtained:

$$ S_{b} = \sum\limits_{i = 1}^{c} {n_{i} } (u_{i} - u)(u_{i} - u)^{T} $$
(5)
$$ S_{w} = \sum\limits_{i = 1}^{c} {\sum\limits_{{x_{k} \in classi}} {(u_{i} - x_{k} )(u_{i} - x_{k} )^{T} } } $$
(6)

To make the result of classification desirable, the features should meet the following requirements: the in-class dispersion matrix should be small and the between-class dispersion should be big (Niskanen et al. 2001). Therefore, the expression of the Fisher discriminant (7) can be obtained by Lagrange multiplier (8):

$$ J_{F} (w) = \frac{{w^{T} S_{b} w}}{{w^{T} S_{w} w}} $$
(7)
$$ S_{b} w^{*} = \lambda S_{w} w^{*} $$
(8)

When J F (w) is at its maximum value, then w * can be obtained by (9).

$$ S_{w}^{ - 1} S_{b} w^{*} = \lambda w^{*} $$
(9)

w * is the optimized projection vector calculated by the feature vectors of the maximum value of S −1 w S b , When w * is obtained, the samples of d dimensions can be projected to a one-dimensional space. When the projection of multi-dimensional space to one-dimensional space is completed, the problem of d-dimensions space is transformed into a one-dimensional classification. When samples of different types are separated, maximum in-class dispersion and minimum in-class dispersion can both be reached.

Defects classifier design based on compressed sensing

If we want to detect p types of defects, the image number b i,j is the jth training sample of defect type i, and A i is the data dictionary composed by the sample dimension.

$$ A_{i} = \left[ {\begin{array}{*{20}c} {b_{i,1} } & {b_{i,2} } & \ldots & {b_{{i,n_{i} }} } \\ \end{array} } \right] $$
(10)

Then the complete data dictionary composed by training samples of defect type are:

$$ A = \left[ {\begin{array}{*{20}c} {A_{1} } & {A_{2} } & \ldots & {A_{P} } \\ \end{array} } \right] $$
(11)

When training samples of defect type i are adequate, assume \( b_{i} \in R^{v \times 1} \) is the test sample of defect type i, the composition method of test samples is the same as training samples, and \( \alpha_{i,j} \in R \) is the weight coefficient. Then the test samples of defect type i can be expressed as:

$$ b_{i} = a_{i,1} b_{i,1} + a_{i,2} b_{i,1} + \cdots + a_{{i,n_{i} }} b_{{i,n_{i} }} = A_{i} \alpha_{i}^{T} $$
(12)

If \( \alpha_{i} \in R^{{1 \times n_{i} }} \) is the weight coefficient vector, then:

$$ \alpha_{i} = \left[ {\alpha_{i,1} \cdots \alpha_{i,2} \cdots \alpha_{{i,n_{i} }} } \right] $$
(13)

Put Eq. 13 into 12 to expand the matrix and then obtain Eq. 14:

$$ b_{i} = A\alpha_{{A_{i} }}^{T} $$
(14)

For any sample (\( i = 1,2, \ldots , \, p \)) that satisfies formula (12), by solving Eq. 14, a vector \( \alpha_{{A_{i} }} \) can be obtained as shown in Eq. 15:

$$ \alpha_{{A_{i} }} = \left[ {\alpha_{1,1} \cdots \alpha_{{1,n_{1} }} \cdots \alpha_{i,1} \cdots \alpha_{{i,n_{i} }} \cdots \alpha_{p,1} \cdots \alpha_{{p,n_{p} }} } \right] $$
(15)

When the classification of test sample b i is unknown, Eq. 14 will be an underdetermined equation, making it difficult to obtain its unique solution. As α Ai is a sparse vector, and Eq. 14 is identical, when compressed sensing theory is used to solve the optimization question which is similar to Eq. 16, \( \alpha_{Ai}^{T} \) is an approximation.

$$ \widehat{\alpha }_{{A_{i} }}^{T} = \hbox{min} \left\| {\alpha_{{A_{i} }}^{T} } \right\|_{1} \quad s.t.\quad b_{i} = A\alpha_{{A_{i} }}^{T} $$
(16)

If the test samples are identical with certain type of the training samples, their features must be similar to training samples of that type. The optimized solution can be obtained by least square method, and the corresponding coefficient has the biggest value. Therefore, the property of samples can be obtained by comparing the average value of the coefficient of all types of training samples.

Results and discussion

Feature fusion test

To verify the necessity of feature selection, defect-detection comparison tests were carried out between the LDA feature-fusion method and the variance-selection method. Fifty sample images of live knot, dead knot, and crack were used for feature selection and classification. In the variance selection, the bigger the feature variance is, the bigger the between-sample dispersion is, and the better the between-sample divisibility is. The calculation formula of variance is as follows:

$$ D = E[(x - \mu )^{2} ] $$
(17)

The result of defect classification based on variance feature selection method is shown in Fig. 6. The results show that the peak recognition rate occurs at the point when the feature vector dimension is 7. The number of features increase at first, as well the recognition rate, but as the number of features increase continually, the recognition rate begins to decrease. This is because some of the vectors in these features are redundant features that do not contribute to the classification but influence the result of classification.

Fig. 6
figure 6

Result of classification using compressed sensor, based on variance feature

Figure 7 is the fusion result of 50 samples based on LDA theory. In Fig. 7, green triangles, blue circles, and red squares represent samples of cracks, live knots, and dead knots. By using Fisher discriminant criteria, the integrated 25-dimensional features are projected to a three-dimensional space.

Fig. 7
figure 7

Result of feature fusion

In our previous study (Zhang et al. 2015), we proposed a fusion method based on Principal components analysis (PCA). Here, we compared the LDA method with both PCA and variance selection methods in our experiment (Table 1).

Table 1 Result of feature comparison

According to Table 1, without the step of feature selection, the recognition rate is at the lowest point at 68%, and the time consumed by recognition is the longest (0.7125 ms). The LDA linear discriminant method has the best recognition rate of 94%, and its recognition time is also the shortest (0.0446 ms). With LDA, the recognition rate had improved 26% and the processing time decreased by 0.67 ms. The results reveal that feature fusion not only reduces identification time, but also increases the recognition rate. Therefore, the process of feature fusion is necessary and LDA is effective.

Classification test

A compressed sensing classifier was designed as follows: Firstly, calculate the mean of every types of sample, between-class discrete matrix, and in-class discrete matrix with the LDA theory; secondly, calculate the optimized projection vector w * according to Fisher discriminant criteria, and finally, obtain matrix A of training samples after projection transformation.

Date dictionary A is shown as follows:

$$ A = \left[ {\begin{array}{*{20}c} { - 1. 1 6 2 9} & { 0. 3 0 7 1} & { - 0. 9 3 5 3} \\ { - 1. 8 1 2 0} & { - 0. 7 2 7 2} & { - 0. 1 4 0 0} \\ { 0. 1 4 1 1 8} & { - 0. 1 9 8 0} & { 0. 2 4 1 6} \\ \end{array} } \right] $$
(18)

For a certain testing sample, extract 25 features of geometry and area, gray-scale texture, and invariant moment from the segmented images. The result obtained after the projection transformation of w * is as follows:

$$ b_{i}^{T} = \left[ \begin{aligned} h^{T} \hfill \\ s^{T} \hfill \\ l^{T} \hfill \\ \end{aligned} \right] = \left[ {\begin{array}{*{20}c} { - 0. 9 2 2 0} & { 0. 4 1 8 9} & { - 0. 9 4 5 0} \\ { - 1. 4 0 8 1} & {1. 2 3 6 0} & { 0. 5 0 6 3} \\ { 0. 1 0 8 3 9} & { - 0. 9 1 7 0} & { 0. 9 2 1 4} \\ \end{array} } \right] $$

Implement classification in accordance with Eq. 16, and obtain \( \alpha_{Ai}^{T} \) by least square method:

$$ \alpha_{{A_{i} }}^{T} = \left[ \begin{aligned} \alpha_{h}^{T} \hfill \\ \alpha_{s}^{T} \hfill \\ \alpha_{l}^{T} \hfill \\ \end{aligned} \right] = \left( {\begin{array}{*{20}c} {0.8980} & 0 & 0 \\ 0 & {0.6955} & 0 \\ 0 & 0 & {0.9135} \\ \end{array} } \right) $$

The 50 test images of live knot, dead knot and cracks are classified in this test. The accuracy and time of classification is shown in Table 2.

Table 2 Result of classification test based on compressed-sensing method

In Table 2, the recognition accuracy of live knot, dead knot, and crack is 90, 95, and 100% respectively, and the time used for recognition are 44.199, 49.059 and 44.268 ms. The results of the new method indicate a very high defect detection recognition rate and the computation time is fast enough for on-line board sorting.

To testify the performance of the compressed sensor, we compared it with a commonly used neuron network classifier. Zhang et al. (2014a, b) proved that SOM neuron network requires less training samples and provides higher classification accuracy, and can be used to make comparisons with compressed sensing. In this experiment, the size of the competition layer is 500. The topological structure of the SOM neuron network is shown in Fig. 8 and the accuracy and time of classification is shown in Table 3.

Fig. 8
figure 8

The topological structure of SOM neuron network

Table 3 The comparison with neuron network method

According to the experiment results in Table 3, the classification time of SOM is 50.8 ms to render a complex iteration. However, the compressed sensing method doesn’t require complex computation, and the time consumed for recognition is reduced 10 ms. As LDA used to integrated features as the input of classifier, the fusion features contain almost all the information in the images, and the exactness of recognition was improved by 7%.

Conclusion

LDA method was analyzed and projection transformation were implemented for dimension reduction of defect features. Features after fusion could express the wood defects in a more reasonable and comprehensive fashion, which could remove the repeated information contained in the ex-features. According to the simulation results, it improved the recognition rate by 26% and reduced computation time by 0.67 ms in contrast with the non-selection method. It improved the recognition rate by 12% and reduced by 0.042 ms in contrast with the deviation method. LDA not only provide simple computation complexity, but also has visual liner space that can be intuitively classified. We designed a compressed senor, based on features after LDA fusion, and our detection results was improved by 7% of the recognition rate, compared with SOM neural network. Our experiments were all conducted on the on-line detection system, and the results indicated that the proposed methods are effective in soft measurement of wood-defect detection.