1 Introduction

With developments in computer technology, communication, and ecommerce, printed circuit board (PCB) requirements have become increasingly strict. Additionally, surface-mount technology has exhibited steady progress. To increase the quality and stability of PCBs, some problems must be addressed, such as the misplacement, loss, and reversed polarity of PCB parts.

Automated optical inspection (AOI) machines are commonly used by manufacturers. Workers must spend time establishing each parameter, including light colors, light angles, image lighting, and the image contrast ratios, prior to testing. Pattern matching is then used to compare the testing images with the golden samples. Unstable image quality may result in workers being required to retest the images, thus increasing companies’ labor costs.

These problems can be solved using two methods: the match method and character verification method. Match methods, such as that developed by Cho et al. [1], use a pattern-matching algorithm to compare the input images with the standard images to identify the misplaced parts of the PCB. The images are then converted through discrete wavelet transformations to boost accuracy and stability. This pattern-matching method was employed by Crispin et al. [2]. However, they used a genetic algorithm to expedite the identification process of the PCB. The pattern-matching method demonstrates excellent performance and accuracy but is easily affected by image lighting, component deviation, and noise.

The primary aim of character verification is to verify laser or ink jet words on integrated circuit (IC) components while checking for missing and misplaced parts. Lee et al. [3] proposed feature extraction through Gabor filter composition, direction gradient, wavelet coefficient, and difference in edge spacing. After feature extraction, the AdaBoost is used for classification. The stroke width transform was proposed by Epshtein et al. [4]. The contour border-detection algorithm and direction gradient are also used in this method. After defining the stroke width, the connection method and predefined threshold are used to divide the character areas.

For character verification, Nava et al. [5] extracted the features through PCA and then classified the characters by using the conditional probability of the Bayesian function. Neullens et al. [6] evaluated the performance of various methods of optical character recognition (OCR) and proposed preprocessing steps for improving performance and stability.

Deep neural networks (DNNs) are considered a basic tool for extracting features from training data, and studies on character detection and verification have increasingly used DNN. For example, to solve the problem of small characters, which are difficult to detect, the feature enhancement network was proposed by Zhang et al. [7]. Furthermore, Zhang et al. [7] designed the adaptive position-sensitive region-of-interest pooling layer to improve accuracy. Shi et al. [8] implemented a character-verification system in an end-to-end network. First, features were extracted using a convolutional neural network (CNN). Next, the map-to-sequence method was used to transform these features into feature vectors. Finally, a recurrent neural network (RNN) and connectionist temporal classification method were used to verify the words.

The object-detection method using deep learning with graphic processing unit hardware [9, 10] exhibits adequate performance in many tasks; however, the considerable time required for training renders this detection method inefficient.

The major contributions of this study are described as follows:

  • A character-verification system with deep learning is proposed to recognize IC components in images with clear characters.

  • An image-classification system is also proposed to classify the images without characters or those with blurry characters by CNN structure.

  • A novel refinement mechanism is used in both systems. It refines the CNN output score and increases the accuracy for detecting misplaced, missing, and reversed-polarity parts.

  • The experiments are implemented in both systems in this study. The experimental results indicate that the proposed method exhibited a superior passing rate and less training time compared to the other methods.

The remainder of the paper is organized as follows. Section 2 introduces the proposed methods of this study. Details on the character-verification method and the image-classification method are described in Sections 3 and 4. Section 5 presents the experimental results, and Section 6 offers conclusions for this study.

2 Proposed methods

The images requiring testing were captured from the PCB by using high-magnification camera lenses through the AOI machine, and then the region of interest was analyzed to determine the IC position. The images might therefore contain some noise, such as uneven light, skewed angles, or a low degree of contrast. Two types of images were examined: those with clear characters, such as in Fig. 1(a), and those with blurry characters, as in Fig. 1(b).

Fig. 1
figure 1

(a) Clear-character IC component (b) Blurry-character IC component

Two systems, a character-verification system and an image-classification system, are proposed in this study. The primary goals of these systems are to achieve high accuracy and performance and decrease the number of manual-adjustment parameters. Figure 2 displays the structure of the character-verification system that is used to examine the clear-character IC component indicated in Fig. 1(a).

Fig. 2
figure 2

Structure of the character-verification system

Fig. 3
figure 3

Structure of the image-classification system

Figure 3 Structure of the image-classification system that is used in the blurry-character IC component presented in Fig. 1(b).

3 Deep learning for character verification

3.1 Contour border detection

Contour border detection is the preprocessing of the character-verification system. First, the images were converted into grayscale after being input into the system. Next, Gaussian smoothing was used to remove noise from the images, and Otsu [11] was used to automatically reduce noises and change the grayscale images into binary images. The border-following algorithm developed by Suzuki [12] was used to extract the characters that did not belong in the background. The border-following algorithm is used to derive chain codes from the border between a connected 1-pixel component and 0-pixel (background) component. The 8-connectivity is defined to search the white border but not the inside of the character. Figure 4 reveals that the detected rectangle frame was the area of the character.

Fig. 4
figure 4

Contour detection

3.2 CNN structure of character verification

The LeNet-5, consists of two sets of convolutional and average pooling layers, and two fully-connected layers, is a straightforward and well-known architecture for character recognition. In this study, the CNN structure which is modified from LeNet is provided in Table 1. To meet the manufacturer requirements, including those for training speed and the accuracy, the sizes of the grayscale images were set to 1 × 28 × 28. The 64 and 128 5 × 5 convolution kernels were the best parameter set which were used in the first and second convolution layers to extract the features after training and testing experiments. The padding was required to remain the same size after convolution. The rectified linear unit (ReLU) activation function was used to strengthen the feature expression, and pooling was used to reduce the size of the feature images. Flattening was employed to connect the feature images to the fully connected layer, and the dropout was set to 0.5 during training; abandoning 50% of the neurons could prevent overfitting.

Table 1 CNN structure

Equation (1) demonstrates the use of the probability distribution P of c types through softmax to acquire the maximum probability type [22], the final result Y, as presented in Eq. (2).

$$ {p}_i=\frac{e^{z_i}}{\sum_{j=1}^c{e}^{z_j}} $$
(1)
$$ \mathrm{Y}={\arg}_{c\in \left[1,C\right]}\max \left(p\left(c|\mathrm{X}\right)\right) $$
(2)

where pi is the result of softmax, zj is the output before softmax, c is the number of types, and e is the exponential.

3.3 Refinement mechanism of character verification

After being trained by the CNN structure, the characters, numbers, IC logos, symbols, and angles can be successfully and roughly recognized. However, the image quality might sporadically become unstable. Contour detection can solve problems associated with the system being unable to recognize a character. Perceiving multiple characters as one character is one such problem. Figure 5 provides an example of three characters being detected as one character because of noise or a lack of light. Another problem is blurry or fractured characters. Light and low print quality cause the characters to be separated into multiple regions, as demonstrated in Fig. 6.

Fig. 5
figure 5

(a) Original image (b) After contour detection

Fig. 6
figure 6

(a) Original image (b) After contour detection

The refinement mechanism for contour detection is introduced as follows, with three cases presented for solving the two aforementioned problems.

  • Character connection

To prevent the system from recognizing multiple characters as one, the opening operation in mathematical morphology was used. The procedure is displayed in Fig. 7.

Fig. 7
figure 7

Procedure for character connection

First, the areas of all contour regions, referred to as the word width, were calculated. Next, for the opening operation to the contour region, words with a width greater than the average word width were identified. The opening operation was then tested N times until the system split the contour region. The number N was set to 5 in the experiments. Finally, the region was recognized again by the CNN. An output score greater than the threshold T indicated that the character had been accurately divided; otherwise, the word had been inaccurately split.

The two directions of fractured characters are illustrated in Fig. 8.

Fig. 8
figure 8

(a) Vertical fracture (b) Horizontal fracture

  • Vertical character fracture

The mechanism of vertical character fracture is presented in Fig. 9. In the contour region, the string projects downward, and if any overlap occurs, then they are combined and then again recognized by the CNN. An output score greater than the threshold T indicates that the character was successfully connected.

Fig. 9
figure 9

Procedure of vertical character fracture

  • Horizontal character fracture

The flowchart of horizontal character fracture is displayed in Fig. 10. First, the angle of the string is determined. Next, the distance (word space) of each contour is calculated and the contour regions that are smaller than the average word space are combined. Finally, the region is recognized again by the CNN. An output score greater than the threshold T indicates that the character was successfully connected.

Fig. 10
figure 10

Procedure of horizontal character fracture

In this study, the proposed adaptive threshold T was set by using training data. The lowest score of all the output scores from the character-verification model was set as threshold T.

In order to compare with adaptive threshold T, the real scores from CNN are needed therefore the softmax which regularizes the output scores from 0 to 1 was removed in the testing phase [22]. Eq. 3 demonstrates the output type without softmax.

$$ {z}_i= pooling\left( ReLU\left(\mathrm{W}\bigotimes \mathrm{X}+\mathrm{b}\right)\right),i\in \left[1,C\right] $$
(3)

where X is the input image, W is the weight after training, b is the bias, ⨂ is the convolution operation, ReLU(·) is the nonlinear activation function, pooling(·) is the pooling operation, and zi is C types of outputs without softmax.

4 Deep learning for image classification

Figure 1 (b) displays the blurry-character or no-character IC component images used in this mechanism. The proposed image-classification method involves using the CNN structure to detect images and to determine the direction of the IC component and the item number that the IC component belongs to.

4.1 CNN structure of image classification

The CNN structure of image classification was revised from AlexNet. A champion of ImageNet Large Scale Visual Recognition Challenge in 2012 with five convolution layers and three fully-connected layers, however, in this study, less layers than AlexNet were used. Only four feature-expression layers were designed in the structure, and then three fully connected layers were connected. To prevent overfitting, the dropout was set at 0.3 and 0.5 for the first two fully connected layers in the training phase. Six classes were then output from a probability distribution using softmax. The structure of image-classification is presented in Table 2. In this study, we try to increase the number of layer. The results of image classification cannot be significantly improved and requires a longer training time.

Table 2 Structure of IC component image classification

4.2 Refinement mechanism of image classification

After training, the IC component images were input in CNN structure and the Top-1 predicted answer was selected as its class; the IC part number and its angle could then be determined. Consequently, misplaced parts and polarities could be identified by the system. However, if the probability distribution of the output closely corresponded to all classes, then the image class was difficult for the CNN to predict.

To address the above-mentioned problem, in the testing phase, the softmax was removed and the system recorded all of the scores from the training data of different image classes. The lowest score was set as the threshold of its class. Therefore, the input image achieving a score that is lower than the threshold was considered abnormal (misplaced or missing parts).

5 Experimental results

To verify the efficiency of the proposed method, real IC components were used in the experiments. The system was evaluated to determine whether it could identify all of the strings and angles from the IC components, and a standard pass rate was used. If characters were not appropriately compared with the database, then they were classified as a misplaced component. If no character was detected, then it was regarded as a missing part. If the system verified the characters correctly but the angle was incorrect, then it was classified as being the wrong polarity.

For image classification, the system was evaluated to determine whether it could display the number and the right angle of the IC component, and a standard passing rate was used. Testing was conducted using the deep-learning tools Caffe and NVIDIA Digits, as well as using a learning environment with a single GTX 1080 Ti.

5.1 Character verification on IC components

5.1.1 Training process of character verification

In the training process, each character was constructed in a different image type according to its angles (0, 90, 180, or 270 degrees). Characters that looked the same from different angles (for example, “8” looks the same at 0 degrees and 180 degrees) were removed from the type lists. Therefore, 600 characters and 120 types were used in the training data. Each type used approximately 100 characters for the testing data. Table 3 lists the training parameters. Figure 11 presents the learning process and system converge in 10 epochs; the amount of time required to train the data was only 3 min.

Table 3 Training parameters of character verification
Fig. 11
figure 11

Learning graph of character verification

5.1.2 Experimental results of character verification

IC001 to IC010 were used in the character-verification experiments. All types of IC components were approximately 1500 images. The images contained approximately 20 characters and symbols. CNN contour border detection was used in this experiment. Table 4 demonstrates that without adding the refinement mechanism, the average misjudged image was 529 photos and the passing rate was only 69.73%. Compare with Table 5, the refinement mechanism is added, the passing rate increases substantially and reached 98.84%. Even the average execution time was longer than before, and it still fits manufacturers’ standards.

Table 4 Character verification without CNN output score refinement
Table 5 Character verification with CNN output score refinement

Table 6 presents a passing rate comparison of the proposed method with the other methods, Shi et al. [8], SSD [9], YOLO [14] and conventional AOI machine, under same conditions. The high complexity of Shi et al. [8] and SSD [9] required more time to train their deep learning networks. The conventional AOI machine also needed more training time because the parameters of all IC components were manually adjusted. YOLO [14] shows short average execution speed, however many characters missed detection which causes YOLO [14] to have a lower passing rate than the proposed method. The proposed method required less time for training, and it is generalizable because it does not involve the development of new training data, as is the case in other methods.

Table 6 Results of different methods

Fig. 12 displays examples of success from using the proposed method. The left side presents the results of contour border detection, and the right side displays the degrees and the strings, which are separated by a comma.

Fig. 12
figure 12

Examples of success

Figure 13 displays some misjudged examples, most of which were caused by blurry characters, noise, or failure of the score to reach the threshold. For example, the dots were not detected in Fig. 13 (a) and (b), the number “4” was verified as “A” in Fig. 13 (b), the number “8” was identified as “B” in Fig. 13 (c), and the score failed to reach the threshold in Fig. 13 (d), indicating that the character fracture was not connected, as was the case with the character “H.”

Fig. 13
figure 13

Examples of misjudgment

5.2 Image classification on IC components

5.2.1 Training process of image classification

Six classes of IC images, numbered IC011 to IC016, were used in the experiment. Each class of image was classified in four directions. Each class had approximately 700 images. First, the black pixels were padded to ensure that the images remained the same size. The images were then adjusted to 250 × 250 for training. In total, 10% of the training data was used as the testing data. Table 7 lists the parameters used in training, and the learning graph is displayed in Fig. 14. The total training time was 45 min.

Table 7 Classification parameters of IC component images
Fig. 14
figure 14

Learning graph of classification on IC component images

5.2.2 Experimental results of image classification

Table 8 lists the passing rates of six classes of IC images. The average passing rate was 99.48%, and the average execution time for each IC image was 23.2 ms.

Table 8 Passing rate of IC component image classification

Many deep learning approaches present good performance in image classification such as VGGNet [15], GoogLeNet [16], AlexNet [17], and ResNet [18]. In very recent algorithms, a new pooling technique that combines two consecutive convolutional layers as a pooling operation, was proposed by Liu et al. [19]. It used convolutional layers from a DCNN at first then applied the pretrained CNN on densely sampled image regions and treated the fully-connected activations of each image region as a convolutional layer’s feature activations. Then, another convolutional layer was trained on top of that as the pooling guidance convolutional layer. To improve recognition accuracy and decrease the parameters needed in CNN, Zhang et al. proposed the hybrid model CNN-GRNN [20], which extracted features using CNN then classified images with GRNN which has only one variable and no need to iterate. Another more discriminative feature coding network is designed by Chen et al. called LSO-VLADNet [21]. Expanding the NetVLAD model, an end-to-end feature coding network, LSO-VLADNet, is able to be jointly trained with a deep convolutional neural network for visual recognition. In addition, the feature coding method is the core component of this framework, which links feature extraction and feature pooling, and greatly influences the image classification performance.

The evaluation indexes consist of the passing rate and the training time. Figure 15 presents the passing rates of various methods using the same training data and testing data for comparison. The epoch was set to 50, and the optimizer type was set to Adam [13].

Fig. 15
figure 15

Passing rate comparison of various methods

According to the experimental results in Fig. 15, the passing rate of the proposed method is better than those of other methods, except the ResNet [18] and CNN-GRNN [20]. Even if ResNet [18] shows the highest passing rate in all the methods, its training time reveals obvious differences as shown in Table 9. Also, in Table 10, the execution speed and loading time per image show efficient advantages of the proposed method. CNN-GRNN also reveals better passing rate than the proposed method, however, CNN and GRNN networks both need to be trained. This causes the training procedure to be more complex and have a longer training time. CNN-GRNN might cause tedious work if employed by a manufacturer. Therefore, the proposed method not only reduces the amount of required time but also is accessible and flexible for manufacturers.

Table 9 Training time comparison of various methods
Table 10 Execution speed and loading time comparison of ResNet and proposed method

6 Conclusions

The proposed deep-learning methods were used in PCB testing, which involved character verification on IC components and classification on IC component images. IC component character verification employed contour border detection with CNN and then used the refining output score from CNN to increase accuracy. IC component image classification employed a different CNN structure and same refinement mechanism to increase accuracy.

According to the experimental results, the passing rates of both methods reached 98.84% and 99.48%, and the times required for training were less than those of other methods. Both methods met manufacturer requirements and have been implemented on the production lines. In future works,the program will automatically test an image again after it has been misjudged. The program will be embedded in machines to shorten processing time after fetching massive images from cameras.