Keywords

1 Introduction

In the manufacturing industry, it is important to achieve a high quality of products along with high production productivity. The rapid increase in the product range and the reduction of time spent on manufacturing products, while increasing the complexity of finished products, are first-priority issues in modern manufacturing [1]. The ultimate goal for manufacturers is to achieve 100% quality control, which means that every single part or product on the assembly or production line is inspected and verified to be accepted or rejected. It is not easy to achieve such a state because it means that in the case of a classic approach to the final product quality control, it is necessary for the operator to check every part manually. This process has proven mainly time-ineffective and also represents a very stereotypical activity which can lead to operator inattention and thus to more mistakes. Especially when it comes to fully automated production or assembly lines, the goal of implementing 100% product quality control has proven to be a rather complicated task to solve [2]. The application of automated visual inspection of products based on computer vision principles may be the right solution for such a situation [3, 4]. Furthermore, the main ways of creating a unified information space in the quality management system were presented in researches [5, 6].

2 Literature Review

The term “computer vision” is nowadays generally associated with systems that work automatically based on the information acquired by the camera (or more cameras). In the field of industrial production, nomenclature machine vision is often used for computer vision. There are some types of such a system for machine vision applications: camera sensors; intelligent cameras, PC systems or custom systems; and hardware performance. Figure 1 describes a simplified scheme of a machine vision based on data processing with the personal computer. This type of system is also used to solve our task, which is to recognize specific objects in the output images of the camera [7].

Fig. 1.
figure 1

Visual inspection system with industrial camera and computer data processing.

Systems for automated visual inspection based on machine vision generally consist of the following basic parts:

  • An imaging device – usually a camera that consists of an image sensor, lenses, polarizing glass, protective cover, and other special parts;

  • Suitable lightning method for use in specific application conditions;

  • Frame grabber (which is often no longer necessary when using modern digital cameras);

  • Personal computer with adequate hardware performance for further image outputs processing and evaluation.

The main principle of the machine vision system is that the camera captures images that are sent to the computer via one type of the serial communication protocol standards, like CameraLink, GigE, USB or Ethernet. The following procedure is that these images are subsequently evaluated in the computer by the pre-created algorithm [8].

Computer vision and machine learning have received wide implementation and use in different types of production [9,10,11], as well as in the society [12,13,14]. A comprehensive overview of the applications which use machine learning is presented in the research [15].

The research [16] is focused on the possibility of using effective recognition algorithms of the OpenCV library in the computer vision area. Furthermore, different algorithms, as well as their comparative analysis of the performance and recognition quality, are discussed in detail.

Deep learning as part of machine learning is also a section of a larger group of artificial intelligence methods [17, 18]. Deep learning, like the “ordinary” machine learning, is mainly based on the use of different types of so-called neural networks. However, the difference between neural networks and networks for deep learning lies in the fact that deep learning networks involve a larger number of hidden layers and are called deep neural networks. Convolutional neural networks seem to be the most successful type of deep neural networks for image processing, thanks to their significant results in the tasks which include image classification and object detection. Convolutional networks, as their name suggests, work on the principle of convolution, which is a type of linear function and is suitable mainly for data processing with a grid topology. Every network that uses a linear function of convolution at last in one of its hidden layers for the purpose of general matrix processing can be called a convolutional network [19].

Every neural network consists of three general parts, which are: an input layer, hidden layer (or layers in the case of deep neural networks) and, last but not least, the output layer. The input layer serves to load input data, which can be of various formats. In our case, in image processing, they will be set as 2D image matrices. Hidden layers serve to process data from the input layer, while the types of chosen hidden layers depend on the main task that the neural network has to fulfill. When it comes to convolutional networks, the most commonly used hidden layer types are Convolutional Layers, Dropout Layers, Batch Normalization Layers, ReLU Layers, Softmax Layers, and Fully Connected Layers. Besides, the values of so-called hyperparameters are also very important for the proper choice of hidden layer types and order. These include filter size, number of filters, number of channels, stride, padding options, number of epochs and so on. By altering the mentioned and many other hyperparameters, it is possible to achieve results with a varying level of success, and, in simple terms, they make it possible to debug the performance of the created network [20].

Depending on the network architecture, convolutional networks can serve to solve various tasks like classification, regression or object detection.

3 Research Methodology

Inspection of specific parts on a printed circuit board (PCB) was chosen as an example of a Deep Learning application for object detection. The main aim of this research is to test the possibility of creating a functional system for recognizing and classifying objects of a certain shape and type in specific images. In industrial applications, special industrial cameras are often used as a recording device. In this project, however, only an ordinary camera and ordinary artificial lighting were used, since the goal was only to create a demonstration network and test its functionality on several test samples.

The mentioned example is programmed using software MATLAB R2018b from the MathWorks company. According to the producer of this software, it is a programming platform designed specifically for engineers and scientists. Its heart is the MATLAB language, a matrix-based language that allows the most natural expression of computational mathematics.

This platform includes various types of specific modules like modules for Deep Learning and Machine Vision. The main task of this experiment was to detect specific types of components and decide whether the type, placement, and orientation of these components are correct or not. During the experiment, three types of components were detected. These components are Valor FL1173, Valor PT0018 transformers and Parallel tasking II 3Com chip, displayed in Fig. 2.

Fig. 2.
figure 2

Sample of training pictures.

As the first step, it was necessary to design an appropriate system. According to the previously defined task, the system was divided into two separate subsystems. The first one is designed for the object of interest detection task. Regions with Convolutional Neural Network (R-CNN) were selected for this purpose. The proposed R-CNN was created using the so-called Transfer Learning method, which is based on the use of the existing network with only a few last layers changed. The core of this “transferred” network is based on AlexNet, as this network is easily available and provides a great foundation for object recognition and detection. For the purpose of detecting specific objects in this experiment, the last three layers of AlexNet were changed. These were replaced by the new Fully Connected Layer, Softmax Layer, and Classification Layer. The new layers were added for the purpose of detecting three specific objects in the picture, and training was carried out using hundreds of pictures for each type of detected components. A sample of pictures that were set as training images is shown above in the article in Fig. 2.

The following step was to verify the trained R-CNN network on testing samples. The result of this process is shown in Fig. 3. The detected objects are in white boundary boxes. The figure also shows annotations beside the rectangular shaped boxes that define the position of components. Annotations consist of labels of each detected component, numbered from 0 to 1, that represent confidence with which the object was correctly detected and evaluation of orientation correctness. The results of this experiment have shown that the best performance of the designed system was in the range from 0,883 to 0,9994 for various components.

Fig. 3.
figure 3

Results of created network verification on testing images.

One of the tasks consists of orientation detection and classification of components detected in the previous steps. The chosen parts have a square and rectangular shape, so there are more possibilities to assemble them. Based on the analysis of assembling the possibilities of objects, it has been found that Transformers have two possibilities of assembling and Parallel Tasking 3Com chip has as many as four possibilities, but only one position is right for each one.

4 Results

For every component, its own convolutional neural network (CNN) was designed. CNN networks are basically composed of only few types of layers, and their order is often similar. The structure of the network used for component classification is the following: imageInputLayer ([227 227 3]); convolution2dLayer (2,2); reluLayer; maxPooling2dLayer (2,’Stride’,1), convolution2dLayer (2,2); fullyConnectedLayer (4); softmaxLayer; classificationLayer ().

The layers listed above are the same layers used for training the CNN network for the orientation classification of component Parallel tasking. The parameters in the first layer were designed with respect to the fact that the detected object is square-shaped and the described situation is the same for VALOR FC. A different situation arises when classifying VALOR FL, whose shape is rectangular. In this case, the imageInputLayer parameters were set to [113 227 3], which represents its rectangular shape. The choice of other layers was based on the network simplicity criterion and on the fact that this network is only one part of a bigger system that consists of four separate neural networks. Such complex networks could be a bit slower, but they are one of the appropriate solutions to the selected task. All in all, the middle layers of the created network consist of three basic types: convolution2dLayer; ReLU Layer; maxPooling2dLayer.

The last three layers represent the classification process. For this task, the most important value is in fullyConnectedLayer, which represents the number of possibilities for assembling the detected components. The values were selected according to Table 1 and then, the neural network was trained with the following options: sgdm, MaxEpochs: 20; InitialLearnRate: 0,0005; ExecutionEnvironment: gpu (the network was trained and tested with the NVIDIA graphics card GTX 1080 Ti).

Table 1. The possibilities of component orientation during the assembly process.

Afterward, a separated program was created, where four created neural networks and testing pictures were loaded. As the first step, the program recognized the desired object (representing a specific component) in the picture and labeled this object with the previously defined boundary box. As mentioned above, this boundary box consists of the object label, confidence number, and orientation value. The final output from the main program is a picture with all the required information.

Figure 3 shows the results of the created network verification. In the picture on the left, the created network detected all desired components correctly, with confidence in the range between 0,8838 and 0,999, which is a satisfying result.

All parts were assembly corrected, so the network also correctly detected their orientation, which was in all cases 0° and Correct. In the picture on the right, the network correctly classified all components with approximately similar confidence. However, the orientation of all components was not right. The parallel tasking chip was rotated upside-down, and the created network had no problem with the detection of its orientation, and marked it as wrong, with the component rotation of 180° from the desired position.

5 Conclusions

In order to achieve high-quality products, it is necessary for the manufacturer to be able to ensure that the products he is shipping to the market meet the declared quality and performance. Different methods are used during the product control process, and one of the basic methods is visual quality control. In the industrial area, and in serial production, in particular, it is necessary to control huge quantities of products in a short time. Therefore, it is appropriate to employ automated visual inspection systems in this area. Such systems also have many advantages over manual control by workers, in particular, the fact that they can work 24 h a day and 7 days a week, they provide a simple way to collect data from the control process and, last but not least, they replace the stereotypical type of work. With these systems, the principle is that the camera captures the image which is subsequently processed by a particular algorithm.

Deep convolutional neural networks are one of the most suitable methods for processing such types of input. The method presented in this article is based on the principle of transfer learning, when using an existing functional network, changing some of its layers and transforming itself into a particular application using a brand new set of training data. The program consists of four networks with the common aim of detecting, locating and classifying specific Printed Circuit Board components. The present research is considered to be the initial attempt in the field of neural network formation for the application of the given method in the field of visual quality control of specific types of products in industrial practice.