Keywords

1 Introduction

Worldwide, steel industry is one of the most important strategic industries. Quality is an important competitive factor to the steel industry success. Detection of surface defects devotes a large percent of quality control process to satisfy the customer’s need [1], and [2]. Defect detection and classification can be accomplished manually by human labor; however, it will be slow and subject to human-made errors and hazards. Therefore, automatic traditional-inspection systems were developed to detect various faults. These include eddy current testing, infrared detection, magnetic flux leakage detection, and laser detection. These methods are not able to detect all the faults, especially the tiny ones [3]. This motivates many researchers [4, 5] to develop computer vision systems capable of classifying and detecting defects in ceramic tiles [6], textile fabrics [7], and steel industries [8]. Achieving defect detection, localization, and classification in real time is one of the challenges in the steel production process. Therefore, the main aim of this chapter is to propose parallel algorithms to detect and classify patches, scratches, and scale defects in surface steel strips in real time.

The rest of this chapter is organized as follows. Section 2 reviews the related works. Section 3 illustrates the proposed algorithm. Section 4 discusses the experiment setup and results. Section 5 concludes this chapter.

2 Related Work

Image processing plays a major role in the steel production industry to enhance the quality of the products. In the literatures, many image-processing algorithms have been proposed to detect various defects by features extraction techniques. A plenty of features have been used including color, texture, shape, geometry features, etc., for defect localization and type identification [9]. The common techniques used for feature extraction in steel images are categorized into four different approaches [10]. These approaches are statistical methods, structural algorithms, filtering methods, and model-based techniques as shown in Fig. 1a.

Fig. 1
figure 1

Related work

Statistical approaches usually used histogram curve properties to detect the defects such as histogram statistics, autocorrelation, local binary patterns, grey level co-occurrence matrices [11], and multivariate discriminant function [12]. Image processing and edge detection algorithms are the basic operations used in structural approaches. Due to various defects depicting similar edge information, it is hard to classify the defect types. Filter-based methods involve convolution with filter masks for computing energy or response of the filter. Filters can be applied in frequency domain [13], in spatial domain, or in combined spatial frequency domain [14]. Model-based approaches include fractals, random field models, autoregressive models, and the epitome model [10] to extract a model or a shape from images. Figure 2 lists methods utilized to detect two types of surface defects on steel strips.

Fig. 2
figure 2

Defects detection and classification techniques

There are many approaches to extract features in parallel. Lu et al. [15] proposed an adaptive pipeline parallel scheme for constant input workloads and implemented an efficient version for it based on variable input workloads; they speed up to 52.94% and 58.82% with only 3% performance loss. Also, Zhang et al. [16] proposed a model to generate gray level run length matrix (GLRLM) and extracts multiple features for many ROIs in parallel by using graphical processing unit (GPU), and they achieved five-fold increase in speed than an improved sequential equivalent.

The classification process is the main consideration in the inspection system. Generally, there are two types of classification methods: supervised and unsupervised as presented in Fig. 1b. In supervised classification, training samples are labeled, and features are given to the classifier to generate the training model. The training model predicts the pre-defined classes for test samples [10]. These methods include SVM, neural networks, nearest neighbors, etc. Yazdchi et al. [17] applied neural network (NN) to classify steel images that achieved accuracy 97.9%. Yun et al. [18] suggested support vector machine (SVM) classifier for defect detection of scale-covered steel wire rods. In unsupervised classification, classifier earns on its own and it is not fed with labeled data. Classifier just tries to group together similar objects based on the similarity of the features [19]. Most common types of methods include K-means, SOM (self-organizing map) [20] and LVQ (learning vector organization) [13]. Figure 2 lists some defect detection and classification methods. The key parameters of the defect classification methods are the accuracy and the efficiency. This paper employs the SVM.

3 Proposed Algorithms

This chapter develops parallel algorithms to detect and classify patches, scratches, and scale defects in surface steel strip. Figure 3 shows the high-level design of the proposed defect detection and classification technique. First phase is to preprocess the image to improve it and remove noises. Second phase detects defects from the steel image and segments it to defective ROIs. Third phase extracts Haralick features from gray level co-occurrence matrix (GLCM). Finally, these features will be used as inputs to the SVM classifier.

Fig. 3
figure 3

High-level architecture of the proposed algorithm

3.1 Preprocessing Phase

Surface steel images are subject to various types of noises due to image acquisition setup, lighting conditions, or material reflections. The preprocessing operation is an important step to eliminate light reflection and noises. Preprocessing operation carried out image enhancement and noise reduction. Image enhancement composes two steps to make image clearer. First, convert the RGB images into grayscale images and resize the image to M × N. Then apply the contrast stretching operation to enhance image brightness by stretching the intensity values from 0 to 255. To remove noises, this chapter uses median filter to remove salt, pepper noises, and makes images more blurred [21, 22].

3.2 Defect Detection Phase

In this phase the algorithm divides the M × N grayscale steel image into blocks (ROIs) of size W x H. After that, it extracts statistical features for each ROI by using multivariate discriminant function [12] to detect either the ROI is defected or not.

  1. 1.

    Features Extraction: The proposed algorithm divides the M × N grayscale image into ROIs of size W x H, where W ≪ M and H ≪ N. To characterize the shape of the surface defects and detect either if the ROI is defected or not, the algorithm extracts following statistical features for each ROI: difference (δ), mean (μ) and variance (υ) as in Eqs. (3), (4), and (5). After that, it calculates mean vector (MV) for each ROI as Eq. (6). Extract features need many operations that may take long time, which is not suitable to achieve real-time for defects detection. Consequently, this paper uses Summed Area Table (SAT) [3] to reduce the required time to compute these features. It quickly generates the sum of values of a rectangular subset of a grid using Eq. (1). Where i(x, y) is the pixel value from the given image and S(x, y) is the value of the summed area table [23]. For M × N image, SAT table is created with O(M × N) complexity. Once it is created, the task to calculate the sum of pixels in a rectangle that is a subset of the original image can be done in constant time by Eq. (2) with O(1).

$$ S\left(x,y\right)=i\left(x,y\right)+S\left(x\hbox{--} 1,y\right)+S\left(x,y\hbox{--} 1\right)-S\left(x-1,y\hbox{--} 1\right) \vspace*{-12pt}$$
(1)
$$ {\displaystyle \begin{array}{c}\mathrm{SUM}=S\left({x}_0\hbox{--} 1,{y}_0\hbox{--} 1\right)+S\left({x}_0+{x}_1,{y}_0+{y}_1\right)\\ {}\hbox{--} S\left({x}_0+{x}_1,{y}_0\hbox{--} 1\right)\hbox{--} S\left({x}_0\hbox{--} 1,{y}_0+{y}_1\right)\end{array}} \vspace*{-8pt}$$
(2)
$$ Diff\mbox{\_}Value\left( ROI,W,H\right) \vspace*{-12pt}$$
(3)
$$ \mu = Mean\mbox{\_}SAT\left( Image,{x}_0,{y}_0,W,H\right)\vspace*{-12pt} $$
(4)
$$ \upsilon = Variance\mbox{\_}SAT\left( Image,{x}_0,{y}_0,W,H\right)\vspace*{-12pt} $$
(5)
$$ \mathrm{MV}={\left[\delta\;\mu \kern0.37em \upsilon \right]}^T $$
(6)

Consequently, SAT can quickly iterate pixels and significantly reduces the required time to process the images. In this paper, we developed SAT algorithm in parallel using CUDA [24, 25] as shown in Fig. 4.

Fig. 4
figure 4

Parallel SAT Algorithm in CUDA

  1. 2.

    Defect Detection: The defect detection algorithm divides image into ROIs to detect each ROI either belongs to defective group (G1) or non-defective group (G2). MV1 and MV2 are mean vectors that contain the statistical features of G1 and G2 respectively. We assume MVROI denotes a mean vector that contains the features in ROI [12]. The two groups represent defective pixels and non-defective pixels in the image. To separate the pixels into defective and non-defective pixels, we create two Gaussian Mixture Models (GMM)s [26]. An iterative Expectation-Maximization (EM) algorithm is used to estimate maximum likelihood and for both GMM1 and GMM2 as in Eq. (7), by guess weight α, mean m and variance σ values [27]. EM contains three steps. First step chooses initial parameters values, the second is E-step that evaluates the responsibilities using the current parameter values, the third is M-step that re-estimates the parameters using the current responsibilities [28]. By maximum likelihood function ML(p) in Eq. (8), the pixel belongs to G1 if is larger than or equal to otherwise it belongs to G2 Eq. (8).

(7)
(8)

To decide if the image is defective or not, we apply multivariate discriminant function, Ω, for each ROI in the image [5, 12]. Multivariate discriminant function applies Mahalanobis distance rule \( \varDelta \) 2 Eq. (9) [12, 29]. If Mahalanobis distance between the ROI and G1 more than or equal Mahalanobis distance between ROI and G2, then the ROI is defective; otherwise, the ROI is non-defective as in Eq. (10). Multivariate discriminant function in Eq. (11) derived from Eqs. (9) and (10), where T denotes matrix transpose [12]:

$$ {\varDelta}^2\left(\mathrm{MVROI},\mathrm{MV}\right)={\left(\mathrm{MVROI}-\mathrm{MV}\right)}^T{\mathrm{CV}}^{-1}\left(\mathrm{MVROI}-\mathrm{MV}\right)\vspace*{-15pt} $$
(9)
$$ {\varDelta}^2\left(\mathrm{MVROI},{\mathrm{MV}}_1\right)\ge {\varDelta}^2\left(\mathrm{MVROI},{\mathrm{MV}}_2\right) \vspace*{-15pt}$$
(10)
$$ \varOmega ={\left({\mathrm{MV}}_1\hbox{--} {\mathrm{MV}}_2\right)}^T{\mathrm{CCV}}^{-1}\ \mathrm{MVROI}\hbox{--} 1/2{\left({\mathrm{MV}}_1\hbox{--} {\mathrm{MV}}_2\right)}^T{\mathrm{CCV}}^{-1}\left({\mathrm{MV}}_1\hbox{--} {\mathrm{MV}}_2\right) $$
(11)

To apply discriminant function Ω, we need to calculate covariance vector, CV, for both groups G1 and G2 by Eq. (12), where N i denotes the number of pixels in the group and x ij denotes pixel in G1 and G2. Then the common covariance matrix (CCV) will be calculated by Eq. (13):

$$ {\mathrm{CV}}_i=\frac{1}{N_i-1}\ \sum \limits_{j=1}^{N_i}\left({x}_{ij}-{\mathrm{MV}}_i\right){\left({x}_{ij}-{\mathrm{MV}}_i\right)}^T\ \mathrm{for}\ i=1,2. \vspace*{-10pt}$$
(12)
$$ \mathrm{CCV}=\sum \limits_{j=1}^2\left({N}_j-1\right)\ \frac{{\mathrm{CV}}_j}{n-2}\kern0.75em \mathrm{where}\ n=\sum \limits_{j=1}^2{n}_j $$
(13)

The ROI is defective if the value of discriminant function Ω is positive. Otherwise, the ROI is non-defective as Eq. (14). To decide either the image contains defects or not, it must have at least one defective ROI; otherwise, the image is non-defective [12].

$$ \varOmega \left(\mathrm{block}\right)=\left\{\begin{array}{c} defective\ block,\kern4.5em \varOmega \kern0.5em \ge 0\\ {}\ non- defective\ block,\kern1.5em \varOmega <0\end{array}\right. $$
(14)

Applying the discriminant rule, Ω, for all ROIs in the image, results would be like in Fig. 5; the numbers represent the value of the discriminant rule for each ROI [12]. The image has no defect if all ROIs have negative discriminant value.

Fig. 5
figure 5

Defective image and its discriminant result. [12]

To speed up the EM algorithm, this chapter calculates each iteration E-step and M-step for all pixels in parallel using CUDA [30] as shown in Fig. 6. The parallel EM algorithm has main function PEM() that launches the GPU kernel UpdateKernel() to process E-step and M-step for all pixels in parallel as seen in Fig. 7. The UpdateKernel() creates 1D grid and 1D ROIs; each ROI contains MT threads [25]. Each thread calculates both E-step and M-step for a pixel. Assume an image has NP pixels, the complexity of the sequential EM is O(Maximum of Iteration NP). However, it is O(Maximum of Iteration) for the proposed parallelEM.

Fig. 6
figure 6

Parallel EM algorithm

Fig. 7
figure 7

UpdateKernel() invoked by EM algorithm in CUDA

3.3 Defect Classification

In the past decade, different researchers have presented several methods for steel defect classification [31]. Nevertheless, these methods are limited to high computation and low accuracy. This work proposed a classification algorithm to classify scratch, patches, and scale defects. The algorithm has two modules. First, features extraction module takes the defective image and calculates GLCM and Haralick features [32, 33]. Second, once features are extracted, the classification module utilizes support vector machine (SVM) for the recognition of their corresponding class.

  1. 1.

    Features Extraction Module: GLCM defines the texture of an image by calculating how frequently pairs of pixels with specific values and in a specified spatial relationship happen in an image. Each element (i, j) of GLCM denotes how many times the gray levels i and j occur as a sequence of two pixels located at a defined distance δ along a chosen direction θ. Haralick defined a set of 14 measures of textural features [33]. This work selected six textural features shown in Table1 that are used as input to SVM classifier to classify the defect in steel image. The computation time of texture features depends on the number of gray levels, between 0 and 255 levels. This chapter develops Haralick features calculation in parallel to reduce the execution time for the proposed classification algorithms [34]. To extract the features from GLCM matrix in parallel, this work developed P_Haralick_Featuers() function that launches the HaralickKernel() kernel with 2D (blocks) ROIs with n x threads in the x-direction and n y threads in the y-direction [25]. Each thread computes six features for one pixel. To accumulate features values from threads, kernel uses AtomicAdd() function as shown in Fig.8. While a thread reads a memory address, it adds the value of feature to it, and writes back the result to memory. As GLMC is 256 256 matrix, 256 gray levels then the complexity to extract Haralick features by sequential algorithm is O(256x256); however, with parallel algorithm it is O(1).

Table 1 Haralick features
Fig. 8
figure 8

HaralickKernel () to extract Haralick features

  1. 2.

    Defect Classification Module: To classify surface steel strip defects, this chapter uses multi-class classification SVM. The classification process is divided into two steps: training process and testing process. In both steps the classifier will use features vectors; in the training step to label different defects and in the test step to classify defects [2, 18]. This work extracts features in parallel to reduce the classification time. In the training phase, we pass these features along with their corresponding labels to the SVM to generate SVM model. The second and final step of the proposed system is the testing phase where we have a test dataset images of the steel strips. These images are further checked for the defect; if an image has defective ROIs, then Haralick features must be extracted. These features are then given to the SVM along with a trained SVM model which was trained in the first step; as a result, SVM identifies the predicted class of defect. Figure 9 shows the classification steps.

Fig. 9
figure 9

Defect classification steps

3.4 Evaluation Criteria of Defect Detection and Classification

This section introduces the performance criteria to check the effectiveness and accuracies of the defect detection and classification algorithms.

  1. 1.

    Detection Accuracy: The defect detection accuracy as shown in Eq. (15) is used to determine the accuracy and the effectiveness of the defect detection algorithms [35, 36]:

$$ \mathrm{DA}=\frac{\mathrm{TP}+\mathrm{TN}}{\mathrm{TP}+\mathrm{TN}+\mathrm{FP}+\mathrm{FN}} $$
(15)

where TN is true negative, TP is true positive, FN is false negative, and FP is false positive. True positive is referred to defective steel image identified as defective. True negative is referred to defect-free steel image identified as defect-free. False positive is referred to defect-free steel image identified as defective. False negative is referred to defective steel image identifies as defect-free.

Classification Accuracy: The accuracy of the classification algorithm could be calculated as in Eq. (16):

$$ \mathrm{Accuracy}(dj)=\frac{\ {N}_c\left({d}_j\right)}{N_t\left({d}_j\right)} $$
(16)

where d j is the defect class j, j = 1, ... W, N c(d j) is the number of images correctly classified as defect class d j, N t(d j) is the total number of images in that defect class, and W is the total number of defected classes. The total accuracy for the defect classification algorithm is the probability of a correct prediction due to our classifier for all defect classes over a set of images:

$$ \mathrm{Total}\ \mathrm{accuracy}=\frac{\sum_j^W\mathrm{Accuracy}\left({d}_j\right)}{W} $$
(17)
  1. 2.

    Performance Criteria: Computing time is the main criteria to study the performance of the proposed defect detection and classification algorithm. The required time to detect and classify defects for steel surface is divided into two main significant parts: detection time and classification time as shown in Eq. (18):

$$ {\mathrm{Total}}_{\mathrm{Time}}=\sum \limits_{i=1}^B{Dt}_i+{Ct}_j $$
(18)

where the surface steel image has been divided into B ROIs, Dt i is the required time to detect either ROI i in the surface steel image has defect or not, Ct i is the required time to classify the type of the defects in the defected ROI i in the surface steel image. Ct i equals zero if ROI i has no defects. In addition, this work used speedup to measure the performance of the proposed algorithm is speedup as in Eq. (19):

$$ \mathrm{Speedup}=\frac{T_s}{T_p} $$
(19)

where Ts denotes the execution time of the sequential algorithms, and Tp denotes the execution time of parallel algorithms.

4 Experiment Results

This section introduces experiments results of the proposed parallel algorithms for detecting scratch, patches, and scale defects.

  1. 1.

    Setup: The experiment platform in this work is Intel(R) Core™ i7-8550U with a clock rate of 1.8 GHz, working with 8 GB DDR4 RAM and a graphics card that is NVIDIA GeForce 940MX with 2GB of DDR3 and 1122 MHz. All experiments in this project were conducted in Windows 10 64-bit operating system with the development environments of Visual C++ 2017 with OpenCV 3.4.3 and CUDA toolkit 10. NEU dataset has 1800 grayscale steel images has been used. It includes six types of defect which are inclusion, crazing, patches, pitted surface and rolled-in scale, 300 samples for each type. Moreover, to study the tolerance of the proposed algorithm against noises this paper added salt and pepper noises to about 1%–2% of the steel images dataset. Dataset is divided into 70% for training set and 30% for testing set.

  2. 2.

    Experiment: The experiments were conducted in three stages: pre-processing, defect detection, and defect classification. In the first stage, images are pre-processed as follows. Steel images are resized to 400 × 400 and then a 3 × 3 median filter is used to remove noises. The second stage “defect detection” includes four steps. The first step creates two Gaussian mixture models for each image by maximum likelihood to divide the image pixels into two groups: defective group and non-defective group. Figure 10 shows two GMM for steel image having scratch defect. The second step calculates statistical features mean, difference, and variance for these groups. In third step, each image is divided into ROIs. Each ROI contains 40x40 pixels. Use summed area table to extract statistical features for each ROI. Finally, use the discriminant rule to decide either the ROI is defective or non-defective. The fourth step displays defected ROIs if the steel image is defective or not as shown in Fig. 11. The defect classification stage is divided into two phases. In the training phase, the SVM classifier takes vectors of the extracted six Haralick features with associated labels for all images in the training set and then generates a training model. In the testing phase, the trained SMV takes Haralick features as a vector for test image from testing set and predicts defect class.

Fig. 10
figure 10

Steel image and GMM models

Fig. 11
figure 11

Defect detection result

  1. 3.

    Results: This section illustrates gradually the results of the proposed algorithms. In this chapter we develop three defect detection algorithms: (1) sequential without SAT algorithm (SEQ) [12], (2) proposed sequential with SAT algorithm (SSAT), and (3) proposed parallel with SAT algorithms (PSAT) developed by CUDA. The median execution time for three types of implementations to detect and classify three defects will be illustrated in this section. Table 2 shows the median defect and classification time in milliseconds (ms) for SEQ and SSAT and PSAT algorithms. They detect three defects, patches, scratch, and scale, while image size is 400 × 400 pixels and ROIs size is 40 × 40 pixels. Table 2 contains steel images with defected ROIs, defect type, median of the execution time for defect detection and classification algorithms, and speedup. The rightmost column in this table displays the speedup of the PSAT compared to SEQ.

Table 2 Execution times for three algorithms

Figure 12 shows median execution time for sequential and parallel algorithms implemented to detect and classify three defects. It depicts that the PSAT algorithm is the fastest one especially in detecting scratch defect. The PSAT algorithm is able to accomplish ~1.50x speedup. Figure 13 plots median defect detection and classification time with different dimensions of surface steel images with block (ROI) size 40 40 pixels. It shows that the median execution increases linearly with the increase of image size. The proposed PSAT algorithm has exhibited superior performance compared to the other algorithms while image size is increasing.

Fig. 12
figure 12

Median defect detection and classification time

Fig. 13
figure 13

Median defect detection and classification time with image size

The proposed algorithm divides the image into non-overlapped ROIs (partitions). The number of ROIs is specified based on defect location. Some defect may split into two ROIs. So, the smaller defect in a ROI may not be classified as a defect type. In doubt, this case will affect the accuracy of the proposed algorithm. The number of ROIs must be chosen carefully to reduce the defect splitting. Figure 14 depicts that PSAT algorithm takes shortest execution time in milliseconds for all bock sizes, while the SEQ takes significantly long execution time. SEQ divides image into ROIs with size W × H and handles each ROI separately, while PSAT generates 2D ROIs with W × H threads. Each thread launches kernel to detect either ROI is defective or not in parallel. Therefore, PSAT shows 1.4 speedup compared with SSAT and more than 1.6 speedup compared with SEQ. The accuracy of the proposed defect detection algorithms SSAT and PSAT is about 95.66%.

Fig. 14
figure 14

Median defect detection and classification rime with ROI size

5 Conclusion

The major aim of this chapter is to design and develop a parallel algorithm that automates the defects inspection process in steel industry. This work employed SAT to improve the defect detection algorithm in [12] In addition, it demonstrated the detailed implementation of the proposed sequential algorithm based on SAT and parallel algorithm. Once defected image is detected, SVM classifier has been used to classify the type of the defect (scratch, patches, scale). The experimental results in this article verified that the developed techniques succeeded to speed up the surface steel defects detection and classification compared with the existing techniques. Finally, the proposed parallel algorithm speeds up over the sentential algorithms developed in [12] by about 1.65 times to detect scratch, about 1.5 times to detect patch and 1.39 times to detect scale defects respectively where the image size is 400 × 400 with about 95.66%.