Cascading Convolutional Neural Network for Steel Surface Defect Detection

Lin, Chih-Yang; Chen, Cheng-Hsun; Yang, Ching-Yuan; Akhyar, Fityanul; Hsu, Chao-Yung; Ng, Hui-Fuang

doi:10.1007/978-3-030-20454-9_20

Chih-Yang Lin¹⁵,
Cheng-Hsun Chen¹⁵,
Ching-Yuan Yang¹⁵,
Fityanul Akhyar¹⁵,
Chao-Yung Hsu¹⁶ &
…
Hui-Fuang Ng¹⁷

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 965))

Included in the following conference series:

International Conference on Applied Human Factors and Ergonomics

3075 Accesses
14 Citations

Abstract

Steel is the most important material in the world of engineering and construction. Modern steelmaking relies on computer vision technologies, like optical cameras to monitor the production and manufacturing processes, which helps companies improve product quality. In this paper, we propose a deep learning method to automatically detect defects on the steel surface. The architecture of our proposed system is separated into two parts. The first part uses a revised version of single shot multibox detector (SSD) model to learn possible defects. Then, deep residual network (ResNet) is used to classify three types of defects: Rust, Scar, and Sponge. The combination of these two models is investigated and discussed thoroughly in this paper. This work additionally employs a real industry dataset to confirm the feasibility of the proposed method and make sure it is applicable to real-world scenarios. The experimental results show that the proposed method can achieve higher precision and recall scores in steel surface defect detection.

Access provided by Autonomous University of Puebla. Download conference paper PDF

Automatic Detection and Quantification of Hot-Rolled Steel Surface Defects Using Deep Learning

Article 30 December 2022

FDD: a deep learning–based steel defect detectors

Article 07 March 2023

Research on steel surface defect classification method based on deep learning

Article Open access 08 April 2024

Keywords

1 Introduction

The billet is an upstream product of the rod and wire from Sinosteel. The process of producing billet from casting to production involves cooling, sand-blasting, rusting, inspection, grinding and heating, and finally rolling into strips. Billets are approximately 145 mm × 145 mm in size, and can be supplied to strip and wire factories for rolling into strip steel, wire rods and linear steel. However, inspection is necessary to ensure the quality of the product before it is sent out.

The surface temperature of billets reaches as high as 700 to 900° [1] in the production environment. These conditions make defect detection on billets difficult to achieve. Traditional billet defect detection methods are divided into visual inspection [2, 3] and magnetic particle inspection [4]. However, visual inspection is more cost and time efficient; therefore, we will only focus on visual inspection in this paper. The types of defects found can indicate the cause of defect formation and be used to improve the steelmaking process, since different defects have different causes.

In this paper, we develop a billet defect detection technology based on convolutional neural network. We propose a hierarchical structure to defect defects with revised SSD and ResNet50 [5, 6]. The experimental results show the effectiveness of the proposed method.

2 Architecture Overview

2.1 Structure of SSD

With the rise of convolutional neural networks, many models have evolved, such as Faster RCNN [7], Mask RCNN [8], Single Shot Multibox Detector (SSD) [9], and You Only Look Once (YOLO) [10]. All of these models have object detection capabilities. Among them, we chose the SSD300 version as the basic model. The reasons that we selected SSD300 are as follow:

Faster RCNN and Mask RCNN are two-stage methods, which means that the training process is performed in two steps. In contrast, SSD and YOLO are one-stage methods, which are more efficient.
The detection speed is better than that of other models. According to the author’s paper, the detection speed of SSD300 is 59 FPS (frames per second).
The architecture of SSD300 is simpler than other models’ architectures and easier to adjust.
SSD has multi-scale predictions.

The original SSD300 contains anchor boxes that are a combination of horizontal and vertical rectangles as shown in Fig. 1(a). In this work, we only use three horizontal rectangles as shown in Fig. 1(b). When there are many prediction boxes on an object, as shown in Fig. 2(a), non-maximum suppression (NMS) in SSD can solve this problem as shown in Fig. 2(b).

In Figs. 3–5, we present our revisions to the SSD architecture based on characteristics of collected defect images. Billet defects are mainly small ones. In Fig. 4, a 75 × 75 feature map is added to convolutional block 3 of the VGG16 layer, and the last two feature maps (3 × 3 and 1 × 1) are removed. In order to compare advantages and disadvantages of various SSD structures, the original SSD module (SSD300) and modified SSD module (revised-SSD300) will be trained. In addition, the revised-SSD300 will be extended to a revised-SSD600 with an input size of 600 × 600 as shown in Fig. 5. Therefore, in total, three models will be trained for comparison.

2.2 Introduction of SENet and ResNet

In our main task, we need to detect two defects, called “sponge” and “scar” defects. The task of SSD is to determine whether the defects exist and where they are. We also added Squeeze-and-Excitation Net (SENet) [11] structure in our model to boost the results with adaptive weights for each feature map. SENet is not a complete network structure, but rather a small architecture in between convolution blocks. When SENet is applied, our method is called Revised-SSDSE, which is shown in Fig. 5.

Sometimes, another non-defect factor, called “rusty factor” as shown in Fig. 9(c), will be present in the dataset. The rusty factors, which are not defects, have various shapes and features and significantly affect our results. In order to detect rusty factors in the dataset, the 3 ∗ 3 and 1 ∗ 1 layers must be added back to the revised SSD network.

After determining the existence and location of defects, ResNet should identify the name of the defect. In this paper, we use ResNet50 [12] and classify three categories of defects as shown in Fig. 9. The defect from SSD will be resized to 224 × 224 to fit the input size for ResNet50. The combination of revised-SSDSE600 and ResNet50 forms the complete hierarchical structure as shown in Fig. 6.

3 System Requirements

3.1 Hardware and Software

The hardware and software environment used in this paper is given in Table 1. The software part of the system includes Anaconda and GPU environment settings.

Table 1. Hardware and software environment.

Full size table

3.2 Data Annotation

We use LabelImg v1.6.0^{Footnote 1} tool to mark defect locations and non-defect classes in the dataset for the SSD model. LabelImg supports several operating system platforms, like Windows, Linux and Mac OS X. In this work, we use a Windows environment. After the labeling process is completed, the label result is saved in an XML format.

4 Experimental Results

The detection results are affected by camera types, illumination, number of defective samples, and other factors. The training process is performed as follows.

Collect various defect samples.
Mark the defect samples and generate corresponding XML files containing defect information.
Train marked defect samples through the neural network structure and save the training results.

4.1 Initial Test

In the initial test, we prepared defect data with 464 Scar and 246 Sponge images in the dataset, 10% of which were validation and 90% of which were training data. The experimental results are shown in Table 2. The results in Fig. 7 show that the performance of the revised-SSD300 is similar to that of the revised-SSD600, but better than SSD300. There are too many redundant boxes when SSD300 is applied, as presented in Fig. 7b.

Table 2. Initial results of models.

Full size table

We test the daily images provided by the onsite database and used the following parameters as an accurate benchmark for calculating the system performance [13]:

True Positive (TP).
True Negative (TN).
False Positive (FP).
False Negative (FN).
Precision (P) in Eq. (1).
Recall (R) in Eq. (2).
F-Measure is a comprehensive evaluation index, which is used to understand whether two values of Precision and Recall are good, as shown in Eq. (3).

$$ \text{P} = \text{TP}/\left( {\text{TP} + \text{FP}} \right). $$

(1)

$$ \text{R} = \text{TP}/\left( {\text{TP} + \text{FN}} \right). $$

(2)

$$ {\text{F-Measure}} = \left( {2 \times {\text{P}} \times {\text{R}}} \right)/\left( {{\text{P}} + \text{R}} \right). $$

(3)

4.2 Final Test

According to Tables 3 and 4, after a seven-day training period, the highest precision and recall of the revised-SSD300 were 100% and 77.6%, respectively. The revised-SSD600 had a better recall due to its high-resolution images. However, the combination of the revised-SSDSE600 and ResNet50 achieved the highest precision and recall rates.

Table 3. Final test results of revised-SSD300.

Full size table

Table 4. Final test results of revised-SSD600.

Full size table

Note that Fig. 10 shows that if we used defect bounding boxes directly from SSD for ResNet as shown in Fig. 8, the training process was hard to converge because the bounding boxes were too fitted to the defects. Therefore, we enlarged the range of the bounding boxes as shown in Fig. 9. After extending the bounding box, the training process could converge, which enabled the performance in Table 5 to be achieved.

Table 5. Final test results of the combination of revised-SSDSE600 and ResNet50.

Full size table

5 Conclusions

In this paper, we design a hierarchical model to build a defect detection system for steel billets. We have modified the architecture of SSD by changing the sizes of feature maps and the sizes of anchor boxes to fit the shape of defects. The experimental results demonstrate the effectiveness of the proposed method. In further work, we will collect more rust defect images, because rusty types include many variations.

Notes

1.
https://github.com/tzutalin/labelImg.

References

Peacock, G.R.: Review of noncontact process temperature measurements in steel manufacturing. In: SPIE Conference on Thermosense XXI, pp. 171–189. SPIE, Florida (1999)
Google Scholar
Yun, J.P., Choi, S., Kim, J.-W., Kim, S.W.: Automatic detection of cracks in raw steel block using Gabor filter optimized by univariate dynamic encoding algorithm for searches (uDEAS). NDT E Int. 42, 389–397 (2009)
Article Google Scholar
Duan, X., Duan, F., Han, F.: Study on surface defect vision detection system for steel plate based on virtual instrument technology. In: 2011 International Conference on Control, Automation and Systems Engineering (CASE), pp. 1–4. IEEE, Singapore (2011)
Google Scholar
Balchin, N.C., Blunt, J.: Health and Safety in Welding and Allied Processes, 5th edn. Woodhead Publishing, Cambridge (2002)
Google Scholar
Simonyan, K., Zisserman, A.: Very Deep Convolutional Networks for Large-Scale Image Recognition. CoRR abs/1409.1556 (2014)
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems 25 (NIPS 2012), pp. 1097–1105. Curran Associates, Inc., Nevada (2012)
Google Scholar
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems 28 (NIPS 2015), pp. 91–99. Curran Associates, Inc., Canada (2015)
Google Scholar
He, K., Gkioxari, G., Dollár, P., Girshick, R.B.: Mask R-CNN. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 2980–2988. IEEE, Venice (2017)
Google Scholar
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.Y., Berg, A.C.: SSD: single shot multibox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) Computer Vision – ECCV 2016. Lecture Notes in Computer Science, vol. 9905. Springer, Cham (2016)
Chapter Google Scholar
Redmon, J., Kumar Divvala, B., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. pp. 779–788. IEEE (2016)
Google Scholar
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7132–7141. IEEE, Salt Lake City (2018)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778. IEEE (2016)
Google Scholar
Powers, D.M.W.: Evaluation: from Precision, Recall and F-measure to ROC, Informedness, Markedness and Correlation. 2017 (2007)
Google Scholar

Download references

Acknowledgement

This work was supported by the Ministry of Science and Technology, Taiwan, under Grants MOST 106-2218-E-468-001, MOST 107-2221-E-155-048-MY3, and MOST 108-2634-F-008-001, and under Grants from China Steel Corporation RE106728 and RE107705.

Author information

Authors and Affiliations

Department of Electrical Engineering, Yuan-Ze University, Taoyuan, Taiwan
Chih-Yang Lin, Cheng-Hsun Chen, Ching-Yuan Yang & Fityanul Akhyar
Automation and Instrumentation System Development Section, China Steel Corporation, Kaohsiung, Taiwan
Chao-Yung Hsu
Department of Computer Science, FICT, Universiti Tunku Abdul Rahman, Petaling Jaya, Malaysia
Hui-Fuang Ng

Authors

Chih-Yang Lin
View author publications
You can also search for this author in PubMed Google Scholar
Cheng-Hsun Chen
View author publications
You can also search for this author in PubMed Google Scholar
Ching-Yuan Yang
View author publications
You can also search for this author in PubMed Google Scholar
Fityanul Akhyar
View author publications
You can also search for this author in PubMed Google Scholar
Chao-Yung Hsu
View author publications
You can also search for this author in PubMed Google Scholar
Hui-Fuang Ng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chih-Yang Lin .

Editor information

Editors and Affiliations

Institute for Advanced Systems Engineering, University of Central Florida, Orlando, FL, USA
Tareq Ahram

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lin, CY., Chen, CH., Yang, CY., Akhyar, F., Hsu, CY., Ng, HF. (2020). Cascading Convolutional Neural Network for Steel Surface Defect Detection. In: Ahram, T. (eds) Advances in Artificial Intelligence, Software and Systems Engineering. AHFE 2019. Advances in Intelligent Systems and Computing, vol 965. Springer, Cham. https://doi.org/10.1007/978-3-030-20454-9_20

Download citation

DOI: https://doi.org/10.1007/978-3-030-20454-9_20
Published: 11 June 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-20453-2
Online ISBN: 978-3-030-20454-9
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics

Cascading Convolutional Neural Network for Steel Surface Defect Detection

Abstract

Similar content being viewed by others

Automatic Detection and Quantification of Hot-Rolled Steel Surface Defects Using Deep Learning

FDD: a deep learning–based steel defect detectors

Research on steel surface defect classification method based on deep learning

Keywords

1 Introduction