Abstract
Steel is the most important material in the world of engineering and construction. Modern steelmaking relies on computer vision technologies, like optical cameras to monitor the production and manufacturing processes, which helps companies improve product quality. In this paper, we propose a deep learning method to automatically detect defects on the steel surface. The architecture of our proposed system is separated into two parts. The first part uses a revised version of single shot multibox detector (SSD) model to learn possible defects. Then, deep residual network (ResNet) is used to classify three types of defects: Rust, Scar, and Sponge. The combination of these two models is investigated and discussed thoroughly in this paper. This work additionally employs a real industry dataset to confirm the feasibility of the proposed method and make sure it is applicable to real-world scenarios. The experimental results show that the proposed method can achieve higher precision and recall scores in steel surface defect detection.
Access provided by Autonomous University of Puebla. Download conference paper PDF
Similar content being viewed by others
Keywords
1 Introduction
The billet is an upstream product of the rod and wire from Sinosteel. The process of producing billet from casting to production involves cooling, sand-blasting, rusting, inspection, grinding and heating, and finally rolling into strips. Billets are approximately 145 mm × 145 mm in size, and can be supplied to strip and wire factories for rolling into strip steel, wire rods and linear steel. However, inspection is necessary to ensure the quality of the product before it is sent out.
The surface temperature of billets reaches as high as 700 to 900° [1] in the production environment. These conditions make defect detection on billets difficult to achieve. Traditional billet defect detection methods are divided into visual inspection [2, 3] and magnetic particle inspection [4]. However, visual inspection is more cost and time efficient; therefore, we will only focus on visual inspection in this paper. The types of defects found can indicate the cause of defect formation and be used to improve the steelmaking process, since different defects have different causes.
In this paper, we develop a billet defect detection technology based on convolutional neural network. We propose a hierarchical structure to defect defects with revised SSD and ResNet50 [5, 6]. The experimental results show the effectiveness of the proposed method.
2 Architecture Overview
2.1 Structure of SSD
With the rise of convolutional neural networks, many models have evolved, such as Faster RCNN [7], Mask RCNN [8], Single Shot Multibox Detector (SSD) [9], and You Only Look Once (YOLO) [10]. All of these models have object detection capabilities. Among them, we chose the SSD300 version as the basic model. The reasons that we selected SSD300 are as follow:
-
Faster RCNN and Mask RCNN are two-stage methods, which means that the training process is performed in two steps. In contrast, SSD and YOLO are one-stage methods, which are more efficient.
-
The detection speed is better than that of other models. According to the author’s paper, the detection speed of SSD300 is 59 FPS (frames per second).
-
The architecture of SSD300 is simpler than other models’ architectures and easier to adjust.
-
SSD has multi-scale predictions.
The original SSD300 contains anchor boxes that are a combination of horizontal and vertical rectangles as shown in Fig. 1(a). In this work, we only use three horizontal rectangles as shown in Fig. 1(b). When there are many prediction boxes on an object, as shown in Fig. 2(a), non-maximum suppression (NMS) in SSD can solve this problem as shown in Fig. 2(b).
In Figs. 3–5, we present our revisions to the SSD architecture based on characteristics of collected defect images. Billet defects are mainly small ones. In Fig. 4, a 75 × 75 feature map is added to convolutional block 3 of the VGG16 layer, and the last two feature maps (3 × 3 and 1 × 1) are removed. In order to compare advantages and disadvantages of various SSD structures, the original SSD module (SSD300) and modified SSD module (revised-SSD300) will be trained. In addition, the revised-SSD300 will be extended to a revised-SSD600 with an input size of 600 × 600 as shown in Fig. 5. Therefore, in total, three models will be trained for comparison.
2.2 Introduction of SENet and ResNet
In our main task, we need to detect two defects, called “sponge” and “scar” defects. The task of SSD is to determine whether the defects exist and where they are. We also added Squeeze-and-Excitation Net (SENet) [11] structure in our model to boost the results with adaptive weights for each feature map. SENet is not a complete network structure, but rather a small architecture in between convolution blocks. When SENet is applied, our method is called Revised-SSDSE, which is shown in Fig. 5.
Sometimes, another non-defect factor, called “rusty factor” as shown in Fig. 9(c), will be present in the dataset. The rusty factors, which are not defects, have various shapes and features and significantly affect our results. In order to detect rusty factors in the dataset, the 3 ∗ 3 and 1 ∗ 1 layers must be added back to the revised SSD network.
After determining the existence and location of defects, ResNet should identify the name of the defect. In this paper, we use ResNet50 [12] and classify three categories of defects as shown in Fig. 9. The defect from SSD will be resized to 224 × 224 to fit the input size for ResNet50. The combination of revised-SSDSE600 and ResNet50 forms the complete hierarchical structure as shown in Fig. 6.
3 System Requirements
3.1 Hardware and Software
The hardware and software environment used in this paper is given in Table 1. The software part of the system includes Anaconda and GPU environment settings.
3.2 Data Annotation
We use LabelImg v1.6.0Footnote 1 tool to mark defect locations and non-defect classes in the dataset for the SSD model. LabelImg supports several operating system platforms, like Windows, Linux and Mac OS X. In this work, we use a Windows environment. After the labeling process is completed, the label result is saved in an XML format.
4 Experimental Results
The detection results are affected by camera types, illumination, number of defective samples, and other factors. The training process is performed as follows.
-
Collect various defect samples.
-
Mark the defect samples and generate corresponding XML files containing defect information.
-
Train marked defect samples through the neural network structure and save the training results.
4.1 Initial Test
In the initial test, we prepared defect data with 464 Scar and 246 Sponge images in the dataset, 10% of which were validation and 90% of which were training data. The experimental results are shown in Table 2. The results in Fig. 7 show that the performance of the revised-SSD300 is similar to that of the revised-SSD600, but better than SSD300. There are too many redundant boxes when SSD300 is applied, as presented in Fig. 7b.
We test the daily images provided by the onsite database and used the following parameters as an accurate benchmark for calculating the system performance [13]:
-
True Positive (TP).
-
True Negative (TN).
-
False Positive (FP).
-
False Negative (FN).
-
Precision (P) in Eq. (1).
-
Recall (R) in Eq. (2).
-
F-Measure is a comprehensive evaluation index, which is used to understand whether two values of Precision and Recall are good, as shown in Eq. (3).
4.2 Final Test
According to Tables 3 and 4, after a seven-day training period, the highest precision and recall of the revised-SSD300 were 100% and 77.6%, respectively. The revised-SSD600 had a better recall due to its high-resolution images. However, the combination of the revised-SSDSE600 and ResNet50 achieved the highest precision and recall rates.
Note that Fig. 10 shows that if we used defect bounding boxes directly from SSD for ResNet as shown in Fig. 8, the training process was hard to converge because the bounding boxes were too fitted to the defects. Therefore, we enlarged the range of the bounding boxes as shown in Fig. 9. After extending the bounding box, the training process could converge, which enabled the performance in Table 5 to be achieved.
5 Conclusions
In this paper, we design a hierarchical model to build a defect detection system for steel billets. We have modified the architecture of SSD by changing the sizes of feature maps and the sizes of anchor boxes to fit the shape of defects. The experimental results demonstrate the effectiveness of the proposed method. In further work, we will collect more rust defect images, because rusty types include many variations.
References
Peacock, G.R.: Review of noncontact process temperature measurements in steel manufacturing. In: SPIE Conference on Thermosense XXI, pp. 171–189. SPIE, Florida (1999)
Yun, J.P., Choi, S., Kim, J.-W., Kim, S.W.: Automatic detection of cracks in raw steel block using Gabor filter optimized by univariate dynamic encoding algorithm for searches (uDEAS). NDT E Int. 42, 389–397 (2009)
Duan, X., Duan, F., Han, F.: Study on surface defect vision detection system for steel plate based on virtual instrument technology. In: 2011 International Conference on Control, Automation and Systems Engineering (CASE), pp. 1–4. IEEE, Singapore (2011)
Balchin, N.C., Blunt, J.: Health and Safety in Welding and Allied Processes, 5th edn. Woodhead Publishing, Cambridge (2002)
Simonyan, K., Zisserman, A.: Very Deep Convolutional Networks for Large-Scale Image Recognition. CoRR abs/1409.1556 (2014)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems 25 (NIPS 2012), pp. 1097–1105. Curran Associates, Inc., Nevada (2012)
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems 28 (NIPS 2015), pp. 91–99. Curran Associates, Inc., Canada (2015)
He, K., Gkioxari, G., Dollár, P., Girshick, R.B.: Mask R-CNN. In: 2017 IEEE International Conference on Computer Vision (ICCV), pp. 2980–2988. IEEE, Venice (2017)
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.Y., Berg, A.C.: SSD: single shot multibox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) Computer Vision – ECCV 2016. Lecture Notes in Computer Science, vol. 9905. Springer, Cham (2016)
Redmon, J., Kumar Divvala, B., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. pp. 779–788. IEEE (2016)
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7132–7141. IEEE, Salt Lake City (2018)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778. IEEE (2016)
Powers, D.M.W.: Evaluation: from Precision, Recall and F-measure to ROC, Informedness, Markedness and Correlation. 2017 (2007)
Acknowledgement
This work was supported by the Ministry of Science and Technology, Taiwan, under Grants MOST 106-2218-E-468-001, MOST 107-2221-E-155-048-MY3, and MOST 108-2634-F-008-001, and under Grants from China Steel Corporation RE106728 and RE107705.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Lin, CY., Chen, CH., Yang, CY., Akhyar, F., Hsu, CY., Ng, HF. (2020). Cascading Convolutional Neural Network for Steel Surface Defect Detection. In: Ahram, T. (eds) Advances in Artificial Intelligence, Software and Systems Engineering. AHFE 2019. Advances in Intelligent Systems and Computing, vol 965. Springer, Cham. https://doi.org/10.1007/978-3-030-20454-9_20
Download citation
DOI: https://doi.org/10.1007/978-3-030-20454-9_20
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-20453-2
Online ISBN: 978-3-030-20454-9
eBook Packages: EngineeringEngineering (R0)