The Narrow Window Evaluation Model of Converter Operation Process Based on the Logistic Regression Algorithm

Chen, Chao; Wang, Nan; Yu, Haiyang; Chen, Min

doi:10.1007/978-3-030-36540-0_4

Chao Chen¹¹,
Nan Wang¹¹,
Haiyang Yu¹¹ &
…
Min Chen¹¹

Part of the book series: The Minerals, Metals & Materials Series ((MMMS))

2346 Accesses

Abstract

The narrow window control of the converter end-point carbon content and temperature is the main target for improving the refining level and achieving intelligent manufacturing. According to the practical data of Q235 steel obtained in converter production, the operation process parameters were discreted based on the chi-square boxing method combined with the ideal target interval of end-point carbon content and temperature. At the same time, the key process parameter affecting the ideal end-point of the converter was guided by WOE value, and the coded data was scored by logistic regression algorithm. The evaluation model of converter operation process established in this paper can reasonably identify the bad operating process parameters, and the recall ratio of the converter production data that does not meet the ideal end-point target interval is 84%, and the accuracy ratio of evaluation model is 88%. In addition, the evaluation results indicate that the carbon content and silicon content of the hot metal are the main factors affecting the convert end-point of this steel plant, and thus, optimizing the condition of hot metal can achieve the purpose of narrow window control on the converter endpoint.

Access provided by Autonomous University of Puebla. Download conference paper PDF

Experimental Investigation and Multi-objective Optimization Approach for Low-Carbon Milling Operation of Aluminum

An Evaluation Method for Material and Energy Conversion Effect with Steel Manufacturing Process Data

Constructal Design of a Converter Steelmaking Procedure Based on Multi-objective Optimization

Article 31 May 2018

Keywords

Introduction

Intelligent manufacturing is an important symbol showing the modernization level of the iron and steel enterprise. It is also an important condition to achieve the minimum cost and energy consumption in the process of production. The converter steelmaking process is a key part of the steel process, and its stability in hot metal conditions have an important impact on the quality of steel products. The precise control of the hot metal condition and smelting operation in the converter is beneficial to obtain a stable end-point carbon content and temperature, thereby promoting the smelting process narrow window control. The converter production is the multi-factor interaction process, so that the condition of hot metal, the operating, and the smelting cycle have an important effect on the converter end-point temperature and carbon content. Therefore, the development of converter operation process evaluation model will help to identify the main factor affecting the converter end-point control, and optimize the relevant process parameters and precise control of the narrow window. The twenty-first century is the information era, and the steel company has the massive data [1]. At the same time, with the rapid development of the cloud platform and the distributed system technology, the data storage and the computing capability have been greatly improved [2,3,4], which makes the data collection from steel companies, and refining the valuable production information through machine learning possible. Bai [5] developed a quality control system and realized online production of steel production, which benefited Cheng De Steel. Rot [6] collected the flame image data of the converter smelting, and utilized the convolutional neural network to predict the carbon content of the converter end-point, which brings a high prediction precise. Liu [7] established an intelligent control system for the quality of the sintering production line by collecting the actual production data of the sintering plant, which has improved the yield rate of sinter. These research works are of great significance to improve the level of intelligent manufacturing of steel mill production. However, the current production information of the steel plant has combined the human experience information and machine sensing information, so it is necessary to utilize the big data to construct the model, which provides more production information for human, to adjust the production in time. Based on the chi-square box method and logistic regression algorithm, this paper utilizes the actual production data of the converter to divide the converter production data and score the operation parameter, then evaluates the influence degree of the process parameters on the ideal target interval of the converter according to the WOE value. The evaluation model constructed in this paper can feedback the converter production process parameters timely, thus guiding the practice production to narrowly control the end-point of the converter.

Converter Process Parameters Selection

Based on the actual production data collected at the mill, the total converter production data about 1 year was sorted out. The data parameter is shown in Table 1, which mainly includes the raw material condition, the process operation parameter, and the end-point target. Among them, the end-point temperature and carbon content are the target variables, and the steel scrap addition, the hot metal condition, and the smelt cycle, etc., are processing operation parameters. This paper intends to divide the ideal target interval reasonably based on the actual requirements of steel production, and according to the ideal target interval, we utilize the chi-square box method and logistic regression algorithm to construct the converter evaluation model with nonlinear evaluation function, so as to guide the converter actual production and realize the narrow window control of the end-point.

Table 1 Relative parameters of converter production

Full size table

Figure 1 is the frequency distribution histogram of the end-point temperature and carbon content. It can be seen that the overall distribution of the end-point carbon content and temperature are approximately normal distribution. If the range of ideal target interval is too large, the constructed evaluation model will bring lower evaluation efficiency; if the range of ideal target interval is too narrow, the constructed evaluation model will be too harsh. Therefore, dividing the ideal target interval range should consider the above two aspects comprehensively.

Considered with the requirement of the low carbon steel Q235B, the ideal target interval was reasonably divided. The ideal target interval of the end-point carbon content is 0.02–0.05%. In order to satisfy the temperature of the refining process, the ideal target interval of the end-point temperature is 1660–1680 °C. According to the selected ideal target interval, the actual converter production data of the end-point carbon content and temperature are re-divided and classified. The data which is in the ideal target interval is classified as Class I, being seen as ideal data, and the data which is not in the ideal target interval is classified as Class II, being seen as non-ideal data.

Evaluation Model Construction

For converter steel production, there are many process parameters affecting the end-point carbon content and temperature, mainly including the hot metal conditions and the operating conditions. Figure 2 shows the Pearson correlation coefficient between the process parameters of the steelworks and the end-point target. It can be seen that there is no significant linear correlation between all process factors and the end-point temperature or carbon content, which undoubtedly increases the difficulty of the precise control in the converter operation to achieve the ideal end-point temperature and carbon content.

With rapid development of big data, the analysis thing mode has changed from the causal relationship model to the correlation relationship model [8]. With the help of a big data model, the relevant factors affecting converter production can be found. By collecting the actual production data of the converter steel plant, based on the big data technology, the potential law among the parameters of hot metal, operating parameters, and smelting cycle is deeply excavated, and the narrow window precise control of the converter end-point temperature and carbon content can be realized.

Establishment of the Chi-Square Boxing Method Model

The chi-square box method is an applying statistical model. By statistical analysis of existing data, the degree of influence for the variable on the target can be evaluated [9, 10]. The chi-square box method is used to group the converter production parameter, then the mass production data is summarized and analyzed, which is a benefit to analyze the relationship between the converter process factors and the end-point target.

The algorithm flow of the chi-square box method is (i) sorting every parameters from low to high; (ii) treating the data with the same value as the same interval; (iii) using Eqs. (1) and (2) to calculate the chi-square value of each interval; (iv) compare the chi-square values of adjacent intervals, and merge the similar value intervals that don’t exceed the chi-square threshold, then repeat steps (i)–(iii) until the proper number of bins.

$$ E_{\text{j}} = N_{\text{i}} \times C_{\text{j}} $$

(1)

$$ X^{ 2} = \sum\limits_{{{\text{j}} = 1}}^{2} {\frac{{\left( {A_{\text{j}} - E_{\text{j}} } \right)^{2} }}{{E_{\text{j}} }}} $$

(2)

where A_j is the number of instances of class j in each interval; E_j is the expected frequency number of A_j; N_i is the total number of each group; C_j is the total number of samples in each group of j samples.

The purpose of the chi-square box method is to discretize the continuous data, which is convenient for further data analysis. Through the chi-square box method, every parameter will be divide to a certain number interval, and to every interval, its target distribution will be maximized differently, so that each interval represents a specific operation. By analyzing the importance of each interval, we can evaluate the impact of different converter process parameters on the end-point target.

The importance of each interval is determined by the WOE value. The WOE value of each interval represents its effect on the end-point target. A positive WOE value indicates that the interval has a good effect on the end-point target, and a negative WOE value indicates a bad impact on the end-point target. The larger the absolute value of WOE, the greater is the impact on the end-point target. The calculation of the WOE value is as shown in the Eq. (3).

$$ WOE = \ln \left( {\frac{{P_{\text{bad}} }}{{P_{\text{good}} }}} \right) $$

(3)

where subscript good represents ideal data and subscript bad represents non-ideal data. P_good represents in each interval the proportion of good in all good; P_bad represents in each interval the proportion of bad in all bad. It can be seen from the expression on the Eq. (3) that the positive WOE value represents a large negative influence, while the negative WOE value represents a large positive influence.

In addition, through the chi-square box method and the WOE value calculation, it is possible to determine the interval contribution of each process parameter in the converter production to the ideal target interval of the converter endpoint. Generally, the number of boxes is based on people’s experience. This paper uses an iterative method to calculate the reasonable number of boxes. The IV value represents the amount of information contained in the variable. The higher IV value represents the more information on the end-point target. The calculation of the IV value is as shown in the Eq. (4).

$$ IV = \sum\limits_{i = 1}^{n} {\left( {P_{\text{bad}} - P_{\text{good}} } \right) \times WOE_{\text{i}} } $$

(4)

Through iterative calculation, when the IV value reaches the maximum value, the number of bins corresponding to the variable is the optimal number of bins, and the iterative termination condition is taken as the IV value convergence. The information of the final IV value and the final binning number are shown in Fig. 3. As can be seen from the figure, the number of boxes in the smelting cycle is 10, the number of converter consumption O₂ quantities is 16, the number of hot metal additions is 4, the number of steel scrap additions is 30, and the number of light roasting additions is 5. The number of hot metal w(C) is 48, the number of hot metal w(Si) is 25, the number of hot metal w(Mn) is 5, the number of hot metal w(P) is 22, the number of hot metal temperature is 14, and the number of lime addition is 12. The IV value represents the converter end-point information contained in the different converter process parameters. The IV value of the hot metal w(C) and the hot metal w(Si) is high, indicating that the hot metal composition of the steelmaking plant is the most important for the converter end-point.

Process Parameters WOE Value Statistics

Utilizing statistic knowledge to count the WOE situation of 2418 production data collected from the steelmaking plant, the result is shown in Table 2. It can be seen that the WOE mean value of the converter process parameters is −0.1, indicating that the overall operation of the converter has a good influence on the end-point target; the WOE variance is 0.72, indicating that the converter operation fluctuates greatly; the WOE maximum is 2.1, indicating that in the past, the operation of converter has been a high negative impact operation and should be avoided in actual production.

Table 2 WOE statistical result of converter process parameters

Full size table

Figure 4 is a comparison between Class I and Class II with each process parameter WOE mean value. It can be seen that the WOE value of each parameter in the class I target data is lower than the WOE value of the class II data, and the WOE value of each parameter of the class I data is less than 0, and the WOE value of each parameter of the class II data is higher than 0, indicating that the WOE is available. The positive and negative values represent the influence of the converter operating process on the ideal target interval of the endpoint. The WOE computing result indicates that the binning situation in this paper is reasonable. The difference in WOE mean value between the hot metal w(C) and the hot metal w(Si) is the largest, indicating that the values of these two process parameters have a great influence on the ideal target interval of the converter. The WOE values of hot metal w(Mn), light roasting addition, smelt cycle, and hot metal addition are all around 0, indicating that these four process parameters have a relatively stable influence on the ideal target interval of the converter.

Establishment of Logistic Regression Model

Every WOE value represents the influence of the data box to the end-point target. By adding the WOE values linearly, the total influence of the relevant parameters can be obtained. The higher total influence value means the less likely to reach the ideal interval. Using the logistic regression algorithm shown in Eqs. (5) and (6), through the $ - \theta^{T} {x} $, the total influence of each parameter can be calculated, and then through the function $ \frac{1}{{1 + {\text{e}}^{{ - {\text{x}}}} }} $, the nonlinear relationship can be mapped.

$$ {\text{h}}_{\uptheta} \left( {\text{x}} \right) = \frac{1}{{1 + {\text{e}}^{{ -\uptheta^{T} x}} }} $$

(5)

$$ {\text{p(y}}/{\text{x;}}\uptheta )= ({\text{h}}_{\uptheta} ({\text{x}}))^{y} (1 - {\text{h}}_{\uptheta} ({\text{x}}))^{1 - y} $$

(6)

where θ represents a linear regression coefficient.

Equation (6) is a logistic regression function. The model is using the built-in logistic regression model of Python sklearn version 0.19. For the calculation results, the mean value of the class I data is 0.7, and the mean value of the class II data is 0.8. In order to identify the bad data, we set the threshold as 0.7. So if the calculation result exceeds 0.7, we determine that the data is unqualified.

As can be seen from Table 3, the total data number is 2418. From the results of the model calculations, the comprehensive accuracy is 79%. For Class II data, the recall rate is 84% and the accuracy rate is 88%, indicating that the constructed model can find unqualified data well.

Table 3 Model effectiveness assessment

Full size table

Conclusion

Based on the actual production data of the converter, this paper utilizes the chi-square box method to discretize the data. To the discretized data, we calculate each box WOE value, and then evaluate the impact of box on the end-point target according to the WOE value. We found that the hot metal w(C) and the hot metal w(Si) have higher influence on the end-point target. So in order to achieve the narrow window control, we should first stable the hot metal w(C) and the hot metal w(Si). To the discrimination model, the comprehensive accuracy rate of the converter operation process evaluation model is 79%, and for the data that does not meet the ideal target interval, the discriminative accuracy rate is 88% and the recall rate is 84%.

References

Qiu Y, Luo H (2018) Research on logistics cost management of steel enterprises under the background of big data. Wuhan Uni Sci Tech
Google Scholar
Yen CC, Hsu JS (2009) Pagerank algorithm improvement by page relevance measurement. In: International conference on fuzzy systems, vol 5, no 8, pp 502–506
Google Scholar
Dean J, Ghemawat S (2010) mapreduce: a flexible data processing tool. Commun ACM 53(1):72–77
Article Google Scholar
Althebyan Q, Alqudah O, Jararweh Y, Yaseen Q (2014) Multi-threading based map reduce tasks scheduling. In: International conference on information and communication systems. IEEE, p 1
Google Scholar
Bai RG, Xu LS, Bao K, Wang FL (2018) The application of big data process quality control system in steel production. Chin Metall 28(08):76–80
Google Scholar
Luo T, Liu H, Wu QS, Wang B (2018) Prediction method of carbon content in converter steelmaking end point based on convolutional neural network. Inf Technol 42(12):142–147
Google Scholar
Lv Q, Liu S, Liu XJ, Bi ZX, Li JP (2018) Intelligent quality control system for sintering production line based on big data technology. Iron Steel 53(7):1–9
Google Scholar
Cheng XQ (2014) Overview of big data systems and analysis technologies. J Softw 9:1889–1908
Article Google Scholar
Li YH (2010) Establishment of credit scorecard model. Sci Inf 37(13):48–49
CAS Google Scholar
Lin ZQ, Zhang PY, Cui ZY (2013) Construction and implementation of SME credit score card system. Banker 7:20–23
Google Scholar

Download references

Author information

Authors and Affiliations

School of Metallurgy, Northeastern University, Shenyang, 110819, China
Chao Chen, Nan Wang, Haiyang Yu & Min Chen

Authors

Chao Chen
View author publications
You can also search for this author in PubMed Google Scholar
Nan Wang
View author publications
You can also search for this author in PubMed Google Scholar
Haiyang Yu
View author publications
You can also search for this author in PubMed Google Scholar
Min Chen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nan Wang .

Editor information

Editors and Affiliations

Central South University, Changsha, China
Zhiwei Peng
Michigan Technological University, Houghton, MI, USA
Jiann-Yang Hwang
Montana Technological University, Butte, MT, USA
Jerome P. Downey
RHI Magnesita, Leoben, Austria
Dean Gregurek
The University of Queensland, Brisbane, QLD, Australia
Baojun Zhao
Istanbul Technical University, Istanbul, Turkey
Onuralp Yücel
Atilim University, Ankara, Turkey
Ender Keskinkilic
Central South University, Changsha, China
Tao Jiang
Elkem Carbon AS, Kristiansand, Norway
Jesse F. White
King Fahd University of Petroleum and Minerals, Dhahran, Saudi Arabia
Morsi Mohamed Mahmoud

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chen, C., Wang, N., Yu, H., Chen, M. (2020). The Narrow Window Evaluation Model of Converter Operation Process Based on the Logistic Regression Algorithm. In: Peng, Z., et al. 11th International Symposium on High-Temperature Metallurgical Processing. The Minerals, Metals & Materials Series. Springer, Cham. https://doi.org/10.1007/978-3-030-36540-0_4

Download citation

DOI: https://doi.org/10.1007/978-3-030-36540-0_4
Published: 24 January 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-36539-4
Online ISBN: 978-3-030-36540-0
eBook Packages: Chemistry and Materials ScienceChemistry and Material Science (R0)

Publish with us

Policies and ethics

The Narrow Window Evaluation Model of Converter Operation Process Based on the Logistic Regression Algorithm

Abstract

Similar content being viewed by others

Experimental Investigation and Multi-objective Optimization Approach for Low-Carbon Milling Operation of Aluminum

An Evaluation Method for Material and Energy Conversion Effect with Steel Manufacturing Process Data

Constructal Design of a Converter Steelmaking Procedure Based on Multi-objective Optimization

Keywords

Introduction

Converter Process Parameters Selection

Evaluation Model Construction

Establishment of the Chi-Square Boxing Method Model

Process Parameters WOE Value Statistics

Establishment of Logistic Regression Model

Conclusion

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

The Narrow Window Evaluation Model of Converter Operation Process Based on the Logistic Regression Algorithm

Abstract

Similar content being viewed by others

Experimental Investigation and Multi-objective Optimization Approach for Low-Carbon Milling Operation of Aluminum

An Evaluation Method for Material and Energy Conversion Effect with Steel Manufacturing Process Data

Constructal Design of a Converter Steelmaking Procedure Based on Multi-objective Optimization

Keywords

Introduction

Converter Process Parameters Selection

Evaluation Model Construction

Establishment of the Chi-Square Boxing Method Model

Process Parameters WOE Value Statistics

Establishment of Logistic Regression Model

Conclusion

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation