Modeling hot strip rolling process under framework of generalized additive model
来源期刊:中南大学学报(英文版)2019年第9期
论文作者:李维刚 杨威 赵云涛 严保康 LIU Xiang-hua(刘相华)
文章页码:2379 - 2392
Key words:industrial big data; generalized additive model; mechanical property prediction; deformation resistance prediction
Abstract: This research develops a new mathematical modeling method by combining industrial big data and process mechanism analysis under the framework of generalized additive models (GAM) to generate a practical model with generalization and precision. Specifically, the proposed modeling method includes the following steps. Firstly, the influence factors are screened using mechanism knowledge and data-mining methods. Secondly, the unary GAM without interactions including cleaning the data, building the sub-models, and verifying the sub-models. Subsequently, the interactions between the various factors are explored, and the binary GAM with interactions is constructed. The relationships among the sub-models are analyzed, and the integrated model is built. Finally, based on the proposed modeling method, two prediction models of mechanical property and deformation resistance for hot-rolled strips are established. Industrial actual data verification demonstrates that the new models have good prediction precision, and the mean absolute percentage errors of tensile strength, yield strength and deformation resistance are 2.54%, 3.34% and 6.53%, respectively. And experimental results suggest that the proposed method offers a new approach to industrial process modeling.
Cite this article as: LI Wei-gang, YANG Wei, ZHAO Yun-tao, YAN Bao-kang, LIU Xiang-hua. Modeling hot strip rolling process under framework of generalized additive model [J]. Journal of Central South University, 2019, 26(9): 2379-2392. DOI: https://doi.org/10.1007/s11771-019-4181-9.
J. Cent. South Univ. (2019) 26: 2379-2392
DOI: https://doi.org/10.1007/s11771-019-4181-9
LI Wei-gang(李维刚)1, 2, YANG Wei(杨威)1, ZHAO Yun-tao(赵云涛)1,YAN Bao-kang(严保康)1, LIU Xiang-hua(刘相华)3
1. Engineering Research Center for Metallurgical Automation and Measurement Technology of Ministry of Education, Wuhan University of Science and Technology, Wuhan 430081, China;
2. National-provincial Joint Engineering Center of High Temperature Materials and Lining Technology, Wuhan University of Science and Technology, Wuhan 430081, China;
3. Research Institute of Science and Technology, Northeastern University, Shenyang 110819, China
Central South University Press and Springer-Verlag GmbH Germany, part of Springer Nature 2019
Abstract: This research develops a new mathematical modeling method by combining industrial big data and process mechanism analysis under the framework of generalized additive models (GAM) to generate a practical model with generalization and precision. Specifically, the proposed modeling method includes the following steps. Firstly, the influence factors are screened using mechanism knowledge and data-mining methods. Secondly, the unary GAM without interactions including cleaning the data, building the sub-models, and verifying the sub-models. Subsequently, the interactions between the various factors are explored, and the binary GAM with interactions is constructed. The relationships among the sub-models are analyzed, and the integrated model is built. Finally, based on the proposed modeling method, two prediction models of mechanical property and deformation resistance for hot-rolled strips are established. Industrial actual data verification demonstrates that the new models have good prediction precision, and the mean absolute percentage errors of tensile strength, yield strength and deformation resistance are 2.54%, 3.34% and 6.53%, respectively. And experimental results suggest that the proposed method offers a new approach to industrial process modeling.
Key words: industrial big data; generalized additive model; mechanical property prediction; deformation resistance prediction
Cite this article as: LI Wei-gang, YANG Wei, ZHAO Yun-tao, YAN Bao-kang, LIU Xiang-hua. Modeling hot strip rolling process under framework of generalized additive model [J]. Journal of Central South University, 2019, 26(9): 2379-2392. DOI: https://doi.org/10.1007/s11771-019-4181-9.
1 Introduction
Since the industrial process has the characteristics of high degree of continuity and complex mechanisms between parameters, industrial data circumstances specific characteristics such as multi-layer irregular sampling, multiple temporal and spatial time series, and non-veracity with outliers. Hence, existing modeling methods for industrial processes encounter numerous challenges, which can be divided into two main categories: 1) poor quality of modeling data and 2) high user requirements for model performance. Poor-quality modeling data are problematic because existing modeling methods often require non-polluted data. Otherwise, the model is greatly affected by a small number of outliers and model mismatch can easily occur. With respect to the high user requirements for model performance, much of the problem lies primarily in the practicality of the model, and the practicality is mainly embodied in the reliability and generalization of the model. Reliability refers to the high accuracy of the model in the scope of application, and high generalization means that the model is suitable for a wide range of cases. However, a contradiction exists between reliability and generalization, and appropriate trade-offs must be made according to the specific needs of users, which is another challenge encountered in industrial system modeling. Traditional industrial process modeling methods include statistical modeling and mechanism modeling. Statistical modeling uses statistics, intelligent computing and other methods to build the model based on industrial process data. This method is relatively simple and suitable for modeling complex industrial processes with large amounts of historical data, and many achievements have been made based on this method [1-6]. However, a simple statistical model is not enough to describe the complexity of the process mechanism, which leads to the coefficients of the model with the same structure are quite different, so the reliability of the established model is not guaranteed. Secondly, because the new products do not have enough production data for modeling, the statistical model is difficult to be applied to the design of new products. Not only that, there is also a great risk to use the statistical model for the optimization of historical products. This is because once the production is organized according to the optimized scheme, the distribution of the new data is different from the original modeling data. Under the new data distribution, the accuracy of the model may be significantly reduced. On the other hand, mechanism modeling is based on laws involved in the process, such as the laws of conservation of energy, dynamics and material balance. Then the differential equations are established according to the corresponding laws to construct the mechanism model. Thus, a model obtained by mechanism modeling has strong interpretability, and this modeling method has been applied in many research fields [7-10, 11-13]. However, in specific applications such as strip mechanical property prediction, in order to improve the accuracy of the model, some sub-models in mechanism model need
to be established by statistical methods. This makes the shortcomings of the statistical model also shown in the mechanism model. Meanwhile, there are many factors affecting the industrial process, and the evolution rules of the related model parameters are difficult to describe accurately. As a result, the process of mechanism modeling is much more difficult than the statistical model. In addition, it is necessary to simplify the object and apply certain assumptions in mechanism modeling, resulting in a substantial error between the obtained model and the actual industrial process.
Hence, to address these possible drawbacks, we developed a new system modeling method by combining industrial big data and process mechanism. Specifically, based on mechanism knowledge, data mining methods and manual analysis, this research divides the complex industrial process problem into several sub- problems, and then builds the sub-models corresponding to each sub-problem. A complex industrial process model is subsequently constructed with a generalized additive form. Finally, two practical instances in a hot strip rolling process are solved by using the new system modeling method, and the prediction results indicate that the obtained models have good generalization and high prediction accuracy.
The remainder of this paper is organized as follows. Section 2 elaborates on the modeling method, and Section 3 describes the two applications of the proposed method. Selected conclusions are presented in Section 4.
2 Proposed modeling method
The mechanical property prediction model of hot rolled strips is a typical industrial process model which can be used to predict mechanical properties such as tensile strength, yield strength and elongation. And it can be applied to reducing the strip sampling amount, controlling the mechanical properties of hot rolled product, optimizing the components of the strip and designing new products [14-17]. Since the 1970s, mechanical property prediction model of hot rolled strips has been a focus of attention in the metallurgical industry. Although this research has lasted for more than 40 years, it is still immature because of the complexity of the research object and the low quality of the collected data. This immaturity has led to difficulty in improving the reliability and accuracy of the model. In addition, the model is usually built based on historical data, however, once the production is organized according to the optimization scheme, the distribution of the new data will be different from the original modeling data. Therefore, the model must have good generalization.
Hence, taking mechanical property prediction of hot rolled strips as an example, specific steps of the proposed modeling method are listed in the following paragraphs. Figure 1 shows the general procedure of the new industrial process modeling method.
Step 1: Screening of influence factors of the built model. Reasonable screening of influence factors can improve the accuracy of the built model. Many factors influence the mechanical property of hot rolled strips, and certain invisible disturbances also exist. Effectively identifying the influence factors necessitates the combination of the process mechanism, data mining methods and the prior knowledge.
According to the metallurgical mechanism, data acquisition process and prior knowledge, the influence factors of strip mechanical property are first classified into four categories: chemical composition, process parameters, detection parameters and abnormal marks. Because various factors involved in the rolling process and complex interactions between a subset of factors, the classification of influence factors not only indicate the area in which the factors can be found but also reduce the wrong influence factors within certain limits.
First, some recognized influence factors can be identified using prior knowledge, such as chemical compositions including C, Si, Mn, S, P, N, Nb, V and Ti, and process parameters including TF (reheating temperature), TFD (finishing delivery temperature), TC (coiling temperature), Tr (roughing temperature), Tfe (finishing entry temperature) and HFD (finishing delivery thickness). Then, the random forests algorithm [18] is used to select a subset of possible influence factors. The procedure for the random forests algorithm is described as follows:
Figure 1 Modeling method combining industrial data with process mechanism under framework of generalized additive models
Algorithm: Random forests
1. For b=1 to B, where B is the total number of
random-forest tree:
a) Draw a bootstrap sample Z* of size N from
the training data.
b) Grow a random-forest tree Tbto the bootstrapped data by recursively repeating the following steps for each terminal node of the tree until the minimum node size nmin is reached.
i. Select m variables at random from the p variables.
ii. Pick the best variable/split-point among
the m.
iii. Split the node into two daughter nodes.
2. Output the ensemble of trees
To make a prediction at new point x:
Regression:.
Classification: Letbe the class prediction of the bth random-forest tree. Then, .
An important feature of the random forests algorithm is its use of out-of-bag (OOB) samples: for each observation zi=(xi, yi), we construct its random forest predictor by averaging only those trees corresponding to bootstrap samples in which zi does not appear. OOB samples can be used to measure the prediction strength of each variable. When the bth tree is grown, the OOB samples are passed down the tree and the prediction accuracy is recorded. Then, values for the jth variable are randomly permutated in the OOB samples, and the accuracy is again computed and recorded. The decrease in accuracy as a result of this permutation is averaged over all trees and is used as a measure of the importance of jth variable in the random forest.
Notably, the influence factors found by the random forests show good distinction in the data but cannot determine whether these factors actually affect the independent variable. Thus, the causality diagram, mechanism knowledge and other methods are comprehensively used to judge the authenticity of the influence factors. Figure 2 shows the overall concept of influence factor determination. Because of the diversity and complexity of the influence factors, the search for influence factors is a repeated process of hypothesis, analysis, inspection and correction.
Figure 2 Determination of influence factors
Step 2: Construction of the unary generalized additive model (GAM) without interactions. This step includes data cleaning, sub-model establishment, sub-model validation, etc. These steps are repeated until a reliable model is obtained or no evidence is found to prove the model unreliable. Specific descriptions of these steps are given as follows:
1) Cleaning the process data. Practical industrial process data contain large amounts of useful information but also certain useless and even misleading information, which is also true for the rolling process data. Thus, data cleaning must be performed first. Binning is a method commonly applied for data cleaning. The sorted data value is smoothed by consulting its “neighborhood”, i.e., values around it, and the sorted values are distributed into a number of bins [19]. The bins can be of equal frequency, where the number of data values in each bin is constant, or of equal width, where the interval range of values in each bin is constant. However, in practice, these two types of binning cannot effectively reflect the distribution of the raw data. Hence, given the non-veracity with outliers and the non-uniform distribution of rolling process data, a dynamic binning method for data cleaning is proposed in which the number of data values in each bin is not constant but instead, based on the distribution density of the data points.
Figure 3 Two typical distributions of hot rolling process data:
The distribution of the hot rolling process data can be divided into two categories: zonal distribution and reunion distribution. The proposed data cleaning method differs for the two distributions.
Assume a sorted data set D={(xi, yi), 1≤i≤N}
When D is zonal distributed, an interval with width δ is set first according to the scatter plot. Subsequently, the interval is moved from left to right along the x-axis until all points are divided into different intervals, and the principle of division states that each interval whose width is δ contains as many points as possible. The kth interval is defined as Ik, and the original data in interval Ik are replaced by the average:
When D is reunion distributed, an initial value cnt is first set, the interval [x1, xN] is binned into bins and the number of points in each bin is sequentially counted. The mean value of the data in these bins is calculated whether the counting number is greater than 40 or the cumulative number of bins is greater than 4 and the counting number of points is over 20. Finally, the original data in each adjacent bins are replaced by the mean value.
2) Building the sub-models. The relationships between the mechanical properties and influence factors are highly complex such that the mechanical property prediction is high-dimension nonlinear. Therefore, the concept of "decomposition" is proposed in this paper. The complex problem is divided into several sub-problems, and the sub-models corresponding to each sub-problem are obtained under the framework of GAM. The local scoring algorithm [20], which is an iterative procedure, is used to estimate each sub-model. In each iteration process, the adjusted dependent variable is formed and the backfitting algorithm is used to estimate the single variable function of each independent variable via non-parametric estimation method or parametric estimation method. For the non-parametric estimation method, the cubic spline is a practical choice. For the parametric estimation method, we must assume the functional form for each sub-model. First, the scatter plot of each independent variable and dependent variable is obtained. Second, a preliminary hypothesis is proposed for the functional form of each sub-model based on each scatter plot. The available choices of functional forms include polynomial, exponential, trigonometric, logarithmic, etc.
The procedure for the local scoring algorithm is described as follows:
Initialize. Compute the initial estimates:
, …
where α is the intercept; g(·) is the link function; is the initial value of fj, and fj is an arbitrary single variable function of the jth independent variable, also known as the jth sub-model; p is the number of independent variables included in this model; and n is the number of records for the sample data.
Enter the iteration, for k=1, 2 ….
First, form the adjusted dependent variables:
where μk-1=g-1(ηk-1).
Second, enter the backfitting algorithm cycles:
Initialize: σk-1=E(Zk-1);
Estimate each for m=1, …, p:
where denotes the estimate ofat the kth iteration.
Stop. Execute the iteration process until either △, where ε is a small
threshold, or the number of iterations reaches a preset value.
3) Verifying the sub-models. When a sub- model is built, it is necessary to verify the reliability of the sub-model based on existing experience, mechanism knowledge and related data, i.e., verify the authenticity of the law obtained by the sub- model. Real laws are often reproducible. In the verification of each sub-model, the reliability of the obtained law is verified by testing whether it can be reproduced with the data in different dimensions. If the law can be reproduced based on the data in most cases or there is no obvious evidence to demonstrate that the law is false, then the sub- model is considered reliable. Otherwise, the reasons for the failure are analyzed and the invalid sub- model is revised.
Specifically, all the sub-models are integrated to form a unary GAM without interactions, and the reliability of the model law is verified by testing whether these laws can be reproduced based on historical production data and whether reasonable theoretical explanations exist for them.
Step 3: Judging the validity of the unary GAM. The interactions between the factors are not considered in the model established by the above steps. For certain industrial process modeling problems, obvious interactions exist between certain factors. If the interactions between these variables are ignored, then the unary GAM is often invalid. Therefore, it is necessary to analyze the interactions between factors, and a binary GAM with interactions is built based on the established model.
Step 4: Building the binary GAM with interactions. First, in the analysis of the interactions between various factors, the following steps are applied: 1) mechanism knowledge is used to estimate the possible interactions between factors;2) the causality chart is applied to analyze the causes of interactions; and 3) the authenticity of each interaction in different dimensions is verified on the basis of the data.
Then, the corresponding sub-models for the interacting variables are constructed. The ideas of hierarchical linear models [21] and multivariate analysis of variance [22] are used to build the sub-models, and the sub-model with the following form is usually considered:
g(y)=f(xi, xj)=fi(xi)fj(xj) (1)
In this manner, a binary function is decomposed into the product of two unary functions. And these unary functions can be obtained using data mining methods. Assume the obtained data set is denoted as {(xis, xjr, ysr)}.
First, the data is preprocessed as For a continuous independent variable, the range of this independent variable is divided into several intervals, and the mean of each interval is considered as a level.
Second, the sub-model of xi is constructed based on the data set at each level xjk (k=1, 2, …, m) of xj, the expression is given as
gik(y)=fik(xi), k=1, 2, …, m (2)
The sub-model of xj is constructed based on the data setat each level xil (l=1, 2, …, n) of xi according to the following expression:
gjl(y)=fjl(xj), l=1, 2, …, n (3)
Finally, after the three aforementioned steps, a binary tensor product surface is obtained to facilitate the use of data for verification.
Step 5: Building the integrated model. With the aid of mechanism knowledge, prior knowledge and data mining methods, the relationships between the sub-models are explored and the sub-models are combined to form the integrated model. Subsequently, the integrated model is used in prediction and the predicted results are evaluated. The data with large prediction error are identified, the causes are analyzed to determine the factors that most affect the error, and a local correction based on these data is applied. In the local correction, the integrated model is gradually modified via the procedure described in Steps 1 and 2. The local correction also broadens the scope of application of the model.
The predicted results are evaluated using the root mean square error (RMSE, Erms) and the mean absolute percentage error (MAPE, Emap), which are expressed as:
(4)
(5)
where yi is the ith predicted value from the built model, ti is the ith measured value from the production field, and N is the number of records for the sample data.
3 Application verification
With the application and development of information technology, the steel industry has accumulated notably large amounts of production data. Determining how to use these data to establish a mathematical model and subsequently using the model to guide industrial production has become hot issues in the steel industry. Thus, the new modeling method was applied to a hot strip rolling process, and the prediction models for the mechanical properties and the deformation resistance model for hot rolled strips were established.
3.1 Mechanical property prediction models for hot rolled strips
Based on the aforementioned analysis, the prediction models for the mechanical properties of hot rolled strips were established using the new modeling method. These mechanical property prediction models were based on more than 70000 strip production data records from hot rolled strips collected at a hot rolling mill in China.
First, tensile strength was selected as the dependent variable. Given the non-veracity with outliers and non-uniform distribution of the industrial data, the dynamic binning method was used to clean the raw data. Second, because of the substantial interactions among carbon, nitrogen and niobium, the contents of Cs (residual carbon), Ns (residual nitrogen), Als (residual aluminium), Nbs (residual niobium), NbC(niobium carbide), NbN (niobium nitride), and AlN (aluminum nitride) were calculated based on the thermodynamic model to consider the interactions among the original contents of carbon, nitrogen, niobium and aluminium. Thus, there are 24 independent variables, and Table 1 summaries the independent variables along with their statistics.
Table 1 Independent variables along with their statistics
Then, the random forests algorithm was applied to measure the importance of each independent variable. And 12 independent variables were finally selected from the original 24 independent variables for modeling, including roughing temperature (Tr), finishing entry temperature (Tfe), coiling temperature (Tc), reheating temperature (Tf), finishing delivery thickness (Hfd), finishing delivery temperature (Tfd), residual carbon content (Cs), Si, Mn, P, NbC and NbN.
After that, the identity function was selected as the link function, and the single variable function of each independent variable was estimated using a cubic spline. Thus, the mathematical expression of the tensile strength prediction model was set as follows:
(6)
where α is the intercept, Ts is the tensile strength, and Si(Xi) is the cubic spline function of the influence factor Xi.
At the end of the iteration calculation,α=539.16. Figure 4 shows the relationship curves of the main influence factors on the tensile strength, i.e., the Si(Xi) functions. The influence curve of each sub-model shown in Figure 4 is in accordance with the metallurgical mechanism. For example, 1) the tensile strength increases with the reheating temperature because more niobium dissolves into the austenite at higher heating temperature, resulting in more carbonitride precipitations in the subsequent cooling process. 2) The tensile strength increases with increasing residual carbon, silicon and manganese because each of these elements has a strong solid-solution strengthening effect. 3) Larger mass fractions of niobium nitride and niobium carbide precipitations indicate greater tensile strength because the carbonitride precipitates at the grain boundary of the deformed austenite, thus pinning the grain boundary and preventing the austenite from recrystallization. Therefore, the deformation effect of austenite is maintained, supplying additional nucleation sites for the subsequent transformation of ferrite and refinement of the ferrite grains, which plays a role in subsequent refinement strengthening.
Figure 4 Effects of main influence factors on tensile strength of hot rolled strips
According to formula (6) and the single variable function in Figure 4, the prediction model for the tensile strength of hot rolled strips can be built. The function values Si(Xi) of each influence factor can be interpolated based on the spline function shown in Figure 4.
Similarly, the prediction model of yield strength can be obtained by the proposed method. The model is set as follows:
(7)
where Ys is the yield strength. Figure 5 shows the relationship curves of selected main influence factors on yield strength, and β=491.67. The laws of these relationship curves are similar to those of the tensile strength, and they are all in accordance with the metallurgical mechanism.
Using the built models, the predicted and measured values of tensile strength and yield strength are compared in Figure 6, respectively. The RMSE and the MAPE of each model are calculated based on formulas (4) and (5). And the results, which are presented in Table 2, show that the mechanical property prediction models based on the new modeling method have high practicability, reliability and generalization.
In order to further verify the performance of the new models, the predicted results by the artificial neural network (ANN) are compared with those from new models for the same sample data [23]. The comparison of the prediction accuracy of tensile strength and yield strength are shown in Table 3. For tensile strength, the amount of data with a relative error (Er) within ±6% accounts for 96.58% for the new model, while that accounts for 93.67% for the ANN model. It means that the new model has higher prediction accuracy. In addition, the calculation process does not need any manual intervention or correction, the predicted values of the models deviate from the measured values within a small range, and the models have strong adaptability. Besides, the new models can also explain the effect of each independent variable on dependent variables, which is not available in the artificial neural network model.
Figure 5 Effects of main influence factors on yield strength of hot rolled strips
Figure 6 Predicted values and measured values of mechanical properties:
Table 2 Errors of mechanical property prediction models
Table 3 Prediction accuracy of tensile strength and yield strength by two models
3.2 Deformation resistance model of hot rolled strips
The deformation resistance model is the core of the rolling force model, and its prediction accuracy directly affects the rolling schedule setting and the thickness accuracy of the strip products [24-27]. Until now, many achievements of deformation resistance model have been made [28-31]. For example, TAO et al [28] predicted the flow behavior using the modified Arrhenius model and artificial neural network model, respectively. The predictions of these constitutive models were compared using statistical measures, which indicate that the modified Arrhenius model is limited by its relatively low predicted accuracy at some deformation conditions, while the ANN model presents very high predicted accuracy at all deformation conditions. SAMANTARAY et al [29] proposed a themo-viscoplastic constitutive model to describe the flow saturation at a higher level of strain over a wide range of strain and temperatures. YU et al [30] presented a mathematical model of deformation resistance with high fitting precision for flow stress of V-5Cr-5Ti alloys using non-linear regression based on experimental data. However, most studies of the deformation resistance model have been carried out based on experimental data rather than production data. In fact, the production data contains a large amount of useful information, and the deformation resistance model established on the basis of these field measured data is more suitable for practical rolling production.
Using the new modeling method, a deformation resistance model for hot rolled strips was established. This modeling case is conducted for the 1880 hot strip mill in Baosteel Corporation, China. The steel grades cover carbon manganese steel, microalloy steel, alloy steel, etc., and 50419 pieces of strip data records produced from the 1880 hot strip mill were collected over a period of time. Then 6259 pieces of data records are selected randomly from them to build the model.
The selected independent variables of the deformation resistance model include strain (f), strain rate (φ), and rolling temperature (T), and 8 chemical compositions including C, Si, Mn, Ni, Cr, Nb, Ti and Mo. The deformation resistance is selected as the dependent variable. The strain, strain rate and actual deformation resistance are calculated using the hot rolling process data for each strip data record. The calculation formulas are described as follows [32, 33]:
(8)
(9)
(10)
where fi is the strain, φi is the strain rate, is the observed value of the deformation resistance, ri is the reduction for Fi stand, Hi and hi are the respective entrance thickness and delivery thickness of the Fi stand, Vi is the roll speed of the Fi stand, R′i is the roll flattened radius of the Fi stand,is the measured roll force of the Fi stand, w is the strip width, Qpi is the stress state influence coefficient of Fi stand, and subscript i represents each finishing stand pass for each strip.
To reduce the dimension of the built model, a comprehensive variable X is proposed for the original eight chemical compositions, i.e., where xi (i=1, …, 8) represents the eight components C, Si, Mn, Ni, Cr, Nb, Ti and Mo, respectively, and ai is the influence coefficient of each component.
In this case, the logarithmic function is selected as the link function, and the single variable function of each independent variable is estimated using a cubic spline function. Hence, the model expression of deformation resistance is set as follows:
(11)
where log(·) is the logarithmic function, α is the intercept, and Sx(X), Sf(f), Sφ(φ) and ST(T) are the cubic spline functions of the comprehensive variable of the chemical compositions, strain, strain rate and rolling temperature, respectively. At the end of the calculation, α=5.291. The single variable function of each independent variable is shown in Figure 7.
Figure 7 displays the influence of the independent variables on the deformation resistance. The following observations are noted: 1) With the increase in rolling temperature, the deformation resistance decreases, which is the characteristic of general austenite rolling. 2) The initial stage of the deformation resistance increases rapidly with increasing strain rate. When the strain rate exceeds 60 s-1, the deformation resistance tends to be saturated and remains unchanged. 3) The deformation resistance decreases with increasing strain. This is not consistent with our general knowledge, and it is generally believed that the hot rolling deformation resistance increases with the increase of the strain, which is a rule obtained from Gleeble thermal simulation test. But it is convinced that the data are true for the actual finish rolling process, and this can be verified by the hot rolling deformation resistance model currently used in Mitsubishi Corporation, Japan and many other electrical suppliers. The influence of the strain is usually expressed as an exponential term, the exponential parameter is negative and the general value is about -0.08. This may be because dynamic recrystallization happens during hot rolling and larger strain results in larger dynamic recrystallization. It is necessary to point out that if the cumulative strain of finishing passes is used to analyze, the conclusions can be obtained that the deformation resistance increases with the increase of cumulative strain. 4) The deformation resistance increases with increasing comprehensive variable of chemical compositions, showing that the effect of integrated chemical composition on deformation resistance is positive.
According to formula (11), Figure 7 and α=5.291, the mathematical expression of the model is given as follows:
(12)
where the corresponding function values of Sx(X), Sf(f), Sφ(φ) and ST(T) can be interpolated by applying the spline function shown in Figure 7.
Figure 7 Effects of influence factors on deformation resistance of hot rolled strips:
To reasonably evaluate the accuracy of the new built model, the deformation resistance of 50419 sample strips is predicted using the new model (referred to as model I) and the Baosteel 1880 online deformation resistance model (referred to as model II), respectively. The MAPE and RMSE of these two models, are shown in Table 4, indicating that the prediction accuracy of model I is higher than that of model II.
Table 4 Errors of deformation resistance models
4 Conclusions and future research
This paper proposes a system modeling method combining industrial big data with the process mechanism analysis under the framework of GAM. This approach pursues the practicality of the model by considering the generalization and precision, which offers a new concept for industrial process modeling. The main strategy of the new method is to divide the complex high-dimensional nonlinear problems into several sub-problems. First, the influence factors are screened by combining the mechanism knowledge, data mining methods and prior knowledge. Second, the steps for building the unary GAM without interactions are specified, including cleaning the data, building the sub- models, verifying the sub-models, etc. The interactions between the various factors are explored, and the sub-models for the interactions are constructed. Finally, the relationships between the sub-models are analyzed, and these sub-models are combined to form the integrated model.
Applying the modeling method proposed in this paper, the prediction models for the mechanical properties of hot rolled strips are established based on the production data. The relationship curves of various influence factors on the tensile strength and yield strength are obtained. The laws of these curves are in accordance with the metallurgical mechanism. Using the obtained models, the tensile strength and yield strength of hot rolled strips were predicted based on a test set. The prediction results indicated that the obtained models have good generalization and high prediction accuracy. Moreover, using the proposed modeling method, we establish a deformation resistance model based on production data collected from the 1880 finishing mill at Baosteel Corperation. The influence curves of strain, strain rate and rolling temperature on the deformation resistance are obtained. Industrial actual data verification demonstrates that the accuracy of the new model is higher than that of the Baosteel 1880 online deformation resistance model. The new model has the advantages of high calculation precision, and it can be applied in online process control of hot rolling.
In the future, we plan to apply this new modeling method to additional complex industrial process modeling scenarios and to further improve the related methods, such as the influence factors analysis, sub-model building, reliability verification and the interaction analysis.
References
[1] LIU G X, JIA L N, KONG B, FENG S B, ZHANG H R, ZHANG H. Artificial neural network application to microstructure design of Nb-Si alloy to improve ultimate tensile strength [J]. Materials Science & Engineering A, 2017, 707: 452-458.
[2] LI Quan-shan, LI Da-zi, CAO Liu-lin. Modeling and optimum operating conditions for FCCU using artificial neural network [J]. Journal of Central South University, 2015, 22(4): 1342-1349.
[3] MOHANTY I, SARKAR S, JHA B, DAS S, KUMAR R. Online mechanical property prediction system for hot rolled IF steel [J]. Ironmaking Steelmaking, 2014, 41(8): 618-627.
[4] ZHOU P, YUAN M, WANG H, CHAI T. Data-driven dynamic modeling for prediction of molten iron silicon content using elm with self-feedback [J]. Mathematical Problems in Engineering, 2015, 2015(9): 1-11.
[5] XU Ke, AI Yong-hao, WU Xiu-yong. Application of multi-scale feature extraction to surface defect classification of hot-rolled steels [J]. International Journal of Minerals Metallurgy and Materials, 2013, 20(1): 37-41.
[6] ASADI S, SHAHRABI J, ABBASZADEH P, TABANMEHR S. A new hybrid artificial neural networks for rainfall–runoff process modeling [J]. Neurocomputing, 2013, 121(18): 470-480.
[7] XIANG Y, LIU Y. Mechanism modelling of shot peening effect on fatigue life prediction [J]. Fatigue Fract Eng Mater Struct, 2010, 33(22): 116-125.
[8] dos SANTOS A A, BARBOSA R. Model for microstructure prediction in hot strip rolled steels [J]. Steel Res Int, 2010, 81(1): 55-63.
[9] YANG Biao, SUN Jun, LI Wei, PENG Jin-hui, LI You-ling, LUO Hui-long, GUO Sheng-hui, ZHANG Zhu-ming, SU He-zhou, SHI Ya-ming. Numerical modeling dynamic process of multi-feed microwave heating of industrial solution media [J]. Journal of Central South University, 2016, 23(12): 3192-3203.
[10] GUPTA A, GOYAL S, PADMANABHAN K A, SINGH A K. Inclusions in steel: micro–macro modelling approach to analyse the effects of inclusions on the properties of steel [J]. International Journal of Advanced Manufacturing Technology, 2015, 77(1-4): 565-572.
[11] CAO Jian-guo, WANG Tian-cong, LI Hong-bo, QIAO Yu, WEN Dun, ZHOU Yun-song. High-temperature constitutive relationship of non-oriented electrical steel based on modified Arrhenius model [J]. Journal of Mechanical Engineering, 2016, 52(4): 90-96, 102. (in Chinese)
[12] SUN Yi-kang. Model and control of cold and hot rolling mill for sheets and strips [M]. Beijing: Metallurgical Industry Press, 2010. (in Chinese)
[13] CAO Jian-guo, ZHANG Jie, ZHANG Shao-jun. Rolling equipment and automotive control [M]. Beijing: Chemistry Industrial Press, 2010. (in Chinese)
[14] DAS S K. Neural network modelling of flow stress and mechanical properties for hot strip rolling of TRIP steel using efficient learning algorithm [J]. Ironmaking & Steelmaking, 2013, 40(4): 298-304.
[15] SUI X, LV Z. Prediction of the mechanical properties of hot rolling products by using attribute reduction ELM [J]. International Journal of Advanced Manufacturing Technology, 2016, 85(5-8): 1395-1430.
[16] LI Wei-gang, YANG Wei, ZHAO Yun-tao, HU Heng-fa. Mechanical property prediction model of hot- rolled strip via big data and metallurgical mechanism analysis [J]. Journal of Iron and Steel Research, 2018, 30(4): 301-308.
[17] HORE S, DAS S K, BANERJEE S. An adaptive neuro-fuzzy inference system-based modelling to predict mechanical properties of hot-rolled TRIP steel [J]. Ironmaking & Steelmaking, 2016, 44(9): 1-10.
[18] YANG Wei, LI Wei-gang, ZHAO Yun-tao, YAN BAO-kang, WANG Wen-bo. Mechanical property prediction of steel and influence factors selection based on random forests [J]. Iron & Steel, 2018, 3: 44-49. (in Chinese)
[19] HAN J, KAMBER M. Data mining: Concepts and techniques [M]. San Francisco: Morgan Kaufmann, 2006.
[20] HASTIE T J, TIBSHIRANI R J. Generalized additive models [J]. Stat Sci, 1986, 1: 297-310.
[21] BRYK A S, RAUDENBUSH S W. Hierarchical linear models in social and behavioral research: Applications and data analysis methods [M]. Newbury Park, CA: Sage Publications, 1992.
[22] CHATTERJEE S, HADI A S. Regression Analysis by Example [M]. 5th ed. New York: Wiley, 2012.
[23] WU Si-wei, LIU Zhen-yu, ZHOU Xiao-guang, SHI Nai-an. Prediction of mechanical properties and process parameters selection based on big data [J]. Journal of Iron and Steel Research, 2016, 28(12): 1-4. (in Chinese)
[24] WANG Jian, WANG Xiao-fan, Yang Hai-tao, YU Chao, XIAO Hong. A new mathematical model for predicting flow stress of X70HD under hot deformation [J]. Journal of Central South University, 2015, 22(6): 2052-2059.
[25] GUPTA A K, SINGH S K, REDDY S, HARIHARAN G. Prediction of flow stress in dynamic strain aging regime of austenitic stainless steel 316 using artificial neural network [J]. Mater Des, 2012, 35(223): 589-595.
[26] LI B, NAUMAN J. Significance and development of a next-generation Level 2 model as a metallurgical system [C]// Materials Science and Technology Conference and Exhibition, MS and T'08. Pittsburgh, 2008: 1066-1077.
[27] MAHESHWARI A K. Prediction of flow stress for hot deformation processing [J]. Computational Materials Science, 2013, 69 (1): 350-358.
[28] TAO Zhi-jun, YANG He, LI Heng, MA Jun, GAO Peng-fei. Constitutive modeling of compression behavior of TC4 tube based on modified Arrhenius and artificial neural network models [J]. Rare Metals, 2016, 35(2): 162-171.
[29] SAMANTARAY D, PATEL A, BORAH U, ALBERT S K, BHADURI A K. Constitutive flow behavior of IFAC-1 austenitic stainless steel depicting strain saturation over a wide range of strain rates and temperatures [J]. Materials and Design, 2014, 56: 565-571.
[30] YU Xing-zhe, SONG Yue-qing, CUI Shun, LI Ming, LI Zeng-de. The mathematical model research of deformation resistance of V-5Cr-5Ti alloys [J]. Journal of Plasticity Engineering, 2008, 15(6): 122-124.
[31] ZHOU Ji-hua, GUAN Ke-zhi. Plastic deformation resistance [M]. Beijing: China Machine Press, 1989. (in Chinese)
[32] REN Yong, CHEN Xiao-ru. Mathematical model of rolling process [M]. Beijing: Metallurgical Industry Press, 2008. (in Chinese)
[33] LIU Xiang-hua, HU Xian-lei, DU Lin-xiu. Math models of rolling parameters and its applications. [M]. Beijing: Chemistry Industrial Press, 2007. (in Chinese)
(Edited by FANG Jing-hua)
中文导读
基于广义可加模型框架的热轧带钢轧制过程建模
摘要:本研究在广义可加模型的框架下,将工业大数据和过程机理分析相融合,提出了一种新的建模方法,从而建立兼顾泛化能力和预测精度的实用模型。新的建模方法主要包括四个方面。首先,利用机理知识和数据挖掘方法对影响因素进行筛选。其次,提出了一元无交互作用的广义可加模型的建模步骤,包括清理数据、建立子模型和验证子模型。随后,研究了各影响因素间的交互作用,构建了二元有交互作用的广义可加模型。最后,分析各子模型之间的关系,并建立整体模型。基于本文提出的建模方法,建立了热轧带钢力学性能预测模型和变形抗力模型。实际工业数据验证表明新建立的模型具有很好的预测精度,抗拉强度、屈服强度和变形抗力的平均绝对误差分别为2.54%、3.34%和6.53%。实验结果表明,本文提出的建模方法为工业过程建模提供了一种新的思路。
关键词:工业大数据;广义可加模型;力学性能预测;变形抗力预测
Foundation item: Project(51774219) supported by the National Natural Science Foundation of China
Received date: 2018-05-11; Accepted date: 2018-12-18
Corresponding author: LI Wei-gang, PhD, Professor; Tel: +86-18802781626; E-mail: liweigang.luck@foxmail.com; ORCID: 0000- 0003-3268-127X