Keywords

1 Preface

To improve the intelligent and information-based design level of the structural platform and to effectively monitor and manage the structural damage and life of the aircraft in use [1], the key link is to establish a reliable and effective monitoring system. Reasonable selection and optimal arrangement of sensor network is the key basic problem that should be solved in the construction of structural health monitoring system [2]. Generally, it is necessary to install more sensors to realize high perception of aircraft structure. On the one hand, the more sensors are placed in the structure, the more accurate the structure state information is. However, due to many reasons, such as economic factors, additional weight factors and structural characteristics, it is unrealistic to place a large number of state sensing sensors in the aircraft structure at present, and too many sensors will introduce a lot of redundant data. On the other hand, if the position of the sensor is not suitable, it may affect the validity of the sensor data and reduce the precision of the monitoring system.

At present, more and more researchers at home and abroad have studied the optimization of sensor network, and put forward more sensor layout optimization methods [3,4,5]. All kinds of optimization methods have their own advantages and scope of application, which can be used to optimize the number, position and weight of sensor network. At present, sensor network optimization is mainly used in bridges, large-scale buildings and other structures. The strain sensor network in aircraft structure is mainly arranged according to the engineering experience of aircraft structure transmission path, and there is little research on the strain sensor network optimization.

How to optimize the number of sensors, the location and the form of the network, get the real and accurate structure state data and realize the global high perception of the structure has become an important content in the field of structural health monitoring. At present, the main monitoring object of aircraft structure health monitoring is load/strain information [6]. Therefore, the optimal layout of strain sensor based on a certain wing box is studied in this paper, the research route is shown in Fig. 1.

Fig. 1
A research route flowchart is as follows. Start, original monitoring point, apply training load, build training set and test set, correlation coefficient method, optimize the original monitoring point, multiple linear regression, load inversion, comparison of results, and end.

Research route

The main steps are as follows: the load matrix is generated according to the actual load of the airfoil box, and the training set and test set are formed; the strain value of each original monitoring point is preliminarily sorted based on the finite element model of the monitored structure under the load matrix; the correlation coefficient method is used as a screening method for a large number of original data to find out the relationship between the characteristics of monitoring points, The monitoring points with high correlation coefficient are removed; and the error difference between the actual load and the inversion load is obtained by using the multivariate linear regression method, and the final monitoring point location and number are determined.

2 Optimization and Inversion Method of Monitoring Points

In order to consider the safety and economy of aircraft structure, the current aircraft structure sensor monitoring system has to ensure the safety of aircraft, the less equipment investment is better, and it needs to ensure the validity of monitoring data, the stability of data transmission, data processing speed and other indexes can't be reduced, at the same time, too much data is prone to cause dimension disaster. Removal of unrelated features can reduce the difficulty of structural health monitoring, simplify the model and reduce the computational complexity. In order to achieve this goal, it is necessary to optimize the sensor network of aircraft structure.

The principle of optimal design of aircraft structure sensor network is mathematically how to determine a small number of monitoring positions by optimization method in multiple alternative sensor locations, and the monitoring effect will not be significantly reduced, so that the balance between monitoring effect and economy and data processing can be achieved. The ultimate goal of the sensor information network is to optimize the sensor network with as few kinds and quantities as possible, and to place the sensor in the “hot spot” as possible, so that the reliability and accuracy of the monitoring results can't be reduced.

The optimal layout of structural sensors can be classified into feature selection in feature engineering. That is, select a small number of useful features (i.e. the location of monitoring points) from a large number of features. Not all features are the same. Attributes that are not relevant to the problem need to be deleted. Some features may be more important than others, and others may be redundant. Feature selection is the automatic selection of a subset of the features that are most important to the problem.

In this paper, based on the correlation coefficient method of feature selection, the monitoring points are optimized, a large number of closely related monitoring points are eliminated, and the error difference between the actual load and the inversion load is obtained by using the multi-linear regression method.

2.1 Correlation Coefficient Method

Correlation coefficient is a statistical index first put forward by famous statistician Carl Pearson, which is mainly used to study the degree of correlation between two different variables. For different subjects, the correlation coefficient has different definitions [7]. Pearson correlation coefficient, also known as correlation coefficient or simple correlation coefficient, is used to measure the degree of linear correlation between two variables, usually expressed in the letter r, defined as (1).

$$ r(X,Y) = \frac{Cov(X,Y)}{{\sqrt {Var[X]} Var[Y]}} $$
(1)

In the formula, Cov(X, Y) is the covariance between variable X and variable Y; Var[X] is the variance of variable X; Var[Y] is the variance of variable Y.

By using correlation coefficient to analyze the correlation among different variables, it is necessary to define the two variables to be analyzed. In the strain/load sensor monitoring network, the strain response of each original strain monitoring point under different load conditions is taken as a variable, and the strain value of each strain monitoring point under different load conditions is taken as the value of the variable, and a series of elements can be obtained. By calculating the correlation coefficient among the variables, we can get the correlation degree of the strain response of each strain monitoring point under the corresponding load condition.

The correlation of the strain response of different strain monitoring points under different load conditions is different, Some of the different strain monitoring points have high correlation to the strain response under the same load condition. In this case, only one of the monitoring points can be selected, and the strain response of the remaining strain monitoring points with high correlation is redundant data, and these strain monitoring points belong to the redundant monitoring points, which can be removed.

The purpose of optimizing the strain/load monitoring network is to minimize the number of strain monitoring points and to ensure the accuracy of load identification to meet the actual requirements. Therefore, a series of strain monitoring points with the lowest correlation need to be selected as the final characteristic strain monitoring points. These strain monitoring points with the lowest correlation can best represent the strain response of the structure under various load conditions. Using these optimized characteristic strain monitoring points to arrange a more reasonable strain sensor network, the aircraft structural load can be identified more efficiently and economically.

2.2 Multiple Linear Regression

In regression analysis, if there are two or more independent variables, it is called multiple regression. In fact, a phenomenon is often associated with a number of factors, the optimal combination of multiple independent variables to predict or estimate the dependent variables, than only one independent variable to predict or estimate more effective, more realistic. Multivariate linear regression can be used to find the relationship between dependent variables and many independent variables, usually used in prediction inversion, etc. It is widely used [8].

The classical expressions of multivariate linear regression are:

$$ y = \beta_{0} + \beta_{1} x_{1} + \beta_{2} x_{2} + \cdots + \beta_{n} x_{n} $$
(2)

In this case, both dependent \(y\) and independent \(x_{1} ,x_{2} \ldots ,x_{n}\) variables are known, and the core or ultimate goal of multivariate linear regression is to obtain multiple regression coefficients \(\beta_{0} ,\beta_{1} \ldots ,\beta_{n}\). The predicted and actual values for different dependent variables are shown in Table 1.

Table 1 Predictive value and actual value

3 Optimization of Monitoring Point Layout Based on a Certain Wing Box

3.1 Brief Description of Model

The wing box is connected with the load-bearing wall and the actuator through the bolts. The test load is carried out by two actuators on the on the right side of the wing box. The wing box is inverted (upper wing face is lower) and installed on the load-bearing wall during the test. The test installation diagram is shown in Fig. 2.

Fig. 2
A schematic diagram of the wing box. The wing box is attached to the bearing wall. At the free end bottom of the box are 2 actuators.

Wing box schematic diagram

A finite element is established for the wing box. As shown in Fig. 3, the elements of each part are simplified as follows: the wing skin, rib panel and web are simplified as CQUAD4 element, the stringer, rib chord and spar cap are simplified as CROD element.

Fig. 3
An illustration of a model of wing box is broader at one end and tapers towards the center. Beyond a certain point, it goes straight like a cuboid.

Finite element model of wing box test piece

3.2 Construction of Original Monitoring Point and Model Data

In addition to the loaded transition section and the restrained end of the wing box, there are 228 CQUAD4 elements (Include 3directions) and 312 CROD elements, forming a total of 996 monitoring points which can be used to monitor the wing skin, rib panel, web, stringer, rib chord and spar cap.

In the experiment, the peak-valley value range of 1 # loading point is (− 5811 to 53,761.4 N), and the peak-valley value range of 2 # loading point is (− 2241.1 to 20,761.3 N). The load of 1 # and 2 # loading point are randomly generated 90 loading conditions, 80 for training and 10 for testing. The distribution of training set and test set data is shown in Fig. 4.

Fig. 4
A scattergram plots the load of actuator 2 versus the load of actuator 1 in multiples of 10 exponent 4. Some plotted data are as follows. Training set. (negative 2, 2.8), (2, 1.9), and (4, negative 1.5). Test set. (negative 3.1, negative 1.5), (negative 1, 1.3), (3.2, 1.4). Values are estimated.

Load distribution diagram

3.3 Monitoring Point Optimization

Based on the finite element model of the wing box, the load data are loaded to 1 # and 2 # joints, and the strain data of 996 monitoring points are sorted from large to small according to the maximum strain of each monitoring point. The correlation between the selected monitoring points and the strain data is calculated. The monitoring points with the absolute value of correlation coefficient greater than 0.99 are combined and deleted. The remaining 13 monitoring points are shown in Fig. 5.

Fig. 5
5 schematics of the monitoring points on the wing box. On the left and right are the longitudinal cross-sections of the wing box, with point 2 and point 4 labeled. On the top facade of the box are points 1, 3, 5, 7, 9, 10, 11, 12, and 13. On the bottom facade are points 6 and 8.

Distribution map of monitoring points

3.4 Multivariate Linear Regression Analysis

Based on the selection of 13 monitoring points as attributes, the first 80 samples of 90 samples were selected as training sets and the last 10 samples as test sets.

The error comparison of 1 # loading point is shown in Fig. 6. The normalized multivariate linear regression equation is as follows:

$$ \begin{aligned} y_{1} & = 0.0120 + 0.0584x_{1} + 0.2225x_{2} + 0.0173x_{3} + 0.1179x_{4} + 0.1388x_{5} \\ & \quad + 0.2722x_{6} - 0.0716x_{7} + 0.0799x_{8} - 0.0185x_{9} - 0.1584x_{10} \\ & \quad - 0.3151x_{11} + 0.1301x_{12} - 0.0103x_{13} \\ \end{aligned} $$
Fig. 6
A line and scatter plot of the load versus test set number and a vertically offset stem plot of relative error percent versus test set number. Two lines for expected and predicted values drop with fluctuations and plots for absolute error are stable. Graph B has a fluctuating trend.

Expected and predicted value error of 1 # load point

The load value of 1 # loading point is \(y_{1}\), \(x_{1} \sim x_{13}\) is the strain value corresponding to monitoring point 1–13.

The error comparison of 2 # loading point is shown in Fig. 7. The normalized multivariate linear regression equation is as follows:

$$ \begin{aligned} y_{2} & = 0.0194 + 0.1378x_{1} + 0.0226x_{2} + 0.1848x_{3} - 6.274 \times 10^{ - 4} x_{4} \\ & \quad + 0.1422x_{5} + 0.0173x_{6} + 0.1638x_{7} - 0.0448x_{8} + 0.0127x_{9} \\ & \quad + 0.2432x_{10} + 0.2006x_{11} + 0.2754x_{12} + 0.0893x_{13} \\ \end{aligned} $$
Fig. 7
A line and scatter plot of the load versus test set number and a vertically offset stem plot of relative error percent versus test set number. Two lines for expected and predicted values fluctuate and plots for absolute error are stable. Graph B has a fluctuating trend.

Expected and predicted value error of 2 # load point

The load value of 2 # loading point is \(y_{2}\), \(x_{1} \sim x_{13}\) is the strain value corresponding to monitoring point 1–13.

The results of Figs. 6 and 7 show that the multivariate linear regression can predict the model well, and the error is about ± 3%, which can meet the need of engineering.

4 Conclusion

In this paper, based on the correlation coefficient method and multiple linear regression in feature engineering, the strain monitoring point optimization problem based on load inversion is studied. The correlation coefficient method can effectively eliminate a large number of repeated data. After optimization, the monitoring point is reduced from 996 to 13. The load inversion error within 3% can be obtained by using multiple linear regression method based on 13 screening points. The precision of inversion error is very high, which can meet the need of engineering application.