Keywords

1 Introduction

Structural-system identification (SI) [3, 15, 22, 35, 39, 41] refers to methods for inverse calculation of structural systems using data to calibrate a mathematical or digital model. The calibrated models are then used to either estimate or predict the future performance of structural systems and, eventually, their remaining useful life. Non-linear structural systems with spatial and temporal variations present a particular challenge for most inverse identification methods [4, 14, 21]. In dynamic analysis of civil structural systems, prior research efforts primarily focused on matching experimental data with either mechanistic models (i.e., known mechanical models) [31, 38] or with black box models with only input/output information (i.e., purely data-driven approaches), [10, 13, 34]. Examples of these approaches include eigensystem identification algorithms [37], frequency domain decomposition [7], stochastic optimization techniques [30], and sparse identification [8]. A majority of these approaches, however, fail to capture highly non-linear behaviors.

Fig. 1.
figure 1

Overview: We consider structures whose dynamics are governed by a known partial differential equation (PDE), but with unknown parameters that potentially vary in both space and time. These unknown parameters are modeled with neural networks, which are then embedded within the PDE. In this illustration, the unknown parameters, modulus P and damping C, vary spatially. The network weights are learned by solving the PDE to obtain the structural response (deflection in this case) and propagating the error between the predicted response and the measured ground truth response through the PDE solve and the neural networks.

In this paper, we consider the class of non-linear structural problems with unknown spatially distributed parameters (see Fig. 1 for an overview). The parameters correspond to geometric and material variations and energy dissipation mechanisms, which could be due to damping or other system imperfections that are not typically captured in designs. As an instance of this problem class, we consider forced vibration responses in beams with spatially varying parameters. The primary challenges in such problems arise from the spatially variable nature of the properties and the distributed energy dissipation. This is typical for built civil structures, where energy dissipation and other hard-to-model phenomena physically drive the dynamic response behavior. In addition, it is very common to have structural systems with unknown strength distributions, which can be driven by geometric non-linearities or indiscernible/hidden material weaknesses. Finally, a typical challenge in structural systems is the rarity of measured data, especially for extreme loading cases.

We propose a framework, dubbed NeuralSI, for nonlinear dynamic system identification that allows us to discover the unknown parameters of partial differential equations from measured sensing data. The developed model performance is compared to conventional PINN methods and direct regression models. Upon estimating the unknown system parameters, we apply them to the differential model and efficiently prognosticate the time evolution of the structural response. We also investigate the performance of NeuralSI under a limited training data regime across different input beam loading conditions. This replicates the expected challenges in monitoring real structures with limited sensors and sampling capabilities.

NeuralSI contributes to the fields of NeuralPDEs, structural identification, and health monitoring:

  1. 1.

    NeuralSI allows us to learn unknown parameters of fundamental governing dynamics of structural systems expressed in the form of PDEs.

  2. 2.

    We demonstrate the utility of NeuralSI by modeling the vibrations of nonlinear beams with unknown parameters. Experimental results demonstrate that NeuralSI achieves two-to-three orders of magnitude lower error in predicting displacement distributions in comparison to PINN-based baselines.

  3. 3.

    We also demonstrate the utility of NeuralSI in temporally extrapolating displacement distribution predictions well beyond the training data measurements. Experimental results demonstrate that NeuralSI achieves four-to-five orders of magnitude lower error compared to PINN-based baselines.

2 Related Work

Significant efforts have been directed toward physics-driven discovery or approximation of governing equations [15, 21, 26]. Such studies have further been amplified by the rapid development of advanced sensing techniques and machine learning methods [16, 17, 19, 32]. Most of the work to date has mainly focused on ordinary differential equation systems [21, 40]. Neural ODEs [9] have been widely adopted due to their capacity to learn and capture the governing dynamic behavior from directly collected measurements [2, 28, 40]. They represent a significant step above the direct fitting of a relation between input and output variables. In structural engineering applications, Neural ODEs generally approximate the time derivative of the main physical attribute through a neural network.

More recently, data-driven discovery algorithms for the estimation of parameters in differential equations are introduced. These methods typically referred to as physics-informed neural networks (PINNs) include differential equations, constitutive equations, and initial and boundary conditions in the loss function of the neural network and adopt automatic differentiation to compute derivatives of the network parameters [20, 29]. Variational Autoencoders were also learned to build baseline behavioral models, which were then used to detect and localize anomalies [23]. Many other applications have employed Neural ODE for dynamic structure parameter identification in both linear and nonlinear cases [2, 21, 40]. On the other hand, few studies have explored Neural PDEs in other fields such as message passing [6], weather and ocean wave data [11], and fluid dynamics [5]. In [18], a Graph Neural Network was used to solve flow phenomena, and a NeuralPDE solver package was developed in Julia [42] based on PINN.

3 Structural Problem – PDE Derivation

3.1 Problem Description

Many physical processes in engineering can be described as fourth-order time-dependent partial differential problems. Examples include the Cahn-Hilliard type equations in Chemical Engineering, the Boussinesq equation in geotechnical engineering, the biharmonic systems in continuum mechanics, the Kuramoto-Sivashinsky equation in diffusion systems [27] and the Euler-Bernoulli equation considered as an example case study in this paper. The Euler-Bernoulli beam equation is widely used in civil engineering to estimate the strength and deflection of beam structures. The dynamic beam response is defined by:

$$\begin{aligned} F(t)=\frac{\partial ^2 }{\partial x^2} \biggl (P(x)E_0I\frac{\partial ^2 u}{\partial x^2}\biggl ) + \rho A\frac{\partial ^2 u}{\partial t^2} + C(x) \frac{\partial u}{\partial t} \end{aligned}$$
(1)

where u(xt) is the displacement as a function of space and time. P(x) and \(E_0\) are the modulus coefficient and the reference modulus value of the beam, \(I, \rho \), and A are refereed to the beam geometry and density. F is the distributed force applied to the beam. C(x) represents damping, which is related to energy dissipation in the structure. In this paper, we restrict ourselves only to spatial variation of the beam’s properties and leave the most generalized case with variations in space and time of all variables for a future study.

The fourth-order derivative of the spatial variable and the second-order derivative of time describes the relation between the beam deflection and the load on the beam [1]. Figure 2 shows an illustration of the beam problem considered here, with the deflection u(xt) as the physical response of interest. The problem can also be formulated as a function of moments, stresses, or strains. The deflection formulation presents the highest order differentiation in the PDE. This was selected to allow for flexibility of the solution to be extended to other applications beyond structural engineering.

Fig. 2.
figure 2

Simply supported dynamic beam bending problem. Dynamic load can be applied to the structure with its values changing in time. The geometry, modulus, and other properties of the beam can also vary spatially with x. The deflection of the beam is defined as u(xt).

To accurately represent the behavior of a structural component, its properties need to be identified. Though the beam geometry is straightforward to measure, the material property and damping coefficient are hard to estimate. The beam reference modulus \(E_0\) is expected to have an estimated range based on the choice of material (e.g., steel, aluminum, composites, etc.) but unforeseen weaknesses in the build conditions can introduce unexpected nonlinear behavior. One of the objectives of this work is to capture this indiscernible randomness from response measurements. In addition, as discussed above, the damping is unpredictable at the design stage and is usually calculated by experiments. For the simply supported beam problem, the boundary conditions are defined as:

$$\begin{aligned} {\left\{ \begin{array}{ll} u(x=0,t)=0\text {;}\quad \quad u(x=L,t)=0 \\ \frac{\partial ^2 u(x=0,t)}{\partial x^2}=0 \text {;}\quad \frac{\partial ^2 u(x=L,t)}{\partial x^2}=0 \end{array}\right. } \end{aligned}$$
(2)

where L is the length of the beam. Initially, the beam is static and stable, so the initial conditions of the beam are:

$$\begin{aligned} {\left\{ \begin{array}{ll} u(x,t=0)=0 \\ \frac{\partial u(x,t=0)}{\partial t}=0 \end{array}\right. } \end{aligned}$$
(3)

4 NeuralSI

4.1 Discretization of Space

To tackle this high-order PDE efficiently, a numerical approach based on the method of lines is employed to discretize the spatial dimensions of the PDE. Then the system is solved as a system of ordinary differential equations (ODEs). The implemented discretization for the spatial derivatives of different orders are expressed as:

$$\begin{aligned} A_4^*u/\varDelta x^4 = \frac{\partial ^4 u}{\partial x^4}\text {;}\quad A_3^*u/\varDelta x^3 = \frac{\partial ^3 u}{\partial x^3}\text {;}\quad A_2^*u/\varDelta x^2 = \frac{\partial ^2 u}{\partial x^2} \end{aligned}$$
(4)

where in the fourth order discretization, \(A_4^*\) is a \(N \times N\) modified band matrix (based on the boundary conditions), and the size depends on the number of elements used for the space discretization, and \(\varDelta x\) is the distance between the adjacent elements discretized in the spatial domain. A similar principle is applied for other order derivatives.

4.2 The Proposed NeuralSI Schematic

A pictorial schematic of NeuralSI is shown in Fig. 1. The Julia differential equation package [28] allows for very efficient computation of the gradient from the ODE solver. This makes it feasible to be used for neural network backpropagation. Thus, the ODE solver can be considered as a neural network layer after defining the ODE problem with the required fields of initial conditions, time span, and any extra parameters. Inputs to this layer can either be output from the previous network layers or directly from the training data.

The network in NeuralSI for the beam problem takes as input the location of the deformation sensors installed on the structure for continuous monitoring of its response. A series of dense layers are implemented to produce the output, which are the parameters that represent the structural characteristics. The parameters are re-inserted into the pre-defined ODE to obtain the final output, i.e., the structure’s dynamic response. The loss is determined by the difference between the dynamic responses predicted by NeuralSI and those measured by the sensors (ground truth).

4.3 Training Data Generation

For experimental considerations in future lab testing, we simulate in this case a beam with length, width, and thickness respectively of 40 cm, 5 cm, and 0.5 cm. The density \(\rho \) is \(2700\,\text {kg}/\text {m}^3\) (aluminum as base material). The force F(t) is defined as a nonlinear temporal function. Considering the possible cases of polynomial or harmonic material properties variations as an example [33], we integrate the beam with a nonlinear modulus E(x) as a sinusoidal function. We use a range for the modulus from 70 GPa to 140 GPa (again using aluminum as a base reference). The damping coefficient C(x) is modeled as a ramp function. The PDE can be rewritten and expressed as:

$$\begin{aligned} \begin{aligned} F(t)=E_0I\Bigl (\frac{\partial ^2 P(x)}{\partial x^2} \frac{\partial ^2 u}{\partial x^2} + 2\frac{\partial P(x)}{\partial x} \frac{\partial ^3 u}{\partial x^3} + P(x)\frac{\partial ^4 u}{\partial x^4}\Bigl ) + \rho A\frac{\partial ^2 u}{\partial t^2} + C(x) \frac{\partial u}{\partial t} \\ =E_0I\Bigl (A_2^*P(x) A_2^*u + 2A_1^*u P(x) A_3^*u + P(x) A_4^*u\Bigl ) + \rho A\frac{\partial ^2 u}{\partial t^2} + C(x) \frac{\partial u}{\partial t} \end{aligned} \end{aligned}$$
(5)
$$\begin{aligned} F(t) = \left\{ \begin{array}{ll} 1000 &{} \quad t \le 0.02\,\text {s} \\ 0 &{} \quad t > 0.02\,\text {s} \end{array} \right. \end{aligned}$$
(6)

where the estimated modulus reference \(E_0\) is 70 GPa, and P(x) and C(x) are modulus coefficient and damping that can vary spatially with x. The pre-defined parameters \(P_0 (x)\) and \(C_0 (x)\) are shown in Fig. 3.

Fig. 3.
figure 3

Pre-defined structural properties and resultant dynamic response. Structural parameters P and C are defined as a sinusoidal and a ramp function. Force is applied as a step function of 1000 N and reduced to zero after 0.02 s.

The PDE presented in (5) is solved via the differential equation package in Julia. The RK4 solver method is selected for this high-order PDE. The time span was set to 0.045 s to have 3 complete oscillations of the bending response. The number of spatial elements and time steps are chosen as 16 and 160 respectively for balancing the training time cost and response resolution (capture the peak deflections). The deflections u(xt) are presented as a displacement distribution of size \(16\times 160\), from which ground truth data is obtained for training.

4.4 Network Architecture and Training

The network architecture is presented as a combination of multiple dense layers and an PDE-solver layer. The input to the network is the spatial coordinates x for the measurements, and the network output is the prediction of the dynamic response u(xt). It is worth mentioning that the structural parameters P and C are produced from the multiple dense layers in separate networks, and the PDE layer takes those parameters to generate a response displacement distribution of size \(16\times 160\). The activation function for predicting the parameter P is a linear scale of the sigmoid function so that the output can be in a reasonable range. For the prediction of parameter C, the network of the same architecture is used, but the last layer does not take any activation function since the range of the damping value is unknown.

Fig. 4.
figure 4

NeuralSI network architecture and training. The network has several dense layers and the output is split into P and C. Those parameters are taken to the PDE solver for structural response prediction. Samples are taken randomly from the response for training the network.

The modulus coefficient might be very high during training and lead to erroneous predictions with very high-frequency oscillations. So, we used minibatch training to escape local minima with a batch size of 16. The loss function is defined as the mean absolute error (MAE) between samples from the predicted and ground truth displacement distribution:

$$\begin{aligned} loss = \frac{1}{n}\sum _{i=1}^n \left| u-\hat{u}\right| \end{aligned}$$
(7)

where n is the number of samples for training, u and \(\hat{u}\) are the values from true and prediction dynamic responses at different training points in the same minibatch.

Furthermore, inspired by the effectiveness of positional embeddings for representing spatial coordinates in transformers [36], we adopt the same as well as for the spatial input to the network. It is worth noting that the temporal information in the measurements is only used as an aid for mapping and matching the predictions with the ground truth. We use ADAMW [24] as our optimizer, with a learning rate of 0.01 (Fig. 4).

5 Results and Performance

The evaluation of NeuralSI is divided into two parts. In the first part, we evaluate predictions of the parameters P and C from the trained neural network. We assume that each structure has a unique response. To determine how well the model is predicting the parameters, Fréchet distance [12] is employed to estimate the similarity between the ground truth and predicted functions. In this case, the predicted P and C are compared to the original \(P_0\) and, \(C_0\) respectively.

The second part of our evaluation is the prediction of the dynamic responses, which is achieved by solving the PDE using the predicted parameters. The metric to determine the performance of the prediction is the mean average error (MAE) between the predicted and ground truth displacement distribution. The prediction can be extrapolated by solving the PDE for a longer time span and compared with the extrapolated ground truth. The MAE is also calculated from the extrapolated data to examine the extrapolation ability of NeuralSI. Moreover, the dynamic response can be visualized on different elements separately (i.e., separate spatial locations x) for a more fine-grained comparison of the extrapolation results.

5.1 Results

We first trained and evaluated NeuralSI with different combinations of number and size of dense layers, percentage of data used for training, and minibatch size. The best results were achieved by taking a minibatch size of 16, training for a total of 20 epochs, and a learning rate of 0.001 (the first 10 epochs has a learning rate of 0.01).

Fig. 5.
figure 5

Predicted beam parameters modulus coefficient (top) and damping (bottom). Observe that the modulus coefficient P matches well with the sinusoidal ground truth, since the modulus dominates the magnitude of the response. The damping C fluctuates as it is less sensitive than P, but the outputs still present a trend of increasing damping magnitude from the left end of the beam to the right end.

Figure 5 shows the output of modulus coefficient P and damping C from NeuralSI. For the most part, the predictions match well with the target modulus and damping, respectively. Compared to the modulus coefficient P, the predicted damping C has a larger error since it is less sensitive to the response. A small difference in damping magnitude will not affect the dynamic response as much as a change in the modulus parameter. However, the non-linearity of the modulus and damping are predicted accurately, and it is easy to identify whether the system is under-damped or over-damped based on the predicted damping parameters.

Fig. 6.
figure 6

NeuralSI predictions. The interpolation results (top row) are calculated from 0 to 0.045 s and temporal extrapolation results (bottom row) are from 0.045 s to 0.09 s. Peak error is only around 0.3% of the peak value from the ground truth, and the error magnitude remains the same for extrapolation.

Figure 6 visualizes the ground truth and predicted dynamic displacement response, along with the error between the two. We observe that the maximum peak-peak value in the displacement error is only 0.3% of the ground truth. We also consider the ability of NeuralSI to extrapolate and display the dynamic response by doubling the prediction time span. It is worth mentioning that the peak error in temporal extrapolation does not increase much compared to the peak error in temporal interpolation. The extrapolation results are also examined at different elements from different locations. Figure 7 presents the response at the beam midspan and at quarter length. There are no observed discrepancies between the ground truth and the predicted response.

5.2 Hyperparameter Investigation

Based on the parameters chosen above, we tested the effect of number of dense layers, training sample ratio and minibatch size on the parameter identification and prediction of dynamic responses.

Number of Layers: The number of layers is varied by consecutively adding an extra layer with 32 hidden units right after the input. From Fig. 8, the performance of the network is affected if the number of layers is below 4. This is explained by the fact that the network does not have sufficient capacity to precisely estimate the unknown structural parameters. It is noted that the size of the input and output are determined by the minibatch size and the number of elements used for discretization. A higher input or output size will automatically require a bigger network to improve prediction accuracy. Additionally, the Fréchet distance decreases as the size of the neural network increases, which demonstrates that the prediction of beam parameters is more accurate.

Fig. 7.
figure 7

Elemental response, spatial elements from the beam are selected to examine the temporal response. The ground truth and prediction responses are matching perfectly. (a) element at beam midspan; (b) element at quarter length of the beam.

Fig. 8.
figure 8

Hyperparameter performance. A sufficient number of layers, more training samples, and small minibatch size will produce a good combination of hyperparameters and loss MAE (top row). The Fréchet distances (bottom row) are calculated for P and C respectively. The fluctuation of Fréchet distance for different sample ratio is because the values are relatively small.

Sample Ratio: The number of training samples plays an important role in the model and in real in-field deployment scenarios. The number and the efficiency of sensor arrangements will be directly related to the number of samples required for accurately estimating the unknown parameters. It is expected that a reduced amount of data is sufficient to train the model given the strong domain knowledge (in the form of PDE) leveraged by NeuralSI. From Fig. 8, when 20% of the ground truth displacement samples are used for training, the loss drops noticeably. With an increased amount of training data, the network performance can still be improved. Furthermore, observe that there is a slight effect of data overfitting when using the full amount of data for training. The Fréchet distance of damping is not stable since our loss function optimizes for accurately predicting the dynamic deflection response, instead of directly predicting the parameters. As such, the same error could be obtained through different combinations of those parameters.

Minibatch Size: The minibatch size plays an important role in the efficiency of the training process and the performance of the estimated parameters. It is worth mentioning that a smaller minibatch size helps escape local minima and reduces errors. However, this induces a higher number of iterations for a single epoch, which is computationally expensive. From Fig. 8 we observe that both the MAE error and the Fréchet distance are relatively low when the minibatch size is smaller than 32.

6 Comparison of NeuralSI with a Direct Response Mapping Deep Neural Network and a PINN

The NeuralSI framework is compared with traditional deep neural networks (DNN) and PINN methods. The tested DNN has 5 dense layers and a Tanh activation. The inputs are the spatial and temporal coordinates x and t, respectively, of the displacement response, and the output is the beam deflection u(xt) at that spatio-temporal position. The optimizer is LBFGS and the learning rate is 1.0. With a random choice of 20% samples, the loss stabilizes after 500 epochs.

The PINN method is defined with a similar strategy to existing solutions [20, 29]. The Neural network consists of 5 dense layers with Tanh activation function. The loss is defined as a weighted aggregate of the boundary condition loss (second derivative of input x at the boundaries), governing equation loss (fourth-order derivative of x and second-order derivative of the t), and loss between the prediction and ground truth displacement response. We used LBFGS as the optimizer with a learning rate of 1.0. The training was executed for 3700 epochs.

Fig. 9.
figure 9

Spatio-temporal displacement distribution predictions and comparisons between DNN, PINN and NeuralSI for both interpolation (top) and extrapolation (bottom). The DNN method fails to learn the interpolation response, while the PINN can predict most of the responses correctly, with only a few errors at the corners of the displacement response. Predictions from NeuralSI have two orders of magnitude lower error in comparison to PINN. With the learned structural parameters, NeuralSI maintains the same magnitude of error in extrapolation results. Both DNN and PINN completely fail at extrapolation and lead to considerable errors.

Fig. 10.
figure 10

Performance comparison between DNN, PINN, and NeuralSI for both interpolation and extrapolation (a) MAE, (b) Inference time, and (c) Trade-off between MAE and inference time. NeuralSI offers significantly lower error while being as expensive as solving the original PDE, thus offering a more accurate solution when the computational cost is affordable. NeuralSI obtains the extrapolation results by solving the whole time domain starting from \(t=0\), while DNN and PINN methods directly take the spatio-temporal information and solve for extrapolation.

The prediction of the dynamic deformation responses for the two baseline methods and NeuralSI and the corresponding displacement distribution errors are shown in Fig. 9. In NeuralSI, we used ImplicitEulerExtrapolation solver for a 4\(\times \) faster inference. We further optimized the PDE function with ModelingToolkit [25], which provides another 10\(\times \) speedup, for a total of 40\(\times \) speedup over the RK4 solver used for training. Due to a limited amount of data for training, the DNN fails to predict the response. With extra information from the boundary conditions and equation, the PINN method results in an MAE loss of 0.344, and the prediction fits the true displacement distribution well. Most of the values in the displacement distribution error are small, except for some corners. But both methods fail to extrapolate the structural behavior temporally. The extrapolation of DNN predictions produces large discrepancies compared to the ground truth. Similarly, the PINN method fails to match the NeuralSI performance, while fairing much better than the predictions from the DNN, as expected due to the added domain knowledge. The MAE errors were computed and compared with the proposed method trained with 20% data as shown in Fig. 10.

7 Conclusion

In this paper, we proposed NeuralSI, a framework that can be employed for structural parameter identification in nonlinear dynamic systems. Our solution models the unknown parameters via a learnable neural network and embeds it within a partial differential equation. The network is trained by minimizing the errors between predicted dynamic responses and ground truth measurement data. A major advantage of the method is its versatility and flexibility; thus, it can be successfully extended to any PDEs with high-order derivatives and nonlinear characteristics. The trained model can be used to either explore structural behavior under different initial conditions and loading scenarios, which is vital for structural modeling or to determine high-accuracy extrapolation, also essential in systems’ response prognosis. An example beam vibration study case was analyzed to demonstrate the capabilities of the framework. The estimated structural parameters and the dynamic response variations match well with the ground truth (MAE of \(10^{-4}\)). The performance of NeuralSI is also shown to outperform direct regression significantly through deep neural networks and PINN methods by three to five orders of magnitude.