1 Introduction

Bridges are vital elements of a nation’s highway infrastructure. Their importance to the economy and society is often evident during repair or replacement works from the traffic disruptions and the consequent imposed costs on businesses. Most highway bridges in the UK and in other western countries were built during the modern engineering era, especially after WWII. The majority of them are either close to or are past their designed working life [13]. For example, according to data from the Federal Highway Administration’s National Bridge Inventory [4], 33 % of bridges in USA have reached their design life. Of these structures, 42 % are classified as functionally obsolete or structurally deficient. Therefore new technologies and methodologies that can support bridge owners and operators in their constant endeavour to keep their ageing assets safe and serviceable are highly sought after.

Significant research has been devoted over the last few decades to structural health monitoring (SHM), which broadly refers to the subject of measuring and evaluating the performance of structures in order to support structural management. SHM has traditionally relied on a structural identification (St-Id) paradigm using physics-based (PB) models where the goal is to use measurements to update a numerical (e.g., finite element) model of the structure [5] and then employ the model to make predictions on structural behaviour. Research has focused mostly on vibration-based St-Id, i.e., the evaluation of dynamic properties collected from forced or ambient vibrations using a priori PB models [6]. In vibration-based St-Id, changes in structural performance are identified by relating changes in stiffness to variations in vibration characteristics such as modal frequencies. St-Id approaches based on static measurements are also increasingly finding acceptance among SHM researchers. In particular, displacement and strain measurements collected during load tests have been successfully employed to update numerical models [7, 8].

While PB models are important for understanding structural behaviour, such models are unlikely to be readily available for the majority of bridges. Their generation can also be resource and time-intensive. Furthermore, computed models may not replicate behaviour of real-life structures due to uncertainties and approximations in the modelling process [8]. For example, while the elastic modulus of concrete generally varies with temperature, humidity and age [9], these variations are seldom taken into account in a PB model and in turn, affect the reliability of the St-Id process. Incorporating effects of ambient conditions, which are now recognized to significantly affect structural behaviour, in PB models is also difficult [10, 11]. Therefore robust approaches for measurement interpretation that are generic and readily applicable without requiring detailed a priori knowledge of structures have tremendous value in the context of extracting information from monitoring for bridge management.

Data-driven approaches, which rely purely on the collected measurements for measurement interpretation, offer great promise for long-term continuous monitoring. These approaches, which employ non-physics based (NPB) models such as those derived from machine learning techniques [5], can detect deviations from normal structural behaviour. Data-driven approaches generally consist of two distinct phases. In an initial learning phase, patterns of normal structural behaviour are defined from measurements taken during a period when the structure is free of damage. Previous research has shown that measurements collected over a duration of at least 1 year, and in certain cases, even more than a decade, may be required to establish the patterns of normal structural behaviour [5]. Subsequently, in the long-term monitoring phase, new measurements are compared against computed patterns to detect deviations from normal behaviour.

Data-driven methods have typically been illustrated on numerical models with damage often modelled as loss in stiffness in a region close to the location of sensors [12]. Posenato et al. [13] investigated a number of data-driven methods for anomaly detection and compared their capabilities for detecting damage simulated in a numerical model. This work used the trends introduced in measurements by variations in seasonal temperatures for tracking structural behaviour. Laory et al. [14] later showed that removing seasonal temperature trends from measurements detrimentally affects the performance of anomaly detection techniques and that an ensemble approach to anomaly detection that combines results from several algorithms is capable of detecting anomalies faster and more reliably [15]. Only a few studies have investigated their performance on measurements from full-scale structures [1618]. These have concluded that their excellent performance on simulated data is seldom replicated on measurements from real-life structures. Del Grosso and Lanata [19] investigated the application of proper orthogonal decomposition (POD) for damage detection on strain measurements from long-term monitoring of post-tensioned concrete beams that were exposed to ambient conditions. They found that known damage events cannot be reliably identified by POD due primarily to the effects of variations in ambient conditions and in particular, those due to changes in temperature distributions.

There are two different approaches to handling environmental effects in measurements. The first, which has limited success, treats environmental effects as undesirable noise in measurements and removes these effects from measured response using numerical techniques [17, 20]. The second approach, which offers more promise, is to explicitly account for the effects of each ambient parameter on response within the St-Id process. Kulprapha and Warnitchai [21] developed techniques for calibrating PB models using temperature and response measurements and showed that such models accurately represent measured behaviour. Kromanis and Kripakaran [22] have proposed a data-driven methodology, referred to as regression-based thermal response prediction (RBTRP) methodology, in which regression models that accept temperature measurements as input for predicting response are derived from measured response and temperature distributions. The methodology was successfully illustrated on measurements from a laboratory-scale and a full-scale bridge supporting the idea of predicting structural response from distributed temperature measurements.

This paper integrates the RBTRP methodology presented in [22] within a comprehensive approach for measurement interpretation that characterises thermal response, and subsequently analyses the predicted thermal response in combination with measured response using anomaly detection techniques. This approach is referred to as the Temperature-Based Measurement Interpretation (TB-MI) approach. The underlying concept in this approach was illustrated in a preliminary study using simulated measurements obtained from a numerical model in [23]. This paper however uses real measurements from a laboratory truss structure, which has been specifically set up to investigate thermal effects in bridges under a range of structural scenarios. It also focuses on tasks related to measurement pre-processing such as dimensionality reduction and measurement down-sampling, which are often required to deal with the high spatial and temporal density in measurement sets from full-scale bridges. In this study, dimensionality reduction of measurements is performed using principal component analysis (PCA); the effect of number of principal components chosen to represent measurements on performance of the RBTRP methodology is investigated in detail. The optimal measurement sampling frequencies essential to capture the full variability in thermal response of bridge are also evaluated. These factors have never before been investigated in detail in the field of SHM, and particularly in the context of characterising thermal effects in measurements. This research lastly also proposes a novel anomaly detection technique to detect anomaly events from the time-series of prediction errors, which are computed as the difference between predicted and measured structural behaviour. The performance of this technique called the signal subtraction method is evaluated on measurements from the laboratory truss.

2 Thermal response of bridges

The response of a structural system is determined by inputs to the system such as forces and ambient conditions, and its structural properties. Knowledge of structural conditions, material properties and applied forces (e.g., traffic, temperature) will therefore enable evaluation of structural response. Variations in environmental conditions and in particular temperatures have a major influence on the structural response of bridges. Real-life bridges experience complex temperature distributions that can vary nonlinearly in all three spatial dimensions [2426]. Potgieter and Gamble [26] showed using measurements from an existing box girder bridge that stresses and forces due to nonlinear temperature distributions are often of magnitudes comparable to those due to live loads. These are also confirmed by real measurements from full-scale structures wherein time-series of response measurements often resemble those of measured ambient temperatures. In contrast, traffic loads are seen to have relatively little effect on overall quasi-static structural response [7]. To illustrate this fact, bridge response measured over 1 day is plotted in relation to ambient temperatures in Fig. 1 for two structures. The figure shows tilt measurements from the footbridge at the National Physical Laboratory (NPL) and measurements of bearing displacement of the Cleddau Bridge. The plots show clearly that structural response closely follows the diurnal temperature cycle implying that temperature variations play a key role in determining deformations in these structures. The effects of traffic loads can be considered as noise superimposed on the thermal response.

Fig. 1
figure 1

Plot of measurements of bearing displacement from the Cleddau Bridge (left) and tilt from the NPL footbridge (right) over 1 day in relation to diurnal changes in ambient temperature

We therefore hypothesize that the quasi-static structural response of a bridge, in general, may be approximated as its thermal response, and that characterizing the thermal response is sufficient to enable interpretation of measurements from long-term monitoring. The proposed data interpretation approach, which is referred to from hereon as temperature-based measurement interpretation (TB-MI) approach, builds on this premise. It investigates to what extent correlations between temperature distributions and structural response can help assess structural performance of bridges.

2.1 TB-MI approach

This research builds on an approach referred to as the RBTRP methodology [22] that has been developed to characterize structural response from measured temperature distributions. It couples the RBTRP methodology with anomaly detection techniques to create the TB-MI approach. A schematic of the envisioned approach is shown in Fig. 2. The steps within the TB-MI approach are further described in detail in Fig. 3. The approach employs a three stage process, which includes (1) collection of distributed measurements for a reference training period, (2) application of RBTRP methodology, and (3) interpretation of predictions from RBTRP methodology. The RBTRP methodology is used to predict the thermal response of the structure, which is then employed in conjunction with real-time measurements of structural response for detecting anomalous structural behaviour. Data can be visualized by engineers at any stage of the interpretation process.

Fig. 2
figure 2

Schematic diagram of temperature-based measurement interpretation (TB-MI) approach

Fig. 3
figure 3

Flow chart illustrating TB-MI approach

2.2 Prediction of thermal response

The RBTRP methodology, illustrated in a flow chart in Fig. 3, consists of two phases. In the first phase, which is a learning phase, data from monitoring is processed to generate regression models that estimate structural response from distributed temperature measurements. In the second phase, which involves the application of models to real-time performance monitoring, predicted response is compared with collected measurements for subsequent analysis using anomaly detection techniques. These two phases are described in further detail below.

  1. 1.

    Model generation phase: the model generation phase is a key step for successful application of the RBTRP methodology. The aim is to generate one or more statistical models for each sensor location such that they are capable of predicting the corresponding structural response from knowledge of temperature distributions. The model generation phase involves a series of iterations over the following interlinked steps:

    1. (a)

      Selection of a reference set: the reference set refers to a set of measurements that are representative of the baseline conditions of the structure. The duration corresponding to the measurements chosen for the reference set is called the reference period. A reference period that is sufficiently large, usually at least 1 year in order to cover the expected daily and seasonal variability in measurements is chosen.

    2. (b)

      Data pre-processing: outliers are removed from measurements using inter-quartile range (IQR) analysis. IQR identifies outliers in a window of values from within a time-series as those values that differ significantly from the median value. Consequently, values in a time-series that represent gradual and permanent shifts, such as due to damage or deterioration, will not be considered outliers. However sudden and temporary shifts (of lengths smaller than the size of the moving window) may be considered outliers. This is not a major limiting factor since such events are often temporary occurrences such as abnormal loading or an extreme event, which do not permanently alter structural performance. Moving average methods and low-pass filters are also applied to remove measurement noise. Measurement time histories are then down-sampled, as needed, to reduce the size of the data-set. Measurements are often collected at high frequencies, e.g., the bearing displacements of the Cleddau Bridge [Fig. 1 (left)] are recorded every second. However, since quasi-static changes in both temperature and response are gradual, all measurements may not be required for model training. Down-sampling measurements can decrease the time required to train regression models while having negligible impact on prediction accuracy. This also helps prevent over-training of regression models and enables better generalization of the relationship between thermal response and temperature measurements. The effect of down-sampling on prediction accuracy is systematically evaluated within the RBTRP methodology. It starts with a low measurement sampling frequency and then doubles the sampling frequency iteratively to determine the optimal measurement sampling frequency for response prediction, i.e., the sampling frequency above which improvement in prediction accuracy is negligible.

    3. (c)

      Dimensionality reduction: the dimensionality of the data-sets are reduced using principal component analysis (PCA), which is a widely-employed statistical technique that takes advantage of inherent correlations between variables in the data-set. It involves finding a set of principal component (PC) vectors that define an orthogonal transformation from the original set of linearly-correlated variables to a new set of uncorrelated variables. In this research, the first few PC vectors of the temperature measurements that capture almost all the variability in the original data are chosen to transform the raw temperature measurements to a low-dimensional PC space.

    4. (d)

      Generation of statistical models: training and test data-sets are composed from the data in the PC space corresponding to the reference period. While the size of the training sets will vary depending on the down-sampling rate, the data-sets will generally capture the full variability in the measurements since the duration of the reference period is left unchanged. Regression-based models are generated through an iterative process of training and testing on these two data-sets. The inputs to the regression models are mainly the PC equivalents of the temperature measurements taken at the current measurement time-step. Data from the PC space corresponding to temperatures from prior measurement time-steps can also be provided as input to the models to incorporate thermal inertia effects. The term thermal inertia refers to the phenomenon of temperatures in certain parts of a structure lagging behind ambient temperatures and temperatures in other regions of the structure. Thermal inertia is common in concrete and masonry bridges, which are more voluminous than metallic structures and have high thermal mass. Thermal inertia effects are accounted in response by providing the PC equivalents of both current (D i ) and former temperature (D ij ) measurements (in the PC space) as input to the regression models. Here, i refers to the most recent measurement time-step and ij to one that is j time-steps prior to i. D i and D i-j are the PC equivalents of the measurement sets corresponding to the two time-steps.

    5. (e)

      Model evaluation: the above-mentioned steps are performed iteratively for various kinds of regression models such as multiple linear regression (MLR) and support vector regression (SVR) to evaluate their appropriateness for thermal response prediction for the structure. The RBTRP methodology selects the regression model with the highest prediction accuracy or an ensemble of models that are observed to demonstrate good performance over the validation period is chosen for thermal response prediction at a given sensor location. Therefore response at each sensor location within a SHM system could be predicted with a specific regression model or a certain ensemble of models. This paper however employs results obtained using only SVR models for thermal response prediction since an earlier study [22] comparing performance of various regression algorithms for thermal response prediction demonstrated its superior performance. Readers interested in a more detailed description of the RBTRP methodology can also refer to that study [22].

  2. 2.

    Regression model application: the regression models identified in the model generation phase are employed for prediction of thermal response from real-time distributed measurements of temperatures. As will be shown in the next section, analysis of model predictions and measured response can support detection of anomalous structural behaviour, and thereby enable tracking structural performance.

3 Data-driven anomaly detection

The main goal of a real-time anomaly detection methodology, in the context of SHM, is to identify deviations in structural behaviour from collected measurements. Measurements from the reference period, i.e., the data used for training regression models, are assumed to represent the structure in its baseline (normal) state. The objective of anomaly detection is then to recognize deviations in system behaviour from this baseline condition. Since this often requires analysis of time-series of measurements, such methodologies draw upon existing knowledge in the domain of signal processing. Anomalies in time-series are often classified as either.

  1. 1.

    Outliers, which are often due to temporary factors unrelated to the main parameters of the system under observation, or

  2. 2.

    Novel trends, which imply shifts from current condition to new states.

Derivation of prediction error (PE) signals: the first step towards anomaly detection is to compare predictions from regression models computed using the RBTRP methodology with measurements of structural response, and thereby derive prediction errors (PEs). PEs are computed as the differences between measured and predicted response as shown below:

$$\Delta y_{s} = p_{s} - m_{s}$$
(1)

where Δy s is prediction error and p s and m s are predicted and measured response, respectively, at a sensor s. As more and more measurements are collected, PEs are correspondingly evaluated and these are chronologically sequenced to form a time-series. The time-series of PEs are from hereon referred to as PE signals.

Treatment of PE signals: PE signals like any other time-series can be noisy and contain outliers. Therefore they are pre-processed using the same procedures as for treating measurements. PE signals are first cleansed using outlier removal techniques such as IQR analysis and three sigma (three-σ) analysis [13]. Then de-noising or data smoothing methods such as moving average techniques [27] are applied to the signals. However, smoothing the signals may lead to loss of information that is critical to detecting anomalies. Therefore selection of smoothing technique and values for its related parameters such as length of the moving window needs to be done with care.

Interpretation of PE signals: there are two fundamentally different approaches to detecting anomalous structural behaviour from PE signals. PE signals corresponding to various sensor locations can either be analysed individually or be analysed in groups. A simple example of the former approach, which is also called as univariate analysis, is the most popular method of setting threshold limits on the PE for a sensor location which when exceeded is said to be indicative of damage near that sensor location. The latter approach, also termed multivariate analysis, relies on the implications of the bridge acting as a well-connected structural system. Damage to one or more components usually alters existing correlations between responses at various locations on the structure. In large and complex structures that have vast numbers of sensors, clustering sensors into groups according to their correlations and then analysing measurements from these clusters for changes in correlations can help detect damage [28].

In this study, PE signals are analysed using a novel numerical technique: signal subtraction method (SSM). SSM belongs to the class of multivariate signal analysis techniques. It exploits correlations between two sensor locations for damage detection. In SSM, two PE signals are linearly combined to generate a new signal, which is then analysed for anomaly detection. Mathematically the process of combining two signals in SSM can be described using the following equation:

$$T_{kl} = \left( {\frac{{w_{k} }}{{r_{k} }}} \right)\Delta y_{k} - \left( {\frac{{w_{l} }}{{r_{l} }}} \right)\Delta y_{l}$$
(2)

T kl is the new signal (SSM signal) resulting from the subtraction process. Δy k and Δy l are values of the PE signals corresponding to sensors k and l respectively. r k and r l are scaling factors for the two PE signals. These are equal to the range of signal values in the reference period, i.e., the difference between the maximum and minimum values in the reference period. w k and w l are weights specified according to the accuracies of the sensors at the corresponding locations and the accuracy of the model for thermal response prediction. In this study, measurements from all elements are assumed to be equally important. Therefore weights of all PE signals are equal to 1. Using SSM on all sensor combinations may not be computationally viable due to the combinatorial explosion in computational complexity. The number of combinations C q from q sensors taken two at a time is given by:

$$C_{q} = \frac{q!}{{2!\left( {q - 2} \right)!}}$$
(3)

However, a small number of sensor combinations can be chosen based on engineering judgment. Sensors whose measurements are strongly correlated are potential candidates for SSM.

The reference period chosen for SSM is the same as the one used for training the regression models in Sect. 2.2. Features computed for the reference period are used to determine threshold limits that describe normal structural behaviour. Threshold bounds, for example, can be defined in terms of the variability of the signal during the reference period; the bounds can be prescribed as [μ − , μ + ] where n is a positive integer and σ and μ are the standard deviation and the mean of the signal for the reference period. While defining threshold bounds closer to the mean value of the signal, i.e., keeping n too small, will increase the likelihood of false-positives and false-negatives, larger threshold bounds, i.e., increasing the value for n, will imply that only damage events of high severity are detected [20]. SSM signals during the reference period are expected to be normally distributed, especially when considering how these signals are derived. When a signal is representing a Gaussian process [15, 29], values for n are commonly set to 3 and 6, which correspond to confidence levels of 99.73 % and 99.99 % respectively. An anomaly is said to be detected when the tracked feature of an anomaly detection technique continuously exceeds the predefined threshold bounds.

4 Continuous monitoring experiment

In this research, an aluminium structure representing a warren truss, typically used for short-span railway bridges, is designed and manufactured specifically to serve as a test-bed for the validation of the proposed TB-MI approach. This section presents the experimental setup and results of application of the proposed approach to measurements collected from this structure.

4.1 Test-bed

The truss shown in Fig. 4 is continuously monitored in the structures laboratory at the University of Exeter [30]. It is composed of (1) channel sections for the top chord, bottom chord and the outer diagonal members, and (2) flat rectangular bars for inner diagonal members and vertical elements. All members of the truss are made of aluminium. Aluminium is chosen over steel for the structure since it has a much higher value for coefficient for thermal expansion (α = 23.1 × 10−6 K−1), almost twice as for steel. Therefore, thermal strains in an aluminium structure of the same size will be nearly double that for a steel structure. This is beneficial when attempting to understand temperature effects using small-scale models in the laboratory. Results obtained from analysing measurements from this truss structure can also be scaled up for steel structures both aluminium and steel follow linear-elastic material models albeit with very different values for elastic moduli.

Fig. 4
figure 4

A photograph of the truss, with zoomed-in views of connection and support details

Each end of the bottom chord is bolted to a steel shoe (see detail in a cyan box in Fig. 4). The shoe is fixed on four threaded rods which are cast in a concrete block. The shoes are adjusted so that a 25 mm clearance is provided between concrete blocks and the bottom chord. The concrete blocks are firmly fixed to the iron floor. The left support of the truss is always fully fixed (i.e., clamped to prevent both rotations and translations). The right support can be configured to act as a fully fixed support or as a roller (see detail in a blue box in Figs. 4, 5). The bottom and top chords have splice connections. These are to the left of mid-span of the truss (see detail in a yellow box in Fig. 4). All joints including the splices are made up of six bolts. None of the connections are likely to be perfectly pinned. In fact, all of them are expected to behave as semi-rigid connections. A typical configuration of bolts used in connections where two diagonal elements and one vertical element are connected to top or bottom chord is shown in a green box in Fig. 4.

Fig. 5
figure 5

A sketch of the test-bed showing its principal dimensions, locations of heaters, thermocouples (black dots), strain gauges (S−i, where i = 1, 2, …, 10) and the joints (J−i where i = 1, 2, 3, 4) where damage is simulated

Structural response of the test-bed is monitored with 10 foil strain gauges (gauge length 6.35 mm). Distributed temperatures are measured with 31 thermocouples. The geometry and principal dimensions of the truss together with locations of sensors and heaters are provided in Fig. 5. The figure also indicates the joints that have been damaged in order to simulate anomaly events and the possible configurations of boundary conditions (pinned or released). The following joints are selected for simulating damage in this study:

  1. 1.

    J−1 and J−2: at these joints, two diagonal elements and one vertical element are connected to the bottom and top chords respectively.

  2. 2.

    J−3: this joint is a splice connection on the bottom chord.

  3. 3.

    J−4: this joint is between an outer diagonal element and the bottom chord.

These three joints have been chosen for damage simulation since they cover all the types of joints present in the truss.

4.2 Simulated scenarios

The TB-MI approach is evaluated on measurements from the laboratory truss, which is setup to simulate various scenarios that differ in the following parameters:

  • Temperature distributions,

  • Boundary conditions, and

  • Damage (location and severity).

In this study, the following two kinds of temperature loads are considered:

  1. 1.

    Load case A: this refers to accelerated temperature cycles simulated by turning infra-red heaters on and off to heat and cool the structure respectively. The purpose is to simulate real-life temperature cycles in bridges as realistically as possible in a very short duration to enable testing and performance evaluation of data interpretation techniques. One simulated diurnal cycle lasts 90 min of which heaters are switched on for 45 min. Thus 16 diurnal cycles are emulated per day. A thermal image of the test-bed taken at night and shortly after heaters are turned off is shown in Fig. 6. Temperatures at the top chord of the truss are up to 10 °C higher than that of the bottom chord. This vertical temperature gradient is similar to those observed in other test-beds [21] and full-scale structures [26].

    Fig. 6
    figure 6

    Temperature distribution captured with thermal imaging camera

  2. 2.

    Load case B: this corresponds to ambient temperature cycles. The structures lab is open to the outside environment. Hence the ambient temperature in the vicinity of the test-bed is close to the outside air temperature.

Note that Load A cannot be applied in isolation as ambient effects are always present. Thus there are two possible load combinations: (A + B) and B.

Response and temperature measurements are recorded at 10 s intervals (0.1 Hz) for load case (A + B) and at 1 min intervals (1.7 × 10−2 Hz) for load case B. The measurement frequency has been reduced for load case B since temporal changes in temperature distributions due to ambient effects are very gradual. Measurements are also appropriately down-sampled for measurement interpretation.

The two possible boundary conditions at the right support are combined with the two temperature load cases to form three different structural scenarios as listed in Table 1. The three scenarios are required to close the loop of simulated scenarios, i.e., a fixed constraint at the right support, a roller constraint at the right support and back to fixed constraint at the right support. This permits the research to evaluate if the truss reverts to the original configuration upon completion of the experimental tests. Measurements for scenarios X and Y are collected for approximately 12 days each while those for scenario Z are collected for 96 days.

Table 1 List of structural scenarios as determined by load and boundary conditions

In addition to varying thermal loading and boundary conditions, damage is simulated by removing bolts from joints. This approach to simulating damage is more realistic than abruptly reducing the stiffness of a truss element by cutting out material as performed in previous studies [19, 31]. In real-life bridges, damage is typically more localized and less subtle than significant loss of material in a load-bearing structural element. Furthermore, it also offers the opportunity to evaluate the sensitivity of SSM to damage by gradually exacerbating the damage severity, which is similar to the incremental progression of damage in real structures. Each act of either changing a joint configuration or altering the boundary condition is termed as an event that must ideally be recognized by the TB-MI approach as an anomaly or change in structural performance. The list of events is provided in Table 2 along with relevant details such as the joint that is affected, the number of removed bolts and the corresponding structural scenario. For example, damage event #3 refers to load scenario X when joints J−1 and J−2 are damaged and in total eight bolts are removed. Measurements are collected from the truss for the three structural scenarios listed in Table 1. The level of damage is gradually increased for each scenario by increasing the number of damaged joints and the total number of removed bolts. Before switching over to a different scenario, the structure is repaired by replacing all the removed bolts. Five damage events are considered in load scenarios X and Y; three damage events are considered in load scenario Z. Events #5 and #12 correspond to the structure being repaired. However, the structure is unlikely to revert back to its original state due to the manner in which the bolts are tightened. In real-life structures, bolts, designed to connect structural elements, are often tightened with a torque wrench; hence a prescribed force is applied to each bolt. Bolted connections in the test-bed, however, are manually tightened without measuring the actual torque provided. Thus the stiffness of each connection will be different and also, the same connection may not revert to its original stiffness when bolts are put back.

Table 2 List of events with details of the events and the corresponding loading and boundary condition scenarios

4.3 Measurement time-histories

Measurements are collected from the strain gauges and thermocouples for the scenarios listed in Table 1. Figure 7 shows plots of temperature and strain measurements from strain gauge S−3 and a thermocouple (located in the vicinity of S−3). Temperatures and strains measured during a simulated diurnal cycle are also shown in zoomed-in views in Fig. 7 (right). Their patterns resemble that of measurements collected over 1 day from full-scale structures. The duration of each structural scenario is given in Fig. 7 (top). The amplitude of strains increases when longitudinal translations are allowed (scenario Y). Effects of ambient temperature variations are not evident during scenarios X and Y since the effects of the accelerated diurnal cycles are superimposed on them. When the heaters are turned off in scenario Z, ambient effects drive the response as the high frequency patterns due to the heaters disappear.

Fig. 7
figure 7

Time-histories of temperatures at the bottom chord (top) and strains (bottom) measured with S−3 with a zoomed-in views for a simulated diurnal cycle (right) around the time of damage event #1

4.4 Prediction of thermal response

The RBTRP methodology is employed to generate regression models for all three scenarios. Application of the methodology to scenario X is first illustrated. Results for scenarios Y and Z are provided subsequently. The reference period for scenario X is composed of the first 25,920 measurements (equivalent to 3 days of monitoring). Outliers and noise are removed with IQR and smoothing techniques, respectively.

Two different approaches for the selection of training and test periods are investigated.

  1. 1.

    Training method 1 (TM1): starting from the first measurement in the reference period, one half of measurements of the reference period is chosen as the training set and the other half is selected to test the accuracy of regression models. To be more precise, in TM1, measurements taken during the first 2 days of the reference period form the training set and the rest form the test set.

  2. 2.

    Training method 2 (TM2): both the training and test set, although mutually exclusive, are composed of measurements spread over the entire reference period.

This study then systematically evaluates the effect of the following three parameters of the RBTRP methodology, which can affect the performance of the regression models generated for response prediction.

  1. 1.

    Down-sampling of measurements

  2. 2.

    Thermal inertia parameter j

  3. 3.

    Number of principal components (PC) chosen after transforming input temperatures into PC space.

After selection of the reference period, down-sampling of the measurements is recommended to avoid over-training regression models. This study finds the optimal down-sampling rate, which is the value for the sampling frequency above which negligible improvements are observed in response prediction accuracy. For this purpose, the sampling frequency is increased iteratively by a factor of two starting from 1 measurement every 10,240 s (1 × 10−4 Hz) to 1 measurement every 20 s (0.05 Hz). Down-sampled temperature measurements are transformed to PC space. The minimum number of PCs required to achieve a reasonably high prediction accuracy is evaluated, and the chosen PCs are input to regression models within the RBTRP methodology. The influence of the thermal inertia parameter j on model predictions is also investigated for all scenarios. For scenarios X and Y, the thermal inertia parameter is not considered essential due to the small scale of the structure and the minimal influence of ambient conditions. However, for scenario Z, ambient temperature variations drive deformations in the structure, and hence thermal inertia is expected to be a factor. As stated before, the chosen number of PCs corresponding to the current measurement time-step and j time-steps prior to the current one are given as input to the regression model to account for thermal inertia effects.

4.4.1 Results for TM1: scenario X

The RBTRP methodology is first evaluated for scenario X using TM1. As stated before, results are presented only for SVR models since they have previously been observed to produce robust and accurate models in the previous studies [22].

In order to understand the influence of the thermal inertia parameter j on the performance of regression models, values for j are increased gradually starting from j = 0. However, minimal improvement in prediction error accuracy is observed for scenario X. The average RMSE for scenario X is the lowest for j = 1; for this value, the error reduces by 1.5 % when evaluated in terms of the range of measured strains. j = 1 implies that measurements from the current as well as the previous time step are included as input to the regression model for thermal response prediction.

The influence of down-sampling on prediction performance is also evaluated by varying the sampling frequency as indicated in the previous section. Results obtained are given in Table 3. The average RMSE is observed to be minimum when the sampling frequency is equal to 4 × 10−4 Hz (see Table 3).

Table 3 Average RMSE of the predictions of the regression models obtained using various down-sampling frequencies and training methods

The study then evaluates the influence of the number of PCs on prediction accuracy. Prediction errors of the regression models generated with increasing numbers of PCs are plotted in Fig. 8. The figure shows clearly that approximately \(\frac{1}{3}{\text{rd}}\) of the PCs (i.e., 10 out of 31 in this case) are sufficient to produce accurate response predictions. However, the actual optimal number varies according to sensor location. For example, the optimal number of PCs to accurately predict response at the locations of sensors S−3 and S−7 are 10 and 11, respectively (see Fig. 8). Note also that the prediction error does not decrease monotonically with increasing number of PCs, especially for sensor S−3.

Fig. 8
figure 8

Prediction error and the number of PCs for sensor locations S−3 (left) and S−7 (rights), scenario X, TM1, sampling frequency 4 × 10−4 Hz

Figure 9 shows prediction error signals corresponding to sensors S−3, S−4 and S−7 generated using a SVR model that is trained using TM1 with the thermal inertia parameter set to 1, and a measurement sampling frequency of 4 × 10−4 Hz. The number of PCs used varies between 10 and 14. The PE signal corresponding to a sensor are from hereon referred to simply as PE sensor name. For example, PE S−3 refers to a prediction error signal corresponding to sensor S−3. Figure 9 shows that the amplitudes of PE S−3 and PE S−4 change abruptly when the right support is released for event #6 (see Table 2). Similarly PE S−3 also has a shift immediately after event #4 (see Table 2). These abrupt changes can be indicators of anomalous structural behaviour. Such shifts, however, are not discernible at the time of other events in any of the PE signals.

Fig. 9
figure 9

PE S−3, PE S−4 and PE S−7 generated using training method TM1. Numbers in boxes represent damage events

PE S−3 and PE S−4 (see Fig. 9), which correspond to sensors located on the bottom chord, drift slightly after the end of the training period towards a new mean. This is attributed to the prediction errors increasing due to ambient temperatures reaching values that were never encountered previously during the training period. However, PE S−7 (see Fig. 9), which corresponds to sensor on the top chord, do not show such a drift after the training period. This is probably due to the fact that the top chord is free to expand and contract due to a lack of restraint in the longitudinal direction (X axis), and also due to being exposed to higher temperature variations than the bottom chord, given that ambient effects on are comparatively low during this scenario.

4.4.2 Results for TM2: scenario X

The performance of the RBTRP methodology is now evaluated using training method TM2. As with the previous case, results are presented only for SVR-based regression models. The optimal measurement sampling frequency is evaluated as 3.1 × 10−3 Hz (see Table 3). The optimal value for thermal inertia parameter j is 1 and the improvement in prediction accuracy is 1.5 %, similar to that for TM1. Results obtained for only sampling frequency of 3.1 × 10−3 Hz and j = 1 are illustrated for this scenario although other values offer similar results with only a marginal change in the prediction accuracy.

The prediction error decreases gradually as the number of PCs is increased (see Fig. 10). A significant drop in the prediction error can be observed for sensor locations measuring large strains (sensors installed on the top chord) when the number of PCs is increased from 1 to 3. The prediction error reduces only marginally when the number of PCs is more than 12 (see Fig. 10), hence, twelve PCs are chosen as the optimal input to the regression models. Compared to the results obtained using training method TM1, the prediction errors decrease monotonically with increasing number of PCs.

Fig. 10
figure 10

Prediction error and the number of PCs for sensor locations S−3 (left) and S−7 (right), scenario X, TM2, sampling frequency 4 × 10−4 Hz

PE S−3, PE S−4 and PE S−7 computed using TM2 are plotted in Fig. 11. The mean RMSE of predictions at all sensor locations is lower than those observed using TM1 (see Table 3). The main reason for the comparatively smaller RMSE is that the training set encompasses measurements which are spread over the whole reference period. Since the full range of peak-to-peak temperatures is included in the training period, the resulting statistical models are more robust as they are predicting response only for scenarios that they have been trained on. These results illustrate the importance of selecting a training set that encompasses the expected variability in the data sets. It also shows that model predictions are unlikely to match structural behaviour during conditions of extreme weather events such as a record hot summer since the models have not been trained on data taken during such environmental conditions.

Fig. 11
figure 11

PE S−3, PE S−4 and PE S−7 generated using TM2

A gradual decrease in PE signals corresponding to sensors installed on the bottom chord is no longer observed around measurement #20,000, when the training period used for TM1 ended (see PE signals S−3 and S−4 in Fig. 9 in relation to those in Fig. 11). PE signals corresponding to those sensors installed on the top chord (S−6 to S−9) remain fairly stationary until time of event #4. PE S–7, which closely resembles the PE signals from the other sensors, is plotted as an example in Fig. 11.

While the prediction error is low, PE S−3, PE S−4 and those for other sensors on the bottom chord have low-amplitude periodic patterns after the reference period (and before event #1) that appear to correspond to diurnal temperature changes. This phenomenon is common to PE signals obtained using both model training methods, however, more noticeable in PE signals generated using TM2 due possibly to the improved prediction of the effects of diurnal variations simulated using the heaters. The predictions cannot fully account for ambient temperature changes since the training period, which lasts only three days, is too short for capturing the full variability in ambient changes.

4.4.3 Results for scenario Y

Training method TM2 is chosen over TM1 to generate regression models for scenarios Y and Z since it has shown to produce better performance for scenario X. Regression models for scenario Y are generated using the same approach as for scenario X. The optimal measurement down-sampling frequency is determined as 6.3 × 10−3 Hz. The mean RMSE of predictions is close to 1 % of the strain range for the reference period, indicating that the models are predicting accurately the response. For illustration purposes, PE S−3, PE S−5 and PE S−10 are plotted in Fig. 12. As for scenario X, periodic distortions in the signals corresponding to diurnal temperature changes can be discerned after the reference period. This phenomenon is common for all PE signals. The only visually detectable event from PE signals is event #11 as observed from the plot of PE S−5. This is probably due to the sensor S−5 being close to the joint (J-4) that is damaged in event #11. All the other events cannot be detected simply from visually examining the PE signals.

Fig. 12
figure 12

PE S−3, PE S−5 and PE S−10 generated using TM2

4.4.4 Results for scenario Z

In scenario Z, only ambient temperature is applied, i.e., temperature load case B. The length of the reference period is 55 days during which there are almost 80,000 measurement time steps. The length of the reference period has been chosen to ensure that it covers peak-to-peak temperature variations. The optimal values for thermal inertia parameter j and measurement sampling frequency are determined. A value for j between 10 and 15 is observed to be optimal depending on the sensor location. The optimal measurement sampling frequency is evaluated to be 5.2 × 10−3 Hz. This translates to approximately one measurement per hour or 24 measurements per day. This can be useful guidance when setting sampling rates for measurement systems. A minimum sampling frequency of one measurement per hour may be required to capture thermal inertia effects. The number of PCs given as input to regression models is between 14 and 21, depending upon the sensor for which the model is constructed.

Time-histories of temperatures measured with a thermocouple installed on the bottom chord are plotted in Fig. 13 (top) together with PE S−3 and PE S−4 [Fig. 13 (bottom)]. The regression models predict accurately the structural response as evident from the low values of prediction errors in the plots of PE S−3 and PE S−4. While events #13 and #14 do not appear to affect PE S−3 and PE S−4, after the occurrence of event #15, both signals show significant deviations from previously observed stationarity. However, during the same period, the ambient temperature in the structures laboratory, where the truss is being monitored, also deviates from previously observed patterns due to abnormally cold weather. Therefore, reliably stating whether the deviations in PE S−3 and PE S−4 are due to event #15 or due to abnormal temperature changes is difficult (Fig. 7). Subsequent discussion on anomaly detection will cover this aspect.

Fig. 13
figure 13

Time-history of temperature measured near sensors S−3 and S−4 (top) and PE S−3 and PE S−4 generated using TM2

4.5 Anomaly detection from PE signals

In this section, PE signals from experimental scenarios are examined for anomalies. Clusters of PE signals are analysed with SSM. Scenario X serves as a demonstrator in this study. Scenarios Y and Z are used to emphasize the robustness of the TB-MI approach.

The first step after gathering PE signals is to prepare them for anomaly detection via signal processing. This involves both smoothing and outlier removal. Prior to smoothing, the PE signals are examined visually. While the signals as plotted in Figs. 11, 12 and 13 seem to be noisy, upon closer examination (see Fig. 14), they are actually seen to be fairly continuous. Figure 14 plots a zoomed in view of the PE S−3 obtained for scenario X. The plot, which includes 1000 data points, contains no visible outliers and little noise, and appears much smoother than the zoomed out view of PE S−3 in Fig. 11. The author attributes these high frequency patterns to temperature variations from both ambient conditions and the simulated thermal cycles. Such patterns are observed in all PE signals. Thus no pre-processing of PE signals is required prior to anomaly detection. A PE signal is computed from the differences between predicted and measured response. As long as the inputs to the regression model and the measured response are both treated for outliers and noise, PE would also be free of outliers and noise, and hence not require pre-processing.

Fig. 14
figure 14

A zoomed in view of the PE S−3 obtained for scenario X shortly after the reference period

4.5.1 Scenario X

Subtracted signals derived from combinations of those PE signals corresponding to the sensors located away from damage exhibit no or negligible deviations from their baseline conditions. Therefore, in this paper, anomaly detection is demonstrated using only a few subtracted signals that are generated from PE signals of sensors which are located closer to the damaged elements. Figure 15 displays four subtracted signals which reflect the structural behaviour of the truss. Each signal refers to a combination of two signals, e.g., subtracted signal TS3S4 refers to a combination of PE S−3 and PE S−4 [see Eq. (2)]. Combinations, which include PE S−3, indicate a sudden change at event #4 (see TS3S4 in Fig. 15). Subtracted signals, which include PE S−4, depart gradually from the confidence interval after each subsequent event starting from event #1. Examples illustrating this behaviour are given in Fig. 15. Subtracted signals corresponding to a combination of sensors on elements of the truss that are not spatially close to the location of damage also diverge from their respective baseline conditions. This can be seen in Fig. 15 for TS4S8, which combines PE S−4 and PE S−8 corresponding to sensors S−4 and S−8. For event #3, joint J-2 of the truss, which is not directly linked to the elements having sensors S−8 and S−9, is damaged. This event is detectable from TS4S8 as it begins to depart outside the confidence interval. At event #5 when structure is repaired signals do not revert to their previous patterns. When boundary conditions are changed (event #6), subtracted signals shift abruptly. From the above, one can conclude that SSM can detect changes in structural behaviour, and that the sensor locations of the PE signals used to generate the SSM signal may also help in revealing the location of damage.

Fig. 15
figure 15

Subtracted signals TS3S4, TS4S5, TS4S8 and TS4S9 generated with SSM

4.5.2 Scenario Y

In scenario Y, the restraint at the right support (in Fig. 5) limiting free translation in longitudinal direction is removed by modifying the boundary condition to simulate a roller support. Therefore, the range of strains in the bottom chord increases (see Fig. 7).

All combinations of two PE signals are derived using SSM. Subtracted signals that are most affected by the events happening on the truss are discussed below. In particular, subtracted signals computed from PE signals corresponding to sensors on the bottom chord such as TS1S3 and TS2S3, and subtracted signals corresponding to sensor S−5 such as TS5S7 and TS3S5 are shown in Fig. 16. The computed signals show the following:

Fig. 16
figure 16

Subtracted signals TS1S3, TS2S3, TS3S5 and TS5S7 for scenario Y

  • A gradual drift after event #9, and

  • An abrupt shift at the occurrence of event #11.

The sensors S−2 and S−3 are in close proximity to joint J−3, which is directly affected by event #9. Therefore, subtracted signals TS1S3 and TS2S3 (see Fig. 16), which are generated by combining PE S–3 with PE S−1 and PE S−2 respectively, depart noticeably from the confidence interval after event #9. At event #11, five bolts are removed from joint J−4, which is close to sensor S−5. This event can be detected as a shift in signal patterns when analysing subtracted signals which include PE S−5 (see TS3S5 and TS5S7 in Fig. 16).

Compared to scenario X, where the boundary conditions are set to prevent translation, in scenario Y, very few of the events are detectable by SSM. This is attributed to the fact that mechanically induced response (i.e., strains that cause stress) is less prevalent due to allowing free thermal movements at the roller support. This can be explained by the two-dimensional nature of the structure and its small scale with diagonal members lacking in axial stiffness. In practice, bridges, even those with roller supports, will experience thermal stresses due to temperature distributions varying across all three dimensions [25, 24]. Consequently, stiffness loss at a joint, as simulated in this study, is likely to affect correlations between strains among different elements across the joint in the structure.

4.5.3 Scenario Z

In scenario Z, the truss is exposed only to ambient temperature variations. The first 55 days (79,200 measurements) form the reference period. The first 20 days (28,800 measurements) from the reference period encompass the training period and the rest are used to derive the confidence interval. On the 75th, 77th and 79th day (events #13, #14 and #15) of scenario Z, two bolts are removed from the splice joint J3 in the bottom chord. Results from analysing the PE signals using the four anomaly detection techniques are described below.

Subtracted signals computed from PE signals corresponding to sensors on the bottom chord are discussed as these are the most likely to be affected by the events for this scenario. Subtracted signals are stable during the reference period. They are also unable to indicate the occurrence of event #13. However, they do show a gradual shift after event #14. Specifically, subtracted signals, which are derived from PE signals corresponding to sensors S−2 and S−3 that are located closer to the damaged joint deviate from the confidence interval. TS2S3, TS2S4 and TS3S4 are plotted in Fig. 17 to illustrate the above.

Fig. 17
figure 17

Subtracted signals TS2S3, TS2S4 and TS3S4 for scenario Z

5 Summary and conclusions

This paper presents a temperature-based measurement interpretation (TB-MI) approach for the online detection of anomalous structural behaviour from distributed measurements obtained through continuous monitoring of structures. The TB-MI approach consists of two phases. The first phase is a learning phase where regression models are trained to predict structural response from distributed temperatures. Regression models are generated using the regression-based thermal response prediction (RBTRP) methodology [22]. The second phase relates to real-time application wherein residuals computed from predicted and measured response are analysed using SSM, a novel anomaly detection technique introduced in this paper. The performance of the TB-MI approach is studied on measurements from a laboratory truss, from which the following conclusions are drawn:

  1. 1.

    The experimental setup consisting of the truss with its sensors and the arrangement of infrared heating lamps has enabled a realistic simulation of diurnal temperature variations and damage scenarios to investigate the TB-MI approach.

  2. 2.

    Down-sampling of temperature measurements, when carried out up to a limit, only marginally affects the prediction accuracy of the regression model while notably reducing the time for model training. Results from the study support using a minimum measurement frequency of 24 measurements per day (5.2 × 10−3 Hz) to capture thermal effects in real-life structures.

  3. 3.

    As a rule of thumb, the PCs that cover nearly 99.9 % of the variance in measurements are sufficient to achieve good response predictions. For example, the optimal number of PCs for the laboratory truss is between 11 and 13 although the truss is equipped with 31 thermocouples.

  4. 4.

    Anomaly events can be detected when PE signals are examined with SSM. SSM can offer support for determining the location of the event, and thereby help in diagnosing the cause of the change in structural performance.

  5. 5.

    Having a roller support at one end of the truss releases longitudinal movements and thereby eliminates thermal stresses in the test-bed. This boundary condition, which prevents detection of anomaly events, may not however be realistic as real-life structures have thermal stresses even in the presence of boundary conditions that allow free longitudinal translation.

While the proposed TB-MI approach has shown promising results, further research into anomaly detection is required to scale it up to practical applications. This will include comparing the performance of SSM to other anomaly detection techniques, and investigating application of the TB-MI approach to measurements from full-scale structures. Further work will also focus on defining flexible thresholds that account for long-term shifts in response due to expected and normal changes in material behaviour such as from creep.