Introduction

Monitoring the vibration of structures is at the same time a fundamental and critical operation, because this type of testing is required for both safety and maintenance purposes. In the case of bridges, the need to periodically assess the structure stability and runability makes this activity a relevant issue. In recent years vision based measuring systems for the estimation of bridge vibrations were developed and applied with the exploitation of different image processing algorithms to perform structural significant feature tracking from a sequence of images extracted from the acquired video and thus estimating their displacement time histories.

The known advantages of camera-based measurement devices, that makes them attractive for the application of bridge monitoring, include: remote contactless monitoring (i.e. no critical problems in setting up the measurement system, especially in the case of bridges with frequent train crossings), possibility to perform a multipoint measurement with a single camera, quick and simple measurement device set-up (no transducers nor cables on the structure), no need to access bridge critical areas (especially below the bridge to fix transducers). The mentioned aspects are even more attractive in the case of bridges crossing rivers or deep valleys.

An additional advantage in using a camera for displacement measurements is that every row or column in the pixel matrix can be considered a sensor on its own: so the complete camera corresponds to a very high number of parallel sensors, allowing for distributed sensing or giving the chance to exploit this redundancy to improve measurements reliability. Due to this reason, cameras are usually referred to as ‘dense’ sensors. However, cameras show the common problems of relative displacement transducers and have a limited frequency bandwidth.

The main goal of this work is to define the performances of vision-based displacement measurement techniques in terms of sensitivity, resolution, uncertainty, in the harsh environment of field measurement, which might strongly deviate from the behaviour defined during laboratory testing.

Cameras allow to grab images of a portion of the bridge and therefore they let to measure the motion of different structure points; however, increasing the field of view, the pixel-to-millimetre ratio becomes lower (i.e. the measuring resolution becomes poorer), therefore the uncertainty of displacement estimation increases. The field of view definition is a compromise between measurement accuracy and the possibility to monitor a large portion of the bridge.

In this paper two different types of cameras are used to monitor the response of a bridge to a train pass-by. The acquired images are analyzed using three different image processing techniques (Pattern Matching, Edge Detection and Digital Image Correlation) and the results are compared with a reference measurement, provided by a laser interferometer. Tests with strongly different zoom levels are shown and the corresponding uncertainty values are estimated. A single-point laser interferometer has been adopted as reference for vision-based measuring systems qualification.

State of the Art

Thanks to developments of digital cameras, to the growth of computer processing capabilities and to new image processing libraries, the use of vision-based systems for measuring the targets vibration has become popular in the last years, with particular focus to the vibration monitoring of civil structures [1, 2], where traditional techniques, based on accelerometers, lasers or LVDTs, can be improved or substituted by vision devices [3].

Image-based measurements are interesting in the case of very long structures, such as bridges [4, 5], where the use of contact displacement sensors is more cumbersome because of the need of fixed reference point closed to the structure under test.

Dynamic measurements by means of cameras can be performed by grabbing a video of the moving object. Knowing the grabbing frequency of the device, it is possible to link every frame to a position of the object at a fixed time. Every frame can indeed be analysed by an image processing algorithm in order to identify the object position in the image sensor matrix. If the grabbing frequency satisfies the sampling theorem (acquisition frequency must be at least two times larger than the maximum frequency of the acquired signal) and the exposure time is set to a suitable value avoiding motion blur, the measurement can be considered correct and its uncertainty will depends on the envirorment conditions, the image scaling factor and the computational uncertainty of the specific image processing algorithm used for the analysis [6, 7].

One of the issues of this kind of measurement is the target used for the measurement itself: in most cases planar black and white targets are being attached to the structure in order to improve the measurement technique reliability and to reduce uncertainty [812]. In other circumstances, the natural texture of the structure under investigation can be used for the measurement [810]. The main advantages of the markerless case are that the set-up preparation is much faster and it is not necessary to have direct access to the structure to be monitored, no permissions, no targets maintenance. An alternative approach to the identification of feature inside the image it is to study the motion blur itself. Motion blurred-image-based methods use long exposure time to acquire motion blurred images and estimates the vibrational parameters from motion blur information. However, these methods usually need to know some information about the motion function, for example the type of motion law or the vibration frequency [1315].

In [8] and [16] bridge vibration measurement is carried out using fit-to-the-purpose targets fixed to the structure; in this case the target is made by one black circle over a white background, while in [9] a planar target with four circles is used. In [10] cross-shaped targets are used and the viewing system is equipped with an additional reference system, which decreases the sensitivity of the camera basement to ambient vibrations. Two types of targets are used in [11]: ring-shaped and random ones; in this case multiple target are measured contemporarily with a single camera, allowing multi-point measurements. In some applications active targets are used; for example in [5] LEDs are used for suspension bridge vibration monitoring. Markerless solutions are proposed in literature, to monitor cable vibrations in cable-stayed bridges [17], power head transmission lines [18] and bridges structures. In [19] vibration measurements obtained through image acquisition and processing are used to develop a modal analysis of a simple structure.

The choice of the pattern type is related to the algorithm used to analyse the acquired images in order to estimate the motion. It is possible to use state-of-the art algorithms, such as pattern matching [11], edge detection [18], digital image correlation [20, 21] or optical flow [22]. Nevertheless some authors developed and proposed software tools specifically optimised to recover the structure motion relying on image sequences; for example in [23] and [24] the point reconstruction rules are used, while a robust object search algorithm is proposed in [25].

According to the type of structure under monitoring two-dimension displacement measurement might be enough, while in some circumstances it is necessary to estimate the full 3D motion of the structure. In the case of 3D measurements, usually the stereoscopic approach is used [2325] because the techniques that require structured light projection are usually hardly applicable in outdoor conditions. However, there are some disadvantages in using a stereoscopic approach for structural monitoring: usually the required hardware and software are expensive, but, most of all, the measurement volume is limited by the intersection of the two cameras field of view. Moreover, the advantage of using a three dimensional measurement is low, because the out of plane displacements are usually negligible with respect to the planar displacement. Since in civil structure monitoring the camera-target distance is usually much larger than the target displacement, possible variations of the mm/px scale factor are negligible. If a target of known geometry is available on the structure, it is also possible to recover the 6° of freedom motion of the target itself using the pose estimation techniques [26, 27], where a unique camera is used. This type of techniques are widely applied in close range measurement [28, 29], where the size of the target is non-negligible with respect to the camera-target distance, while in structural vibration monitoring, where the camera-target distance is usually much larger than the target size, the application of pose estimation is more challenging [30].

In this work different types of image processing algorithms are used (pattern matching, edge detection and digital image correlation), and different types of target are used accordingly, as detailed in the next section. In this way each processing technique is applied to the most suitable target type, selected according to the criteria underlined in the scientific literature mentioned above. The goal of this analysis is to obtain a benchmark of the different measuring solutions and to analyse the uncertainty as the image scaling factor changes.

Design of the Vision-Based Measurement System and Testing Layout

In vision-based bridge monitoring a compromise must be found between two needs: on one side a wide field of view (FOV) can theoretically allow to get the whole dynamic deformed shape, on the other side, as the FOV increases, the image scaling factor in terms of mm/px decreases accordingly. If the will to get the complete bridge deformed shape pulls towards the need of having a wide view, there is the serious risk that this choice impacts on resolution, so that the peak to peak vibration amplitude is confined within a few pixels, thus worsening the signal-to-noise ratio. Due to this reason, in the past, most dynamic measurements have been performed on small structure portions, trying to increase contrast, developing fit-to-the-purpose targets to be fixed to the structure, being rather close to the target (or using high-zoom level) and assuming the motion of the small structure portion in the field of view to be rigid, relying on the redundancy offered by the grabbed images to increase the reliability of results. A single image allows indeed to measure the motion of multiple targets, therefore in case of rigid motion the average of the displacements estimated by different targets allows to obtain a more robust motion estimation.

The steps faced in this paper aim at evaluating if it is reasonably possible to move from few measurement targets to dense measurements along a bridge span by using cameras. Once fixed the limits, this method can be really attractive, as it works with no sensors and no power supply on the bridge, and with a simple approach manageable by whichever worker. This allows to get more dense checks (since the measurement setup can be easily assembled and disassembled), dramatically changing all the approaches to maintenance and testing.

The chosen test bed is a 50 m long steel trussed railway bridge crossing a river in northern Italy (Fig. 1(a)).

Fig. 1
figure 1

(a) Tested structure; (b) laser interferometer at mid span; (c) and camera framing the bridge

Trains run on this bridge at low speed during the whole day, approximately every 30 min. In this work, vision based measures will be exploited in order to quantify the bridge sag during the trains pass-by: some videos of the structure during several train pass-by has been acquired and analysed in order to estimate the vertical displacement at different points along the bridge main span. The movies are acquired by means of two digital cameras (one is a consumer camcorder camera, Canon Legria HF 21, 1920 × 1080 px, 25 fps, Fig. 1(c) and the other one is a professional digital camera, the Allied GX3300, 3296 × 2472 px, 17 fps) to compare the performances of the two types of hardware in this kind of measurements.

The upper and the lower horizontal beams of the structure are connected by 13 vertical trusses and 14 inclined on each side of the bridge (see Fig. 2, where a schematic sketch of the structure is reported). These trusses represent the points where the bridge vertical displacement will be measured. As it will be described in the next sections, three different set up approaches will be adopted: “frontal” measurements with targets fixed to the structure side, “in-axis” measurements and “frontal” measurements without targets. Table 1 summarizes the tests conditions and the applied vision techniques.

Fig. 2
figure 2

Measuring layout

Table 1 Tests conditions

The first tests employ, on every vertical truss, two targets fixed onto the structure, both printed on A4 sheets (210 × 297 mm). The first target (Fig. 3(a)) is composed by a couple of round blobs and it will be used to both fix the scaling factor (i.e. mm to px conversion) and to measure the displacement using pattern matching. The local scaling factor is computed for every measuring point in every clip, knowing the physical centre to centre distance of the two circles in millimetres and estimating, by means of standard blob analysis, their distance (in px) in the first acquired frame for every movie. The second target (two black lines, Fig. 3(b)) will be analysed by an edge detection algorithm. Only data extracted by the evaluation of the tilted line edge will be presented in this work: edge detection algorithms, in fact, have proven to work better when the edge is not aligned to the grid of the camera sensor [31, 32]. In order to provide a low uncertainty reference to these measurements, a Polytec Scanning Vibrometer PSV300 laser interferometer (resolution = 2.56 μm) has been placed at the bridge mid span (Fig. 1(b)) and acquired by hardware National instruments with a 1 kHz sample frequency. The interferometer can be assumed to be placed in correspondence of target 7 in Fig. 2 and it will be used as a reference to estimate to compute the measurement uncertainty associated to the vision-based approach. By varying the portion of the bridge framed by the camera (i.e. the zoom level and the distance between the camera and the structure), it is possible to study the link between the scaling factor and the related measurement uncertainty.

Fig. 3
figure 3

Targets fixed on the bridge: (a) for pattern matching and (b) for edge detection analyses

In the second part of the work, a different measurement layout is proposed and validated, where the acquisition device grabs images of the structure along its longitudinal axis (Fig. 4(a)). Also in this case, measurements rely on the presence of high contrasted patterns fixed on the structure. The speckle pattern, the one commonly used in digital image correlation measurements [33], is exploited (Fig. 4(b)) in order to enhance the pattern average intensity gradient. This approach proved to be strictly linked to the uncertainty of the matching processing [34, 35]. A border of blobs allows for the computation of the resulting scaling factors for the given configuration. It was proved that measurement performances of image digital correlation depends on the target pattern quality [36]. The object displacement is evaluated comparing the clip frames by using a subset (part of the image) of the first frame and searching for it in other frames of the video, in order to maximize a given similarity function. The displacement result, expressed in the centre point of the subset, is an average of the displacements of the pixels inside the subset. The uniqueness of each subset is only guaranteed if the surface has a nonrepetitive, isotropic, high contrast pattern. This uniqueness quality also reduces the measurement uncertainty of other image processing algorithms that work on image pixel areas [13], such as pattern matching techniques applied to speckle pattern targets.

Fig. 4
figure 4

In-axis test: measurement layout

Into detail, four targets are fixed on the bridge, at points 3-5-7 and 9 of the previously presented layout (Fig. 4), while the camera is fixed under the bridge, close to one of the structure ends. This camera position allows to measure the displacement of different points along the entire bridge span without the need of a wide field of view (preserving a good mm/px scaling factor). The targets should be parallel to the sensor plane to ensure the millimeter-to-pixel scaling factor is exactly the same for the whole pattern surface. The supports that connect each target with the bridge were designed to facilitate the correct mounting of the target itself, minimizing the risk of possible rotations. Moreover it should be noted that, since all the targets are more than 10 m apart from the camera, possible errors of a few degrees in the target orientation produce negligible variation of the target-to-camera distance and therefore do not affect the scaling factor in an appreciable way. A wire potentiometer, which represents a common and quite cheap displacement transducer, has been juxtaposed to the laser vibrometer.

In the end, the third part of the work uses the same measurement setup of the first approach, but investigating the targetless measurement approach.

As final consideration about the measurement setup, the software used for the analysis and their performances are listed in the following:

  • DIC: Vic-2D from Correlated Solutions, Inc. The metrological performances of DIC is usually studied by means of bias and random effects estimation [33, 37], the magnitude of both these effects are of the order of few hundredths of pixels but it depends on different parameters, above all the pattern characteristics and the image quality (contrast and noise).

  • Edge detection: Labview 2010 edge detection tool. The tool allows to estimate the edge position with sub-pixel resolution. The edge detection uncertainty is strongly related to the image quality, as underlined in [31].

  • Pattern matching: Labview 2010 pattern matching tool. The tool allows to estimate the sub-pixel position detection, also allowing for scaled and rotated pattern search. The resolution of the measurement is below 0.01 px but the uncertainty strongly depends on the image acquisition conditions as well as on the processing parameters (above all the subset size).

As underlined above the metrological performances of these techniques strongly depends on the experimental and processing conditions, due to this reason in the present paper an uncertainty analysis is performed.

Experiments

In this section the tests results will be reported. The first series of tests aims at verifying the measurement performance in frontal view, i.e. with the camera placed at the bridge side, with the optical axis normal to the train motion and pointing at several targets fixed onto the steel structure (Fig. 1(c)). The second series of tests is about an unconventional measurement setup using a perspective view of the bridge in order to acquire the entire span sag with one single video so compressing the longitudinal axis along the bridge span. The third series of tests is carried out to verify the measurement camera performance if it is used to detect the bridge displacement without any target fixed to the structure, only relying on the structure natural texture. It must be noted that these tests are being performed under uncontrolled environmental conditions and are applied to a civil structure monitoring, where the image analysis must deal with the specific requirements of this practical application. In comparison with other similar works about real structures [16, 38], this one is focused on quantifying uncertainty for these techniques applied under uncontrolled environmental conditions. Into details, the results will be proposed in terms of uncertainty as a function of the image scaling factors, relative position between the camera and the bridge (frontal or prospective) and, in the end, in terms of image processing algorithms (edge detection, pattern matching and digital image correlation).

Frontal Measurements with Targets Fixed to the Structure Side

These measurements have the purpose to qualify the results obtained by the image analysis when the camera is placed in front of the bridge side normal to its main axis at mid-span. As explained in the previous section, several targets, one every 3.65 m, are fixed to the structure (Fig. 3). The tests are performed grabbing both with the GX and the Canon cameras described in the previous section. The tests are performed with the maximum image resolution available for both the cameras, in order to guarantee the widest measurement field. For this reason the fps values are fixed to 17 fps for the GX camera and 25 fps for the Canon camera, which are the highest values in full image resolution. The maximum displacement frequencies detectable (8.5 Hz and 17.5 Hz) are proven to be sufficient to describe the bridge displacement due to the train pass-by analysing the data obtained by means of the interferometer that were acquired with a sampling frequency of 1 kHz. Moreover, the GX exposure time (the only one settable) is fixed to 0.750 ms in order avoid motion blur in each frame. The tests specifications are summarized in Tables 2 and 3: for every test, the value of the px/mm ratio is given for each target acquired by the camera. During test 1 the cameras are near the structure and the zoom parameter is set to have only one single target enclosed in the field of view. Under these conditions the highest image scaling factor in terms of px/mm ratio is achieved, but for just one measurement point. All the further tests are made by increasing the number of targets in the field of view, so that a wider description of the vertical displacement of the bridge deck can be obtained by a single video (the real gain of having a single camera vs. many sensors). However, the target number increase has the drawback of decreasing the image scaling factor in terms of the px/mm ratio, which means to worsen the measurement accuracy. For this reason, results obtained by tests with no more than 5 targets framed by the Canon camera and 4 targets by the GX camera are considered; because further increase in the mm/px scaling leads to unacceptable measurement uncertainty.

Table 2 Canon frontal tests: framed targets and local scaling factor (mm/px)
Table 3 GX frontal tests: framed targets and local scaling factor (mm/px)

The aim of these tests is to evaluate how the accuracy of the estimated displacements is affected by the image scaling factor and consequently by the width of the field of view. This evaluation is carried out on a real structure and with uncontrolled environment conditions, which is the situation in which the whole process is expected to operate for structural monitoring purposes. For every target the pattern matching technique is applied to the two blobs (Fig. 3(a)), whereas the edge detection is applied to the tilted line and a mean value of its position at every frame is extracted (Fig. 3). In this way it is possible to estimate the target displacement as a function of time.

Figure 5(a) shows the results obtained by means of the Canon camera for test 1 (measurement point #7), where the x-axis is time and the y axis gives the vertical displacement at mid span. The bridge vertical displacement, due to the train bridge crossing, which is up to 8 mm, can be appreciated. The black line identifies the laser displacement measurement, which can be considered the reference for the camera qualification. The light and dark grey lines represent the displacements estimated with the edge detection and the pattern matching techniques, respectively. The figure shows a good agreement among all data obtained with the three techniques. A direct comparison in the time domain between the displacement measured by the camera (the edge and the pattern analysis) and the reference (the laser) is given in Figure 5(b), where the discrepancies between the camera and the laser measurements are shown: the maximum discrepancy is around the absolute value of 0.4 mm.

Fig. 5
figure 5

Canon camera: (a) displacements measured at point 7; (b) discrepancies between camera and laser; (c) spectra amplitudes

In Fig. 5(c) the spectra amplitudes (1–12 Hz) of the bridge sag obtained by the camera based measurements are compared with the reference interferometer. Both edge detection and pattern matching are able to provide a reliable estimation of the dynamic displacement of the structure: the peaks are identified at similar frequencies and with close amplitudes.

The same evaluations are made for the other tests, but a direct comparison with the reference signal is possible only for target #7 (Fig. 2) because no reference transducers are available for the other bridge sections considered for vision based measurements. However, it should be noted that the camera is mounted with the optical axis almost normal to the bridge, therefore, in each test, the scaling factor is nearly the same for all the targets; thanks to this assumption we can assume that the validation for the measurements at point #7 can be extrapolated also for the other targets. In order to give a complete description of what can be measured with a single camera, all the estimated displacements of test 5 (Table 2) taken by the Canon camera are shown together in Fig. 6(a). In this case, data are obtained by means of the pattern matching algorithm applied to all the five targets. The results give a description of the bridge sag at different positions along the deck.

Fig. 6
figure 6

Canon camera test 5: (a) displacements measured from point 03 to point 07 applying pattern matching algorithm; (b) spectra amplitudes of the bridge sag computed at point 07

In Fig. 6(b) the spectra amplitudes of test 5 are shown and a higher noise level can be qualitatively noticed in correspondence of target 07, as a consequence of the lower px/mm scaling factor. The same conclusions can be drawn by the comparison of the signal spectra estimated by the camera with respect to that measured by the interferometer at point #7 (Fig. 6(b)): an overall slight overestimation of the structure vibrations can be noticed in all the harmonic components characterized by low vibration levels. This is a consequence of the lower signal to noise ratio of the camera based measurements with respect to the reference interferometer, due to poor mm/px values

The results can be confirmed by those obtained from the GX camera under equivalent test conditions. Figure 7(a) shows the bridge vertical displacement due to a train transit. As in Fig. 5, the black line identifies the laser displacement measurement which can be considered the reference for the camera data validation. The light and dark grey lines represent the displacements estimated with the edge detection and pattern matching techniques, respectively. The estimated bridge sag (about 7–8 mm) is similar to the data shown in Fig. 5 (note that Figs. 5 and 7 correspond to two different trains crossing the bridge). In the end, Fig. 7(b) shows the GX estimation of the bridge sag at different positions along the deck, in an approach similar to that of Fig. 6(a), which relates to the Canon camera.

Fig. 7
figure 7

GX camera: (a) displacements measured at point 7 of test 1; (b) displacements measured from point 04 to point 07 of test 4

If attention is paid to the measurement validation, Fig. 8 summarizes the results obtained with the Canon camera images at position #7; the root mean square value (RMS) of the discrepancy Δ = (yref − ycamera) between the reference signal (yref) and the camera-based displacement estimation (ycamera) is shown. The evaluation has been performed only for the time record corresponding to the train bridge crossing. The values are plotted as a function of the image scaling factor in terms of the mm/px ratio. The data quantify the trend of the measuring uncertainty with respect to the scaling factor. The results, as expected, show an increasing trend as the mm/px ratio increases (i.e. the increasing field of view of the camera).

Fig. 8
figure 8

RMS of the discrepancy of target p07 as a function of resolution

The results obtained by the image processing techniques seem to point out that the measurement uncertainty is mainly affected by the resolution value also in presence of uncontrolled environment conditions. Indeed, Fig. 9 shows the RMS dynamic component (the standard deviation) and the RMS static component (the mean value) which are related to the RMS value by the formula:

Fig. 9
figure 9

Mean and standard deviation of the discrepancy of target p07 as a function of resolution: (a) Canon camera; (b) GX camera

$$ RMS=\sqrt{\frac{1}{N}{\displaystyle \sum {\varDelta}^2}}=\sqrt{\left(\mu {\left(\varDelta \right)}^2+\sigma {\left(\varDelta \right)}^2\right)} $$

where μ(Δ) and σ(Δ) are respectively the mean and the standard deviation of the discrepancies between the camera and the reference signal for a chosen target. The results are shown both for the Canon and GX cameras.

The analysis of the mean and standard deviation trends helps to understand the RMS behaviour in Fig. 8. The standard deviation of the Canon camera shows a clear increasing trend up to 5 mm/px resolution value, whereas the mean takes random values between −0.20 mm and 0.20 mm. The GX camera shows the same behaviour: an increasing trend as a function of the available resolution values and a random mean value between −0.05 mm and 0.20 mm. This means that the uncertainty linked to the dynamic component of the bridge vibrations is strongly affected by the image scaling factor, whereas the uncertainty of the static measurement of displacement due to the train mass depends on the uncontrolled biased errors (uncertainty in the mm/px ratio, relative position between the bridge and the camera, lightning conditions, etc.). If the mm/px scaling factor increases, the physical target dimensions are represented by fewer pixels in the image and consequently the algorithms (edge and pattern matching) have fewer points for the target displacement estimation. This is the reason why the measurement uncertainty increases as a function of the scaling factor value.

In the end, in Fig. 10, the standard deviations of the discrepancies between the interferometer and vision data are shown in the case of no train on the bridge. In this case the measurements are about the bridge in static condition and the relative measurement made by the cameras, with respect to the mean position detected by the first thirty frames used as reference, should be zero. Indeed, if there is no train, the bridge should be almost still. The camera measurement variation, estimated in several acquisitions at different distances and expressed in terms of standard deviation, can be linked to the measurement uncertainty. The purpose is to analyse the uncertainty under this condition and to compare it with the measuring uncertainty obtained during the train transit. As it can be seen in Fig. 10, the values and the trend of the standard deviation match well those of Fig. 9, both for the Canon and the GX camera. The conclusion is the measurement uncertainty does not depend upon the vibration condition of the bridge; on the contrary, it is only due to the image scaling factor and eventually to errors in the calibration process; it is therefore possible to estimate the uncertainty of camera based measurements by simply framing the unloaded tested structure, without a real need for a calibrated external reference.

Fig. 10
figure 10

Standard deviation of the discrepancy in correspondence of no train transit: (a) Canon camera; (b) GX camera

In-Axis Measurements with Target Fixed to the Structure

The results presented in the previous paragraph are related to what has been named as “frontal” approach in the measurement setup: the camera sensor is in a plane parallel to the bridge, with the optical axis of the lens almost normal to the measurement surface.

It has been shown and quantified that, as expected, the measurement uncertainty is strongly related to the resulting scaling factor: as the portion of the structure framed by the camera gets wider, the millimetre to pixel ratio increases and consequently the performances of the vision based measurement system worsen, leading to unacceptable signal to noise ratio in case a large number of points needs to be tracked.

This is peculiar of the presented frontal approach; the two main directions of the camera matrix sensor investigate two phenomena characterized by totally different orders of magnitude: along the horizontal direction (bridge axis) a big span portion needs to be framed to be investigated and, on the other hand, a vertical sag of few millimetres has to be measured. This has been done maintaining the same scaling factor on both directions.

In order to keep the measurement uncertainty to acceptable values also in case many points need to be tracked, a different approach is designed and tested.

The targets are fixed, as in the frontal approach, at the most significant sections of the bridge, but the camera is placed with the optical axis almost parallel to the longitudinal direction of the bridge (Figs. 4 and 11), with a small angle between these two directions: the bridge span is now observed in a perspective view. In this way, the distance between the targets (21.9 m between target 3 and target 9) is compressed due to the perspective, allowing to frame all these target in a single image although preserving a limited mm/px ratio and therefore a limited measuring uncertainty. A comparison between the frontal and the in-axis measuring conditions can be done considering Table 1, test number 5, where the mm/px ratio is close to 9.2 with a field of view around 14.5 m, whereas the bridge portion grabbed in the in-axis setup is around 22 m with a mm/pixel ratio between 0.35 and 1.7.

Fig. 11
figure 11

In axis test: an image acquired by the camera along with the scaling factor of every framed target

In this test only the Canon camera is exploited, but both digital image correlation and pattern matching analyses are performed on the acquired data. Due to the high similarity of the two investigated image processing techniques results (digital image correlation and pattern matching), only the digital image correlation data will be shown. Indeed, the discrepancy between the two techniques in correspondence of the target #7 has a maximum of 0.030 px and a 0.016 px standard deviation (0.017 mm and 0.009 mm respectively). A wire potentiometer, commonly exploited in bridge sag measurement tests, has been juxtaposed to the laser vibrometer.

In Fig. 11 an image acquired by the camera is shown along with the targets numbering and the respective scaling factors (Table 4).

Table 4 Scaling factors of the in-axis tests targets

It can be noticed that the millimetres to pixels ratio of the target number 7, where the reference transducer is placed, is comparable to that presented earlier in the single target fontal test, so comparable results in terms of measurement uncertainty are expected, at least for that point.

In Fig. 12(a) the vertical sag of the bridge measured by the three different measurement systems at point 7 is shown. First of all it appears how the wire potentiometer seems not fit to be a reference: probably due to some internal friction or to a long wire, this sensor (which has no filters) allows at a certain delay with respect to the other two. This is confirmed by looking at the differences between the two displacements sensors, assuming the laser vibrometer as the reference. This is shown in Fig. 12(b) where the discrepancy reaches a peak of nearly 1 mm on an 8 mm displacement. In case of data obtained from images, better performances can be noticed with respect to the potentiometer: the standard deviation of the discrepancy curve is about 0.12 mm, so totally compatible with the one derived by the single framed pattern of the previous section (Fig. 5).

Fig. 12
figure 12

(a) Displacements measured at point 7 and (b) discrepancies with respect to the laser vibrometer

Frontal measurements, described in the previous section, show how the measurement uncertainty depends on the image scaling factor expressed in terms of mm/px ratio. The importance of the image scaling factor is demonstrated in Fig. 10 where the measurement standard deviation of a still target gets wider as the mm/px ratio increases. The same evaluation is here proposed for the in-axis tests where the targets are grabbed at different distances. The standard deviation is evaluated for each of the four targets, using the data obtained by six clips of the bridge without train transits. As it can be seen in Fig. 11, target 7, in correspondence of the reference laser, is well focused, whereas the other targets (9, 5 and 3) are out of focus at different blurring levels. The effect of blurring in the estimated displacement has to be analysed. In the previous section the focus influence was not investigated because the targets were all at the same distance from the camera. The depth of field of the cameras is strongly related to the acquisition system characteristics (in particular the lens aperture, the focal length and the camera-to-target distance). If these 3 parameters are known, it is possible to estimate the depth of field (see for example [39]). However experimental results shown in Fig. 13 prove that the standard deviation expressed in pixel, does not changes significantly as the camera-target distance changes, even if the target falls out of the well focusing distance region. In other words the perfect target focusing is not a requirement to minimize the measuring uncertainty. In Fig. 13 the standard deviation is also expressed in millimetres: here the trend is on the contrary very clear and shows that the uncertainty increases as the scaling factor (in mm/px) increases. This conclusion is in complete agreement with the results obtained for frontal measurements. In the end, in-axis tests settle that the scaling factor is the main parameter, which effects the measurement uncertainty even under different focussing conditions.

Fig. 13
figure 13

Standard deviation evaluated by in-axis tests (expressed in pixel and millimetres)

The results show that the “in-axis” approach reaches better performances than the frontal tests and, at the same time, four different points are tracked thanks to the compression along the horizontal direction of the framed area (Fig. 14(a)). Moreover, they settle the independency of the measurement uncertainty from the vision technique used for the image processing: close uncertainty levels are estimated with pattern matching, edge detection or digital image correlation, if the scaling factor is the same.

Fig. 14
figure 14

Vertical (a) and transversal (b) displacements measured from point 9, 7, 5 and 3 applying pattern matching algorithm in the in-axis test

The main drawbacks of the in-axis measurement procedure are the following: different measurement uncertainties are associated to different points along the structure (that anyhow can be easily quantified by framing the nominally still structure, as previously demonstrated); the targets belong to different focus planes and consequently they show different blurring levels; anyway access to the bridge is required to fix the targets.

Targetless Analysis

The analysis presented in the previous paragraphs were based on the presence of fit-to-the-purpose targets at the measuring points; this approach presents certain advantages but it requires to access the structure before the test execution, which cannot always be guaranteed with real applications. The vision-based displacement techniques could work even without any target, relying on the natural texture of the structure. The performance analysis of camera based displacement measurements in the targetless conditions will be presented in this paragraph. As in the previous case the evaluation is performed under uncontrolled environment conditions, as measurements are carried out on a real bridge.

Targets on the measurement points present two main advantages: on one hand, a framed known geometry makes the computation of the mm/px scaling factor straightforward; on the other hand, it is possible to design sharp edges and high contrasted patterns, basically increasing the signal to noise ratio of the measurement process.

In order to show an unbiased comparison with respect to the analyses performed using targets, only the second aspect (i.e. the possibility to obtain sharp and high-contrast target) will be investigated, maintaining the previously computed scaling factor.

The comparison has been performed only close to point #7 (Fig. 15), where the laser interferometer gets the reference bridge sag; in this test the camera grabs bridge images with a scaling factor of 9.25 mm/px.

Fig. 15
figure 15

Target-less analyses: first frame of the movie and tracked features

Two different features have been extracted and tracked in the video (Fig. 15). The first is a horizontal stripe due to the presence of a white tape stuck to the bridge. This feature represents a less intrusive pattern with respect to that analysed in the previous paragraph and it has been tracked with both the edge detection and the pattern matching algorithms. The extracted pattern dimension is 18 × 83 px (corresponding to 166 × 768.5 mm) and the edge detection data have been averaged on the same width (18 px). The second feature is a riveted section of the bridge. Due to the lack of evident edges, only pattern matching can be applied for this analysis, on the 57 × 58 px pattern of Fig. 15 (corresponding to 528 × 537 mm).

In Fig. 16 the discrepancies with respect to the data recorded by the interferometer during the train transit are plotted and some synthetic data are reported. Considering the first pattern (tape), a slight increase in the standard deviation of the discrepancies, with respect to the data obtained from the target, can be noticed for both edge detection (from 0.31 to 0.36 mm) and pattern matching (from 0.33 to 0.35 mm) analyses. Considering the small differences in the measurement performances with respect to the previous presented data, this approach can still be considered attractive in cases where either the presence of the operator on the structure has to be minimized or the surface presents inherent textures comparable to the tested one.

Fig. 16
figure 16

Target-less analyses: discrepancies with respect to the interferometer data

Conversely, the third curve in Fig. 16 describes a more critical situation: the rivet plate connecting the two beams is characterized by very low gradient in its textures and these results, in the analysis, as a higher measurement uncertainty. The standard deviation of the discrepancy is about 0.82 mm (more than twice that obtained with the tape), with peaks that reach 2.5 mm on an 8 mm displacement.

Concluding Remarks

In this work, a vision based technique is proposed to measure both the static and dynamic bridges response due to train pass-by. In order to measure a large bridge portion with one single camera, a compromise between field of view and measurement resolution is necessary. The relation between the measurement uncertainty and the resolution in terms of mm/px scaling factor was studied in a real environment. Three different state of the art image processing algorithms, edge detection, pattern matching and digital image correlation, were exploited to estimate the vertical sag of a railway bridge subjected to train pass-by. The collected data were compared with a laser interferometer transducer, used as reference, in order to quantify the vision-based measuring uncertainty.

At first, the displacement was measured in correspondence of fit-to-the-purpose targets mounted on the structure: the root mean square of the discrepancy between camera-based measurements and the reference transducer shows a linearly increasing trend with respect to the setup scaling factor. The RMS of the discrepancy has been decomposed into its two components: the standard deviation, representing the random component of the discrepancy and characterized by a deterministic trend if plotted against the scaling factor, and the mean discrepancy (bias), that, on the contrary, does not show any evident trend as the image scaling factor changes. Furthermore, it has been proven that the measuring uncertainty estimated under static condition (no loading of the structure) is able to correctly quantify the measuring system uncertainty, suggesting an easy way to estimate such a parameter in an on-the-field application.

In the last part of the paper the displacement of the bridge deck was measured without using the targets but relying only on the natural texture of the bridge. In this conditions the measurement reliability is strongly affected by the structure texture contrast, however, in favourable conditions, small differences in the measuring performances were found with respect to the measurement with target mounted on the bridge. The markerless approach can still be considered attractive in cases where either the presence of the operator on the structure has to be minimized or the surface presents inherent textures comparable to the tested one.