Keywords

1 Introduction

Visual information has been a privileged media manifestation for understanding and transmitting knowledge, such as stimulating content and concept assimilation in STEAM areas.Footnote 1 Sonification, which can be defined as the practice of generating sound from data sets [7], can complement knowledge acquisition within STEAM areas. It unveils different contents and concepts via deepening, sedimenting, and combining aesthetic perception and scientific understanding [16].

The motivation behind the current research is driven by the development of sonification techniques at the service of science and technology learning [20, 25, 27]. Among the areas of engineering that benefit from the contributions of sonification, we can highlight vibration analysis, structures under loading conditions, electrical phenomena, modelling and control, computer science, signal processing, and telecommunications [13, 18, 23].

The objectives of the current study are threefold. First, to identify and extract relevant information from the footbridge’s vibrations caused by human interactions. Second, to define sonification processes and strategies to be applied to the collected footbridge data and information, aiming to disclose in the sonic result three relevant conditions – 1) the number of people interacting with the footbridge, 2) their walking speed, and 3) the steadiness of their pace. Third, to evaluate if information is correctly conveyed through sonification.

This paper is organised as follows. Section 1 covers the context, motivation, and objectives. Section 2 presents the fundamental theory on sonification techniques and related work to our study. Section 3 details the structure and physical properties of the Faculty of Engineering of the University of Porto (FEUP) stress-ribbon footbridge. Section 4 presents the design method adopted and describes our solution. Section 5 focuses on the evaluation of our method and resulting sonifications by a listening or perceptual test. Finally, Sect. 6 presents the conclusions of our study, pointing towards the future endeavours of the research.

2 Related Work

According to Walker and Nees [26], sonification techniques can have a threefold categorisation: 1) audification, 2) parameter mapping sonification (PMSon), and 3) model-based sonification (MBS). In this study, we focused on the first two categories, which have been applied in related vibration analysis contexts.

Audification is the most direct method of sonification. This technique directly translates data into the audible frequency range of the human perception (i.e., 20 Hz–20 kHz) and, in the digital domain, to an audio sample rate (e.g., 44.1 kHz/s). Audification aims to format the data to better expose it to human perception. Audification is particularly useful in cases where numerical data analysis methods fail and it is not possible to determine which aspects of the data encode relevant information. Therefore, by interacting with data in an analytical listening process, undetectable patterns can be identified. Representative audifications of vibration patterns, namely seismic data, can be found in the work of Kilb et al. [19] and Hutko et al.  [17]. They applied time compression and translation of the subsonic seismic range (below 20) Hz to translate it to the human audible range.

Parameter mapping sonification does not require data to be sampled at an audio rate. Changes in data are associated with changes in the sonic domain [15]. This association needs to be adapted to the perceptual characteristics and constraints of human hearing to optimise its interpretation. The wide range of parameter mappings raises challenges in terms of consistency and comprehensibility in the task. A common issue in PMSon development is the confidence that variations in sound are audible and informative when, in reality, these variations may not be evident to most listeners. Adopting an interactive approach during the mapping process can avoid this problem [26]. An example of parameter mapping is the sonification of the Mies Van Der Rohe pavilion in Barcelona [14]. The sonification of the pavilion unveils its structure, namely the repeating of patterns (e.g., the distance between identical materials) and provides an extended architectural experience for blind people.

3 The FEUP Stress-Ribbon Footbridge

Figure 1 shows the FEUP stress-ribbon footbridge, designed by ENCIL [8]. Its deck is a significantly slender concrete slab suspended from encased tensioned cables and continuous over two spans. The footbridge presents two spans of 28 and 30 m long and a two-meter rise from the abutments to the intermediate pier [2].

Fig. 1.
figure 1

Views of the FEUP stress-ribbon footbridge.

The high flexibility of the footbridge and the significant vibration levels observed motivated the development of a numerical study to analyse its dynamic behaviour, followed by an experimental identification of the corresponding modal shape parametersFootnote 2. Figure 2 shows the footbridge’s natural frequenciesFootnote 3 and the vertical modal shapes direction. Consequently, humans can excite the footbridge with specific actions.

Fig. 2.
figure 2

Natural frequencies and modal shapes direction of the stress-ribbon footbridge under analysis [3].

Damping is a crucial factor in the characterisation of structural vibration systems that can be excited near resonance under service loads, as in the case of the FEUP footbridge. Concrete footbridges typically have damping coefficients of around 1%. The FEUP footbridge dynamic tests were performed to identify modal damping factors associated with the most relevant modal shapes in terms of human-induced vibrations. The measured damping factors concerning the two first modal shape vibrations at frequencies of about 1 2 Hz were 1.7% and 2.6%, respectively [2]. However, it needs to be noted that damping increases with the amplitude of the footbridge’s vibrations. Consequently, damping increases with the number of people interacting with the footbridge.

Furthermore, the footbridge has distinct features that highlight its proneness to vibrate in response to human actions. By fixing specific dimensions – 1) the number of people interacting with the footbridge, 2) their walking speed, and 3) the steadiness of their pace – we can control the type of vibration. The number of people interacting with the bridge has an impact on the amplitude of the footbridge’s vibrations. The frequency of their pace impacts the footbridge’s response depending on which modal shape is being excited. Finally, the steadiness of their pace has an impact on the regularity of the footbridge’s vibrations.

4 Sonification of the FEUP Stress-Ribbon Footbridge

This study aims to sonify the FEUP stress-ribbon footbridge vibrations. In Sect. 4.1, we report the adoption of acceleration data to capture the footbridge vibration patterns on the following threefold conditions: 1) the number of people interacting with the footbridge, 2) their walking speed, and 3) the steadiness of their pace. In Sect. 4.2, we detail an analysis of the acceleration data conveying higher-level of information. The following two sonification techniques adopt this information to intuitively unveil the threefold above conditions: audification and PMSon. Audification provides a direct ‘translation’ of the footbridge data stream into sound and is detailed in Sect. 4.3. Parameter mapping is the most creative of the two techniques. It aims at conveying relevant information from the footbridge’s acceleration data with musical content and is detailed in Sect. 4.4. Both sonification techniques have been implemented on the software environment Pure Data [21]. Video demos, the captured data, and the code are available online for dissemination purposes at https://doi.org/10.5281/zenodo.7250401.

4.1 Data Collection

To study the FEUP footbridge’s dynamic behaviour under controlled conditions, an experiment was conducted involving the measurement of the footbridge vibrations induced by a variable number of people under different actions. Knowing in advance the natural frequencies and modal shapes of the FEUP stress-ribbon footbridge, described in Sect. 3, we excited the footbridge at its resonant frequencies (shown in Fig. 2).

Fig. 3.
figure 3

Location of the two seismographs used for the data collection process, positioned at one-third and a half of the longest span of the footbridge.

The captured footbridge’s acceleration data, a was expressed as the ratio \(a=v/g\), where v is the vertical acceleration of the bridge and g is the acceleration due to gravity (equal to 9.81  m/s\(^2\)). The footbridge response to human actions was assessed for the following three conditions: 1) the number of people interacting with the bridge, 2) their walking speed, and 3) the steadiness of their pace. As shown in Fig. 3, the footbridge’s response was measured with two wireless seismographs located at one-third and the center of the longest span, as these are the most significant locations concerning the footbridge’s vibrations.

Table 1 shows the total list total of 13 actions that were collected. The acceleration data, a, was collected with 100 Hz sample rate, thus capturing frequencies up 50 Hz. The adopted sampling rate safely ensures that no frequency content was lost, as our experiments have shown that the footbridge’s frequencies are in the range of 0–10 Hz [22].

Table 1. Human actions used during the data collection process separated by 1) number of people interacting with the bridge, 2) frequency of the walking speed, and 3) type of human action.

4.2 Data Analysis

The data analysis consisted in extracting relevant higher-level information from the footbridge’s acceleration data, a, to objectively measure features and structural characteristics of the bridge. First, the analysis of the maximum values of the temporal acceleration data a is presented to study if there is a correlation between human actions and vertical vibrations. Second, three spectral descriptors such as harmonicity, spread, and flatness, are used to capture information on the frequency domain representation of the acceleration data.

Fig. 4.
figure 4

Representation of maximum values of the vertical acceleration data, v, from the FEUP footbridge. It shows a direct proportionality between the number of people interacting with the footbridge and the range of vertical oscillations.

Figure 4 shows the relationship between the maximum values of the acceleration, a, and the different human actions listed in Table 1. It shows a direct proportionality between the number of people interacting with the footbridge and the vertical acceleration values. Furthermore, when exciting the footbridge at 1.82 Hz, a smaller vertical vibration is obtained compared to that 2 Hz. This is due to the fact that the frequency of 1.82 Hz is misaligned with the resonant frequency 2 Hz, where vibrations have greater magnitude. The maximum acceleration across all actions results from five people flexing their knees. This is due to the systematic excitation of the footbridge at a section of the maximum modal shape.

Spectral analysis of the temporal acceleration data resulted from applying the fast Fourier transform with a 512 window size, 50% overlap, and a hamming amplitude window. The spectral descriptors used are described below:

Harmonicity, h, provides an indicator of the quantity and quality of the harmonics in a signal [9]. It quantifies how the spectral peaks align with harmonic spectra. In detail, it computes how well the spectral peaks with higher magnitudes align with the expected harmonic peaks from a given fundamental frequency. A harmonic spectrum is typically driven by a steady human action on the footbridge, as opposed to inharmonic spectra, which is driven by unsteady temporal actions.

Spread, s, is a measure of the concentration of a spectrum’s energy around its centroid [12]. It is reported in units of Hz. A regular and steady human action pace will have a considerably low spread value because all energy is concentrated around a single spectral frequency peak.

Flatness, f, is interpreted as a measure of the noisiness in a signal. It is computed as the ratio between the geometric and the arithmetic mean of the spectrum [9]. A very flat spectrum with equal energy across all frequency bands should have a high flatness value, while the flatness of a regular and steady pace is considerably low.

4.3 Audification

The audification of the FEUP footbridge vibrations followed the threefold steps: 1) translation of the acceleration data, a, to the human hearing range, 2) mapping the translated data, \(\hat{a}\), to the parameters of a frequency modulator synthesizer, and 3) audio synthesis.

We translated the acceleration data, a, to fit the human audible range and sampled the data to adopt an audio processing rate (e.g., 44.1 kHz/s). Figure 5 shows the processing pipeline used to translate the acceleration data, a, into the audible human range, referred to as translated acceleration data, \( \hat{a}\). First, a sum of 0.2Footnote 4 ensures that all values are in the positive range. Second, the acceleration data values are multiplied by a scalar (5000Footnote 5) to scale the pitch range. Lastly, a sum of 200Footnote 6 establish that all the values are audible and are not below the lower human range 20 Hz. To sample the data into a 44.1 kHz/s rate, we linearly interpolated the resulting translated acceleration data, \(\hat{a}\).

Fig. 5.
figure 5

Process used for the translation of the footbridge acceleration data to the human audible range.

Both translated acceleration data, \(\hat{a}\), and higher-level information extracted from the footbridge vibrations have been mapped to three input parameters of an FM synthesis module. The choice of FM synthesis technique is due to the quality of the input parameters – modulating amplitude, carrier amplitude, and carrier frequency – that link to the sonification conditions under study [4], as detailed next. Figure 6 shows the FM synthesis diagram with the corresponding mappings to the footbridge’s data.

Fig. 6.
figure 6

Diagram of a frequency modulator synthesizer with the mappings selected in this study. The two hexagons represent the two oscillators of an FM synthesizer. Boxes with dashed lines are external data while boxes with continuous lines are the three parameters of a frequency modulator we can control.

We mapped the footbridge’s translated acceleration data, \(\hat{a}\), to the modulator frequency and carrier amplitude of the FM synthesizer aiming to convey relevant information on the number of people and the walking speed. In fact, a single person interacting with the footbridge translates in a smaller pitch range than that of multiple people. The rationale for this mapping is the behaviour of the footbridge, which has reduced vibrations for a single person compared to multiple people. For what concerns the walking speed, the faster the walking pace, the shorter the resulting oscillatory patterns within the pitch range.

After that, we mapped the higher-level spectral descriptors (detailed in Sect. 4.2) to the modulation amplitude parameter of the FM synthesizer, aiming to convey information about the steadiness of the pace. By computing the product of spread, s, and flatness, f, scaled by an integer (5), we impose clear differences in the ‘richness’ or ‘distortion’ of the resulting synthesized sound. Therefore, the steadier the pace, the more regular the footbridge’s vibrations, which results in lower spread and flatness values, thus less rich and distorted sound. On the other hand, an unsteady pace translates into spreader spectral peaks and a noisier and flatter spectrum, thus creating a richer and more distorted sound.

Finally, the output of the FM synthesis is processed by reverberation to creatively shape the timbral qualities of the sound.

4.4 Parameter Mapping Sonification

The second sonification strategy adopted is parameter mapping. Its major challenge is the selection of musical processes to be driven by the acceleration data, a, and the analysed information from the bridge dynamics. Next, we detail the adopted method establishing links between the information we aim to convey in the resulting signal – 1) the number of people interacting with the footbridge, 2) their walking speed, and 3) the steadiness of their pace – and musical processes.

The number of people interacting with the footbridge was mapped to the number of notes in an arpeggiator. This mapping aims to provoke an intuitive connection between the steps of a walking person and the notes of an arpeggio. In detail, a single person walking across the footbridge triggers a two-note arpeggio, played as a regular (and somewhat slower) two-note rhythmic figure, compared to multiple people walking across the footbridge, which triggers a three-note arpeggio at a faster rhythm figure (triplets).

The walking speed of the person (or people) interacting with the footbridge was mapped to the tempo of the generated music. Therefore, if the person (or people) walks at a slow pace, the generated music will be slower compared to a faster walking pace, with a faster tempo. To define the tempo in beats per minute (BPM), we extracted the peaks (i.e., local maxima) from a smoothed acceleration data, a, function.Footnote 7 Then, we applied a harmonic product spectrum algorithm that analyses the peak data to estimate a steadier tempo value over time.

The steadiness of the pace was mapped to 1) a percussive sound triggered by the peaks of the footbridge acceleration data, a, and 2) the consonance of a chord generator engine. A steady pace results in a regular percussive sound that behaves similarly to a metronome. On the other hand, an unsteady pace results in highly irregular rhythmic patterns. Harmonicity, h, was used to select the level of consonance of a generative chord system inspired by the work of Ferguson and Parncutt [10]. The higher the harmonicity level, the more consonant the generated chords are. The level of harmonicity is a powerful indicator to identify the steadiness of a pace as it is strongly connected with the regularity of the spectrum of the acceleration data.

Finally, all generated musical instructions in Pure Data were routed to virtual synthesizers and audio devices. In particular, the sound library Spitfire Labs [24] within Logic Pro [1] was used for the arpeggio and the chords. The arpeggio was performed by a piano VST. The chords were sent to a synthesizer VST with a low attack to keep a steady (continuous) sound in the background. Pure Data has been used for both mappings and playback of the percussive sounds, which were sent to a reverberation unit for creative endeavour and timbre manipulation.

5 Evaluation

We adopted the protocol proposed in [11] to assess the efficiency of the two proposed sonification techniques through an online perceptual test, detailed in Sect. 5.1. Participants were informed about the test goals and the details of the sonification techniques under evaluation. The results are shown and discussed in Sect. 5.2.

5.1 Perceptual Test

The perceptual test consisted in assessing which of the two adopted sonification techniques – audification and PMSon – best expose the three following conditions: 1) the number of people interacting with the footbridge, 2) the walking speed, and 3) the steadiness of the pace. Furthermore, we assessed the impact of training in the information assessment.

For each condition, participants had to select one category that best describes the sonification. In detail, participants had to select whether the sound reflected a single person or multiple people crossing the footbridge; whether the walking speed was at a slow or fast pace; and, lastly, whether the pace was steady or unsteady. Additionally, for all conditions, a third category labelled I don’t know was included to account for the case where participants could not establish a connection between the categories and the audio heard.

The perceptual test was separated into four main sections: A) evaluation of the intuitiveness of the audification with a blind perceptual test, B) re-evaluation of the audification after a short training phase, C) evaluation of the intuitiveness of PMSon with a blind perceptual test, B) re-evaluation of PMSon after a short training phase. Sections A/B and C/D of the perceptual test include the same audio examples to assess the impact of a training phase and previous exposure to the sonifications. The order of the audio examples was randomised to avoid order effects.

Each section included eight different audio examples with three different questions each. A total of 96 questions were assessed by the participants. Audio examples were about 10 s long and were taken from the sonification of the footbridge acceleration data detailed in Sect. 4.1. The audio files were selected to guarantee all the combinations between the three dimensions under analysis – 1) the number of people interacting with the footbridge, 2) their walking speed, and 3) the steadiness of their pace. Table 2 shows all the combinations of the three dimensions used in the perception test.

Table 2. List of the examples used in the perceptual test with all the combinations between the three categories under analysis.

5.2 Results and Discussion

A total of \(n=23\) subjects completed the perceptual test (average age of \(29.42\,\pm \,10.84\)). Five participants identified as female and one did not disclose their gender. Seven participants conducted the perceptual test using loudspeakers; eight participants used earphones, and the remaining eight adopted headphones. Most participants identified as musical amateurs (i.e., they play an instrument as a hobby). Six participants reported no musical training, and four participants identified as musical experts (i.e., professional musicians or musicologists).

Tables 3, 4, and 5 shows the results of the conducted perceptual test per dimension under analysis. Each table shows the percentage of correct answers per sonification technique, with and without the training phase for all combinations listed in Table 2. Each table has a total of 32% values.

Table 3. Percentage of correct answers from the online perceptual test regarding the detection of the number of people interacting with the bridge.

We can analyse the gathered results in terms of the most effective sonification method as well as the impact of training. To compare the two sonification methods we must observe the same conditions across the two sonification methods. To analyse the impact of training, it is necessary to compare the same conditions across the training and no training data. For example, comparing the first and third rows of Tables 3, 4, and 5.

The results of our perceptual test show that PMSon is prone to better exhibit the three conditions under study. As a matter of fact, only five audification assessments out of a total of 96 across the three conditions were better classified. Additionally, PMSon shows a minimum improvement of at least 21 % across all the dimensions under analysis. However, PMSon presents a particular difficulty in distinguishing between a slow and fast pace with multiple people interacting with the bridge at an unsteady pace – with and without training – with three higher values for audification. Furthermore, the results of the perceptual test show that training positively impacts the perception of the conditions under study in both techniques. In fact, the recognition of the three categories under analysis considerably improved after the small training phase. This suggests that the outcome of the perceptual test can reasonably improve with deeper training.

Table 4. Percentage of correct answers from the online perceptual test regarding the detection of the walking speed of the people interacting with the bridge.
Table 5. Percentage of correct answers from the online perceptual test regarding the detection of the steadiness of the pace of the people interacting with the bridge.

6 Conclusions and Future Work

In this study, we adopted two sonification techniques – audification and parameter mapping – to analyze the oscillatory behaviour of a stress-ribbon footbridge. The acceleration data captured by seismographs used for the sonification includes human action-driven vibration measurements, with three fixed conditions: 1) the number of people interacting with the walkway, 2) their walking speed, and 3) the stability of their stride.

Audification translates the footbridge acceleration data to the human audible range and samples it to an audio rate. The mapped data is adopted to control parameters of a FM synthesizer. PMSon maps the fixed conditions – 1) the number of people interacting with the walkway, 2) their walking speed, and 3) the stability of their step – directly to musical properties.

A perceptual test assessed the degree to which the sonification techniques were conveying the threefold above conditions. The results have shown that PMSon is more effective in eliciting information. Furthermore, the results showed that a short training before exposure to the sonification may slightly improve the understanding of the conditions. Finally, the conditions are somewhat independent, as their combination is still prone to the identification of the individual conditions.

In future work, we will consider the application of a model-based sonification to our problem by adding interactive aspects of real-time gait analysis and footbridge vibrations response.