Keywords

1 Introduction

Voice production involves the self-sustained, flow-induced oscillations of the vocal folds. Vocal folds are a layered structure of three different tissues, namely, epithelium, lamina propria and thyroarytenoid muscles. Prolonged phonation can damage these layers due to mechanical fatigue induced. Vocal fatigue is the root cause of several conditions, from vocal fold lesions, up to sleep apnea. After the fatigue of vocal folds has taken place surrounding structures compensate for the reduced vocal-fold activity, in turn, fatiguing and damaging themselves with prolonged phonation. Hence, it is of prime importance, clinically, and economically, to study fatigue and identify precursors to irreversible damage so that the same can be avoided.

Existing literature in the field of vocal fatigue extensively covers objective and subjective parameters to detect fatigue, mechanisms of fatigue, etc. [1]. So far no attempt has been made to quantify fatigue and develop a framework to predict precursors to fatigue. Here, we use the simplified two-mass model, as formulated in [2] to extract the vibration characteristics of the vocal folds. Then, taking into consideration the material properties of the multilayered vocal fold, the stress time histories are computed. The RainFlow counting algorithm (RFC) along with the Miner’s rule is used to compute the accumulated damage.

The organization of the paper is as follows. A brief overview of the mathematical model is provided in Sect. 2. Section 3 presents a brief overview of vocal fatigue and its quantification using the RainFlow-Counting algorithm and Miner’s rule. In Sect. 4 the system response over different lengths of phonation, along with the corresponding stress histories and damage levels are presented.

2 Mathematical Model

The model used in our study is a two-mass model developed by Herzel et al. in [2], which is a simplification of the Ishizaka and Flanagan model [3]. Herzel’s model takes into account the basic principle of the possibility of a phase difference between the lower and upper edge of the vocal fold which has been proven to be a necessary condition for phonation. For the sake of simplicity, we have ignored the cubic nonlinearities introduced in [3] to account for the non-linear nature of vocal fold tissue. We also have ignored the role of subglottal and supraglottal structures and their resonances in phonation. Despite this being a gross simplification of the actual dynamics of phonation, we still obtain the bifurcations that are observed to occur in excised larynges [4], and hence this approach of solving simplified equations for the purpose of analyzing fatigue is validated.

Fig. 1
figure 1

Schematic representation of the two-mass model

The motions of the masses are described by the following equations:

$$m_{i} \ddot{x}_{i} + r_{i} \dot{x}_{i} + k_{i} x_{i} + \Theta \left( { - a_{i} } \right)c_{i} \left( {a_{i} /2l} \right) + k_{c} \left( {x_{i} - x_{j} } \right) = F_{i} (x_{1} ,x_{2} )$$
(1)

where

$$\Theta \left( x \right) = \left\{ {\begin{array}{*{20}c} {1,} & {x > 0} \\ {0,} & {x \le 0} \\ \end{array} } \right.$$
(2)

where \(m_{i} , k_{i} ,k_{c} ,c_{i} ,r_{i} ,P_{i}\) are masses, spring constants, coupling constant, additional spring constants for collision, damping constants and pressure inside the glottis, respectively. Here i = 1, 2 represents the lower and upper masses, respectively.

The forces \(F_{i}\) acting on the masses \(m_{i}\) are given by

$$F_{i} = ld_{i} P_{i}$$
(3)

The following parameters correspond to our standard, symmetric vocal fold:

m1 = 0.125, m2 = 0.025, r1 = 0.02, r2 = 0.02, k1 = 0.08, k2 = 0.008, c1 = 3k1, c2 = 3k2, kc = 0.025, d1 = 0.25, d2 = 0.05, a01 = a02 = 0.05 and \(\rho = 0.0013\).

All the above values are in centimeters, grams, milliseconds and their combinations.

For further details on the mathematical model, refer [2].

2.1 Phonation Threshold Pressure (PTP)

PTP is defined as the minimum lung pressure to be exerted, all other conditions remaining constant, for phonation onset (defined as loss of stability of system via Hopf bifurcation and onset of oscillations).

Application of the Bernoulli’s equation to our model, including the assumption of buildup of jet gives:

$$P_{\text{s}} = P_{1} + \left( {\frac{\rho }{2}} \right)\left( {\frac{U}{{a_{1} }}} \right)^{2} = P_{0} + \left( {\frac{\rho }{2}} \right)\left( {\frac{U}{{a_{ \text{min} } }}} \right)^{2}$$
(4)

Here,

$$a_{ \text{min} } = \text{min} \left( {a_{1l} ,a_{2l} } \right) + \text{min} \left( {a_{1r} ,a_{2r} } \right)$$
(5)

where \(P_{\text{s}}\) is subglottal pressure, \(P_{0}\) is supraglottal pressure, \(U\) is volume flow velocity and \(\rho\) is air density.

Accumulation of fatigue in vocal folds due to prolonged phonation changes several characteristics of the vocal folds, and all such changes have been observed to ultimately change PTP [5]. Since PTP has been proven to be a good indicator for fatigue as it closely correlates with perceived exertion of phonation (PPE) [6], it is used as the parameter that varies with time and accounts for the fatigue accumulated in the system. Data from [6] was used to find the variation of PTP as time progresses, during a continuous vocal loading task. Since the variation of PTP with time is not available in the literature, we have approximated as a linear function.

$$P_{\text{s}} = 0.2030\left( {\frac{t}{1800}} \right) + 2919$$
(6)

where Ps is in cm of H2O and t is in s.

The above relation is useful in our objective of setting benchmarks of PTP values for different levels of fatigue, as presented later in this paper.

3 Vocal Fatigue and Its Quantification

The hallmark of vocal fatigue is considered to be the self-report of an increased sense of effort with prolonged phonation. Vocal fatigue occurs due to several biomechanical and neuromuscular factors. These include fatigue of the respiratory and laryngeal muscles, fatigue of the non-muscular vocal-fold tissues, and changes in the viscous properties of the vocal folds.

The potential mechanisms that contribute to vocal fatigue are

  1. 1.

    Neuromuscular fatigue

  2. 2.

    Non-muscular tissue fatigue and viscosity.

Mechanical fatigue comes under non-muscular tissue fatigue. This sort of fatigue reflects the amount of cyclic stresses that a material can tolerate before breaking down. Fatigue is the progressive structural damage that results from stress imposed by strain on the material. Because of the deformation of the three layers of the vocal folds, they are subjected to mechanical stresses with every cycle of vocal-fold oscillation during phonation. These extended cyclic stresses result in fatigue damage of the vocal fold tissue. Many fatigue damage theories have been used over the past years and they indicate that fatigue damage is strongly associated with the cycle ratio, (\(n_{i} /N_{i}\)) where \(n_{i}\) and \(N_{i}\) are number of stress cycles of a specific stress amplitude exerted on the specimen and the number of cycles to failure for that stress amplitude, respectively.

In this model, the vocal folds are subjected to cyclic loads. Since the amplitude of the loading is changing with time, it is difficult to determine which cycles contribute to fatigue and their corresponding amplitudes. Hence, several cycle counting techniques have been introduced to reduce a complicated variable amplitude loading history into a number of discrete simple constant amplitude loading events, which are associated with fatigue damage. And out of all these methods, Rainflow-cycle (RFC) counting method is generally regarded as the method leading to the best estimators of ‘fatigue life’. This is a method to determine the number of cycles present in a stress history and along with the SN-curve, the fatigue damage is evaluated using a Palmgren-Miner linear damage accumulation theory. Hence, RFC method breaks down any load-time history into its constituent fatigue cycles so as to estimate the fatigue life.

4 Results and Discussions

The governing equations of motion are written in state-space form and solved in MATLAB using an adaptive time-step Runge–Kutta ODE solver, to obtain time-responses of the system for different values of the control parameter.

The governing equations can be written as:

$$\dot{x}_{1} = v_{1}$$
(7)
$$\dot{v}_{1} = \frac{1}{{m_{1} }}\left( {P_{1} ld_{1} - r_{1} v_{1} - k_{1} x_{1} - \Theta \left( { - a_{1} } \right)c_{1} \frac{{a_{1} }}{2l} - k_{c} \left( {x_{1} - x_{2} } \right)} \right)$$
(8)
$$\dot{x}_{2} = v_{2}$$
(9)
$$\dot{v}_{2} = \frac{1}{{m_{2} }}\left( { - r_{2} v_{2} - k_{2} x_{2} - \Theta \left( { - a_{2} } \right)c_{2} \frac{{a_{2} }}{2l} - k_{c} \left( {x_{1} - x_{2} } \right)} \right)$$
(10)
$$P_{1} = P_{\text{s}} \left[ {1 - \Theta \left( {a_{ \text{min} } } \right)\left( {\frac{{a_{ \text{min} } }}{{a_{1} }}} \right)^{2} } \right]\Theta \left( {a_{1} } \right)$$
(11)
$$a_{1} = a_{01} + 2lx_{1}$$
(12)
$$a_{2} = a_{02} + 2lx_{2}$$
(13)
$$a_{\text{min}} = \left\{ {\begin{array}{*{20}l} {a_{1} ,} & {{\text{if }}\, 0 < x_{1} < x_{2} } \\ {a_{2} ,} & {{\text{if }}\,0 < x_{2} \le x_{1} } \\ {0,} & {{\text{if}}\,\,{\text{otherwise}}} \\ \end{array} } \right.$$
(14)

The function \(\Theta\)(x) is approximated as:

$$\Theta \left( x \right) = \left\{ {\begin{array}{*{20}l} {{ \tanh }\left[ {50(x/x_{0} )} \right],} & {x > 0} \\ {0,} & {x \le 0} \\ \end{array} } \right.$$
(15)

The differential equations mentioned above are solved using ODE solver over a time span of 1 h and the time responses are plotted. The amplitudes of the masses are observed to be within 0.15 mm (see Fig. 2). Similarly, the time responses are plotted for the 2nd and 3rd hour of phonation and they are observed to be within 0.2 mm and 0.3 mm respectively (see Figs. 3 and 4). As expected, a phase difference is observed between the motion of the two masses which, as mentioned earlier, is a necessary condition for phonation. In this way, the time responses can be obtained for different time durations of phonation so as to understand the dynamics of vocal folds.

Fig. 2
figure 2

A section of the time response obtained for 1 h of phonation

Fig. 3
figure 3

A section of the time response obtained for 2 h of phonation

Fig. 4
figure 4

A section of the time response obtained for 3 h of phonation

From these time responses, the corresponding stress histories are obtained using tensile and compressive stress–strain equations. A linear relationship between stress and strain is assumed to obtain the stress–time history:

$$\sigma = E\varepsilon$$
(16)

where E is Young’s modulus of vocal folds in the transverse direction, for low moduli of strains [7].

From the stress history plots above, we can see that, for 1 h of phonation, the maximum stress amplitude is observed to be 0.0086 g/(cm) (ms)2 (see Fig. 5). In the same way, the maximum stress amplitudes for 2 and 3 h of phonation are observed to be 0.012 and 0.0139 g/(cm) (ms)2 (Figs. 6 and 7). Hence, it can be inferred that, as time progresses and fatigue accumulates, the vocal folds experience a higher magnitude of stress.

Fig. 5
figure 5

A section of stress history obtained for 1 h of phonation

Fig. 6
figure 6

A section of stress history obtained for 2 h of phonation

Fig. 7
figure 7

A section of stress history obtained for 3 h of phonation

Now, our objective is to find the amount of fatigue incurred for different lengths of continuous phonation using RFC algorithm. To apply Rainflow-Counting Algorithm, stress-time history and S-N curve of the vocal folds are required.

Since existing literature does not detail the S-N curve of this multilayered, anisotropic structure, and because calculating the same is beyond the scope of this paper, we have used the S-N curve of the Achilles tendon [1], since it is similar in characteristics to the vocal fold, and especially to the lamina propria, which is the most prone to fatigue damage of the three layers that compose the vocal fold (see Fig. 8).

Fig. 8
figure 8

S-N curve of Achilles tendon

Now, using the WAFO toolbox in MATLAB, the amount of damage (D) is calculated for different lengths of phonation and is tabulated in Table 1.

Table 1 Amount of damage for different lengths of phonation

As seen from the data, the damage value is higher for a longer duration of phonation. The PTP value at the end of phonation is also found to be higher for a longer phonatory duration. Hence, these PTP values can be linked to their respective damage indices.

5 Concluding Remarks

This study is thus a preliminary attempt at quantifying the amount of damage incurred in vocal folds and linking the same to a measurable, objective quantity (PTP). This has been done as a step towards recognizing precursors to permanent, serious damage to the vocal folds. This study hence provides a framework for quantifying fatigue for preventive applications. To make this study directly useful in reality, a few improvements are in order—a separate study is required to estimate the S-N curve of a realistic model of the vocal folds; tissue nonlinearities have to be taken into account; perturbations in the aerodynamic forcing have to be considered, and the role of subglottal and supraglottal structures in aiding phonation have to be addressed. Future work will focus on improving accuracy of the damage estimates by implementing the above improvements. Further, the results of this study will prove useful when corroborated with in vivo and ex vivo experimental studies on vocal fold fatigue, and the authors intend to take up the same in future projects.