The tongue is the primary propulsive agent to accomplish oropharyngeal swallowing [1], and it is generally accepted that disordered lingual strength and coordination can contribute to the loss of safe and efficient swallowing [25]. Objective evaluation of lingual functioning therefore is a valid addition to clinical swallowing evaluations and should replace unreliable subjective assessments [3]. The availability of normative data is a prerequisite when seeking to determine the impact of tongue strength on observed swallowing deficits. Several studies have documented the range of maximum isometric tongue strength with respect to age and gender in healthy individuals [2, 69]. In contrast, studies on lingual endurance and its role in swallowing are only recently emerging [2, 8, 1012]. Additional research that examines differences in tongue strength across age and gender ideally should be conducted with large, stratified samples and compared with prior investigations to determine if agreement exists and findings have been replicated. As life expectancy increases, there is a strong need to include sufficient numbers of elderly persons in these study populations. European reference data, however, are still lacking, obstructing widespread clinical application of objective tongue function assessment. Cross-national variation in health is well established but cross-national research using performance-based assessment is rare [13]. Skeletal muscle strength is influenced by race [14, 15] and wide variations exist even within one continent [16]. No such studies have been performed in assessing the bulbar muscles, although informal strength differences have already been reported [17]. Therefore, the current study aimed to provide the first European data on tongue strength and endurance and to supplement the exploration of age- and sex-related differences in these measures. The influence of additional parameters was also evaluated to address potential methodological questions.

Methods

Participants

This study included 420 participants, ranging in age from 20 to 96 years old, who were evenly distributed among seven age categories (20–30, 31–40, 41–50, 51–60, 61–70, 70–80, and above 81 years old) in order to maximize detection of age-related influences. Mean age was 54.8 years old with a standard deviation of 20.9 years. Each decade consisted of 60 persons, 30 males and 30 females, for a total of 210 males and 210 females. The breakdown of the above 81-year-old group shows 51 persons (24 males, 27 females) between the ages of 81 and 90 and 9 persons (6 males, 3 females) older than 90 years. Participants were recruited from the general public. Informed consent was obtained from all participants according to the institution’s procedures and policies for human-subjects research.

To be included in the study the participants had to be of Caucasian descent, within the specified age range, report being in good general physical and mental health, and speak Dutch. General exclusion criteria were any history of major medical illness such as respiratory disease; neurologic trauma, disease, or insult; major head or neck surgery; or any form of cancer. All participants had to pass several additional exclusion criteria in an attempt to ensure the inclusion of representative participants. The presence of exclusion criteria was determined by interview and an oral mechanism examination. Any person with a history of dysphagia was excluded because this could be indicative of an as-yet undocumented underlying abnormal tongue function or strength. Other exclusion criteria were having had major oral cavity surgery (i.e., any procedure beyond routine dental surgery, including wisdom teeth extraction), dyspnea as it could interfere with endurance testing, dysarthria or apraxia, any oral motor impairment, and the playing of a wind instrument or working as a professional speaker or debater, because those that do have been shown to have higher tongue endurance than typical adults [11]. Tobacco and alcohol use within socially accepted levels was not considered an exclusion criterion provided no medical condition could be attributed to these agents. The use of medication was allowed provided the underlying condition was not an exclusion criterion.

Instrumentation

The Iowa Oral Performance Instrument (IOPI) (model 2.1; IOPI Medical LLC, Carnation, WA) was used to measure the variables related to tongue function. The IOPI is a portable device that measures the amount of pressure exerted on a small air-filled bulb. Pressures obtained are digitally displayed (expressed in kPa) on a LCD panel on the instrument. A series of LED lights representing percentages in 10 % increments of a manually set pressure acts in combination with the build-in timer as a tool for measuring endurance. As an instrument that measures tongue function, the IOPI has been utilized in a number of published experiments [3, 11, 18] and has established high inter- and intrajudge reliability [3, 6]. To ensure accurate measurement, calibration was checked and adjusted if necessary prior to obtaining measures from each participant. A new bulb was used for every participant because of hygienic concerns and to minimize measurement error due to possible compliance variations of the bulb after extended use.

Procedures

Tongue strength was measured by obtaining maximal tongue elevation pressures. Instructions to the participants were to “place this bulb in your mouth on the midline of your tongue and push it against the roof of your mouth as hard as you can.” In order to maximize standard placement, the examiner demonstrated placing the bulb along the central groove of the tongue blade. Since previous research indicated that maximal measures of tongue strength and endurance are best assessed with an unconstrained jaw, participants were encouraged to gently rest their incisors on the tubing of the IOPI bulb [2, 19, 20]. All trials were motivated by verbal encouragement from the examiner [19] and lasted ~7–10 s. The strength measurement was completed three times, with a brief resting period of about 30 s between each trial while the examiner recorded the peak pressure obtained. The highest pressure across the three trials was used as the participants’ maximal isometric pressure (MIP) instead of the mean pressure, as other researchers do [8, 21]. Since the correlation between averaged and maximal pressure is high and both are similarly related to oral-phase swallowing function [3], the use of the maximal pressure is more efficient in a clinical setting because it requires no calculation.

Endurance measures were gathered following the strength task after a break of at least 5 min. Participants were asked to follow the same placement instructions used in the strength measurements and sustain 50 % of their MIP for as long as possible. The timing was recorded using the built-in stopwatch. Participants were always able to monitor their performance via the LED array located on the right side of the device. The rules of measurement were taken from previous research [22]. Timing starts when the pressure meets or exceeds the target pressure and stops when the pressure drops steeply, the pressure is maintained between 40 and 50 % of MIP for 2 s or more, or the pressure stays below 40 % of MIP for at least 0.5 s. These rules allow for transient changes in pressure and are used likewise when testing patients where pressure variations are to be expected (e.g., movement disorders) [23]. Only one trial was done for endurance since the time available for individual participant testing was limited, therefore precluding adequate recovery time needed when performing multiple endurance trials. Short-interval (defined as less than 30 min) multiple endurance measures done in a limited number of participants showed no significant differences (Kays 2010, unpublished raw data) [20], further backing up the adopted protocol.

Factors

To investigate possible factors influencing the maximum anterior and posterior tongue strength and endurance, several conditions were evaluated. Factors studied were age, sex, bulb position, visual feedback, the order of testing, and the number of trials needed to reach maximum values.

Age and Sex

The age range of the participants covered the whole adult life span. People older than 81 years were put into one group for practical reasons. The age decade was used when comparing between-groups differences; the exact age was used when evaluating correlations between strength and endurance parameters.

Bulb Position

In this study both anterior and posterior tongue body function were assessed. To measure the anterior position strength, the bulb was placed longitudinally along the hard palate just posterior to the upper alveolar ridge, where compression was exerted by the anterior tongue (~10 mm posterior to the tongue tip). Posterior strength was measured with the distal end of the bulb at the posterior edge of the hard palate, where contact is made by the posterior tongue (~10 mm anterior to the most posterior circumvallate papilla). There were no participants who could not tolerate the posterior position because of a gag response. Once the bulb was appropriately positioned on the anterior and posterior tongue, the researcher indicated the point where the tubing running from the intraoral bulb to the connective tube met the upper incisors using a permanent marker. While the anatomy of the participants clearly varied due to differences in the shape of the upper alveolar ridge and palate, the consistent instruction to the participants and the placement demonstration and visual inspection of the individual markings by the researchers allowed for reliable bulb placement between trials and across participants. A similar approach was used by other authors [12, 24].

Visual Feedback

To evaluate the influence of visual feedback, tongue strength was measured with and without feedback in a randomized starting order. When allowed visual feedback, the participants got a clear view on the LCD display of the IOPI during each of the three trials. Verbal encouragement was provided in both conditions. Endurance testing was always performed with visual feedback.

Order of Testing

To determine whether there was an effect caused by the order of testing, half of the participants in each decade group were randomized to either sequence, i.e., anterior or posterior location as the first test position.

Repeated Trials

The standard procedure used by all previous researchers involves three consecutive trials to determine the maximum isometric pressure. Little information can be found on the need to adhere to this procedure.

Data Analysis

Data on anterior and posterior tongue strength and endurance were analyzed for normal distribution by using the Kolmogorov–Smirnov and Shapiro–Wilk tests. Tongue strength (measured as isometric pressure) was normally distributed and therefore analyzed on the original scale, while endurance measures (measured as time) were log10-transformed to reach normality. Descriptive statistics [means and 95 % confidence intervals (CI) of the mean, and minimum and maximum values] were calculated for all variables. An analysis of variance (ANOVA) was computed to determine if the MIP and endurance variables differed significantly based on age or gender, with the Tukey HSD procedure used for pairwise comparisons. All between-group comparisons for tongue measures employed independent t tests (two-tailed). An α of 0.05 was used to determine significance for all comparisons. The effect size was calculated using η2 or partial η2. Pearson correlation coefficients were calculated between age and strength and endurance parameters. A repeated-measures ANOVA was performed to evaluate the effect of three trials in attaining maximal MIPs. All computations were made using SPSS v19.0 (IBM, Armonk, NY, USA).

Results

Descriptive statistics for the different variables by age group and gender are given in Table 1.

Table 1 Means and standard deviations for dependent variables

Age- and Sex-related Changes

There was no significant interaction between gender and age in any of the four parameters [MIPant (F6,406 = 1.52, p = .17); MIPpost (F6,406 = 0.39, p = .89); Endant (F6,406 = 1.13, p = 0.35); Endpost (F6,406 = 1.18, p = 0.32). This allows for separate further analysis of age and gender.

In the male subgroup, all ANOVA assumptions were met. There was a statistically significant difference at the p < .05 level for all variables when comparing the seven age groups. All effect sizes (calculated using η2) were medium (i.e., 0.08 for anterior endurance, 0.09 for posterior endurance) or large (i.e., 0.37 for anterior strength, 0.26 for posterior strength). Post hoc analyses indicated that there was no significant difference in the MIPant between males in the 20–60-year-old range, and equally for the age groups of 61–70, 71–80, and 80+ years old. Regarding MIPpost, no significant deterioration was seen in males between 20 and 70 years old, whereas males from 71 to 80+ years old showed significant lower pressure capability. The anterior endurance shows only a significant difference between males of the groups 41–50 and 80+; all other pairwise comparisons were not significant. Posterior endurance shows similar limited significant differences with males aged 31–50 years, demonstrating longer endurance than males 71-80+ years old.

In the female subgroup, strength and anterior endurance fulfilled ANOVA assumptions with a resulting significant difference. The assumptions, however, were not met for the endurance of the posterior tongue; using the Robust Test of Equality of Means (Welch’s F), a significant effect of age on this variable was nevertheless confirmed. All variables had large effect sizes (i.e., 0.30 and 0.24 for anterior and posterior strength, respectively; 0.22 and 0.15 for anterior and posterior endurance, respectively). Post hoc analyses in the female subgroup showed that females from 20 up to 70 years old have similar and higher MIPant than that of females aged 71–80+ years old. These findings differ by a decade with the male results. Posterior tongue strength data again demonstrate two significantly different groups: 20–70 years old and 71–80+ years old, in line with the male results. Anterior endurance in females is not significantly different in the age range of 20–80 years old, but shows a significant decline in 80+ year olds. Posterior endurance shows similar, though more limited, differences, where 80+-year-old females are clearly weaker than some of their younger counterparts, again reflecting the general trend seen in the male subgroup.

Further exploration of gender differences in both strength and endurance further using an independent t test revealed significant but small differences. As a group, males obtained significantly higher MIP scores (both anterior and posterior) on the IOPI than women, while only anterior tongue endurance was longer in men. Posterior tongue endurance was not significantly different between men and women. Table 2 provides a summary of the findings.

Table 2 Gender differences by combined age

A breakdown of the differences between males and females in each decade on both strength and endurance, analyzed using independent t tests, with effect size expressed as η2, is given in Table 3. It is clear that there are only a few significant differences and are present only at the anterior tongue and between people at the extremes of the age decades. Males invariably reach higher values.

Table 3 Gender differences across the age decades

The Pearson’s r correlation coefficients between age and MIP and endurance variables are summarized in Table 4. The relationship between anterior and posterior strength is striking, explaining 59 % of the variance both parameters share; a similar but smaller correlation was found between the anterior and posterior tongue endurance.

Table 4 Pearson product-moment correlations between age and tongue strength and endurance

Bulb Position

A paired-samples t test was conducted to evaluate the difference in strength and endurance between the anterior and posterior testing location. There was a statistically significant difference in tongue strength in the feedback condition when comparing the anterior (M = 44.27, CI = 42.83–45.71) with the posterior location [M = 41.08, CI = 39.67–42.4, t(420) = 6.45, p < 0.0005]. The η2 statistic (0.09) indicated a moderate effect size. Similar results in the non feedback condition were found between anterior (M = 43.11, CI = 41.60–44.62) and posterior tongue strength [M = 39.92, CI = 38.49–41.36, t(420) = 6.22, p < 0.0005] The effect size was also moderate (.08). Endurance was also significantly different for the anterior (M = 22.56, CI = 20.78–24.12) and posterior position [M = 14,96, CI = 13.96–15.90, t(420) = 10.92, p < 0.0005]. However, the effect size was large (0.22).

Visual Feedback

To determine the impact of allowing participants visual feedback during maximal strength testing, a paired-samples t test was conducted between measures with and without visual feedback. There was statistically significantly higher anterior tongue strength with feedback (M = 44.27, CI = 42.83–45.71) than without [M = 43.11, CI = 41.60–44.62, t(420) = 3.56, p < 0.0005]. The η2 statistic (0.03) indicated a small effect size. Similar results were found for the posterior location between the feedback (M = 41.08, CI = 39.67–42.4) and the non feedback condition [M = 39.92, CI = 38.49–41.36, t(420) = 4.11, p < 0.0005]. The effect size was also small (0.04).

Order of Testing

To determine the effect of the order of testing, an independent t test was performed for both strength and endurance values by comparing the means of both procedural groups. No statistically significant differences were found for any of the measurements.

Repeated Trials for Maximal Strength

A one-way repeated-measures ANOVA was performed to determine if there was a significant difference between the three trials to reach maximal tongue pressure, both anterior and posterior, and between the condition with or without feedback. Since Mauchly’s test indicated that the assumption of sphericity had been violated for all test conditions (p < 0.05), multivariate tests are reported. These results show that maximal pressures obtained during the feedback conditions showed no statistically significant difference between the pressures obtained. In the non feedback conditions, pressures of the three trials were significantly affected by the number of the trial [anterior pressure: Pillai’s Trace V = .57, F(2, 418) = 12.65, p < 0.000, partial η2 = 0.057 (small); posterior pressure: Pillai’s Trace V = .098, F(2,418) = 22.58, p < 0.000, partial η2 = 0.098 (moderate)]. Post hoc pairwise comparisons of the non feedback results showed that for both anterior and posterior, the first trial produced a significantly higher maximal pressure while the difference between the second and third trials was not significant.

Discussion

As the global population is aging, there is an increasing need for data on the effects of aging on swallowing in healthy elderly individuals over the age of 80. Therefore, a substantial effort was made to include this subgroup of people in our study, including people up to 96 years old, mirroring the study population of Crow and Ship [8]. Since the study group consisted of a large number of participants, grouping was by decade, similar to the study by Utanohara et al. [21] (although they did not include people over the age of 80), to allow maximal detection of differences between groups.

Interestingly, the mean maximum tongue pressures in this Belgian population seem to be significantly lower than those found in similar American studies, a discrepancy that also seems to be applicable to the available literature on Asian results. Variations in the number of participants, their ages, and measurements tools should be taken into consideration when comparing these results. Mean MIPant in our study was 44.27 kPa (CI = 42.83–45.71), while Stierwalt and Youmans [2] found a mean of 59.78 kPa (CI = 57.88–61.68), and these values are in line with other American studies [6, 10]. Utanohara et al. [21] found in a very large dataset a mean of 39.03 kPa (CI = 38.39–39.67). The average MIPpost of 41.08 kPa in our study (CI = 39.67–42.48) was also lower than that reported by Clark and Solomon [19] (M = 53.6 kPa, CI = 51.28–55.92); no comparable Asian data are available. Comparing endurance data is hindered by lack of reference data. The mean Endant was 22.39 s (CI = 20.78–24.12), which is similar to that of Lazarus et al. [18] (M = 23.7 s; CI = 20.2–27.1), but considerably shorter than that found by other researchers [2, 8, 11, 12]. The mean posterior endurance in our study group was 14.90 s (CI = 13.96–15.90); only Kays et al. [12] provided data for comparison on a small group of participants (M = 26.1, CI = 20.2–32.0). There is no clear explanation for this discrepancy as it is the first report on cross-national variation in bulbar muscle strength. This subject clearly merits further research since, if corroborated, it would necessitate the development and use of different regional standard values, and cross-national surveys of tongue strength should control not only for age and gender but also for race or nationality.

In this study, no interaction effect between age and gender was found for any of the tongue measures. For the tongue strength measures, this finding is in accordance with that of Clark et al. [19]. The study by Utanohara et al. [21], however, seems to indicate that there still could be some minor interaction effect since the differences in tongue elevation strength are greater in younger than in older participants.

When looking at the effect of age on strength measures in males and females, we can clearly see that people older than 70 years are significantly weaker than younger people; however, male anterior tongue elevation already starts to decrease by the age of 60. Correlation data indicate that the decline in overall tongue strength with age is rather gradual. The correlation coefficient for MIPant regression as a function of age was significant and larger than reported in previous studies [68, 19]. Similar results were obtained for our MIPpost results [19]. Data on the effect of age on endurance is sparse and most often indicates no effect with advancing age [2, 8]; only Kays et al. [12] found a reduction of endurance with age in their study on a limited number of participants. Our data show that endurance remains stable throughout the major part of life. In males, only subsets of participants differed significantly, and interestingly between middle-aged (31–50 years old) and elderly (71+ years old); in females, endurance started to deteriorate only after the age of 80. Our findings contradict earlier suggestions that age-related swallowing changes begin at ~45 years of age [25]. These differences may be attributed to large age differences between age groups [2, 4, 6, 7] or the lack of inclusion of the very elderly (80+ years old); only Crow and Ship [8], Stierwalt and Youmans [2], and Clark et al. [19] included this subgroup.

The effect of gender in this well-balanced study population remains significant but minor. Males, in general, show higher pressures and longer endurance; however, when taking age into consideration, differences are small and few. These results are in agreement with the majority of studies when looking at strength [2, 6, 8, 12, 21], though several studies have found no significant difference [7, 9, 19]. Limited data on the effect of gender on endurance allows no clear conclusion; Crow and Ship [8] and Stierwalt and Youmans [2] found no effect of gender for the anterior tongue, while Kays et al. [12] found a gender effect at both the anterior and posterior locations.

There are limited data in the literature on the effect of bulb location, with some studies using different equipment (3-bulb array), and most of the previous studies measuring only the anterior portion of the tongue (“just posterior to the alveolar ridge”). Since the posterior tongue provides critical forces for transferring food and liquid from the oral cavity into the pharynx, establishing its measures of strength and endurance is warranted. Our data show that the anterior part of the tongue demonstrates both higher strength and longer endurance than the posterior part of the tongue. This is in accordance with recent studies [12, 19, 26]. Previous data from smaller studies indicated that the tongue blade [4] or even the posterior part [9] was the strongest part of the tongue.

When patients receive feedback on their strength testing, there is a small but significant increase in maximal pressure. Since the purpose of our study was to document maximal values, the pressures obtained without feedback were not further analyzed.

Our data clearly show that there is no significant effect of the order of testing, meaning that maximally loading a part of the tongue for a limited number of trials has no negative impact on the results of the other location. No similar data are available in the literature for comparison, however.

Finally, the common accepted practice of performing three trials to determine the maximal tongue pressure has received very little attention from a methodological point of view. Our results demonstrate that when healthy participants are allowed visual feedback, there is a nonsignificant amount of variation in the pressures obtained. Without the feedback condition, the first trial generally led to the maximal tongue pressure. These results are in line with findings by Crow and Ship [8] who found that there was also no significant difference between the three trials. Butler et al. [26] found a marginal effect of trial order on MIPpost, where the first trial solicited the highest pressures in patients with dysphagia. When faced with limited time while evaluating patients, clinicians can probably get reliable data on maximal tongue pressure using a single motivated trial.

Conclusions

The contribution of tongue function, including adequate strength, to successful mastication and deglutition is well established, but until now there has not been an investigation on normal values of maximum isometric tongue strength and endurance in a European population that varied across age and gender. This study provides the first European data on tongue strength and endurance. The dataset is very comprehensive, based on a large number of healthy adults with carefully selected exclusion criteria, in order to detect possible subtle differences between subgroups that may not have been found in studies lacking sufficient power. The testing procedure not only investigated the anterior tongue location, it also measured the posterior part of the tongue blade. Different parameters like visual feedback and order of testing were tested for significant influence on the outcome parameters. The results of our study are in agreement with some previous data, but differ on other points. The current study expands the dataset describing age- and sex-related differences in strength and endurance measures obtained in a large group of nondysphagic men and women. Future studies will explore how these measures relate to specific aspects of swallowing function.