Introduction

This is the second communication from a series of experiments, wherein the relationship between whole-body metabolism and body mass was evaluated in healthy humans from basal through to maximal-exercising states. In the first experiment, the association with metabolic rate was explored during basal and resting states (Bowes et al. 2021). Just as others had found for animals (Sarrus and Rameaux 1838; Kleiber 1932; White and Seymour 2003), an allometric (power) model provided a biologically valid, powerful and superior description of that relationship, relative to linear scaling. Moreover, a unique intra-specific (human), body-mass exponent for metabolic rate was revealed (metabolic rate = mass0.50–0.55). During exercise, there exists inter-specific (species) evidence that, whilst that allometric relationship is retained, the scaling exponent is elevated (Taylor et al. 1981; Bishop 1999; Weibel et al. 2004), especially during maximal-intensity exercise. Human evidence for such a change also exists, although it remains inconclusive, and possibly unconvincing. Therefore, the primary purpose of this contribution was to explore the veracity of different scaling models during maximal-intensity exercise, so that, when combined with basal observations from the same individuals (Bowes et al. 2021), those analyses would span the complete metabolic range (aerobic scope). Collectively, those descriptive observations would facilitate the derivation of mass-specific metabolic rate and oxygen consumption from existing absolute data.

During maximal exercise, whole-body oxygen consumption increases to ~ 20 times that observed at rest (Taylor et al. 1981 [animals]; Clark et al. 1983 [humans]), due to increases in the type and mass of the metabolically active tissues, and an intensity-dependent metabolic transition (Darveau et al. 2002; Weibel 2002). As a consequence, the skeletal-muscle contribution to metabolism increases from ~ 20% at rest (Elia 1992; Gallagher et al. 1998; Müller et al. 2011) to ~ 90% when exercising maximally (Mitchell and Blomqvist 1971; Hochachka 1994). Collectively, those factors could modify the relationship between oxygen consumption and body mass, relative to that observed at rest. Therefore, others have postulated that the scaling exponent would increase with elevations in metabolic intensity, perhaps in the form of an allometric cascade (Darveau et al. 2002; Hochachka et al. 2003; Suarez et al. 2004), reaching a theoretical maximum of 0.92 (mass0.92; Darveau et al. 2002).

Since the comparative literature reveals that such an exponent increase seems to occur when moving from rest [e.g. mass0.67 (White and Seymour 2003)] through to maximal exercise [mass0.87 (Weibel et al. 2004)], that possibility needed to be explored within humans. Moreover, since it is plausible that the species specificity of the allometric relationship, evident at rest (Heusner 1982; White and Seymour 2003; Sieg et al. 2009), may continue across the entire metabolic range, human-specific data were required to provide a more complete evaluation of those possibilities. Ideally, data should be gathered using the same individuals used to scale basal metabolic rate (Bowes et al. 2021).

However, for human data, ratiometric scaling (mass1.0) dominates the exercise literature, even though allometric scaling is commonly used for both resting (Kleiber 1932; Sieg et al. 2009; White and Kearney 2014) and exercising states (Taylor et al. 1981; Bishop 1999; Weibel et al. 2004) within the comparative literature. The ratiometric scaling of peak oxygen consumption (peak aerobic power), typically converts the mass-dependent (positive) bias evident within those absolute data into a systematic negative bias, when expressed as mass-specific equivalents (Vaage and Hermansen 1977). This occurs because neither the dependent nor the independent variables change isometrically (a pre-requisite of ratiometric scaling; Packard and Boardman 1988), but disproportionately. The effect is that the validity of those mass-independent data, as well as the capacity for predicting data beyond those experimental mass ranges, remains questionable.

Recognising that limitation, some have adopted non-linear scaling to analyse human peak aerobic power (e.g. Secher et al. 1983; Bergh et al. 1991; Nevill et al. 1992; Batterham and Jackson 2003; Markovic et al. 2007; Lolli et al. 2017). Unfortunately, efforts to minimise sampling bias, by standardising inter-individual variability and obtaining evenly distributed and adequately sized samples across the mass range investigated (White and Seymour 2005), seem not to have been made. Indeed, data were often obtained from samples of convenience, or from existing databases. Not surprisingly, the resulting exponents are highly variable, even when restricted to healthy, adult males [e.g. mass0.63 (Nevill et al. 1992); mass0.94 (Batterham and Jackson 2003)]. Those inconsistencies complicate the adoption and application of a uniform scaling approach for human data, and will neither convince those who prefer linear scaling to consider another possibility, nor will they help clarify the relationship between body mass and oxygen consumption during exercise.

Therefore, the first aim was to evaluate how human, peak aerobic power would scale against body mass within a homogeneous sample of healthy, adult males. It was hypothesised that data collected across the metabolic range would scale allometrically with mass. Secondly, it was hypothesised that the scaling exponent for maximal (treadmill) exercise would be significantly larger than observed at rest. To test the possibility that all non-resting states would have a significantly larger scaling exponent than resting states, data obtained during steady-state walking (4.8 km h−1) were also scaled. It was further hypothesised that the size of the scaling exponent would increase in a stepwise manner from rest, to steady-state walking and then to maximal running. Those hypotheses were evaluated using a single population sample (N = 60), which varied primarily in body mass, with all individuals providing data for each metabolic intensity. Those data were scaled using both linear and allometric methods; the latter reflect the state of the art in comparative physiology. Finally, to test the hypothesis that the scaling exponent for peak aerobic power would be applicable to broader population samples, historical data were extracted from an in-house database, and analysed.

Methods

Participants

For this investigation, data from two different population samples of males were analysed. In the first instance, a homogeneous sample of healthy University students was used (N = 60; Table 1), all of whom had already been studied under basal conditions (Bowes et al. 2021). Those participants provided written, informed consent before undertaking both steady-state ambulatory (treadmill walking) and maximal running exercise (treadmill). The resulting data enabled a comparison of scaling models that could be applied to both ends of the metabolic range, with walking representing an intermediate state, at the lower end of the exercising range. All experimental procedures were approved by a Human Research Ethics Committee (University of Wollongong, Australia: HE14/469) following national regulations (National Health and Medical Research Council), and in compliance with the Declaration of Helsinki. A second sample (N = 54) came from an in-house, historical database of maximal-exercise data (running), collected by the authors, and for which the same level of technical precision and experimental control was known to exist. Those individuals were used to represent a random sample of physically active University men, all of whom had provided written, informed consent to similarly, and previously, approved procedures. Those data were used to evaluate the robustness of the maximal-exercise scaling model.

Table 1 Physical characteristics of the experimental subjects

Experimental participants

The experimental sample came from a pool of 71 healthy men, 68 of whom had been studied under basal conditions (Bowes et al. 2021), and who now participated in steady-state walking. However, only 61 subjects performed maximal exercise, which left 60 individuals who participated in all three trials. They were the experimental participants. Of the 11 remaining participants, three suffered unrelated injuries, five withdrew and three were deemed to be outliers with regard to their relative adiposity or absolute peak aerobic power (Cook’s distance), so their data were excluded. All remaining subjects were either competitive athletes or previous research participants, and engaged in endurance exercise at least three times per week. Strict recruitment criteria were used to minimise the effects of inter-individual variations on the scaling models, as described elsewhere (Bowes et al. 2021). In the first instance, sexual dimorphism was prevented. Age-dependent mass differences in the metabolically active tissues (Holliday et al. 1967; Holliday 1971; Müller et al. 2011) were minimised by recruiting adults aged 18–40 years (Table 1). To minimise data clustering, and its adverse impact upon scaling (White and Seymour 2005), 10–15 participants were initially recruited across each of five body-mass classes (Table 1; extra small: 55–65 kg; small: 66–76 kg; medium: 77–87 kg; large: 88–98 kg; extra large: > 99 kg), although the loss of three extra-large individuals represents a sampling limitation. Finally, variations in body composition were minimised by standardising subcutaneous adiposity, such that only those with average-to-low adiposity were included (i.e., height-adjusted sum of six skinfold thicknesses ≤ 88 mm; Ross and Wilson 1974).

In-house, historical participants

Data were also extracted from an in-house database of published observations for peak aerobic power (Booth et al. 2001; Taylor et al. 2012; Peoples et al. 2016, 2017; Craddock et al. 2021). In those investigations, one-off (incremental) treadmill runs to volitional exhaustion, were performed on physically active subjects. Data were extracted for participants aged 18–40 years (N = 54), and covering a body-mass range of 54.6–101.7 kg {age: 26 years [standard deviation (SD) 6.3]; body mass: 76.3 kg (SD 8.7); height 179.5 cm (SD 6.3); absolute peak aerobic power: 4.44 L min−1 (SD 0.54)}. However, clustering occurred within the small-to-medium, body-mass classes (extra small: N = 7; small: N = 21; medium: N = 22; large: N = 3; extra large: N = 1), as defined for the experimental sample. Those data were initially used to evaluate the robustness of the maximal-exercise scaling exponent derived from the experimental subjects, when applied to data collected for the same exercise mode, but without the physical and physiological attributes of those individuals being standardised. Secondly, to evaluate the possible impact of applying scaling procedures to a heterogeneous sample of convenience, over which no attempt was made to obtain either a wide or even distribution of body masses, the peak aerobic-power data of those 54 individuals were independently scaled.

Procedures

Experimental overview

Incremental (graded) exercise tests were administered to the experimental participants on three, non-consecutive days, starting at the same time. That approach minimised the impact of familiarisation and practice effects on performance (Shephard et al. 1968), and increased comfort and confidence during subsequent treadmill exercise. Data from the last maximal-exercise test were used to represent peak aerobic power. During each laboratory visit (~ 23 °C and ~ 50% relative humidity), subjects commenced with blocks of steady-state walking (which also served as a warm-up), followed by seated rest and then an incremental forcing function (maximal treadmill ramp) to volitional exhaustion. Subjects wore running shoes, shorts, t-shirt and a torso safety harness. Each ramp test commenced at a unique, pre-determined running speed for each person (8–12 km h−1; horizontal), with only the gradient increasing thereafter (1% steps each minute: ~ 12 min). The criterion for test termination was volitional failure, with verbal encouragement provided. Steady-state exercise data were collected during horizontal walking (15 min at 4.8 km h−1), before the third incremental exercise test.

Prior to commencing this experiment, a pilot trial was undertaken in which 10 individuals (females and males) with similar endurance-training behaviours, performed three maximal-exercise trials, as per the main experiment. Oxygen consumption was measured continuously during each trial. Peak aerobic power did not change significantly across those trials [trial 1: 3.69 L min−1 (SD 0.77); trial 2: 3.71 L min−1 (SD 0.75); trial 3: 3.59 L min−1 (SD 0.86); P > 0.05]. It was therefore interpreted that, for similarly trained subjects, treadmill familiarity and learning effects would not introduce measurement artefacts.

Experimental standardisation

Subjects were asked to present in a well-rested (~ 8 h sleep) and normally hydrated state, having abstained from strenuous physical activity over the previous 24 h. They were also requested to consume a high-carbohydrate, low-fat meal > 2 h before testing, and to maintain normal fluid intakes.

Measurements

Anthropometric and skinfold-thickness measurements were taken for the experimental subjects, as described elsewhere (Bowes et al. 2021). From those indices, height-adjusted body-mass and skinfold-thickness data were derived, following the method of Ross and Wilson (1974): mass (kg) × [170.18/height (cm)]3; and skinfold sum (cm) × [170.18/height (cm)]. Those values were used to minimise inter-individual variability in relative muscularity and adiposity during recruitment.

Oxygen consumption was measured during rest and exercise (open-circuit respirometry), with expired gas fractions and expiratory flows sampled from a two-way, low-resistance, non-rebreathing valve (7400 series, Hans Rudolph Inc., Kansas, USA). Those data were analysed continuously (TrueOne 2400, ParvoMedics Inc., Utah, USA) and used to derive oxygen consumption, carbon dioxide production and minute ventilation (15-s averages). Two-point gas [room air plus alpha standard gases (16.00% oxygen, 4.00% carbon dioxide, 80.00% nitrogen)] and a range of flow calibrations (50 to > 300 L min−1) preceded every trial. Steady-state oxygen consumption was averaged over the last 5 min, whilst the highest oxygen consumption observed during incremental running was taken as each individual’s peak value.

Design and analysis

To minimise subsequent scaling errors, assumptions underlying both the sampling and scaling procedures were evaluated. Those processes began with an assessment of the sample homogeneity. One-way Analysis of Variance was used to compare both raw and height-adjusted morphological data across the body-mass groups, with differences evaluated for statistical significance using Tukey’s HSD post hoc procedure. If between-group differences existed, Analysis of Covariance was used to evaluate whether or not that variable, when unadjusted, had a significant interaction with the scaling relationship; such interactions did not occur. For all statistical analyses, alpha was set at the 0.05 level. Data are reported as means with standard deviations, and 95% confidence intervals for the scaling coefficients.

The relationship between peak aerobic power and body mass was modelled both linearly and non-linearly (hypothesis one). Indeed, for those scaling models to be valid, each had to satisfy both biological and statistical assumptions, because best fit does not necessarily equate with the correct solution (Sholl 1948; Motulsky and Ransnas 1987; Nevill et al. 1992). The biological criterion was an origin intercept (zero–zero), which was appraised using untransformed data, and evaluated using t tests. Since a non-zero oxygen consumption when the body mass is zero is biologically impossible (Krogh 1916; Kleiber 1950), that criterion was heavily weighted.

To provide an evaluation of the model shape, non-linear data were initially log-transformed [log-linear (log10 base); as described elsewhere; Bowes et al. 2021] so that the assumptions underlying linear regression could be appraised (normality, linearity, homeoscedasticity and an absence of first-order correlations). Normality was assessed using skewness and kurtosis z scores (± 3.29 when 50 < N < 300; Kim 2013), as well as the Shapiro–Wilk test. Linearity and homeoscedasticity were evaluated using scatter and residual plots (respectively). The Durbin–Watson test was used to assess first-order correlations between variables.

To enhance the accuracy of each scaling model, the following statistical procedures were followed. Firstly, Cook’s distance test was used to identify potential outliers within the absolute oxygen-consumption data that might disproportionately influence scaling, with the cut-off set to 4/(N − k − 2); where N was the sample size and k the number of coefficients. Data from three individuals were omitted (Analysis of Covariance or an abnormally low absolute value). The scaling models were then bootstrapped (1000 repetitions using resampling) to generate coefficient means, and Bias-Corrected and Accelerated analyses were used to yield 95% confidence intervals that were independent of skewness produced during bootstrapping.

The relationship between peak aerobic power and body mass was modelled using both linear (first-order polynomial: Eq. 1A) and non-linear scaling (allometric: Eq. 2A), with ratiometric analysis also performed. For steady-state walking, both scaling models were again used, as described elsewhere (Bowes 2018), with only the latter being reported here. Non-linear data were also log-transformed for use with analyses that assumed linearity (log-linear; Eq. 3A). For all equations, body masses are in kg and oxygen-consumption data are in absolute units (L min−1). The following equation numbers appear throughout the manuscript, but with the alphabetical qualifiers varying according to their application.

$${\text{Linear regression: Oxygen consumption }} = {\text{ }}a{\text{ }} + {\text{ }}b{\text{ }} \times {\text{ mass}}~~~~~~~~\left[ {{\text{L}}\,{\text{min}}^{{ - {\text{1}}}} } \right],$$
(1A)

where a = ordinate intercept [L min−1], b = linear coefficient [slope; L kg−1 min−1].

$${\text{Allometric regression: Oxygen consumption }} = {\text{ }}b \times {\text{mass}}^{k} ~~~\left[ {{\text{L}}\,{\text{min}}^{{ - {\text{1}}}} } \right],$$
(2A)

where b = linear coefficient [L kg−1 min−1], k = coefficient of allometry (scaling exponent) for body mass [kg].

$${\text{Log-linear}}:{\text{ Log}}_{{{\text{1}}0}} \left( {{\text{oxygen consumption}}} \right){\text{ }} = {\text{ Log}}_{{{\text{1}}0}} \left( b \right){\text{ }} + k \times {\text{ Log}}_{{{\text{1}}0}} \left( {{\text{mass}}} \right){\text{ }}\left[ {{\text{L}}\,{\text{min}}^{{ - {\text{1}}}} } \right],$$
(3A)

where b = linear coefficient [L.kg−1.min−1], k = scaling exponent for body mass [kg].

A limitation of the present study, which was primarily descriptive, rather than mechanistic in nature, is that, whilst both the linear coefficient and the coefficient of allometry (Eq. 2A) will vary, only variations in the latter have been explored. Therefore, it would be remiss to ignore the impact of variations in the linear coefficient, which can independently, and in combination with changes in the scaling exponent, significantly affect the prediction model. For example, when using Eq. 2A to predict absolute peak aerobic power from body mass alone, a 10% variation in the linear coefficient would result in an equivalent relative change across the complete body-mass range. One suspects that few would have that objective. Instead, it is much more likely that others would choose to derive mass-specific values for peak aerobic power, having first measured absolute oxygen consumption. That conversion only requires the coefficient of allometry (Eq. 4), which is the central focus of this manuscript.

$$Y_{{{\text{adjusted}}}} = {\text{ }}Y_{{{\text{measured}}}} /{\text{ mass}}^{k} ~~~\left[ {{\text{mL}}\,{\text{kg}}^{{ - k}} \,{\text{min}}^{{ - {\text{1}}}} } \right],$$
(4)

where Yadjusted = mass-specific oxygen consumption, Ymeasured = absolute oxygen consumption [L min−1], k = scaling exponent for body mass [kg].

Differences in the goodness-of-fit among those models were compared using the root-mean-square error (RMSE), the Akaike Information Criterion (AIC), coefficients of determination (r2) and F tests (Motulsky and Ransnas 1987). Pearson’s correlation coefficient was used to evaluate the remaining mass bias (residuals) within the scaled oxygen-consumption data, with correlations that approached zero indicating successful removal of that bias. Between-model comparisons were made using Analysis of Covariance, with a significant interaction effect demonstrating that those models had unique slopes (linear: linear coefficients; non-linear: exponent). Differences among the non-linear exponents for oxygen consumption during basal conditions, steady-state walking and maximal exercise were evaluated using Analysis of Covariance. That test required a sample of at least 52 participants to detect between-group slope (exponent) differences of a large effect size (Cohen’s f = 0.40), with a statistical power of at least 80% (Faul et al. 2007). Thus, both the experimental and historical samples were adequately powered (a priori) for these analyses. The exponent for peak aerobic power was also compared with both the peak animal exponent (mass0.87; Weibel et al. 2004) and the proposed theoretical maximal exponent (mass0.92; Darveau et al. 2002) using t tests.

To evaluate the robustness of the final experimental scaling model, when applied to data obtained from an independent population sample, an in-house sample of convenience was used. Those data were also allometrically scaled (Eq. 2A) to evaluate the impact of performing that analysis without controlling for variations in the physical and physiological characteristics, or the range and distributions of the body masses, of those subjects.

Results

Homogeneity of the experimental sample

Whilst the experimental participants were distributed across a more than twofold elevation in body mass (Table 1), inter-class differences were not observed for either their height-adjusted body mass or adiposity (Table 1; P > 0.05), except for comparisons between the lightest and heaviest groups (P < 0.05). When the impact of those differences on the scaling models was evaluated, significant effects were not observed for either measure (P > 0.05). Those outcomes support the interpretation that the experimental sample was relatively homogeneous, and that the influences of relative body mass (proportionality) and adiposity had been sufficiently minimised for the current scaling purposes.

Integrity of the experimental data

The absolute oxygen consumption of the experimental subjects increased ~ 16-fold from the supine, basal state [0.27 L min−1 (SD 0.03)] through to volitional exhaustion on the third maximal-exercise trial [4.25 L min−1 (SD 0.71)]. The respective heart rates were 59 beats min−1 (SD 1) and 194 beats min−1 (SD 10). Those data are consistent with values expected at the end of an incremental (running) forcing function, and, in combination with the non-significant changes in peak aerobic power across the three pilot trials, permit one to suggest that they represented valid measures of peak aerobic power. As expected, those data displayed a positive mass bias (Fig. 1), with values ranging from 3.01 L min−1 (extra-small subject: 59.8 kg) to 6.27 L min−1 (extra-large subject: 107.6 kg). When expressed ratiometrically, a negative mass bias became evident: 56.0 mL kg−1 min−1 (mean of the extra-small subjects) and 48.8 mL kg−1 min−1 (extra-large subjects).

Fig. 1
figure 1

Time-series oxygen-consumption data from five experimental participants, one from each body-mass class, during incremental treadmill running to volitional exhaustion. Data are 15-s averages measured continuously. Oxygen-consumption plateaux are infrequently observed, even in experienced and familiarised subjects

Scaling-model assumptions

Both the body-mass and peak aerobic-power data were normally distributed (Table 2; Shapiro–Wilk test, P > 0.05), as confirmed from the z scores for skewness and kurtosis, which were well within the pre-defined range for samples of this size (Table 2; ± 3.29, P > 0.05; Kim 2013). The raw and log-transformed independent and dependent variables were linearly related [Fig. 2A (linear: r2 = 0.65) and Fig. 2C (log-linear: r2 = 0.67)], with an even distribution of residuals (homeoscedasticity) about those regressions (Fig. 2B and D, respectively). Indeed, the sum of those residuals approached zero (Fig. 2D; r = − 0.001; P > 0.05). Notwithstanding the intrinsic relationship between body mass and peak aerobic power, auto-correlation was not observed within either model (Table 2; Durbin–Watson test, P > 0.05). Accordingly, both linear and allometric scaling were statistically justified.

Table 2 Normality and auto-correlation (first-order correlation) tests performed on peak aerobic power (dependent variable) and body mass (raw and log-linear forms), as well as the model residuals (linear and log-linear regressions; N = 60)
Fig. 2
figure 2

Scatter plots of the independent (body mass) and dependent (peak aerobic power) variables from the experimental participants (N = 60) obtained at volitional exhaustion. Those data were used to evaluate the assumptions of linearity between those variables, and are presented as absolute values on both linear (A) and logarithmic scales (C). The respective residual plots for both formats are presented in B and D, with E containing the residuals resulting from the ratiometric analysis of the data presented in A. Each residual represents the difference between the measured peak aerobic power and the predicted value derived for each body mass, as determined from linear-regression analysis. Figures 2B, D and E were then used to evaluate homeoscedasticity

Scaling peak aerobic power

The ratio standard demonstrated a visually [Figs. 2E and 3 (dashed line)] and statistically inferior fit (RMSE = 0.456; AIC = 80.033), when compared with the linear (Eq. 1B) and allometric models (Eq. 2B; the corresponding statistical parameters appear in subsequent paragraphs). The positive, absolute body-mass bias for peak aerobic power (Fig. 2A; r = 0.81) was converted into a negative bias following ratiometric scaling (r = − 0.40; P < 0.05; Fig. 2E), resulting in the mass-specific data for the ten heaviest participants (48.77 mL kg−1 min−1) differing significantly from the corresponding data for the ten lightest individuals (56.03 mL kg−1 min−1; P < 0.05). Thus, ratiometric analysis did not remove the inherent mass bias, but merely changed its form, so that method was not further explored.

Fig. 3
figure 3

Allometric (power: Eq. 2B) and linear regression (least-squares, first-order polynomial: Eq. 1B) models of the relationship between absolute, peak aerobic power and body mass within the experimental subjects (N = 60; circles). Also presented is the commonly used, ratiometric treatment of those data. The non-zero ordinate intercept (1.05 L min−1; r2 = 0.65; P < 0.05) for the linear model violates the biological assumption underpinning all scaling models (presence of an origin intercept), and is thermodynamically impossible. The allometric model revealed a mass exponent of 0.75 (r2 = 0.67; P < 0.05)

When least-squares, linear regression was used to model that relationship [Figs. 2A, 3 (dotted line); Eq. 1B], a statistically strong linear model was obtained (RMSE = 0.418; AIC = 71.457), which accounted for approximately 65% (r2) of the variation in oxygen consumption on the basis of body mass. Whilst the goodness-of-fit indices (RMSE and AIC) for both the linear and ratiometric models were similar, only the linear-regression model satisfactorily normalised data for variations in body mass (Fig. 2B; r = 0.005; P > 0.05). Thus, goodness-of-fit, on its own, was insufficient for evaluating the appropriateness of those models. Nevertheless, predictive errors were evident when the linear model was applied to masses lower than the experimental body-mass range. That was epitomised by a significant, non-zero ordinate intercept (Fig. 3; 1.05 L min−1; P < 0.05). Therefore, whilst linear scaling was statistically suitable, it was unable to model that relationship for individuals lighter than the smallest experimental subject, and its rejection was theoretically justified.

$${\text{Peak \, aerobic power }} = {\text{1}}.0{\text{49}}\left\{ {0.{\text{325}} - {\text{1}}.{\text{677}}} \right\}{\text{ }} + {\text{ }}0.0{\text{4}}0{\text{ }}\left\{ {0.0{\text{32}} - 0.0{\text{5}}0} \right\}{\text{ }} \times {\text{ mass,}}$$
(1B)

Where oxygen consumption units are L min−1 and body mass is in kg, 95% confidence intervals are shown within the braces.

In contrast, allometric scaling implicitly satisfied the biological criterion [Fig. 3 (solid line); Eq. 2B], whilst simultaneously providing a statistically strong fit to the experimental data (P < 0.05). That model did not differ significantly from the linear-regression outcome [r2 = 0.67 (log-linear format; Eq. 3B); RMSE = 0.414; AIC = 70.560; F test > 0.05]. Accordingly, allometric modelling was superior to linear scaling for both describing and normalising the relationship between peak aerobic power and body mass in healthy, adult males. The allometric model was also expanded to include an ordinate intercept (metabolic rate = b × massk + c), as was performed when analysing basal and resting data (Bowes et al. 2021), and to evaluate the possibility that it might further improve the model fit. However, that model could not be confidently applied, due to the absence of data below 56 kg. Therefore, just as had been found with the resting data and body masses > 10 kg, the addition of an intercept did not improve the model fit.

$${\text{Peak aerobic power }} = 0.{\text{16}}0\left\{ {0.0{\text{75}} - 0.{\text{292}}} \right\}{\text{ }} \times {\text{ mass}}^{{0.{\text{75}}0\{ 0.{\text{615}} - 0.{\text{924}}\} }} ,$$
(2B)

where oxygen consumption units are L min−1 and body mass is in kg, 95% confidence intervals are shown within the braces.

$${\text{Log}}_{{{\text{1}}0}} ({\text{peak aerobic power}}){\text{ }} = {\text{ }} - 0.{\text{863}}\left\{ { - {\text{1}}.{\text{14}}0{\text{ to }} - 0.{\text{595}}} \right\}{\text{ }} + {\text{ }}0.{\text{784}}\left\{ {0.{\text{641}} - 0.{\text{935}}} \right\}{\text{ }} \times {\text{ Log}}_{{{\text{1}}0}} \left( {{\text{mass}}} \right),$$
(3B)

where oxygen consumption units are L min−1 and body mass is in kg, 95% confidence intervals are shown within the braces.

In combination with previous analyses of basal metabolism (Bowes et al. 2021), it was evident that the metabolic rate of men scaled allometrically with body mass across the complete metabolic range (aerobic scope), and hypothesis one was accepted. The human exponent for peak aerobic power (mass0.75) did not differ significantly from the inter-specific, mammalian exponent (mass0.87; P > 0.05; Weibel et al. 2004), but was significantly smaller than the maximal exponent postulated by Darveau et al. (2002; mass0.92; P < 0.05).

Allometric scaling of steady-state oxygen consumption

Having demonstrated elsewhere the appropriateness of allometrically scaling steady-state oxygen consumption (Bowes 2018), data obtained from the experimental subjects during horizontal walking (4.8 km h−1) were scaled in both allometric (Eq. 2C) and log-linear formats (Eq. 3C). A statistically strong allometric model was again obtained (RMSE = 0.071).

$${\text{Steady-state oxygen consumption }} = 0.0{\text{23}}\left\{ {0.0{\text{14}} - 0.0{\text{37}}} \right\}{\text{ }} \times {\text{mass}}^{{0.{\text{865 }}\{ 0.{\text{759}} - 0.{\text{982}}\} }} ,$$
(2C)

where oxygen consumption units are L min−1 and body mass is in kg, 95% confidence intervals are shown within the braces.

$${\text{Log}}_{{{\text{1}}0}} \left( {{\text{steady-state oxygen consumption}}} \right){\text{ }} = - {\text{1}}.{\text{63}}0\left\{ { - {\text{1}}.{\text{855 to }} - {\text{1}}.{\text{421}}} \right\}{\text{ }} + 0.{\text{857}}\left\{ {0.{\text{748}} - 0.{\text{976}}} \right\}{\text{ }} \times {\text{ Log}}_{{{\text{1}}0}} \left( {{\text{mass}}} \right),$$
(3C)

where oxygen consumption units are L min−1 and body mass is in kg, 95% confidence intervals are shown within the braces.

Scaling oxygen consumption across metabolic intensities

Since 60 individuals participated in all three trials, and since allometric modelling was superior for each metabolic state, a unique opportunity arose to evaluate the effect of a change in metabolic intensity on the scaling exponent. To achieve that, absolute oxygen-consumption data for the basal state were scaled (Eq. 2D; RMSE = 0.019; AIC = − 301.908), rather than using the metabolic rates reported elsewhere (MJ day−1; Bowes et al. 2021), and compared with both the steady-state and peak scaling models. The basal scaling exponent (mass0.57) differed slightly, but not significantly (P > 0.05), from the exponent reported for basal metabolic rate (mass0.50–0.55), due to differences in subject numbers between investigations (60 and 68, respectively). Relative to the basal exponent, the exponents for both steady-state (Eq. 2C; mass0.87) and maximal exercise (Eq. 2B; mass0.75) were significantly greater (P < 0.05), so hypothesis two was accepted. However, the exponents for steady-state and maximal exercise did not differ significantly (P > 0.05). Therefore, a stepwise elevation in the body-mass exponent appeared not to accompany increases in metabolic rate, and hypothesis three was unable to be accepted.

$${\text{Basal oxygen consumption }} = 0.0{\text{23}}\left\{ {0.0{\text{15}} - 0.0{\text{38}}} \right\}{\text{ }} \times {\text{ mass}}^{{0.{\text{567}}\{ 0.{\text{458}} - 0.{\text{671}}\} }} ,$$
(2D)

where oxygen consumption units are L min−1 and body mass is in kg, 95% confidence intervals are shown within the braces.

To evaluate the significance of the exponent change when moving from resting to exercising states, and also to assess the interchangeability of those exponents, peak data from the experimental sample were re-scaled using the basal exponent (mass0.57). The residuals from that analysis revealed a significant, positive mass bias (r = 0.34; P < 0.05), thereby demonstrating the importance of using the scaling exponent appropriate to the metabolic intensity of interest, even when analysing data from the same individuals. However, since the exercising exponents were not significantly different (P > 0.05), it seems that only two exponents (basal and exercising) are necessary to account for mass-dependent variations in oxygen consumption across the metabolic range of healthy men.

An assessment of scaling robustness: peak aerobic power

The robustness of the allometric model for maximal treadmill exercise (Eq. 2C) was evaluated by applying it to data from the in-house database. Both sets of data were overlaid (Fig. 4), revealing good visual agreement, with the possible exception of two to three historical subjects (~ 80 kg).

Fig. 4
figure 4

Peak aerobic-power and body-mass data for the experimental subjects (N = 60; circles), with the corresponding allometric regression, overlaid with the co-ordinates obtained from the in-house, historical sample (N = 54; triangles)

Nevertheless, data clustering was evident, with ~ 80% of the historical participants having masses between 66 and 87 kg, so the body-mass distribution was uneven. Furthermore, 76% of those values were above the allometric regression (Fig. 4), resulting in under-predictions when using Eq. 2C. Whilst the average prediction error (residual) was 0.32 L min−1 (SD 0.47), the greatest absolute difference was 1.55 L min−1 (range −0.84 to 1.55), with 71% of the absolute residuals exceeding 0.25 L min−1. For the experimental participants, the residuals ranged from −0.77 to 0.92 L min−1, with 65% of the absolute residuals exceeding 0.25 L min−1. The similarity of those outcomes indicates that the experimental model was sufficiently robust to scale the historical data, and hypothesis four was cautiously accepted.

Finally, to evaluate the impact of applying a unique allometric regression to the historical database, those data were independently modelled (Eq. 2E). Since those data did not conform to the criteria necessary for generating a meaningful, independent mathematical scaling model (White and Seymour 2005). Therefore, the purpose of that exercise was to evaluate the implications of that practice. From a statistical perspective, the in-house model (Eq. 2E) was similar to the experimental model (Eq. 2B) in its goodness-of-fit (ANCOVA = P > 0.05; F test = P > 0.05; RMSE = 0.458). However, the body-mass exponent (mass0.55) approximated that observed for the basal oxygen consumption of the experimental subjects (mass0.57; P > 0.05). That outcome was theoretically unsound, and was not further considered.

$${\text{Peak\,aerobic power }} = {\text{ }}0.{\text{4}}0{\text{4}}\left\{ {0.{\text{154}} - {\text{1}}.{\text{116}}} \right\}{\text{ }} \times {\text{ mass}}^{{0.{\text{553}}\{ 0.{\text{323}} - 0.{\text{788}}\} }} ,$$
(2E)

where oxygen consumption units are L min−1 and body mass is in kg, 95% confidence intervals are shown within the braces.

Discussion

In this manuscript, the scaling of whole-body metabolism against body mass has been re-visited in healthy men during two steady-states (basal rest and walking), as well as during one of the most unsteady of physiological states; volitional exhaustion. Four significant outcomes have arisen. Firstly, whilst both the linear and non-linear scaling of peak aerobic power were supported statistically, ratiometric scaling merely converted a positive mass bias into a negative bias, and linear regression yielded a biologically impossible ordinate intercept. This confirms that the peak aerobic power of male adults scales allometrically with body mass (mass0.75), as it does for other mammals (mass~0.87; Taylor et al. 1981; Bishop 1999; Weibel et al. 2004). Therefore, across the entire metabolic range (aerobic scope) of these participants, oxygen consumption scaled allometrically. Secondly, the exponent for peak aerobic power was significantly larger than observed at rest (mass0.57). Thirdly, whilst steady-state oxygen consumption also scaled allometrically with body mass (mass0.87), its exponent did not differ significantly from that derived for peak aerobic power. Hence, the scaling exponent did not increase in a stepwise manner with increments in metabolic activity. Finally, the exponent for peak aerobic power could be used to normalise similar historical data, albeit with the expected noise. Nonetheless, the independent application of allometric scaling to those historical data, which infringed some sampling pre-requisites, resulted in an improbable relationship with body mass.

Linear versus non-linear scaling of peak aerobic power

The principal objective behind modelling these relationships is to remove the influence of inter-individual mass variations. The resulting mass-specific data may then be used to explore mechanistic questions, to make comparisons across individuals of varying size or to predict the effects of changes in overall mass during ambulatory activities, regardless of the reason for such changes (e.g., adiposity changes, load carriage). However, the validity of all models depends upon the mathematical precision of those procedures and the adherence of that relationship to biological principles. Whilst both are essential, the latter is of greater importance, since goodness-of-fit has little meaning if the relationship violates first principles (Sholl 1948; Motulsky and Ransnas 1987; Nevill et al. 1992; Rogers et al. 1995).

As shown by others, the peak aerobic power of humans (Vaage and Hermansen 1977; Secher et al. 1983; Bergh et al. 1991; Nevill et al. 1992; Rogers et al. 1995; Batterham and Jackson 2003; Markovic et al. 2007; Lolli et al. 2017) and other mammals (Taylor et al. 1981; Bishop 1999; Weibel et al. 2004) scales allometrically. Whilst linear analyses are frequently used for human data, ratiometric scaling is unable to remove the body-mass bias (Fig. 2E), and a single linear regression cannot be applied to the full body-mass range without violating the First Law of Thermodynamics as body mass approaches zero (Fig. 3). Thus, both linear procedures provided inaccurate descriptions of the underlying biological relationships, and yielded spurious outcomes when applied to morphologically diverse samples. Therefore, neither method is recommended for scaling peak aerobic power.

The novelty of the current procedures centres around the methods used by comparative physiologists (Taylor et al. 1981; Bishop 1999; Weibel et al. 2004; White and Seymour 2005; Sieg et al. 2009; White and Kearney 2014), for it is they who have most advanced this research, and developed recruitment and analytical principles for others to follow. Most notably, those principles require the recruitment of a sufficiently large sample of homogeneous participants covering a wide body-mass range, with those individuals distributed somewhat equally across that range. To our knowledge, the current procedures constitute the first attempt to thoroughly evaluate that relationship within humans. Collectively, these animal and human experiments provide convincing evidence that peak aerobic power varies non-linearly with body mass, as does basal metabolic rate (Bowes et al. 2021). Therefore, across the complete metabolic range, oxygen consumption for humans can be described using a simple allometric relationship with mass.

Nevertheless, a reduced sensitivity of the allometric model for peak aerobic power was observed, relative to data obtained from the two steady-state walking trials, as indicated by the poorer goodness-of-fit, the broader confidence intervals for the linear coefficient and greater discrepancies between the raw and log-transformed regression coefficients. Since peak data are inherently more variable than steady-state data, and since the experimental samples covered only a relatively narrow body-mass range, the reduced sensitivity was not unsurprising. Consequently, it is possible that a small change in the regression coefficients, but within the 95% confidence intervals, could produce an equally good model fit. However, bootstrapping was used within these analyses to maximise the likelihood of determining the most appropriate coefficients. Notwithstanding these limitations, it is recommended that the current maximal-exercise exponent (mass0.75) be used to derive mass-independent, oxygen-consumption data from data obtained during maximal treadmill exercise in men. However, there exists considerable variability in the scaling exponents reported for peak aerobic power, which must lead to some disquiet amongst students and applied physiologists.

What then is the allometric scaling exponent for peak aerobic power in humans?

The exponent for the peak aerobic power of the experimental subjects (mass0.75) did not differ significantly from the inter-specific exponent for maximally exercising, non-human mammals (mass0.87; P > 0.05; Weibel et al. 2004). Under resting conditions, the non-human, mammalian exponent (mass0.67; White and Seymour 2003) was also larger than that observed for the basal metabolic rate of the current experimental subjects (mass0.50–0.55; P < 0.05; Bowes et al. 2021). That difference was attributable to a species specificity of those resting exponents (White et al. 2009; White and Kearney 2014), which may not obtain during maximal exercise. However, within the human literature, one finds scaling exponents for peak aerobic power varying between 0.67 and 0.94 (e.g. Secher et al. 1983 [mass0.67]; Bergh et al. 1991 [mass0.71]; Nevill et al. 1992 [mass0.67]; Rogers et al. 1995 [mass0.75]; Batterham and Jackson 2003 [mass0.94]; Markovic et al. 2007 [mass0.67–0.89]; Lolli et al. 2017 [mass0.70]).

It seems that the highest design priority for those last seven (human) studies might have been to maximise their sample sizes, with all but two reports (Secher et al. 1983; Rogers et al. 1995) using large databases, or data extracted from the literature. Whilst large samples are beneficial, the quality of those data must be weighted more highly (White and Seymour 2005). Indeed, in none of those studies does one find evidence for obtaining an even distribution of body masses or for minimising inter-subject heterogeneity for variables known to independently influence peak aerobic power. Instead, two groups used both genders, as well as adolescents and adults (Rogers et al. 1995; Lolli et al. 2017), two used data from both mass-supported and unsupported exercise (Secher et al. 1983; Lolli et al. 2017), two groups studied elite athletes (Secher et al. 1983; Bergh et al. 1991), with two others using data from elderly (Batterham and Jackson 2003; Lolli et al. 2017), and even obese participants (Lolli et al. 2017). It is suggested that differences in the supply of, and demand for, energy among such participants might independently modify the scaling exponents.

To illustrate that variability, it is helpful to consider the largest of those investigations (Lolli et al. 2017); 6514 individuals across 36 primary investigations. Twenty-two of those studies focussed on, or included a significant number of adolescents; one study used obese boys and three emphasised elderly adults. If the current subject exclusion criteria had been applied, 72% of those studies would have been eliminated (2005 participants). Similarly, women would have been excluded (N = 916). To standardise health status, people with a high risk of metabolic syndrome (N = 571) and physical disabilities (N = 45; spinal-cord injuries), would have been eliminated. Thus, > 50% of the participants would have been excluded if steps were taken to minimise heterogeneity. The authors were aware of that variability, and attempted to counteract that effect statistically.

Another important consideration was the inclusion of data from mass-supported exercise. In the first instance, a significant proportion of the metabolic rate during treadmill exercise must be attributable to supporting an upright posture and to vertical displacements of the body mass. Thus, running activates a greater muscle mass than does wheelchair ergometry (N = 45), and, to a lesser extent, cycling (N = 22). Rowing is different again, possibly activating the greatest muscle mass, but it is unclear how many rowers were tested. The inclusion of those data within analyses aimed at producing mass-independent data seems counter-intuitive, as does the derivation of fat-free scaling exponents, when adipose tissue, just like any load, contributes to the overall carried mass and its metabolic cost during load-bearing activities.

Nevertheless, the scaling exponent for peak aerobic power derived by Lolli et al. (2017; mass0.70) is close to that obtained from the current experimental subjects (mass0.75), both of which are centrally positioned within the human data range. What are the implications of that similarity? Those authors counselled against using their exponent for scaling. Instead, they recommended scaling to fat-free mass. The current authors, whilst not challenging those analyses, have difficulty accepting the inclusion of such heterogeneous samples. The similarity of the two exponents is difficult to reconcile, and is perhaps a chance occurrence.

To provide some insight into that outcome, in-house, historical data (N = 54) were analysed (Eq. 2E). Although those data were both statistically and visually similar to the experimental data (Fig. 4), the equivalence of the scaling exponent (mass0.55) to the basal oxygen-consumption exponent (mass0.57) demonstrates that, unless appropriate recruitment criteria are followed (White and Seymour 2005), the generation and subsequent use of scaling models should be approached with caution. Indeed, this highlights the sensitivity of the scaling exponent to even slight violations of those criteria, with data from similarly aged, morphologically alike, endurance-trained males yielding a significantly different outcome.

Is the scaling exponent sensitive to an elevation in metabolic intensity?

The scaling exponent for peak aerobic power (mass0.75) increased significantly relative to the basal state (mass0.57). Since that difference was observed within the same sample, then it supports the hypothesis that the exponent for oxygen consumption is sensitive to an elevation in metabolic intensity, at least when moving from a basal state to maximal exercise. The mechanisms responsible for that elevation include proportional increases in the metabolically active body mass, in combination with an intensity-dependent progression through the metabolic pathways (Weibel 2002; Darveau et al. 2002).

The exponent change between rest and maximal exercise was anticipated. It was also hypothesised that the exponent would increase in a stepwise manner with increments in metabolic rate. That did not occur; the walking and maximal running were equivalent (P > 0.05). Therefore, a progressive elevation in the exponent (an allometric cascade) was not evident. It is possible that, unlike the change in skeletal-muscle activity between rest and running, the recruitment of inactive tissues was minimal when moving from walking to maximal running, with the primary change being a variation in the metabolic rate of already active muscles. It is also possible that the variability of those peak data, or even gait differences, may have masked an exercise-intensity effect. Resolution of this issue awaits evidence from a similarly robust experimental design with a much larger sample.

Nonetheless, within each scaling model (Eqs. 13), both coefficients could change. In the current analyses, the principal emphasis was upon the exponent (k), with the linear coefficient (b) not being analysed statistically. Inspection of Eqs. 2B–2D reveals that the linear coefficients for the basal and walking conditions were smaller and less variable. However, within the current mass range, those differences did not have a meaningful effect on the capacity of the maximal-exercise exponent to generate mass-independent data across both exercise intensities. This implies that unloaded, horizontal exercise could be scaled across all intensities using a common exponent (mass0.75–0.87), and the recommended exponent for that purpose is 0.75. What remains to be established is whether or not those same relationships obtain when loads are carried and when ambulatory gradients are modified.

Conclusion

From this experiment, it has been demonstrated that, across the entire metabolic range (aerobic scope) of healthy men, oxygen consumption scales allometrically with body mass. During maximal exercise, ratiometric scaling was unable to remove the body-mass bias, whilst linear regression produced biologically inappropriate data for low body masses. Neither of those methods is recommended. The size of the scaling exponent increased significantly between resting (mass0.57) and maximal-exercise states (mass0.75), but remained relatively stable when changing from steady-state walking (mass0.87) to maximal running. Therefore, across all non-resting, metabolic intensities, it is recommended that the exponent for peak aerobic power (mass0.75) be used to derive mass-specific oxygen consumption for adult men [Yadjusted = Ymeasured/massk (k = exponent)]. For those remaining unconvinced, it is recommended that, instead of linear analysis, absolute oxygen consumption (L min−1) and body masses should be reported (including data ranges), with individual data provided whenever possible.