Introduction

Graded and ramp-like incremental tests are widely used for evaluation of physiological capacity both in clinical and athletic settings. Such tests can also be used to delineate physiological markers that are used to predict performance and guide training. These markers may include peak power output (\({\dot{{\rm W}}} \)peak), maximal oxygen uptake (\({\dot{\text{V}}\text{O}}_{2} \max\)) and submaximal physiological variables such as aerobic or lactate threshold, anaerobic threshold, ventilatory thresholds, and critical power (CP). However, \({\dot{{\rm W}}} \)peak and power output at these submaximal markers, whether expressed as absolute work rate or as a percentage of \({\dot{{\rm W}}} \)peak, will vary according to test design including stage duration and incremental rate (Adami et al. 2013; Amann et al. 2004; Bentley et al. 2003; Luttikholt et al. 2006; Morton 1994; Peiffer et al. 2005; Weston et al. 2002).

In the field of cycling particularly, \({\dot{{\rm W}}} \)peak is considered a significant performance parameter and even a better predictor of performance than \({\dot{\text{V}}\text{O}}_{2} \max\) (Balmer et al. 2000; Hawley et al. 1992; Lamberts et al. 2012). The effects of stage duration and stage increment on \({\dot{{\rm W}}} \)peak are well understood in a qualitative manner, in that a longer (or shorter) stage duration and/or a smaller (or greater) stage increment result in lower (or higher) \({\dot{{\rm W}}} \)peak (Morton 1994; Zuniga et al. 2012). However, for a valid comparison and meaningful interpretation of \({\dot{{\rm W}}} \)peak data across studies and tests and for consistent use of \({\dot{{\rm W}}} \)peak for establishing work intensities associated with the physiological markers mentioned above, the impact of different test protocols on \({\dot{{\rm W}}} \)peak, i.e., the ‘systematic error’ associated with design of an incremental exercise test (EXT), must be understood in a quantitative manner.

Luttikholt et al. (2006) were the first to present a mathematical model to predict \({\dot{{\rm W}}} \)peak from one EXT to another on the basis of two empirical power–duration relationships representative of moderate to well-trained cyclists. The model was validated against actual \({\dot{{\rm W}}} \)peak data from three different EXTs: EXT with a 25 W increment every 3 min (EXT25/3), EXT25/5 and EXT30/1. The level of agreement between actual and predicted \({\dot{{\rm W}}} \)peak was found to be sufficiently high for the modelling procedure to be of practical use. The goal of this article is to extend the modeling concept to a wider range of cycling abilities and performance characteristics and for a wider range of EXT designs with commonly used stage increments in the range 10–50 W and stage durations in the range 1–5 min.

Methods

Assumptions and principles underlying the prediction model

The basic assumptions and principles underlying the prediction model are described elsewhere (Luttikholt et al. 2006). In summary:

  1. 1.

    The incremental cycle exercise test to exhaustion involves work stages with a fixed duration and a fixed increment in work rate, without work interruptions between stages (e.g., for blood sampling). The test continues until volitional exhaustion of the subject.

  2. 2.

    Sustaining a given constant work rate for a time t (min) causes fatigue in a quantifiable manner determined by the ratio t/Tlim, where Tlim is the maximum time (min) for which that work rate could be sustained in a constant work rate test. For example, fatigue accumulation associated with sustaining a work rate with Tlim = 30 min for a time of 3 min is quantified as 3/30 × 100% = 10%.

  3. 3.

    Fatigue accumulated after completing a given work stage is carried over to the next work stage.

  4. 4.

    Volitional exhaustion, i.e., ‘task failure’, occurs at the point in time when, in the final work stage, aggregated fatigue across work stages reaches 100%.

  5. 5.

    \({\dot{{\rm W}}} \)peak is predicted using the formula

    $${\dot{{\rm W}}}{{\rm peak}} = {\dot{{\rm W}}}{{\rm complete}} + \left( {t/{\text{SD}} \times {\text{increment}}} \right)$$
    (1)

    where \({\dot{\rm{W}}} \)completed = work rate for the highest fully completed stage, t = the time (min) the final (non-completed) stage was sustained (if t > 0) and SD = stage duration.

The Tlim’s associated with the work rates of an EXT constitute critical input for the prediction model (see above under point 2). Power–duration relationships will vary between individuals due to inter-individual differences in terms of absolute performance level (power outputs) and ‘fatigue resistance’, i.e., the increase in Tlim (min) at a given decrease in work rate (W). Also, for a given individual, the power–duration relationship may change over time related to e.g., shifting training focus from long duration base training to high intensity interval training.

For the purpose of testing the effect of ‘individuality’ of the power–duration relationship on \({\dot{\rm{W}}} \)peak, we established three power–duration relationships, representing a wide range of inter-individual variability in ‘resistance to fatigue’. The power–duration relationships encompass work rates with Tlim in the range 2–360 min. Work rates with Tlim < 2 min were found not to be relevant because a subject will have terminated an EXT at a preceding stage with a lower work rate, or because such high work rates cannot be sustained for a long enough time t (min) for the associated fatigue (t/Tlim) to affect \({\dot{\rm{W}}} \)peak. Also, work rates with Tlim > 360 min were not considered as they do not significantly contribute to fatigue (and, therefore, will not significantly affect \({\dot{\rm{W}}} \)peak) given the ratio between stage duration (≤ 5 min) and Tlim (> 360 min) <  ~ 1%. In the power–duration relationships, we took the maximum mean power for a duration of 6 min (MMP6) as the reference for the maximum mean power for other durations t (MMPt). We chose MMP6 as a reference because it represents a close approximation of the work rate associated with \({\dot{\text{V}}\text{O}}_{2} \max\)(\({\text{P}} - {\dot{\text{V}}\text{O}}_{2} \max\)), given that average Tlim at \({\text{P}} - {\dot{\text{V}}\text{O}}_{2} \max\) is  ~ 6 min (Caputo et al. 2003; Rønnestad 2014; Stone et al. 2011). As a close approximation of \({\text{P}} - {\dot{\text{V}}\text{O}}_{2} \max\), MMP6 constitutes an important physiologic parameter in that it is a strong indicator of absolute cycling performance level and in that it demarcates the boundary between work rates in the supramaximal or extreme domain and work rates in the submaximal domain. In addition, using MMP6 as a reference work rate enabled us to model the effect of stage increment and stage duration on \({\dot{\rm{W}}} \)peak without having to take into account the actual absolute level of cycling performance capabilities. We established the section of the power–duration curves spanning durations in the range 6–360 min utilizing the empirical relationships between (maximum) mean running speed (v) and distance (D) of world class male runners. These relationships can be described by scaling law equations of the type v = c·(D)α,where ‘c’ is a normalization constant and α the scaling exponent which varies depending on the ‘fatigue resistance’ of the athlete. We chose scaling law equations with values for c of 10.27, 11.59, and 12.76 in combination with values for scaling exponent α of − 0.0575, − 0.070 and − 0.081, respectively, reflecting the performance level and the high/medium/low fatigue resistance in all-time top 30 male runners specialized in distances of (half) marathon, 10,000 and 3000 m, respectively (García-Manso et al. 2012). With each of these three scaling law equations we iteratively determined the maximum mean running speed for durations of 6, 7, 8, 9 10, 12 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 70, 80, 90, 120, 180, 240 and 360 min. Next, the maximum mean running speeds for the selected durations were converted to an associated oxygen uptake when running on a horizontal track as follows (Léger et al. 1984):

$${\dot{\text{V}}\text{O}}_{2} = 2.209 + 3.163 \times {\text{v}} + {\text{A}} \times {\text{v}}^{3}$$
(2)

where \({\dot{\text{V}}\text{O}}_{2}\) oxygen uptake (ml/kg·min) and v running speed (km/h). The term A × v3 represents the oxygen cost of running against air resistance. The parameter A is reciprocal with ‘projected area’ which depends on height and weight of the subject. For a reference subject of 180 cm height and 70 kg body mass we calibrated A = 0.001752.

The oxygen uptake associated with a given running speed v was converted to a cycling power output:

$${\text{P}} = \left( {{\dot{\text{V}}\text{O}}_{2} - 0.476} \right) \times 88.25$$
(3)

where P is cycling power output (W) corresponding with oxygen uptake (l/min) associated with running speed v.

Equation 3 is based on average values of the parameters in the oxygen uptake-power output relationships found by Hawley et al. (1992) and Lee et al. (2000). We calibrated the conversion between running speed and cycling power by considering a world class maximum mean running speed of 24.45 km/h for a duration 6 min to be equivalent to a world class maximum mean cycling power output of 497 W sustained by a 70 kg male cyclist (7.1 W/kg) for that same duration. With the assumptions made, the translation of results from running to cycling appears to be valid (Luttikholt, unpublished observations). 

As Eq. 2 is valid for running speeds between 8 and 25 km/h and given that maximum mean running speeds for durations < 6 min of the world class runners considered by García-Manso et al. (2012) were close to or exceeded 25 km/h we relied on the critical power (CP) model to establish the power–duration relationships for durations in the range 2–6 min. The CP model relates Tlim for a given work rate greater than CP:

$${\text{Tlim}} = {\text{W}}^{'} /\left( {{\text{WR}} - {\text{CP}}} \right)$$
(4)

where W′ is the maximum amount of work that can be performed above CP.

We selected values of 14.5, 17.5 and 21.5 kJ for the W′, representing low/average/high W′ in endurance-trained male athletes (Vanhatalo et al. 2011). Finally, we constructed the three power–duration relationships spanning the ‘full’ range of durations between 2 and 360 min by ‘connecting’ the segments of the power–duration curve in the range 2–6 min, described by the CP-model, with values for the W' of 14.5, 17.5 and 21 kJ, with the segments of the power–duration relationship in the range 6–360 min, described by scaling law equations with exponent − 0.0575, − 0.07 and − 0.081, respectively (see Fig. 1a and b and Table 1). Consequently, the three connected segment pairs represent matching combinations of ‘fatigue resistance’. The resulting power–duration relationships 1, 2 and 3 represent ‘high, ‘medium’ and ‘low’ resistance to fatigue, i.e., a ‘high’, ‘medium’ and ‘low’ increase of Tlim with a given decrease in work rate, respectively. An important feature of the power–durations relationships is that the differences in power output between any pair of durations solely depend on the scaling exponent α and are independent of the absolute value of the power outputs (García-Manso et al. 2012; Vandewalle 2018). This also applies to the curvature of the CP-model.

Fig. 1
figure 1

a The three power–duration relationships for durations in the range 2–360 min. The segments with Tlim in the range 6–360 min (to the right of the vertical line at Tlim = 6 min) were based on scaling law equations with values for the scaling exponent α of − 0.057, − 0.070 and − 0.081. The segments with Tlim in the range 2–6 min (left of the vertical line at Tlim = 6) were based on the CP-model with values for the W' of 14.5 kJ, 17.5 kJ and 20.5 kJ (see also enlarged section in b). Work rates (W) indicated along the y-axis are relative to MMP6 (maximum mean power for a duration of 6 min). Work rates > MMP6 are in the supramaximal domain, whereas work rates < MMP6 are in the submaximal domain. b Enlarged section of the three power–duration relationships in a for durations in the range 2–12 min

Table 1 The values of the W′ of the CP-model and of the scaling exponent α of the scaling law equations underlying the three established power–duration relationships for durations between 2 and 6 min and 6 and 360 min, respectively

Testing the validity of predicted \({\dot{\rm{W}}} \)peak differences between EXTs

For the purpose of testing the validity of the predicted \({\dot{\rm{W}}} \)peak differences (Δ\({\dot{\rm{W}}} \)peak) between EXTs, we relied on articles in exercise and sports science literature with actual \({\dot{\rm{W}}} \)peak data for at least two different EXTs. We considered EXTs and associated \({\dot{\rm{W}}} \)peak data for inclusion if they met the following criteria: (1) both work rate increment and stage duration were well-defined and constant; (2) work rate increments were in the range 10–50 W; (3) stage durations were in the range 1–5 min; (4) the exercise test was continuous; (5) \({\dot{\rm{W}}} \)peak definition was equivalent to the definition of \({\dot{\rm{W}}} \)peak described under point 5 of ‘Assumptions and principles’ (see above).

Statistical analyses

The agreement between actual and predicted Δ\({\dot{\rm{W}}} \)peak from the different EXTs was determined using the 95% limits of agreement (95% LoA) method (bias ± 1.96 × standard deviation) (Bland et al. 1999). The assumptions of normality for the distribution of the differences between actual and predicted Δ\({\dot{\rm{W}}} \)peak and that of homoscedasticity were confirmed using the Shapiro–Wilk test and 2-tailed Pearson correlation coefficients, respectively. Where underlying data from the selected studies with actual \({\dot{\rm{W}}} \)peak data allowed, the standard error of the difference (SED) between actual mean-\({\dot{\rm{W}}} \)peak data was estimated on the basis of the Standard Error of the Mean (SEM) of each mean-\({\dot{\rm{W}}} \)peak.

Results

Time to task failure

An important (intermediate) result in predicting \({\dot{\rm{W}}} \)peak was calculating the point in time in the EXT when accumulated fatigue reached 100% (see under point 4 of ‘Assumptions and principles’ above). For illustration purposes, Table 2 shows the results for EXT25/1 and EXT25/3 on the basis of Tlim values of power–duration relationship 2 (Fig. 1a and b). Based on the time t sustained in the last work stage, \({\dot{\rm{W}}} \)peak in EXT25/1 and EXT25/3 was calculated as follows:

$${\dot{{\rm W}}}{{\rm peak}}25/1 = \left( {{\text{MMP}}6 + 35} \right) + 0.778 \times 25 = \left( {{\text{MMP}}6 + 35} \right) + 19{\text{ W}} = {\text{MMP}}6 + 54{\text{ W}}.$$
$${\dot{{\rm W}}}{{\rm peak}}25/3 = \left( {{\text{MMP}}6 - 15} \right) + 1.65 \times 25 = \left( {{\text{MMP}}6 - 15} \right) + 13{\text{ W}} = {\text{MMP}}6{-}2{\text{ W}}.$$
Table 2 Example: calculation of accumulated fatigue (at the point of completion of the work stages in column 1) and of time to task failure in EXT25/1 and EXT25/3 based on power–duration relationship 2 (Fig. 1a and b)

The predicted difference between \({\dot{\rm{W}}} \)peak25/1 and \({\dot{\rm{W}}} \)peak25/3 was (MMP6 + 54 W) − (MMP6 − 2) = 56 W.

Figure 2 demonstrates how fatigue accumulated in EXT25/1, EXT25/2, EXT25/3, EXT25/4 and EXT25/5 on the basis of Tlim in power–duration relationship 2.

Fig. 2
figure 2

Development of accumulated fatigue (%) in EXT25/1, EXT25/2, EXT25/3, EXT25/4 and EXT25/5 based on power–duration relationship 2. The symbols denote the accumulated fatigue at the point of completion of the work rates of the stages used in the example in Table 2. The work rates are relative to MMP6 (maximum mean power for a duration of 6 min). For illustration purposes, we also showed the predicted difference of 56 W between \({\dot{\rm{W}}} \)peak25/1 (in the supramaximal domain) and \({\dot{\rm{W}}} \)peak25/3 (in the submaximal domain)

The effects of stage increment and stage duration on \({\dot{\rm{W}}} \)peak

The effects of stage increment and stage duration on \({\dot{\rm{W}}} \)peak for the three power–duration relationships are summarized in Fig. 3 and detailed in Table 3. Predicted \({\dot{\rm{W}}} \)peak in Fig. 3 and Table 3 are relative to \({\dot{\rm{W}}} \)peak25/3 (for the rationale for using \({\dot{\rm{W}}} \)peak25/3 as a reference: see under “Discussion”). Within the group of EXTs considered, the greatest Δ\({\dot{\rm{W}}} \)peak occurred when the biggest (smallest) stage increment was combined with the shortest (longest) stage duration, i.e. EXT50/1 and EXT10/3, respectively. The range of Δ\({\dot{\rm{W}}} \)peak between \({\dot{\rm{W}}} \)peak50/1 and \({\dot{\rm{W}}} \)peak10/3 is 102 and 130 W in power–duration relationship 1 and 3, respectively.

Fig. 3
figure 3

Effect of stage increment and stage duration on \({\dot{\rm{W}}} \)peak differences between EXTs for each of the three power duration relationships. The stage increments are on the x-axis and the stage durations are indicated at the far right end of the corresponding curves. The values on the y-axis denote the differences between \({\dot{\rm{W}}} \)peak relative to \({\dot{\rm{W}}} \)peak25/3. 1EXTs with 45 W increments were not found in literature. Hence, we refrained from calculating and including data points for EXTs with 45 W increments. 2For illustration purposes, we showed the predicted difference of 56 ± 5 W between \({\dot{\rm{W}}} \)peak25/1 and \({\dot{\rm{W}}} \)peak25/3 (see also Table 3)

Table 3 Predicted \({\dot{\rm{W}}} \)peak differences between EXTs (EXTs sorted by stage increment)

Figure 3 and Table 3 allow to distinguish between ‘slow’ EXTs with \({\dot{\rm{W}}} \)peak <  ~ \({\dot{\rm{W}}} \)peak25/3, and ‘fast’ EXTs with \({\dot{\rm{W}}} \)peak >  ~ \({\dot{\rm{W}}} \)peak25/3. ‘Slow’ EXTs are characterized by a 10 W increment combined with ≥ 2 min stage durations, 15 or 20 W increments combined with ≥ 3 min stage durations, 25, 30 or 35 W increments combined with ≥ 4 min stage durations or 40, 45 or 50 W increments combined with ≥ 5 min stage durations. ‘Fast’ EXTs are characterized by a 1 min stage duration combined with ≥ 10 W stage increments, a 2 min stage duration combined with ≥ 20 W stage increments, a 3 min stage duration combined with ≥ 30 W stage increments, or 4 min stage duration combined with ≥ 45 W stage increments. EXTs with ≥ 5 min stage duration cannot be ‘fast’ unless increments are much greater than 50 W. Simultaneous increase (decrease) of stage increment and stage duration have opposing effects on \({\dot{\rm{W}}} \)peak. As a consequence, different EXT designs may have the same \({\dot{\rm{W}}} \)peak. For example, the EXTs within each the following groups have the same \({\dot{\rm{W}}} \)peak: EXT10/1, EXT25/2 and EXT50/3; EXT10/2, EXT20/3, EXT30/4 and EXT50/5; EXT15/2, EXT25/3 and EXT40/4; EXT30/3 and EXT50/4 (Fig. 3 and Table 3). It follows from Fig. 3 and Table 3 that the effect of the power–duration relationship on Δ\({\dot{\rm{W}}} \)peak is relatively small compared to the effect of a  ≥ 5 W difference in stage increment or a  ≥ 1 min difference in stage duration. The effect of individuality of the power–duration relationship on Δ\({\dot{\rm{W}}} \)peak increases with increasing stage increment and decreasing stage duration, i.e., the ‘faster’ an EXT, the greater the effect. The relative range of that effect on Δ\({\dot{\rm{W}}} \)peak is defined by power–duration relationship 1 and 3 and is  ± 10% of Δ\({\dot{\rm{W}}} \)peak in power–duration relationship 2. For the vast majority of the EXTs considered, the absolute range of the effect of individuality of the power–duration relationship on Δ\({\dot{\rm{W}}} \)peak is less than  ± 5 W, which is insignificant given the estimated within-subject difference ≥ ~ 5 W between \({\dot{\rm{W}}} \)peak of duplicate EXTs (see under “Discussion”). Only in ‘fast’ EXTs with 1 min stage duration in combination with work rate increments > 25 W, the effect of individuality of the power–duration relationship on Δ\({\dot{\rm{W}}} \)peak exceeds 5 W.

Actual \({\dot{\rm{W}}} \)peak data and comparison of predicted Δ\({\dot{\rm{W}}} \)peak and actual Δ\({\dot{\rm{W}}} \)peak

The actual \({\dot{\rm{W}}} \)peak found in literature with data for at least two different EXTs are presented in Table 4. As a measure of validity, we calculated the differences between actual Δ\({\dot{\rm{W}}} \)peak with predicted Δ\({\dot{\rm{W}}} \)peak (see last column of Table 4). Predicted Δ\({\dot{\rm{W}}} \)peak were derived from the data in Table 3. Table 4 shows that the bias of the mean (actual ΔPPO- predicted ΔPPO) of power–duration relationship 1, 2 and 3 is small (6.9, 2.2 and − 2.3 W, respectively). The 95% CI (mean ± 1.96 × standard deviation) of the mean of all (actual Δ\({\dot{\rm{W}}} \)peak- predicted Δ\({\dot{\rm{W}}} \)peak) for each power–duration relationship includes ‘zero’. Therefore, we conclude for each of the three power–duration relationships that the overall mean of actual Δ\({\dot{\rm{W}}} \)peak is not significantly different from the overall mean of predicted Δ\({\dot{\rm{W}}} \)peak.

Table 4 Actual differences between \({\dot{\rm{W}}} \)peak found in literature and comparison of actual \({\dot{\rm{W}}} \)peak differences (actual Δ\({\dot{\rm{W}}} \)peak) with predicted \({\dot{\rm{W}}} \)peak differences (predicted Δ\({\dot{\rm{W}}} \)peak)

Discussion

The purpose of this study was to quantify the effects of stage increment and stage duration on Δ\({\dot{\rm{W}}} \)peak between EXT protocols. The study also investigated the effect of individuality of cycling performance capabilities on Δ\({\dot{\rm{W}}} \)peak. In presenting Δ\({\dot{\rm{W}}} \)peak we used \({\dot{\rm{W}}} \)peak25/3 as a reference. First, because amongst \({\dot{\rm{W}}} \)peak of the EXTs considered, we found \({\dot{\rm{W}}} \)peak25/3 to have the closest agreement with MMP6, i.e. a difference of only 3 W (as an average across the individuality-range of the underlying power–duration relationships). So, consistent with our assumption that MMP6 is a close approximation of \({\text{P}} - {\dot{\text{V}}\text{O}}_{2} \max\), \({\dot{\rm{W}}} \)peak25/3 too is close to \({\text{P}} - {\dot{\text{V}}\text{O}}_{2} \max\). Secondly, using \({\dot{\rm{W}}} \)peak25/3 as a reference for \({\dot{\rm{W}}} \)peak comparison builds on the recommendation of de Pauw et al. (2013) to apply EXT25/3 as a standard in incremental exercise testing.

\({\dot{\rm{W}}} \)peak has a standard error in the range 1–2% (Balmer et al. 2000; Lamberts 2009). In practice, with \({\dot{\rm{W}}} \)peak25/3 of well-trained male cyclists typically in the range 300–450 W (de Pauw et al. 2013), a within-subject difference between \({\dot{\rm{W}}} \)peak of duplicate EXTs  ≥ ~ 5 W is considered ‘significant’. This implies that a Δ\({\dot{\rm{W}}} \)peak between different EXTs ≥ ~ 5 W must be considered ‘significant’. We demonstrated that, at a given stage increment, the smallest difference in stage duration considered (1 min) already resulted in a ‘significant’ Δ\({\dot{\rm{W}}} \)peak. Similarly, at a given stage duration, the smallest difference in stage increment considered (5 W) already resulted in a ‘significant’ Δ\({\dot{\rm{W}}} \)peak.

Modeling aspects

The model is based on the assumption that sustaining a given constant work rate for a time t (min) is associated with an ‘amount’ of accumulated fatigue which can be quantified by the ratio t/Tlim, where Tlim is the maximum time (min) for which that work rate could be sustained in a constant work rate test. By adding together of the amounts of accumulated fatigue per work stage we calculated when fatigue accumulated during the whole test reached 100%. This point in time coincided with test termination. In calculating fatigue accumulation we did not consider work rates with an associated Tlim > ~ 360 min (see Table 1). We made this cut-off considering that sustaining a work stage at a lower work rate, even for the longest (stage) duration considered (5 min), is associated with negligible fatigue accumulation, as reflected by the ratio (stage duration/Tlim) ≤ 5 min/360 min ≤ ~ 1%. We verified that the effect on \({\dot{\rm{W}}} \)peak of ≤ 1% fatigue accumulation is < 1 W. Work rates with an associated Tlim > ~ 360 min not contributing to fatigue accumulation is consistent with these work rates being below the aerobic threshold, i.e., without increase in blood lactate concentration above baseline. Work rates with an associated Tlim < ~ 2 min were not needed to be considered given that in most EXTs task failure will have occurred at a preceding lower work rate. For example, task failure in EXT25/1 occurred during a work rate with an associated Tlim = 2.69 min (see Table 2). A work rate with an associated Tlim ~ 2 min was only achieved in the fastest protocols with a 1 min stage duration and ≥ 40 W stage increments. We note that our definition of fatigue is specific for the modeling and consistent with the traditional concept that metabolic processes and disturbance of homeostasis are the key determinants of fatigue during exercise of this type. Our model did not take into account a person’s subjective level of fatigue which is addressed by a more comprehensive concept in which ‘the symptom of fatigue emerges from interactions between two domains of fatigability (performance and perceived fatigability)’ (Enoka et al. 2018).

We showed that a simultaneous increase (decrease) of stage increment and stage duration have opposing effects on \({\dot{\rm{W}}} \)peak, resulting in different EXTs sometimes having a same \({\dot{\rm{W}}} \)peak. We identified groups in which EXTs had the same \({\dot{\rm{W}}} \)peak, one of them being the group including EXT10/2, EXT20/3, EXT30/4 and EXT50/5. These EXTs having the same \({\dot{\rm{W}}} \)peak is in agreement with the finding that for blood lactate concentrations in an EXT to achieve a quasi-steady-state (= 95% of steady-state-level), increments of 10, 20, 30, 40 and 50 W would require a minimum stage duration of 2, 3, 4, 4.75 and 5 min, respectively (Stockhausen et al. 1997). Apparently, EXT10/2, EXT20/3, EXT30/4, EXT40/4.75 and EXT50/5 induce a similar blood lactate response. Having similar blood lactate responses and assuming that lactate accumulation correlates with disturbance of homeostasis and development of fatigue, \({\dot{\rm{W}}} \)peak10/2, \({\dot{\rm{W}}} \)peak20/3, \({\dot{\rm{W}}} \)peak40/4.75 and \({\dot{\rm{W}}} \)peak50/5 were expected to be similar. (Note: we verified \({\dot{\rm{W}}} \)peak40/4.75 to be similar too).

Figure 3 and Table 3 permitted the identification of ‘slow’ EXTs, with \({\dot{\rm{W}}} \)peak in the submaximal domain (\({\dot{\rm{W}}} \)peak <  ~ \({\dot{\rm{W}}} \)peak25/3), and ‘fast’ EXTs with \({\dot{\rm{W}}} \)peak in the supramaximal domain (\({\dot{\rm{W}}} \)peak >  ~ \({\dot{\rm{W}}} \)peak25/3). Consistent with the above findings of Stockhausen et al. (1997), we conclude that ‘slow’ EXTs have a stage duration that, given the associated stage increment, is (more than) long enough for blood lactate concentrations to achieve a (quasi-)steady state. Consequently, fatigue accumulation already is mostly induced during stages in the heavy-intensity domain such that, once reaching work stages in the severe-intensity domain, task failure is ‘prematurely’ induced i.e., before \({\text{P}} - {\dot{\text{V}}\text{O}}_{2} \max\) can be attained. Similarly, ‘fast’ EXTs have a stage increment that, given the associated stage duration, is too great for blood lactate to achieve a (quasi-)steady-state. The blood lactate response ‘lagging behind’ the work rate increase is consistent with a delay in fatigue accumulation at work rates in the heavy-intensity domain, allowing continuation of exercise in work stages across the severe-intensity domain into the supramaximal domain, resulting in \({\dot{\rm{W}}} \)peak >  ~ \({\text{P}} - {\dot{\text{V}}\text{O}}_{2} \max\). EXT15/2, EXT25/3 and EXT40/4 are ‘special’ in that their specific combinations of stage increment and stage duration strike the balance between the level of accumulated fatigue induced by completing the imposed work rates in the heavy-intensity domain, whilst ‘leaving enough fatigue to be accumulated’ during work stages in the severe-intensity domain for \({\dot{\rm{W}}} \)peak to approximate \({\text{P}} - {\dot{\text{V}}\text{O}}_{2} \max\). We conclude that the modeled accumulation of fatigue during incremental exercise based on the ratio stage duration/Tlim mimics the accumulation of blood lactate concentrations in incremental exercise.

We based the power–duration relationships for durations in the range between 6 and 360 min on the O2-uptake-duration relationship in running which we, in turn, derived from empirical scaling laws. In running the O2-uptake increases in proportion to body mass with exponent 0.75 (BM0.75) (Berg et al. 1991; Svedenhag 1995). Consequently, the curvatures of three power–duration relationships too depend on BM0.75. To address this dependency on body mass we established the three relationships assuming a reference cyclist with a body mass (BM) of 70 kg and body fat content of ~ 10%. For a cyclist with a different body mass, the values (relative to MMP6) of the power–duration relationships for the 70 kg reference cyclist may be adjusted by multiplying with (BM/70)0.75. For example, assuming an 80 kg athlete, the values (relative to MMP6) of the power–duration relationships would be ~ 10% higher. Figure 3 and Table 3 show that the Δ\({\dot{\rm{W}}} \)peak based on power–duration relationship 1 and 3 are some 10% lower and higher than Δ\({\dot{\rm{W}}} \)peak based on relationship 2. At the same time, Fig. 3 shows that the effect of individuality of the power–duration relationship on Δ\({\dot{\rm{W}}} \)peak is relatively small compared to the effect on Δ\({\dot{\rm{W}}} \)peak of stage increment or stage duration. Therefore, the effect of body mass on Δ\({\dot{\rm{W}}} \)peak is relatively small too. It may be assumed that fat mass in itself does not affect the power–duration relationship. So, the power–duration relationship of the 70 kg reference endurance cyclist with a reference body fat content of 10% would be the same if her/his body mass would increase to 80 kg by an increase in body fat only. In summary, the effect of EXT protocol on Δ\({\dot{\rm{W}}} \)peak can be expected to decrease (increase) with decreasing (increasing) (lean) body mass. The lower (higher) the (lean) body mass, the more likely the Δ\({\dot{\rm{W}}} \)peak values are close to or outside the low (high) end of the range of Δ\({\dot{\rm{W}}} \)peak associated with relationship 1 and 3, respectively (Fig. 3). Nevertheless, the effect of (lean) body mass on Δ\({\dot{\rm{W}}} \)peak will remain small relative to the effect of EXT protocol.

With the time range for which the CP model can be expected to be valid being ~ 2 to ~ 15 min (Jones et al. 2019), we were able to compare the values of power–duration relationships predicted by the scaling law equations with the values of the CP-model in the overlapping applicability range of durations between 6 and 15 min. We found the values of the power–duration relationships in this overlapping time range to be equivalent. For durations >  ~ 15 min the power outputs predicted with the CP-model showed a levelling off relative to the power outputs predicted from using the scaling law equations. This is consistent with the CP-model trending towards a power-asymptote which effectively leads to overestimating exercise performance for long duration events such as the marathon (Vandewalle 2018), whereas the scaling law model shows a more ‘real-world’ continuing gradual decrease of power for long lasting exercises. However, it should be acknowledged that the CP-model appropriately separates the heavy-intensity from severe-intensity exercise domains within which physiological responses to exercise differ markedly (Jones et al.2019).

Effect of overall work rate increase

With regard to determination of parameters of aerobic function such as \({\dot{\text{V}}\text{O}}_{2} \max\)\({\dot{\text{V}}\text{O}}_{2} \max\) and anaerobic threshold, EXTs with the same ratio between stage increment and stage duration, i.e. having the same ‘overall rate of work rate increase’, are ‘interchangeable’ (Davis et al. 1982; Zhang et al. 1991). However, our results show that, when it comes to \({\dot{\rm{W}}} \)peak, EXTs with the same overall work rate increase are not ‘interchangeable’. For example, EXT25/1 and EXT50/2 have the same overall work rate increase (25 W/min), but have a significant difference (~ 14 W) between \({\dot{\rm{W}}} \)peak (see Table 3). Similarly, \({\dot{\rm{W}}} \)peak20/2 and \({\dot{\rm{W}}} \)peak40/4, with a same overall rate of work increase of 20 W/min, differ by ~ 11 W, and \({\dot{\rm{W}}} \)peak15/1 and \({\dot{\rm{W}}} \)peak30/2 (overall rate of work rate increase of 15 W/min) differ by ~ 8 W. In summary, given the same overall work rate increase, the EXT with the shorter stage duration will result in a higher \({\dot{\rm{W}}} \)peak than the EXT with the longer stage duration. Extrapolating the effect of overall work rate increase on \({\dot{\rm{W}}} \)peak in graded EXTs to ramp-like EXTs, e.g. 1 W increase every 2 s, suggests that \({\dot{\rm{W}}} \)peak in a ramp protocol will be significantly higher than \({\dot{\rm{W}}} \)peak in a graded EXT with the same overall work rate increase and a 1 min stage duration.

Validity of predicted \({\dot{\rm{W}}} \)peak differences

We tested the validity of predicted Δ\({\dot{\rm{W}}} \)peak by comparing with actual Δ\({\dot{\rm{W}}} \)peak data (see Table 4). We identified actual Δ\({\dot{\rm{W}}} \)peak data for subjects covering performance levels 1–4, ranging from (healthy) untrained or moderately active subjects (Hansen 1988; Adami et al. 2013) to highly trained cyclists (Weston et al. 2002). All \({\dot{\rm{W}}} \)peak data were derived from male subject groups.

With the exception of EXT25/3 in the study of Adami et al. (2013), and EXT25/3 and EXT25/5 in the study of Luttikholt et al. (2006), all protocols included in Table 4 (EXT10/1, EXT15/1, EXT25/1, EXT25/1.5, EXT25/2, EXT30/1, EXT35/1,EXT50/1 and EXT50/3,) were ‘fast’ protocols (\({\dot{\rm{W}}} \)peak in the supramaximal domain).

To be able to draw conclusions regarding the agreement between predicted Δ\({\dot{\rm{W}}} \)peak and actual Δ\({\dot{\rm{W}}} \)peak we needed to understand the variability of both predicted and actual Δ\({\dot{\rm{W}}} \)peak. The variability of the predicted Δ\({\dot{\rm{W}}} \)peak is modeled by the effect on Δ\({\dot{\rm{W}}} \)peak of the individuality of the three power–duration relationships. As mentioned above, we considered Δ\({\dot{\rm{W}}} \)peak associated with power–duration relationship 1 and 3 to represent the 95% CI of predicted Δ\({\dot{\rm{W}}} \)peak. In 10 out of the 15 results for (actual Δ\({\dot{\rm{W}}} \)peak − predicted Δ\({\dot{\rm{W}}} \)peak) in Table 4, the 95% CI included zero. In 2 of the 5 remaining results where the 95% CI did not include zero (see data from Weston et al.(2002) and Hansen et al.(1988), the results were marginally (1 W) outside the 95% CI. The remaining 3 results all were from the same paper by Adami et al. (2013). In each of these 3 results there was a greater than predicted positive difference between \({\dot{\rm{W}}} \)peak in the ‘fast’ protocols and \({\dot{\rm{W}}} \)peak25/3. EXT25/3 was the only protocol where the duration of the stages was long enough for blood lactate to reach a quasi-steady-state. This likely explains why the subjects who were not endurance-trained ‘underperformed’ in EXT25/3. The variability of actual Δ\({\dot{\rm{W}}} \)peak was estimated by calculating the SED of actual mean-\({\dot{\rm{W}}} \)peak. Based on the underlying data of Luttikholt et al. (2006) we calculated SED’s of 4.0, 2.7 and 4.4 W between \({\dot{\rm{W}}} \)peak30/1 and \({\dot{\rm{W}}} \)peak25/3, \({\dot{\rm{W}}} \)peak25/3 and \({\dot{\rm{W}}} \)peak25/5 and \({\dot{\rm{W}}} \)peak30/1 and \({\dot{\rm{W}}} \)peak25/5, respectively. From the study of Hansen (1988), we calculated (having excluded subjects 6 and 7 because of missing \({\dot{\rm{W}}} \)peak data points) SED = 7.6 ± 1.2 W (mean ± standard deviation) between \({\dot{\rm{W}}} \)peak15/1 and \({\dot{\rm{W}}} \)peak30/1. The lower SED’s in the study of Luttikholt et al. (2006) relative to the SED in the study of Hansen (1988) is probably related to the subjects in the former study being moderate to well-trained, versus subjects in the latter study being untrained/inexperienced. When assuming that actual Δ\({\dot{\rm{W}}} \)peak has an estimated SED = 6 ± 3 W (mean ± 95% CI), each actual Δ\({\dot{\rm{W}}} \)peak in the third column of Table 4 falls in the 95% CI of predicted Δ\({\dot{\rm{W}}} \)peak in the fourth column. The systematic bias between actual and predicted Δ\({\dot{\rm{W}}} \)peak is quite small and insignificant (~ 2 W) for power–duration relationships 2 and 3. The bias between actual and predicted Δ\({\dot{\rm{W}}} \)peak for power–duration relationship 1 is somewhat greater (~ 7 W) but still insignificant. The 95% LoA’s indicate a high level of agreement between actual and predicted Δ\({\dot{\rm{W}}} \)peak, especially when based on power–duration relationship 2 and 3. Power–duration relationship 1 (representing a high resistance to fatigue) is associated with a somewhat higher 95% LoA. This may be explained by the majority of the actual \({\dot{\rm{W}}} \)peak data in Table 4 being derived from untrained subjects who are likely to have a low resistance to fatigue. In summary, we conclude that actual Δ\({\dot{\rm{W}}} \)peak and predicted Δ\({\dot{\rm{W}}} \)peak are not significantly different and in good agreement.

Practical applications

Knowing Δ\({\dot{\rm{W}}} \)peak between EXTs allows the prediction of \({\dot{\rm{W}}} \)peak from one EXT design to another. For example, when \({\dot{\rm{W}}} \)peak20/1 of a cyclist was 400 W, and we wanted to know \({\dot{\rm{W}}} \)peak40/5, we simply use Table 3 to find that \({\dot{\rm{W}}} \)peak40/5 is 54 ± 4 W lower than \({\dot{\rm{W}}} \)peak20/1, resulting in \({\dot{\rm{W}}} \)peak40/5 = \({\dot{\rm{W}}} \)peak20/1 − (54 ± 4) = 345 ± 4 W. The three below examples demonstrate practical applications of the finding that \({\dot{\rm{W}}} \)peak15/2, \({\dot{\rm{W}}} \)peak25/3 and \({\dot{\rm{W}}} \)peak40/4, as a close approximation of \({\text{P}} - {\dot{\text{V}}\text{O}}_{2} \max\), can be useful in training and testing. Ventilatory thresholds, when expressed as percentage \({\dot{\text{V}}\text{O}}_{2} \max\)% (\({\dot{\text{V}}\text{O}}_{2} \max\)) are independent of EXT protocol, whereas \({\dot{\rm{W}}} \)peak associated with \({\dot{\text{V}}\text{O}}_{2}\)peak and with these thresholds are higher when EXT is ‘faster’ (Weston et al. 2002). By ‘adjusting’ \({\dot{\rm{W}}} \)peak to \({\text{P}} - {\dot{\text{V}}\text{O}}_{2} \max\), the work rates associated with the ventilatory thresholds can be established as % \({\dot{\rm{W}}} \)peak15/2, \({\dot{\rm{W}}} \)peak25/3 or \({\dot{\rm{W}}} \)peak40/4. Furthermore, %\({\dot{\rm{W}}} \)peak15/2, %\({\dot{\rm{W}}} \)peak25/3 or %\({\dot{\rm{W}}} \)peak40/4 can be expected to be equivalent to % heart rate reserve (HRR) and %\({\dot{\text{V}}\text{O}}_{2}\) reserve (\({\dot{\text{V}}\text{O}}_{2}\)R). Hence, for estimating or prescribing exercise intensities, %\({\dot{\rm{W}}} \)peak15/2, %\({\dot{\rm{W}}} \)peak25/3 or %\({\dot{\rm{W}}} \)peak40/4 can be incorporated in established equivalency relationships between %HRR and %\({\dot{\text{V}}\text{O}}_{2}\)R in cyclists (Lounana et al. 2007). Lastly, \({\dot{\rm{W}}} \)peak25/3 has been recommended for unified classification of performance levels (De Pauw et al. 2013). For the purpose of performance level classification of the cyclist in the above example with \({\dot{\rm{W}}} \)peak20/1 = 400 W, we calculated (utilizing Table 3) \({\dot{\rm{W}}} \)peak25/3 = 400–45 ± 4 W = 355 ± 4 W. This value for \({\dot{\rm{W}}} \)peak25/3 falls in the power output range associated with performance level 3, i.e. 320–379 W.

Conclusions

The model quantified the effects of EXT protocol on \({\dot{\rm{W}}} \)peak in continuous incremental exercise tests. The importance of our modeling results is that \({\dot{\rm{W}}} \)peak data reported for a wide range of EXTs appearing in exercise and sports science literature have now been made comparable. The effect of individuality of endurance performance abilities on Δ\({\dot{\rm{W}}} \)peak between different EXT is small compared to the effect of stage increment or stage duration. Amongst the EXTs considered, \({\dot{\rm{W}}} \)peak15/2, \({\dot{\rm{W}}} \)peak25/3 and \({\dot{\rm{W}}} \)peak40/4 were equivalent and a close approximation of \({\text{P}} - {\dot{\text{V}}\text{O}}_{2} \max\). So, \({\dot{\rm{W}}} \)peak15/2, \({\dot{\rm{W}}} \)peak25/3 and \({\dot{\rm{W}}} \)peak40/4 constitute a logical physiologic reference for exercise intensity prescription and a practical measure for performance diagnostics and classification of absolute performance level in cycling.