Dear Editor-in-Chief Pietro E. di Prampero,

My colleagues and I recently read the study by Sanada et al. (2007) on the development of prediction models for maximal oxygen uptake \( (\ifmmode\expandafter\dot\else\expandafter\.\fi{V}{\text{O}}_{{{\text{2max}}}} ) \). We would like to focus our comments on two particular areas of the study: (1) the use of stepwise regression, and (2) the practical application of the prediction equations.

1. Use of stepwise regression: Stepwise regression allows a computer program to select a small set of the ‘best’ predictors from a larger set of potential predictors (Tabachnick and Fidell 2001). Stepwise procedures should not be used to develop prediction models because this method produces an inflated R-squared (R 2), inaccurate test of statistical significance, and it does not maximize the theoretical or practical value of the model (Berger 2004; Keppel and Wickens 2004). An essential problem is that estimates of population multiple correlations and tests of statistical significance fail to take into account how many variables were considered in the stepwise analysis. Inflation occurs whether the experimenter selects predictors after looking at the correlations or stepwise regression is used to select the ‘best’ predictors out of a larger set of potential predictors (Cohen et al. 2003). A more realistic estimate of the population multiple correlation is ‘shrunken’ R 2 based on the total number of variables considered. In the Sanada et al. (2007) study where the two strongest predictors from a set of 15 potential predictors produced R 2 of 0.72 with a sample of N = 40, an estimate of the population multiple R 2 based on 15 predictors is the shrunken R 2 of 0.55. Contrary to the conclusions of Sanada et al. (2007) based on their inflated R 2, their model offers no improvement on models generated in larger studies as shown in their Table 5.

Ordinarily a regression formula generated on one sample will produce a smaller R 2 when it is applied to a new sample (Pedhazur 1997). Thus, it is surprising that Sanada et al. (2007) found R 2 to be larger (R 2 = 0.83) in the validation group than in the derivation group (R 2 = 0.72) for which the model was generated. Perhaps this can be explained by large sampling error due to the extremely small sample size (N = 20) for the validation group.

In practice, it is always preferable for the investigator to control the order of entry of predictor variables based on theoretical considerations (Berger 2004). This procedure is called “hierarchical analysis,” and it requires the investigator to plan the analysis with care, prior to looking at the data. The double advantage of hierarchical methods over stepwise methods is that there is less capitalization on chance, and careful choice of the order of entry of predictors assures that results such as R 2 added are maximally interpretable (Berger 2004). Kerlinger (1986) stated that, “… the research problem and the theory behind the problem should determine the order of entry of variables in multiple regression analysis.” (p. 545). For example, Malek et al. (2004b, 2005) used hierarchical analysis to develop nonexercise-based \( \ifmmode\expandafter\dot\else\expandafter\.\fi{V}{\text{O}}_{{{\text{2max}}}} \) prediction model for aerobically trained men and women. The investigators purposefully controlled the order of entry for their predictor variables in order to determine the contribution of physical activity indices (i.e., duration, intensity of the exercise, and the length of time subjects performed habitual physical activity) on \( \ifmmode\expandafter\dot\else\expandafter\.\fi{V}{\text{O}}_{{{\text{2max}}}} \) above and beyond traditional predictors such as age, height, and weight. The comment by Sanada et al. (2007) that, “Although they (Malek et al. 2004b) suggested that the prediction equation is a valid method, their study used six predictors and no statistical selection, such as stepwise regression analysis …” (p.147) reflects a common misunderstanding of the pitfalls of using stepwise regression. Stepwise regression is entirely data driven, tests of statistical significance reported by popular statistics programs are incorrect, and the set of predictors identified by stepwise regression may not be ‘best’ in terms of generalizability or in terms of theoretical or practical value.

2. Practicality of the Sanada et al. equations: Although we recognize and appreciate the efforts of Sanada et al. (2007) in conducting their study, the predictor variables (thigh skeletal muscle and stroke volume) that are used in their model are not readily available to the general population. Many more practical \( \ifmmode\expandafter\dot\else\expandafter\.\fi{V}{\text{O}}_{{{\text{2max}}}} \) prediction equations derived with larger samples are already available. For example, formulas exist in the Exercise Science literature for treadmill and cycle ergometry (Vehrs et al. 2007; Malek et al. 2004a) as well as for various populations including adult men and women (Storer et al. 1990), teenage athletes (Wells et al. 1973), college students (George et al. 1997), older adults (Blackie et al. 1989), healthy Malaysian and Indian men (Singh et al. 1989; Verma et al. 1998), and aerobically trained individuals (Malek et al. 2004b, 2005).

Sanada et al. (2007) concluded that their results “… suggest that the thigh SM mass and cardiac dimensions are important determinants of \( \ifmmode\expandafter\dot\else\expandafter\.\fi{V}{\text{O}}_{{{\text{2max}}}} \) in healthy young men.” (p.147). This is not a new finding. Wagner’s papers on the determinants of \( \ifmmode\expandafter\dot\else\expandafter\.\fi{V}{\text{O}}_{{{\text{2max}}}} \) and the integrative approach to \( \ifmmode\expandafter\dot\else\expandafter\.\fi{V}{\text{O}}_{{{\text{2max}}}} \) elegantly address the question of which physiological variables contribute to \( \ifmmode\expandafter\dot\else\expandafter\.\fi{V}{\text{O}}_{{{\text{2max}}}} \) (Wagner 1988, 1993, 1996). Therefore, our conclusion is that the Sanada et al. (2007) paper does not add novel information to the \( \ifmmode\expandafter\dot\else\expandafter\.\fi{V}{\text{O}}_{{{\text{2max}}}} \) literature.