Introduction

The accurate measurement of stroke volume and cardiac output during exercise is essential for evaluating cardiovascular function during growth and maturation, in response to training and as a site of impairment during disease. Various methods of measuring cardiac output during exercise are available and have been extensively reviewed elsewhere (Warburton et al. 1999a, 1999b; Driscoll et al. 1989; Rowland and Obert 2002). The accepted “gold standard” methods of determining cardiac output (i.e. direct Fick and dye dilution techniques) are only applicable to submaximal “steady state” exercise conditions and their invasive nature and inherent risks preclude their use within studies of healthy children.

A variety of non-invasive methods have been developed, for example carbon dioxide (CO2) or acetylene (C2H2) rebreathing and have been validated in adults against the direct techniques (Warburton et al. 1999a). These have also been used successfully to characterise children’s cardiac output responses during submaximal exercise (Rowland 1996). More recently Doppler echocardiography has come to the fore as a valuable method of cardiac output evaluation in children as it is both non-invasive and, significantly, is one of the few methods which do not require steady state exercise conditions, providing beat-to-beat measurement of cardiac function (Rowland and Obert 2002). The reliability of the Doppler technique with children has been demonstrated in conditions ranging from rest to maximal exercise (Vinet et al. 2001; Rowland et al. 1998). However, Doppler echocardiography is not without significant drawbacks including the requirement for expensive equipment and technical expertise. Thus the technique is not ideally suited for routine investigations in non-medical exercise physiology laboratories.

The measurement of cardiac output from thoracic electrical bioimpedance (TEB) is not a recent development (Kubicek et al. 1966) and combines relative simplicity of use with non-invasive automated measurements of beat-to-beat cardiac output. However, until recently the methodology has proved problematic and has been criticized for lack of accuracy and reliability. Previous TEB devices have been based upon the equations of Kubicek, Sramek, Berstein and coworkers (Kubicek et al. 1966; Berstein 1986; Sramek 1982). These equations require two components to estimate stroke volume, baseline thoracic impedance (Z0) and the pulsate variation of impedance (ΔZ) which represents variations in the volume and velocity of aortic blood flow. Devices based upon these methods require accurate placement of electrodes to determine Z0 and measurements are affected by perspiration, subcutaneous adiposity, and poor electrical contact resulting in questionable reproducibility (Warburton et al. 1999b).

More recently developed equipment such as the PhysioFlow device used in this study use an alternative equation for estimation of stroke volume (Charloux et al. 2000) which does not rely on the measurement of baseline impedance (Z0). The validity of the technique has been established in adults versus the direct Fick method at both rest and exercise (Charloux et al. 2000; Richard et al. 2001) and the reliability in adults is suggested to be at least as good as more established methodologies (Charloux et al. 2000; Richard et al. 2001). To date the feasibility and reliability of this technique with children at maximal exercise has not been evaluated.

Despite the extensive and routine use of laboratory tests of peak oxygen uptake (Peak \(\dot V{\text{O}}_2 \)) to evaluate children’s cardiorespiratory fitness (Armstrong and Welsman 1994, 2000) the reliability of the measure with young people is not thoroughly described. Test–retest reliability of peak \(\dot V{\text{O}}_2 \) has been examined by several authors (Figuera-Colon et al. 2000; Rivera-Brown and Frontera 1998; Pivarnik et al. 1996) but results have largely been limited to reports of intraclass correlation coefficients (ICC). It is well recognised that ICC is not an ideal measure of repeatability and, although still a subject of debate (Hopkins 2000; Atkinson and Nevill 2000), a more appropriate measure of reliability is gained from the calculation of typical error over at least three trials (Hopkins 2000).

Therefore, the aim of this study was to examine the reliability of peak \(\dot V{\text{O}}_2 \) concurrently with TEB derived measures of stroke volume and cardiac output over three maximal cycle ergometer tests in children.

Subjects

The study was reviewed and given ethical approval by the institutional ethical committee. Subjects for this study were volunteers recruited from a local school. The nature of the testing was explained to the children and their parents/caregivers in writing and written informed consent was obtained from both parties prior to participation. Eleven boys and nine girls were recruited and completed the exercise tests. None of the children were participating in exercise training. Children attended the laboratory on four separate occasions each separated by a week. During the first visit the children were familiarised with the laboratory environment, personnel, equipment and test procedures. On the three subsequent visits the children completed the same incremental cycle ergometer test for the determination of peak \(\dot V{\text{O}}_2 , \) cardiac output and stroke volume.

Methods

Age was computed from date of birth to date of testing. Stature was measured to the nearest 0.01 m using a Harpenden stadiometer. Body mass was measured to the nearest 0.1 kg using calibrated Avery balance beam scales.

Determination of peak \(\dot V{\text{O}}_2 \)

Peak \(\dot V{\text{O}}_2\) was determined using an incremental protocol on an electronically controlled cycle ergometer (Lode Excalibur Sport, Groningen, The Netherlands). All tests were conducted in the morning at approximately the same time for each child. The protocol consisted of a 2 min warmup at 0 W with subsequent increments of 10 or 15 W, according to body size, imposed every 2 min until the point of voluntary exhaustion. Children cycled at 60 rpm throughout. Respiratory parameters were monitored continuously throughout the test using an Oxycon Sigma on-line analysis system (Mijnhardt, The Netherlands) that was recalibrated before each test with gases of known concentration. The highest oxygen uptake attained (30 s average) was accepted as peak \(\dot V{\text{O}}_2\) if the subject showed signs of intense effort (e.g. facial flushing, hyperpnea), heart rate had reached a value within 5% of the age predicted maximum or had reached a plateau over the final stages of the test, and the respiratory exchange ratio (RER) was at least 1.0. All subjects reported within this study satisfied these criteria on each test occasion.

Determination of stroke volume and cardiac output at peak \(\dot V{\text{O}}_2\)

Cardiac output and stroke volume were measured non-invasively throughout the incremental exercise tests using a TEB device (PhysioFlow PF-05 Lab1, Manatec Biomedical, France). The PhysioFlow device and methodology have been thoroughly described elsewhere (Charloux et al. 2000). In brief, the bioimpedance method of cardiac output determination uses changes in transthoracic impedance during cardiac ejection to calculate SV. The PhysioFlow emits a high-frequency (75 kHz) and low-amperage (1.8 mA) alternating electrical current via electrodes. Cardiac output is based upon the following formula:

$$ {\text{Cardiac output}}\left( {{\text{l}}\,{\text{min}}^{ - 1} } \right) = fc\left( {{\text{beats}} \cdot \min ^{ - 1} } \right) \times {\text{SVi}}\left( {{\text{ml}} \cdot {\text{m}}^{ - 2} } \right) \times {\text{BSA}}\left( {{\text{m}}^2 } \right) $$

where, fc is heart rate based on R–R interval measurement determined from the ECG first derivative dECG/dt which provides a more stable signal than the ECG signal itself, BSA is body surface area calculated from the Haycock (Haycock et al. 1978) formula (BSA=0.024265×body mass0.5378×stature0.3964) and SVi is the stroke volume index (SV/BSA).

For measurement of impedance two pairs of electrodes were used, one pair transmitting and one pair sensing the impedance signal. Normally these would be placed at the base of the neck and on the xiphoid process. During preliminary studies we found that with the smaller size of children better signals were obtained with one of the neck electrodes attached to the centre of the forehead and the thorax electrodes placed off centre on the lower ribs avoiding the abdominal muscles. The position of the impedance electrodes is not critical as long as one is on the neck and one at the base of the thorax (Z0 evaluation is not required using the PhysioFlow) (Charloux et al. 2000). In addition two electrodes were used to measure a single ECG signal (positions V1 and V6). Leads were taped to the skin immediately below each electrode to minimise movement and prevent electrodes working loose during exercise.

Prior to exercise the PhysioFlow was autocalibrated having entered age, stature, mass, and resting blood pressure into the software. Blood pressure was measured by the same observer throughout this study using a manual sphygmomanometer with the child seated at rest on the cycle ergometer. Systolic and diastolic (korotkoff IV) blood pressure were measured four times with the last three measures averaged for entry into the calibration process.

During the calibration period of 30 heart beats the child remained immobile and relaxed on the cycle ergometer. The calibration period provides the basic curves and data necessary to identify SV variations according to ΔZ (impedance waveform) and dZ/dt (its first derivative) variations during exercise (Charloux et al. 2000). The largest impedance variation during systole and the largest rate of variation of the impedance signal during the calibration period are retained. Examples of the PhysioFlow waveforms are presented in Fig. 1. For this study, cardiac parameters (heart rate, stroke volume and cardiac output) were averaged over a 30 s period to be consistent with the \(\dot V{\text{O}}_2\) measurement. Values of cardiac output and stroke volume at peak \(\dot V{\text{O}}_2\) were used for the reliability analyses.

Fig. 1
figure 1

Waveforms obtained with the PhysioFlow. The upper trace is the electocardiogram (ECG). The middle and bottom traces are impedance (ΔZ) and its first derivative, dz/dt, respectively. Thoracic fluid inversion time (TFIT) and dz/dtmax are indicated on the first derivative waveform. Used by permission [Charloux A, Lonsdorfer-Wolf E, Richard R, Lampert E, Oswald-Mamosser M, Mettauer B, Geny B, Lonsdorfer J (2002) A new impedance cardiograph device for the non-invasive evaluation of cardiac output at rest and during exercise: comparison with the “direct” Fick method. Eur J Appl Physiol 82:313–320]

Data analysis

Descriptives (means and standard deviations) were calculated for anthropometric measurements and peak exercise variables for all subjects and for boys and girls separately using SPSS Version 11.0.

Reliability of peak \(\dot V{\text{O}}_2\), cardiac output and stroke volume at peak \(\dot V{\text{O}}_2\) were examined using the statistical methods recommended by Hopkins (2000). Initially consecutive pairwise comparisons were examined, i.e. test 1 vs. test 2 and test 2 vs. test 3, to identify any systematic changes in the reliability statistics between pairs of trials. These analyses yielded results for change in the mean, typical error with 95% confidence limits, change in the mean as a percentage and typical error as a coefficient of variation for each variable. They also enabled the data to be scrutinised for non-uniformity of errors (heteroscedasticity). These analyses were facilitated by a downloadable spreadsheet (Hopkins 2000, http://www.sportsci.org/resource/stats/index.html).

Subsequently data from the three trials were analysed together. A general linear model with subjects and trials as effects and with estimation by analysis of variance was computed for each of the three exercise variables. The root mean square error with precision of estimation represented by 95% confidence limits represented the typical error of each measurement. The same general linear models were also computed on 100×natural log transformed data yielding typical error expressed as a coefficient of variation. These were converted to exact percentages using the formula:

$$ 100\left( {{\text{e}}^{s/100} - 1} \right) $$

where s is the typical error (root mean square error) and e is exponential e. As suggested by Hopkins (2000) data were analysed by subgroups to provide an indication of any systematic differences in reliability between the sexes using the general linear models described above. Finally, group intraclass correlation coefficients were calculated along with their 95% confidence limits to facilitate comparisons with previous studies.

To provide an indication of the construct validity of the PhysioFlow technique, the relationship between cardiac output (l min−1) and \(\dot V{\text{O}}_2\) (l min−1) was examined across all determinations using linear regression.

Results

Descriptive data for anthropometric and exercise variables by group and sex are presented in Table 1. Reliability statistics for pairwise comparisons of consecutive tests are summarised in Table 2. For all measures, the change in the mean both in absolute terms and expressed as a percentage were slightly smaller for tests 3 and 2 compared with tests 2 and 1. Differences in the typical error between pairwise comparisons were small and when expressed as coefficients of variation were equal to or less than 2% for all three measurements. With coefficients of variation of <5% the reliability of peak \(\dot V{\text{O}}_2\) across tests was notably better than observed for stroke volume and cardiac output. From the plots of difference scores (test 2 minus test 1) vs. test 2 and similarly (test 3 minus test 2) vs. test 3 there was no evidence of non-uniform errors (heteroscedasticity) in this fairly homogeneous group of 10 year olds.

Table 1 Descriptive statistics for anthropometric and peak exercise variables over three test occasions
Table 2 Reliability data calculated from consecutive pairs of trials for peak exercise variables

Results from the general linear models analysing all three trials together are presented in Table 3. Data for boys and girls are also presented independently for indications of differences between the sexes. Intraclass correlation coefficients are also presented with 95% confidence limits for the three exercise measures in Table 3.

Table 3 Typical error (root mean square error) with confidence limits and typical error expressed as coefficients of variation for peak \(\dot V{\text{O}}_2\), stroke volume and cardiac output at peak \(\dot V{\text{O}}_2\) calculated over three trials

The relationship between cardiac output and \(\dot V{\text{O}}_2 \) in the children studied here is represented in Fig. 2. The derived regression equation was cardiac output= \( {\text{1}}{\text{.454 + }}{\left( {{\text{6}}{\text{.323}} \times \dot{V}{\text{O}}_{2} } \right)}l \cdot \min ^{{ - 1}} ,\;r^{2} = 0.469. \)

Fig. 2
figure 2

Relationship between maximal cardiac output and peak oxygen uptake in 20 boys and girls measured on three occasions

Discussion

The day to day variation in peak \({\dot {\text{VO}}}_2\) in adults is generally accepted to be within 5% (Katch et al. 1982) but few studies have documented whether such good agreement is characteristic of young people. The majority of previous studies have reported intraclass correlation coefficients for treadmill and cycle ergometer derived peak \(\dot V{\text{O}}_2\) measured on two occasions. The more recent of these have reported values ranging from 0.86 to 0.94 with no apparent effect of training status, sex or age (Rivera-Brown et al. 1995; Turley et al. 1995; Pivarnik et al. 1996). There appear to be no data reporting the stability of peak \(\dot V{\text{O}}_2\) over three trials. Whilst accepting the limitations of intraclass correlation as a measure of reliability (Atkinson and Nevill 1998; Hopkins 2000) the value obtained in the present study of 0.96 calculated over three trials with confidence limits of 0.91–0.98 is comparable with these earlier studies.

A few studies have reported alternative reliability statistics, again calculated over two trials. Several studies have reported no significant differences in mean peak \(\dot V{\text{O}}_2\) values from two trials measured 2 days to 6 weeks apart (Figuera-Colon et al. 2000; Rivera-Brown and Frontera 1998; Pivarnik et al. 1996) and intra-individual coefficients of variation ranging from 4.4% to 7.5% have also been reported over the same time frame. Pivarnik et al. (1996) reported a standard error of measurement (similar to the typical error reported in the present study) of 1.8 ml in 10–16 year old untrained girls representing a percentage variation of approximately 4.4%.

As previously mentioned, intraclass correlation coefficients and paired t-tests are limited in their ability to fully describe within-individual variation between repeated measurements. In particular, correlation-based methods are highly dependent upon sample size and composition such that a high correlation coefficient observed in a large and/or heterogeneous subject group may mask considerable variation in within-individual agreement between scores (Atkinson and Nevill 1998; Hopkins 2000). Calculation of typical error is not subject to sample specificity in this way and is independent of any change in the mean (Hopkins 2000).

When three or more trials are conducted it is recommended that an initial consecutive pairwise analysis is conducted, i.e. test 1 vs. test 2; test 2 vs. test 3 etc. to evaluate the presence of any learning effects (Hopkins 2000). From Table 2 it is evident that although there was a small improvement in reliability over the tests as indicated by the values for change in the mean, the typical error expressed as a coefficient of variation varied by less than 1% across the two analyses. Consequently all trials were analysed together and, in addition, were examined by sex as it has been suggested that where subgroups exist these should be examined for any consistent bias (Hopkins 2000). Supporting previous studies failing to indicate any sex differences in the reliability of children’s peak \(\dot V{\text{O}}_2\) the typical error as a coefficient of variation differed by only 0.5% between the boys and the girls with a group coefficient of variation of 4.1%. This finding confirms that peak \(\dot V{\text{O}}_2\) is at least as reliable in untrained children as in adults.

The reliability of more established methods of cardiac output determination is most widely documented for submaximal exercise conditions in adults. For the “gold standard” methods of direct Fick and dye dilution reliability has been established to be around 5–10% in adults (Warburton et al. 1998). The reliability of non-invasive methods of cardiac output determination have also been widely reported for steady-rate submaximal exercise conditions. Typical coefficients of variation of 5–6% for CO2 rebreathing (Zeidifard et al. 1972; Ferguson et al. 1968), C2H2 rebreathing (Warburton et al. 1998), and Doppler echocardiography (Espersen et al. 1995) have been reported.

Reliability data for cardiac output determination with children remain scarce but the available results demonstrate that measurements with children are at least as reliable as those derived with adults. During submaximal exercise reliability has been reported for CO2 rebreathing (plateau method) with coefficients of variation of 7–8% over five trials (Paterson et al. 1982). For CO2 rebreathing (exponential method) and Doppler echocardiography typical errors expressed as coefficients of variation of 14–16% and 6–8%, respectively have been observed (Nottin et al. 2001).

Data reporting reliability for cardiac output at maximal exercise are predominantly derived from Doppler echocardiography or C2H2 rebreathing with typical coefficients of variation for test–retest determinations around 5–8% in adults (Warburton et al. 1998; Rowland et al. 1998). In children, only one previous study appears to have examined the reliability of cardiac output at maximal exercise. In 14 11-year-olds measured on two occasions with both Doppler echocardiography and CO2 rebreathing (exponential method) coefficients of variation for maximal cardiac output were 5.2% and 11.7% and intraclass correlation coefficients were 0.85 and 0.94, respectively (Nottin et al. 2001).

There are very few published data on the reliability of TEB determinations of cardiac output using the PhysioFlow device in adults and there appear to be no data available for children. In seven moderately trained adults tested 3 days apart Richard et al. (2001) noted a maximum 16% variation between two impedance cardiac output measures during an incremental cycle test to maximum. No significant differences were observed on a test–retest basis for cardiac output with a mean difference of 0.009 l min−1 with 95% confidence limits of −4.12 to 4.12. A correlation of r=0.99 was reported for measures throughout the incremental test. On the basis of these results the authors concluded the method to be reliable and observed that the protocol was likely to have influenced the reliability as this was calculated over all stages of an incremental test.

Charloux et al. (2000) examined the reproducibility of the PhysioFlow in 10 adult patients tested 5 min apart at rest. Coefficients of repeatability (Bland and Altman 1986) were 0.94 and 0.96 l min−1 for TEB and simultaneously measured Fick, respectively indicating similar reproducibility between the methods.

Against this background, the present study with three trials on 20 subjects represents perhaps the most comprehensive reliability study of maximal cardiac output and stroke volume in children completed to date. The typical error expressed as a coefficient of variation for cardiac output of 9.3% for the group with an intraclass correlation coefficient of 0.86 is broadly comparable with results from previous studies using different methodologies for both adults and children, and considerably better than results obtained with adults (Richard et al. 2001). When examined by sex there was no consistent bias observed with values for boys and girls within 1%. When the data were analysed by consecutive pairs (Table 2), a small improvement in reliability was obtained when a third test was completed (<2%) which does not represent a major learning effect and indicates that with habituation trials reproducibility less than 9% is achievable.

Similar results were evident for the PhysioFlow measurement of stroke volume with a group typical error expressed as a coefficient of variation of 9.3%. This value compares with coefficients of variation of 4.9 and 5.7% for Doppler-determined maximal SV in children (Nottin et al. 2001; Rowland and Blum 2000) and 8.5% in young adult males (Rowland et al., 1998). Previous studies have reported intraclass correlation coefficients of >0.90 for Doppler-determined SV in children and adults (Vinet et al. 2001; Nottin et al. 2001) compared to the value of 0.88 obtained in the present study.

The validity of the PhysioFlow technique has been established in adults versus direct Fick determinations of cardiac output at rest to maximal exercise in both patients and healthy adults (Richard et al. 2001; Charloux et al. 2000). The findings have confirmed that the PhysioFlow device provides clinically acceptable values. For example, Richard et al. (2001) noted a mean difference between PhysioFlow and Fick of −2.78% (12.33%) under non-steady-state conditions. Although assessment of validity was not the main aim of the present paper, the nature of the study enabled evaluation of the construct validity of the technique through examination of the relationship between cardiac output and \(\dot V{\text{O}}_2\). The slope of the CO– \(\dot V{\text{O}}_2\) equation describes the rate of change between the two variables with increasing levels of \(\dot V{\text{O}}_2\)In studies with children values at maximal exercise of 4.50 and 6.98 have been reported for Doppler determinations (Rowland et al. 1997; Nottin et al. 2001) and 6.72 for CO2 rebreathing in a mixed sample of boys and girls (Nottin et al. 2001). Previous studies using traditional impedance methods with children identified slopes of 4.6 (Pianosi and Garros 1996; Rowland and Popowski 1996) and 5.2 (Pianosi 2004). Thus the value of 6.3 identified in the present study falls within the range reported from previous studies using a variety of techniques.

In conclusion, this study examined the reliability of peak \(\dot V{\text{O}}_2\)and PhysioFlow TEB measures of stroke volume and cardiac output in 20 children over three trials. The reliability statistics used are unbiased and provide the means for future studies using the same technique to estimate appropriate sample sizes (Hopkins 2000). Peak \(\dot V{\text{O}}_2\) was shown to be at least as reliable as in adults with typical error expressed as a coefficient of variation of less than 5%. The reliability of the PhysioFlow device was shown to fall between reported values for CO2 rebreathing and Doppler echocardiography at maximal exercise in children. Although slightly less reliable than Doppler the PhysioFlow device evaluated here represents significant advantages over the former technique notably in terms of expense and expertise required to perform the measurements during maximal exercise. The technique represents a valuable tool for exercise physiologists wishing to elucidate sex differences in cardiac parameters during exercise in children during growth and maturation or for monitoring changes in cardiac output with training.