Introduction

Physique traits are associated with competitive success amongst athlete populations in many sports with specific physique traits varying with the sport, player or position (Olds 2001; Slater et al. 2010). Therefore, it has become essential for practitioners to monitor physique traits of athletes in response to growth, training, and/or dietary interventions (Ackland et al. 2012). Highly trained athletes likely see only small changes in body composition over time (Binkley et al. 2015; Harley et al. 2011), ensuring the need for highly precise techniques to identify small but potentially important changes in physique traits. A deeper understanding of these changes can enable better refinement of interventions and thus, potentially enhance performance outcomes.

The most popular body composition assessment techniques used in practice include two-compartment (2C) models such as dual-energy X-ray absorptiometry (DXA) (ZiMian et al. 2010), air displacement plethysmography (BOD POD) and bioelectrical impedance spectroscopy (BIS), as well as surface anthropometry (SA) (Meyer et al. 2013), yet they are all vulnerable to inaccuracy and imprecision. Despite the three-compartment (3C) model (fat mass, fat-free mass, total body water) having greater validity (Withers et al. 1998) or the four-compartment (4C) model (FM, FFM, TBW, bone mineral content) considered the current reference method (Withers et al. 1998), these models are also impacted by subject presentation due to acute changes in factors such as hydration status (Kerr et al. 2017). Technical factors such as discrepancies in machine software/hardware (Toombs et al. 2012), subject positioning (Nana et al. 2012) or technician expertise (Hume and Marfell-Jones 2008) affect precision, whereas biological factors influenced by subject presentation play an important role (Bunt et al. 1989; Dixon et al. 2009; Kerr et al. 2017; Nana et al. 2012). Exercise, plus food and fluid intake prior to assessment (Pietrobelli et al. 1998; Heiss et al. 2009; Dixon et al. 2009; Gallagher et al. 1998; Rouillier et al. 2015), core body temperature fluctuations (Fields et al. 2004) as well as muscle creatine and glycogen changes (Bone et al. 2016) are known to impact results. Arriving for assessment without controlling for these variables is referred to as non-standardised presentation. Industry recognises quality control to minimise technical and biological variance so some guidance has been provided by manufacturers to obtain more accurate and reliable measurements. For example, COSMED and ImpediMed, the manufacturers of the BOD POD (COSMED USA 2010) and SFB7 (ImpediMed 2016) instruments respectively, recommend a 2 h food/fluid and exercise-free period while bioelectrical impedance analysis (BIA) clinical guidelines advocate 8 h fast and abstinence from alcohol consumption, plus avoidance of excessive exercise, 2 h prior to test assessment (Kyle et al. 2004b). An investigation into the reliability of DXA has recommended assessment after an overnight fast (Nana et al. 2012) yet some manufacturers make no reference to subject presentation (Kelly et al. 2009; Lunar 2011). Standardised presentation refers to subjects presenting for assessment overnight fasted, rested and well hydrated, in accordance with current best practice guidance (Dixon et al. 2009; Nana et al. 2012; Utter et al. 2003). Despite recommendations that subjects present for assessment under these conditions, studies monitoring body composition change in athletes utilising a standardised subject presentation prior to measurement are rare. Accordingly, the impact of biological factors on interpretation of longitudinal monitoring of body composition remains to be elucidated.

The aims of this study were to (1) assess the changes in body composition after 6 months of self-selected training and diet by each technique, and (2) to evaluate the impact of subject presentation (standardised versus non-standardised) on interpretation of body composition change over time using SA, 2C, 3C and 4C models of assessment. It was hypothesized that failing to standardise subject presentation prior to body composition assessment would result in a misinterpretation of the true change in fat mass (FM) and fat-free mass (FFM), as inferred from results collected under standardised presentation. We hypothesized the direction of this change in interpretation of body composition would vary depending on the method used.

Materials and methods

Subjects

Thirty Caucasian athletic subjects who met the inclusion criteria which included male gender, at least 2 years resistance training experience, and minimum body mass index (BMI) of ≥ 25 kg/m2 volunteered to participate in this study. Subjects were excluded from the study if they were more than 190 cm tall due to the limitation of the active scanning area of the DXA bed. The characteristics of subjects are presented in Table 1. All subjects were informed of the nature and possible risks of the investigation before giving their written informed consent. This study was conducted according to the guidelines laid down in the Declaration of Helsinki and all procedures involving subjects were approved by the Human Research Ethics Committee of the University of the Sunshine Coast (Ethics Approval Number S/12/450).

Table 1 Descriptive statistics for body composition variables

Experimental design

All subjects underwent two identical testing sessions in a single day which were repeated 6 months later (Fig. 1). Each session commenced with a total body DXA scan immediately followed by a BIS assessment, a BOD POD test and quantification of subcutaneous fat mass via SA, in that sequence. This order of testing was in accordance with prior assessments (Kerr et al. 2017) as it was deemed the most efficient and timely to undertake the tests. Each subject undertook the morning testing session (BASEam) under standardised conditions (early morning, overnight fasted, euhydrated and rested). A repeat test session (BASEpm) was undertaken at a random time later in the afternoon (~ 7 h later), after ad libitum food, fluid and physical activity without intervention. This protocol was then repeated 6 months later with test sessions referred to as POSTam and POSTpm. Comparison of these testing sessions allowed the calculation of mean and individual longitudinal change scores, 6 months later.

Fig. 1
figure 1

Body composition was assessed on four occasions over a 6 month period

Subject presentation

Guidance was provided to ensure subject presentation was standardised for two of the testing sessions (BASEam and POSTam). Subjects were instructed to increase fluid intake on the days prior to testing to ensure a euhydrated state. They were required to present overnight, fasted and well rested (no prior physical activity) on the mornings before BASEam and POSTam. They were asked to wear minimal fitted clothing with metal objects and jewellery removed, with clothing checked for metal zips or studs. Hydration status was assessed by a mid-stream sample of urine provided by the subjects early on the mornings prior to testing. The specific gravity of the urine sample was measured using a digital refractometer (UG-Alpha, Atago Corporation, Tokyo, Japan). All subjects voided their bladder prior to tests. Stretch stature was measured with a stadiometer (Harpenden, Holtain Limited, Crymych, United Kingdom) to the nearest 0.1 cm. Body mass was measured on a calibrated scale to the nearest 0.01 kg (SECA GMBH, Germany).

Dual-energy X-ray absorptiometry

All DXA scans were undertaken in the total body mode on a pencil beam DXA scanner (Lunar DPX, GE Healthcare, Madison, WI, USA) with analysis performed using GE enCORE v.13.6 software (GE Healthcare) with the combined Geelong/Lunar reference database. The DXA was calibrated with phantoms as per the manufacturer’s guidelines each day before measurements were taken. All scans were conducted by the same Queensland Radiation Health licenced technician using the standard thickness mode as determined by the auto scan feature in the software and all safety protocols as per the Institution’s Radiation Safety Protection Plan were adhered to. The coefficient of variation (CV) for the laboratory using this technique is 0.1, 2.2, 0.6, and 1.0% for body mass (BM), FM, lean mass (LM) and BMC respectively. The SWE for our laboratory using this technique is presented in Table 2.

Table 2 Longitudinal change in mean for all testing conditions

The scans were performed according to a protocol developed that emphasises a consistent positioning of subjects on the DXA scanning bed as previously described (Nana et al. 2012). Additionally, two Velcro straps were used to minimise any subject movement during the scan as well as provide a consistent body position for subsequent scans. One strap was secured around the ankles above the foot positioning pad and the other strap was secured around the trunk at the level of the mid forearms. All scans were analysed automatically by the DXA software but all regions of interest were reconfirmed by the same technician before being included in the subsequent statistical analysis.

Bioelectrical impedance spectroscopy

Immediately after each DXA scan whilst the subjects were still positioned on the DXA scanning bed, TBW was measured using the BIS device, SFB7 (ImpediMed, Brisbane, Australia). Subject positioning was standardised to ensure they lay in the supine position on the non-conductive foam mattress (dimensions being 75 cm in width and 236 cm in length) with hip abduction of 15°–30° (Kyle et al. 2004b) without contact to the metal side supports of the DXA scanner for a minimum of 15 min prior to BIS measurements. Prior to each testing session the BIS device was calibrated using a test cell provided by the manufacturer. The subject’s stature, body mass, age and gender were then programmed into the unit. Sites of attachment for the electrodes (ImpediMed, Brisbane, Australia) were first shaved at the foot and wrist if required, and cleaned with alcohol wipes before the dual-tab electrodes were attached in accordance with previous guidelines (Kerr et al. 2015). The SFB7 measures impedance using 256 frequencies between 4 and 1024 kHz to estimate TBW based on a Cole–Cole plot (Cornish et al. 1996). The Cole model shows impedance data across a spectrum of frequencies that has been plotted when resistance and reactance to the current are measured in biological tissue (Matthie 2008). Three measurements were taken consecutively and the median of these used in subsequent analysis. The BIS estimates body composition using the Pace et al. model (Pace and Rathbun 1945) to measure TBW, and subsequently FFM and then FM through simple subtraction from body mass, creating a field-based 2C model of physique assessment (Kyle et al. 2004a). The CV for the laboratory using this technique is 8.7% (FM) and 0.8% (FFM). The SWE for our laboratory using this technique is presented in Table 2.

Air displacement plethysmography

Immediately after BIS measurement, assessment of body density was undertaken using the BOD POD (BOD POD, Life Measurement Instruments, Concord, CA, USA) following the recommended procedures of the manufacturer (COSMED USA 2010) utilising a predicted thoracic lung volume (VTG) estimation (Crapo et al. 1982). A body mass value as well as stretch stature, gender and age were incorporated into an equation by the software to estimate a predicted thoracic lung volume (VTG). The subject cohort consisted of healthy male adults deemed acceptable for use of predicted VTG estimations (Crapo et al. 1982). Subjects were given a brief description of the procedure before entering the chamber for the first of two sequential body volume measurements, wearing only lycra clothing and a swim cap, with all metal objects removed prior to measurement. If the difference between these two measurements was > 150 mL a third measurement was taken. Body density was calculated by the BOD POD’s software system (COSMED V5.3.2) as follows:

$$D{\text{ }}\left( {{\text{density}}} \right){\text{ }}={\text{ Mass }}\left( {{\text{scale}}} \right)/{\text{Volume }}\left( {{\text{BOD POD}}} \right).$$

An estimate of FM and FFM was obtained after using the simple 2C model to calculate %BF as defined by the Siri equation (Siri 1961), as follows:

$$~\% {\text{BF }}={\text{ }}\left( {{\text{497}}.{\text{1}}/{\text{body density}}} \right){\text{ }} - {\text{ 451}}.{\text{9}}.$$

FFM (kg) and FM (kg) estimates were obtained using %BF values and subjects’ body mass. The CV for the laboratory using this technique is 3.4% (FM) and 0.5% (FFM). The SWE for our laboratory using this technique is presented in Table 2.

Surface anthropometry

Immediately after completion of BOD POD assessment, duplicate skinfold measurements were taken according to the International Society for the Advancement of Kinanthropometry (ISAK) guidance by the same technician certified by ISAK as previously described (Norton et al. 1996). The intra technical error of measurement (TEM) of 0.2 mm and 0.6% for the technician was calculated by taking the difference between the first and second measurement (d), squaring it (d2), adding them up for each subject (d2), dividing by 2n (where n is the number of subjects), and taking the square root. Therefore:

$${\text{TEM }}={\text{ }}\surd {\text{ }}\left[ {\sum \_{d^2}{\text{/}}\left( {2n} \right)} \right].$$

The sum of eight skinfolds was determined following measurements of the triceps, biceps, sub scapulae, iliac crest, supraspinale, abdominal, quadriceps, and calf skinfold using previously described definitions of technique, with a calibrated skinfold caliper (Harpenden, Baty International, UK). The mean of duplicate and median of triplicate measures was then used in the subsequent analysis. Due to the similar procedure, equipment and population used, the 4C validated Evans equation of three skinfolds (triceps, abdominal and thigh) was utilised to calculate %BF as follows (Evans et al. 2005):

$$\begin{aligned} & \% {\text{BF}}={\text{ }}8.997{\text{ }}+{\text{ }}0.24658\times \left( {3{\text{SKF}}} \right){\text{ }} - {\text{ }}6.343\times \left( {{\text{gender}}} \right){\text{ }} - {\text{ }}1.998\times ({\text{race}}), \\ & {\text{Gender coded as }}0{\text{ }}={\text{ female}},{\text{ }}1{\text{ }}={\text{ male}},{\text{ and race coded as }}0{\text{ }}={\text{ white}},{\text{ }}1{\text{ }}={\text{ black}}{\text{.}} \\ \end{aligned}$$

FFM (kg) and FM (kg) estimates were obtained using %BF values and subjects’ body mass. The CV for the laboratory using this technique is 0.8%, 0.9% and 0.2% for absolute values (mm), FM and FFM respectively. The SWE for our laboratory using this technique is presented in Table 2.

Three- and four-compartment models

Utilising the body density values obtained by the BOD POD and the TBW estimations from the BIS, a 3C model was created for percentage of body fat calculated using the Siri equation as described by Withers et al., being a 3C model derived from prior research on highly trained individuals (Withers et al. 1998).

$$\% {\text{BF }}={\text{ 211}}.{\text{5}}/{\text{body density }} - {\text{ 78}}.0 \times \left( {{\text{TBW}}/{\text{body mass}}} \right){\text{ }} - {\text{ 134}}.{\text{8.}}$$

Similarly for the 4C model, the additional variable of BMC measured from DXA was incorporated to calculate percentage of body fat using the Siri equation as described by Withers et al., being a 4C model derived from prior research on highly trained individuals (Withers et al. 1998). The BMC was converted to bone mineral mass by multiplying it by 1.0436 (Withers et al. 1998) before being incorporated into the following equation: 

$$\% {\text{BF }}={\text{ 251}}.{\text{3}}/{\text{body\,density }} - {\text{ 73}}.9 \left( {{\text{TBW}}/{\text{body\,mass}}} \right) + {\text{94}}.{\text{7}} \left({{\text{bone\,mineral\,mass}}/{\text{body\,mass}}} \right) - {\text{ 179}}.{\text{0}}$$

.

FFM (kg) and FM (kg) estimates were obtained using %BF values and subjects’ body mass. The CV for the laboratory using the 3C and 4C models are 3.7% (FM), 0.5% (FFM) and 4.0% (FM) and 0.5% (FFM), respectively. The SWE for our laboratory using the 3C and 4C models are presented in Table 2.

Statistical analysis

A customised spreadsheet http://sportsci.org/resource/stats/relycalc.html#excel was used to derive reliability statistics for comparing precision in the estimate of FFM and FM using the reference 4C model, with those obtained by the 2C and 3C models plus SA (FFM, FM and skinfolds sum of 8). These statistics included the difference in the mean between measurements, and confidence limits. The determination of smallest worthwhile effect (SWE) from differences in the means was calculated from two testing sessions, 24 h apart, conducted in our laboratory under standardised presentation testing (Kerr et al. 2017). Formulas for SWE (%) and SWE (kg) are as follows:

$${\text{SWE }}\% {\text{ }}={\text{ 1}}00 \times {\text{EXP}}\left( {{\text{SD}}/{\text{3}} \times 0.{\text{2}}/{\text{1}}00} \right) - {\text{1}}00,$$

where EXP refers to the exponential of the values in the brackets next to it.

$${\text{SWE }}\left( {{\text{kg}}} \right){\text{ }}={\text{ SWE}}\% {\text{ }} \times {\text{ mean /1}}00.$$

The SWE was standardised by dividing by the standard deviation (Cohen’s effect size). To ensure the smallest worthwhile differences in body composition were standardised, one-third of the between-subjects SD was used for standardising [Δmean/(1/3 × SD)] as previously described (Nana et al. 2012). Therefore, the magnitudes of standardised effects were categorised as follows: < 0.20 trivial, < 0.60 small, < 1.20 moderate, and < 2.0 large (Hopkins et al. 2009).The change in mean was deemed as substantial for the SWE when the standardised value reached the tolerance for a small effect (≥ 0.2) (Nana et al. 2012). Previous work by Nana et al. 2012 first used this statistical analysis to identify the change from the SWE across different conditions. The use of SWE and Cohen’s effect changes (small, moderate and large) has also been validated by Hopkins et al. (2009). This analysis was subsequently adopted by Kerr et al. (2017) which identified acute changes in the mean from the SWE. Individual change scores were plotted and visually interpreted using Bland–Altman analysis to compare the two different subject presentation regimes.

Results

Body composition change with standardised presentation

Adherence to a best practice standardised subject presentation resulted in trivial longitudinal changes for raw values across all methods for FFM and FM estimations (Fig. 2A). The exception was for BIS (FM) where there was a large change of 7.2% from the SWE (Table 2).

Fig. 2
figure 2

ac Reliability results—change in mean ± 90% confidence intervals. DXA dual-energy X-ray absorptiometry, POD air displacement plethysmography, BIS bioelectrical impedance spectroscopy, 3C three-compartment model, 4C four-compartment model, SA surface anthropometry

Body composition change without standardised presentation at 6 months follow-up

Failing to account for subject presentation for follow-up testing resulted in trivial changes from the SWE in FFM for DXA and SA, small for BOD POD (1.1%) but large for BIS (2.2%), 3C (1.8%) and 4C (1.8%) models (Table 2). The change in mean from the SWE in FM estimates for SA, DXA and BOD POD were trivial, small for 3C (3.5%) and 4C (3.4%) models but large in BIS (6.4%). TBW (L) estimations obtained from BIS also had a large change from the SWE of 2.2% (Table 2). The individual change scores for FFM were compared for standardised presentation and non-standardised presentation in Fig. 3a. For FFM, the 95% limits of agreement were largest for BIS, 3C and 4C models (− 5632 to 2223 g, − 3616 to 1246 g, − 3557 to 1216 g) and lowest for DXA, BOD POD and SA. (− 2396 to 1145 g, − 2275 to 1273 g, − 1775 to 724 g). The individual change scores for FM are compared for standardised presentation and from non-standardised presentation in Fig. 3b. Again, the largest 95% limits of agreement were for BIS, 3C and 4C models (− 2857 to 4909 g, − 1772 to 2752 g, − 1698 to 2649 g) with lower limits for DXA, BOD POD and SA (− 1332 to 1383 g, − 1530 to 1143 g, − 695 to 370 g).

Fig. 3
figure 3

a Individual FFM change scores comparison and 95% limits of agreement. DXA dual-energy X-ray absorptiometry, POD air displacement plethysmography, BIS bioelectrical impedance spectroscopy, 3C three-compartment model, 4C four-compartment model, SA surface anthropometry. b Individual FM change scores comparison and 95% limits of agreement. DXA dual-energy X-ray absorptiometry, POD air displacement plethysmography, BIS bioelectrical impedance spectroscopy, 3C three-compartment model, 4C four-compartment model, SA surface anthropometry

Body composition change with non-standardised presentation

Failing to standardise subject presentation in both testing dates (BASEpm and POSTpm) resulted in trivial changes from the SWE in FFM for DXA, BOD POD, 3C and 4C models plus SA, but small for BIS (1.1%) (Table 2). The change in mean from the SWE in FM estimates for DXA, 3C and 4C models plus SA were trivial but moderate in BOD POD (3.3%) and large in BIS (9.4%). TBW (L) estimations obtained from BIS also had a small change from the SWE of 1.1% (Table 2).

Discussion

The key findings of this study were that failing to standardise subject presentation prior to body composition assessment resulted in a misinterpretation of changes in FM and FFM. When a standardised presentation in line with current best practice protocols (overnight fasted, rested, euhydrated) was implemented, the change in body composition for all methods after the 6-month period showed a conservative increase in FFM with a concomitant small decrease in FM (Fig. 2a). However, when presentation was not standardised 6 months later, the change to body composition was amplified in the methods. This was most apparent in the methods that included a TBW estimation, specifically BIS, 3C and 4C models (Fig. 2b). When presentation for both testing sessions (BASEpm and POSTpm) was non-standardised, the changes were reversed for 3C and 4C models (FM, FFM) as well as BIS and DXA (FFM), compared to the standardised presentation testing results (BASEam to POSTam) (Fig. 2c).

The individual change scores comparing the two different testing protocols show large individual differences in change scores for FFM and FM especially via BIS, 3C and 4C models. In addition to large changes in magnitude, some individuals showed an opposing direction of change in FM and FFM from non-standardised presentation in these methods. Overall, the changes in body composition were greater or reversed when current presentation guidance is not adhered to prior to testing and thus, interpretation of change will be misleading under these conditions. This may impact upon consequent dietary and training prescription and programming interventions.

Surface anthropometry

SA involves the measurement of characteristics such as stretch stature and girths as well as skinfold thicknesses of subcutaneous adipose tissue at specific landmarks around the body. The influence of TBW change on skinfold measurements has been found to be non-significant (Norton et al. 2000) suggesting that SA is unaffected by changes in hydration status. For both standardised and non-standardised presentation testing there was no change in SA values (mm) confirming prior research (Kerr et al. 2017) that this technique is robust and reliable. However, interpretation of change in composition is typically undertaken via regression equations that also include body mass which is influenced by changes in hydration status and gastrointestinal contents (Slater et al. 2010). The change in SA-derived estimations of FFM and FM was small but this study measured BM to obtain estimates of FM and FFM, so as expected there was a greater increase in the change of FFM from non-standardised presentation testing. Despite this, individual changes over the 6-month period were consistent in both direction and magnitude when assessed by SA, irrespective of subject presentation (Fig. 3a, b). Therefore, skinfold measurements (mm) can be undertaken at any time of day but the protocol of athlete presentation for SA (body mass) should follow previous recommendations of euhydrated, overnight fasted, post bladder and bowel evacuation with measurements taken in minimal clothing (Oppliger and Bartok 2002).

BOD POD

The BOD POD estimates physique traits by providing a valid measure of body volume and density with subsequent estimates of FFM and FM possible (Siri 1961; Brožek et al. 1963). One of the limitations of this densitometric 2C model is that the assumed TBW content of the FFM is 73.7% (Brožek et al. 1963) which previous research has shown not to be the case in all individuals including large muscular males similar to this investigation (Kerr et al. 2015). The literature has shown that subjects presenting for BOD POD assessment in a dehydrated state may introduce error (Bunt et al. 1989) and dehydration in BOD POD assessments have produced a small underestimation in body fat (1.1%) which may be important when tracking longitudinal change (Utter et al. 2003). This study has found that the modest change in FFM (400 g increase) over the 6-month period was increased (901 g increase) when food, fluid and physical activities were unrestricted prior to testing (Fig. 2b) although not as much as BIS, 3C and 4C models. In contrast, the change in FM (260 g decrease) from standardised presentation was diminished compared to non-standardised follow-up testing (4 g decrease). Of interest are the individual changes in FFM and FM which show similarities in magnitude and direction suggesting subject presentation is not as impactful as other methods such as BIS, 3C and 4C models (Fig. 3a, b). Although the BOD POD has been found to be reliable and robust (Kerr et al. 2017) with a small amount of prior food and fluid intake, at least with respect to mean changes in FM, existing guidance on subject presentation for BOD POD tests could be improved. By extending the 2-h food/fluid and exercise restriction to overnight fasted, rested and euhydrated in accordance with the other techniques, reliability may be enhanced for monitoring longitudinal changes in physique traits. This warrants further investigation.

DXA

DXA is unique in that the technology provides estimates of FM or FFM via attenuation of two photons of light (X-rays) through body tissue depending on its composition. Although previous research recommends standardising both subject presentation and scan protocol (Nana et al. 2012) for reduction of biological and technical error, manufacturers’ guidance is less clear (Lunar 2011; Kelly et al. 2009). Change in mean with standardised presentation for estimates of FFM using DXA technology were minimal which infers that there has been, collectively at least, no real change. However, the mean change in FFM was higher (762 g) for non-standardised testing suggesting that the gain in FFM was biological variation (Fig. 2b). Despite no meaningful change in magnitude for FFM and FM estimates from either presentation (Figs. 2c, 3b), including individual responses (95% limits of agreement of – 1332 to 1383 g), the impact on magnitude of longitudinal change in FFM indicates subject presentation should not be ignored. Consequently, this research confirms earlier work that recommends a standardised presentation protocol be implemented prior to assessment using the DXA technique (Nana et al. 2012) in order to have confidence that any detected changes in physique are meaningful. Manufacturers are recommended to include subject presentation guidance for total body scans in their recommendations for use to account for any biological variance in FFM estimations.

BIS

Deuterium dilution (D2O) is the reference method for laboratory based TBW measurement in 3C and 4C models but is expensive and can not be used for serial measures acutely over time (van Marken Lichtenbelt et al. 1994) whereas BIS has been applied in both athlete and non-athletic populations being safe, non-invasive and cost effective with instantaneous TBW results (Kerr et al. 2015; Moon et al. 2008). BIS is a measurement of body composition that scans 256 frequencies between 4 and 1000 kH. It directly measures impedance using Cole modelling with Hanai mixture theory to determine TBW. Calculations of FFM from TBW are created using the Pace et al. model, with FM generated by simply subtracting FFM from body mass (Pace and Rathbun 1945). BIA scans a single frequency of 50 kHz to determine TBW; however, both instruments rely on several assumptions including the human body being a series of cylinders that have equal resistivity to an electrical current that passes through water-containing tissue (FFM). Additionally, the techniques are insensitive to water changes in the trunk region and predictive algorithms assume a relative distribution of water between the limbs and trunk. Previous research using BIA technology has found that acute ingestion of fluid can overestimate FM by 3.2% (Saunders et al. 1998) and in agreement with this study using BIS, non-standardised testing (BASEpm to POSTpm) showed a more substantial increase in FM (701 g) compared to standardised (68 g). Additionally, the small increase in FFM identified via standardised testing (204 g) was recognised as a decrease (− 529 g) when presentation was not controlled for (BASEpm to POSTpm). In contrast, testing conditions of BASEam to POSTpm showed an underestimation of FM (− 958 g) and overestimation of FFM (1908 g), respectively. Previous research has shown that small amounts of fluid intake (590 mL) influence BIA-derived FM estimations (Dixon et al. 2009), confirming reliability of assessment is heavily dependent on subject presentation. Further, non-standardised presentation testing on both occasions showed a large and contradictory change in FFM and FM compared to the standardised presentation testing protocol which suggests that results could be misinterpreted using this method. Of great interest from this study regarding non-standardised presentation was the random nature of individual responses in FFM and FM from BIS with 95% limits of agreement ranging from − 5632 to 2223 g (Fig. 3a) and − 2857 to 4909 g (Fig. 3b), respectively. This makes prediction of small to moderate change in physique traits difficult to quantify on an individual level as mean changes (BASEam to POSTpm) revealed an overall large increase in FFM and loss of FM compared to BASEam to POSTam assessments. In contrast, BASEpm to POSTpm assessment mean changes for BIS identified a large increase in FM and loss in FFM.

Similar to other physique assessment methods there are several variables potentially influencing bioelectrical impedance measurements (BIA, MFBIA and BIS) making it far less predictable at an individual level. These include factors that impact TBW such as prior food and fluid intake, physical activity before measurement or medical conditions that affect fluid and electrolyte balance (Kyle et al. 2004b). However, due to the measurement of impedance to an electrical current, this technology is unique regarding its vulnerability to imprecision. Specifically, changes to cutaneous blood flow, skin electrolyte balance and ambient temperature can contribute to reduced reliability as the impact can be inconsistent across individuals (Dehghan and Merchant 2008). Indeed, any change to TBW will affect concomitant changes in fluid and electrolyte content (Saunders et al. 1998), confounding any change inferred from bioimpedance-derived estimates of physique traits (O’brien et al. 2002). Considering these findings, it is recommended that subject presentation follow previous clinical guidance (Kyle et al. 2004b) before utilising BIA or BIS technology for monitoring change in body composition. Manufacturers of these instruments are advised to extend their guidance for use to include prerequisites of overnight fasted, euhydrated and well-rested subjects prior to testing.

Three- and four-compartment models

The 3C model removes some assumptions associated with the 2C models with the inclusion of a measured TBW value. Similarly, the 4C model which also includes a measure of BMC, is considered the current reference method (Withers et al. 1998). The 3C and 4C models in this study included TBW estimates from BIS so intuitively we expected they would follow a similar direction and magnitude as BIS measures of physique traits. Consequently, the overall mean change and direction of change in 3C and 4C model estimates of FFM (increased) and FM (decreased) were similar to BIS using the Pace et al. method for FM and FFM estimations, for all testing conditions (Fig. 2a–c). This collaborates with previous research (Kerr et al. 2017) undertaken by our group which has shown the impact of acute biological error to greatly alter 3C and 4C model body composition estimates. Additionally, the individual responses were inconsistent with sufficient variation making prediction of FFM and FM responses challenging (Fig. 3a, b). Therefore, if applying a reference 3C or 4C model using BIS estimates of TBW, it is critical subject presentation is standardised for meaningful change in physique traits to be identified. This includes adherence to specific clinical guidance on subject presentation for BIS estimations of TBW (Kerr et al. 2017). The reference method of TBW estimation, D2O, could not be used in this study because of the frequency of assessments undertaken. The implications of using D2O warrants investigation, given this method is not influenced by factors such as skin temperature and peripheral blood flow, which are known to influence BIS estimates of TBW.

A limitation of this study is the use of the SFB7 BIS device instead of the reference D2O method for TBW assessment. The resolution of D2O is greater (600–800 mL) (Armstrong 2005) than for BIS; however, prior to inclusion of the BIS-derived TBW measures we undertook a validation study of the BIS system, on a similar population of large muscular males (Kerr et al. 2015). Due to the constraints associated with use of D2O, BIS or BIA is increasingly being used in 3 and 4C models. Another potential limitation of this study was the use of predicted instead of measured Vtg during the BOD POD assessments. This may have impacted on results if the Vtg changed during 6 months between assessments. However, since all training undertaken by the subjects was self-selected this would be speculative to ascribe to any training undertaken. Furthermore, the regression equation for predicted Vtg was validated previously in a cohort of healthy males similar to this group (Crapo et al. 1982). Due to this and for timeliness, a predicted Vtg was used in this study. Finally, other multi-compartment models of physique assessment are available including the Heymsfield 4C model and the Baumgartner 4C model (Wang et al. 1998). While it is claimed that they include corrections for soft tissue mineral and other updated constants relative to the Withers 4C model that was used in this study, soft tissue mineral is unlikely to change over a similar 6-month period of training in adult males providing trivial change to outcomes (Wang et al. 2002). Furthermore, the same assumptions are made at both assessment periods and thus, it is believed similar outcomes would have been achieved using these newer 4C models.

In conclusion, after 6 months of self-selected training and diet, standardised presentation testing identified a modest increase in FFM and minimal or negligible decrease in FM via all methods. However, non-standardised presentation for follow-up testing exaggerated changes in body composition with a substantial increase in FFM for all methods but most notably for BIS, 3C and 4C models, which included an estimation of TBW from BIS. Furthermore, utilising non-standardised presentation for both time points showed changes that were contradictory to standardised presentation testing for FFM using some methods (BIS) and FM plus FFM for others (DXA, 3C and 4C models), indicating the impact variation in TBW has during the day. Muscular athletic males with large amounts of FFM such as the subjects in this study experience substantial hydration fluctuations throughout the day which explains why methods using TBW estimations have the highest biological change. The effects of non-standardised presentation may be more marked in athletic than general or clinical populations due to larger shifts in TBW due to training but nonetheless, this requires further exploration.

Standardising subject presentation had little impact on interpretation of change in FM when using SA, DXA and BOD POD. However, subject presentation markedly impacted on interpretation of change in FM at the individual level when using BIS, 3C and 4C models, making prediction of change difficult when subject presentation was not standardised in accordance with the guidance offered here. The results of this study confirm the importance of consistent standardised presentation (overnight fasted, rested and euhydrated) of subjects prior to testing to accurately interpret changes from body composition assessment. The BOD POD and SA methods were less affected by subject presentation but observed changes still contained unacceptable biological variation, emphasising standardised presentation of subjects is required before testing if accurate assessment of longitudinal physique change is desired. This study adds to the research emphasising the importance of subjects adhering to a standardised protocol prior to assessments if attempting to interpret change in physique over time. Manufacturers of 2C models are advised to update their recommendations for use to include guidance for subject preparation to minimise biological variability. Practitioners are advised to adhere to best practice presentation protocols to ensure real change in body composition in athletes can be confidently identified.