Introduction

Athletes often assess their body composition given its association with performance in sport. Body composition data can be used to evaluate the effectiveness of nutrition and conditioning interventions, and also indicate health and injury risks [13].

Dual-energy X-ray absorptiometry (DXA) was originally designed to assess bone mineral density but gained popularity in body composition measurement due to its low radiation dose, ease of use, and high-throughput [3]. With its relatively high precision, DXA serves as a laboratory reference method for body composition assessment [1]. Despite the low dose of radiation involved with DXA, there is no consensus on the maximum times per annum this technique can be used safely and hence it is not suitable for regular assessment of body composition throughout a sport season [11]. The costs associated with owning and operating the DXA machine may also limit its widespread use in regular body composition monitoring [11].

Skinfold thickness assessment is a common field method used by sport science practitioners to determine body composition [15]. Skinfold thickness assessment is easy to administer, inexpensive, and reliable if assessed by a trained practitioner [1]. However, factors such as the grip of the practitioner, the applied pressure of the caliper as well as type and brand of calipers used may affect skinfold measurement outcomes [4]. Despite the technical limitations of skinfold thickness assessment, it is less affected by activities of daily living, such as meal ingestion and hydration, than other methods for assessing body composition [12]. Many researchers consider the sum of skinfold thicknesses a valid body composition measure for Olympic and world-class athletes, with the number of measured body sites varying from six to eight [5, 8, 19]. Furthermore, not all sports have reference norms for skinfold thickness.

Researchers developed and validated several equations to predict body fat percentage (BF%) from skinfold thicknesses in elite athletes [18, 20,21,22]. However, all of these equations were based on athletes in North America and Europe, and therefore the equations may not be generalized to athletes in Asia or elsewhere. Although researchers have developed and validated predictive equations for Asians and Asian Americans in the general population [6, 9], there are no estimates for elite Asian athletes. The equations developed by Davidson and colleagues [6] in particular, were based on the widely used equation by Durnin and Womersley (DW) [7] in predicting BF%, using skinfold thickness and validated against hydro-densitometry. Davidson et al. [6] refined the DW equation to include different racial groups and genders and validated their equation against DXA.

Given the relevance of BF% to sport and health, race- and gender-specific equations for predicting BF% from skinfold thickness data would allow coaches and athletic support personnel to monitor athletes’ body composition accurately and easily. Therefore, we sought to develop an equation to predict BF% from skinfold thicknesses and other anthropometric measurements in elite Asian athletes.

Methods

Study Design and Participants

We analyzed data from two studies. In the first study, we recruited 95 athletes from Singaporean national teams between February 2016 and March 2018 to investigate vitamin D’s relationships with muscle function and bone health [10]. Details of the recruitment process including inclusion and exclusion criteria are documented in the published article by Huang et al. [10]. In the second study, we recruited a separate, non-overlapping sample of 21 athletes from Singaporean national teams between May and June 2019 to examine body composition changes before and after Ramadan fasting (unpublished data, Ducker et al.). Pre-Ramadan anthropometric measurements were used for this analysis. Athletes were recruited via word-of-mouth by their respective sport scientists and through emails invitations sent to national sporting associations in Singapore. Both studies recruited highly trained athletes who have represented Singapore at local and regional competitions [14].

We excluded athletes from both studies who were pregnant, had past injuries that contraindicated DXA scanning, or were on long-term medication that contraindicated participation. All athletes gave written informed consent prior to participation, and the Singapore Sport Institute’s Institutional Review Board (NU/FULL/003 and NU/FULL/015) approved both studies. We used the data from the vitamin D study to develop the prediction equation, and then validated that equation with data from the Ramadan study.

Anthropometry

In both studies, we measured height, weight, waist circumference, and skinfold thicknesses at eight sites (right side of body: triceps, subscapular, biceps, iliac crest, supra-spinal, abdomen, front thigh, and medial calf). Staff with Level 1 certification from the International Society for the Advancement of Kinanthropometry (ISAK) took all anthropometry measures. All measurements were taken in duplicates with calibrated calipers (Harpenden, UK). If the difference between duplicate measurements exceeded 5% for an individual skinfold site, staff took a third measurement. In analysis, we used the mean of duplicate or median of triplicate measurements. Staff measured each participant’s height with a wall mounted stadiometer (Holtain Limited, UK), weight with an electronic scale (Mettler-Toledo GmbH, Germany), and waist circumference at the narrowest reading above the navel with a non-elastic measuring tape (Lufkin, USA).

Three staff performed anthropometry measures for the first sample set while two staff performed anthropometry measures for the second sample set. For the second sample set, one staff was responsible for landmarking the sites while the other performed skinfold measures.

Dual-energy X-ray Absorptiometry (DXA)

In the vitamin D study, we assessed whole body composition with pencil beam DXA (Lunar DPX-L, GE Healthcare). In the Ramadan study, we assessed whole body composition with fan beam DXA and the Hologic Discovery Wi system (Hologic, USA) at a commercial radiological provider (AsiaMedic Imaging Centre, Singapore). Participants completed both skinfold measurements and DXA scanning within the same day. In both studies, prior to scanning, we asked all athletes about current injuries and orthopedic pins or implants, and then they removed any jewelry or metal objects that could cause interference in scanning. For scanning, athletes dressed in minimal clothing and laid supine on the scanning bed with hands by their sides, level with their hips.

Statistical Methods

With the vitamin D study participants (n = 95) as the training sample, we performed regression analysis to predict BF%. Then we used the Ramadan study participants (n = 21) as validation data to evaluate the performance of the prediction models to predict the BF% for the athletes. The data of one participant from the Ramadan study were deleted due to missing triceps skinfold measurement. We developed the prediction model using the following variables: the sum of seven and eight skinfold thicknesses (SOS7 [all of the sites except iliac crest] and SOS8), age, weight, height, eight skinfold thicknesses (triceps, subscapular, biceps, iliac crest, supra-spinal, abdomen, front thigh, and medial calf), waist circumference, body mass index (BMI), sex, the natural logarithm transformation of weight [log(weight)], height and abdomen, and the quadratic transformation of age (age2) and waist circumference (waist2). Because BMI is a function of weight and height, we did not include BMI in any model with height or weight to avoid potential overfitting and multicollinearity issues. Likewise, we did not include the summary skinfold thickness measures (SOS7 and SOS8) in the same model with each other or with any of the individual skinfold thicknesses. We computed the pairwise Pearson correlations among all variables, inspected scatterplots for linear relationships, and transformed covariates with non-linear relationships with BF% prior to regression analysis (Figs. 1, 2). Additionally, we compared the Pearson correlations between the raw untransformed and transformed covariates with BF% prior to the regression analysis, to inform the decision of the covariates to be included in the models. Finally, several prediction models were fitted, including the modeling of both transformed and raw untransformed covariates. However, the raw untransformed and transformed covariates were not included in the same models as predictors.

Fig. 1
figure 1

Pair-wise correlation plot. Pair-wise correlation between variables is represented by the size and color of the corresponding square. The size of the circle indicates the significance of the correlation. The larger the circle, the more significant the correlation. The color of the circle represents the scale of the correlation. The darker the color, the larger the correlation. Blue circles indicate positive correlations and salmon circles indicate negative correlations

Fig. 2
figure 2

Plots of selected potential predictors against DXA. The four scatter plots illustrate non-linear relationship between each predictor and DXA

To select the covariates for the model to be used with the training sample, we used backward stepwise regression, eliminating variables that had statistically nonsignificant (P > 0.05) associations with BF%. The maximum number of initial covariates included in the full regression model was 13 and the resulting subjects per variable (SPV) was 7.3. While various values based on rules of thumbs are often used for the minimum SPV required in regression models, a recent study illustrated through extensive Monte Carlo simulations that a minimum value of approximately 2 for the SPV was adquate for estimation of coefficients, standard errors, and confidence intervals in regression models [2]. Since independence is assumed in regression models, we conducted the Durbin–Watson test to assess if the model residuals violate the independence assumption. According to the results of Durbin–Watson test, the assumption of independent observations was met (Durbin–Watson autocorrelation = − 0.01, Durbin–Watson statistic = 2.00, P= 0.92). Values of the Durbin–Watson statistics ranging between 1.5 and 2.5 indicate no autocorrelation. A common challenge often observed with linear regression models is multicollinearity, which occurs when some of the predictors are correlated to each other. The variance inflation factor (VIF) is an indicator used to measure the correlation between the predictors in a regression model and assess for multicollinearity. In our model building, we obtained the VIF for a final model that included triceps, subscapular, log (abdominal), front thigh, waist2, and gender. The VIF associated with all the covariates were less than 10. To compare models derived from different starting sets of covariates and identify the best performing model, we used the Akaike information criterion. We then used this training sample model to predict BF% in the test sample and assessed its accuracy with mean squared error (MSE) and adjusted R2. We also computed the Pearson correlation between the predicted and observed values, and obtained the paired t test between them. We defined statistical significance as P < 0.05 and conducted all analyses with R [17].

Results

Athletes of the training sample represented 18 sports and those of the test sample represented 10 sports. The athletes in the two samples were similar in terms of gender, age, height, and weight, although they differed notably in ethnicity and body composition (Table 1). We also conducted F tests to test the equality of the variances of age, height, weight, DXA, SOS8 between the training and test samples. According to the results of the F tests, the variances of age, height, weight, DXA, SOS8 for the two samples were equivalent (all P values > 0.05). We also conducted two-sample t tests to compare the means of age, height, weight, DXA, SOS8 between the samples. According to the two-sample t tests, the means of age, height, weight, DXA, SOS8 were similar between the two samples (all P values > 0.05). Based on our correlation analyses, we found that the eight different measures of skinfold thicknesses were moderately to strongly correlated with each other (0.57 to 0.90, all P values < 0.0001) and also to their SOS7 and SOS8 summaries (0.76 to 0.93, all P values < 0.0001). While the DXA-based measures of BF% were linearly correlated with most of the independent variables, non-linear associations were observed between BF% and age, BMI, weight, and waist circumference based on the bivariate plots. Therefore, quadratic or log transformations were considered for age, BMI, weight, abdominal skinfold thickness, and waist circumference.

Table 1 Characteristics of athletes in samples 1 and 2 including distribution of sport type. M refers to male athletes and F refers to female athletes

Our proposed model is given by

$$BF\%=-8.304+ 0.239\times Triceps + 0.230*Subscapular + 4.199\times log(Abdominal)+ 0.517\times FrontThigh+ 0.001\times {Waist}^{2} -4.106\times Male$$

Table 2 shows our final prediction model. Triceps, subscapular, log (abdominal), and front thigh skinfold thicknesses, waist2, and sex were all statistically significant at an alpha level of 0.05. Females had higher BF% than males on average. Skinfold thicknesses (triceps, subscapular, abdominal, and front thigh) all had positive relationships with BF%. The predicted and observed BF% values were not systematically different in either the training or test samples (paired t-test values close to 0). However, the predicted and observed values were strongly correlated in both the training and test samples (r = 0.96 and r = 0.97, respectively).

Table 2 Final coefficients for the predictive equation of BF%

The ratios between the DXA-measured BF% (reference) and predicted values, in percentage form (parity = 100%), ranged between 86% and 151% for individual athletes in the test sample. The absolute differences between observed and predicted values, expressed as percentages relative to observed values, ranged between 0.17% and 34%. Figures 3, 4 show the ratios and absolute percent changes between the observed (reference) values and predicted values in relation to the distribution of observed (reference) values. For instance, Fig. 3 shows that about 35% (n = 7) of test sample’s predictions with a reference/prediction ratio within (1.11, 0.9) and most are located at the two tails of the test data (the shaded area). This suggests that the prediction was more accurate for subjects with high or low BF% levels than those with intermediate BF% levels. Also, Fig. 4 illustrates that the predicted values are more accurate at high and low BF% levels than at intermediate BF% levels.

Fig. 3
figure 3

Histograms of DXA with different ratios of the reference/predicted values. Reference/Predicted denotes the ratio between reference BF% (DXA) and predicted BF% values by the proposed model. The numbers in the parentheses in bold represent the upper and lower limit of the Reference/Predicted ratio for each histogram. The value of n represents the number of subjects that have a Reference/Predicted value falling in the corresponding upper and lower limit for each histogram and the percentage in parentheses of the total test sample. Each histogram illustrates the distribution of the reference BF% (DXA) of these subjects (the shaded area) along with the distribution of the reference BF% (DXA) of all test subjects in the background (the white area). We provide this figure for illustrative purposes and note that the results are not generalizable due to the small sample size

Fig. 4
figure 4

Histogram of DXA with different percent change between reference and predicted values. Absolute percent change = 100 × |reference BF% – predicted BF%| / reference BF%. The value of n represents the number of subjects that have an absolute percent change value falling in the corresponding limit for each histogram and the percentage in parentheses of the total test sample. Each histogram illustrates the distribution of the reference BF% (DXA) of these subjects (the shaded area) along with the distribution of the reference BF% (DXA) of all test subjects in the background (the white area). We provide this figure for illustrative purposes and note that the results are not generalizable due to the small sample size of the test data

Figure. 5 illustrates the relationship between fitted (predicted) BF% values by the proposed equation and the observed (reference) BF% values. The x-axis represents the observed (reference) BF% values while the y-axis represents the fitted (predicted) BF% values by the proposed equation. The diagonal line in the center of the plot represents the lowest smoothing line between the predicted values and reference. Since most of the data points based on plotting the predicted values of the BF% against the reference values were close to the diagonal line, we concluded that the regression model provided an adequate fit to the test data. This indicates that the prediction made by our proposed model is close to the reference. The adjusted R2 represents the proportion of variation in the observed (reference) BF% values that can be explained by the regression model while penalizing the number of predictors. Thus, our proposed model can explain about 64.5% variation in the observed (reference) BF% values, which is higher than that of the Davison equation and the Henry equation (see Table 3).

Fig. 5
figure 5

Plot of reference vs. predicted values from our analysis, including mean square error (MSE) and adjusted R2 MSE \(\Sigma\)(reference BF% – predicted BF%)2/sample size. BF body fat; Adj.Rsq Adjusted R2

Table 3 Model assessment of the proposed equation of this study and the Asian-specific equation by Davidson et al. [6] and Henry et al. [9]

Discussion

Although athletes undergo regular skinfold thickness measurements for monitoring of body composition, there are no prior available equations for predicting BF% for Asian athletes. Our prediction equation based on samples of Asian athletes includes waist circumference, skinfold thicknesses at four sites (triceps, subscapular, abdominal, and front thigh), and gender. Despite the large difference in ethnic distribution between the first and second sample sets, the high accuracy of the predicted BF% using the equation derived from the first sample set illustrates the robustness of the model.

Several equations are available for predicting BF% from skinfold thicknesses for athletes in North America and Europe [18, 20,21,22], but their application to Asian athletes is uncertain due to racial differences in body shape and fat distribution [23]. Specifically, Asians have greater upper-body fat and more prominent abdominal adiposity than Caucasians [23, 24]). A predictive equation based on a general population sample of Asian Americans [6] might not apply to Asians living in Asia, because dietary and other environmental factors may influence body fat deposition and hence the predictive power of the equation [9]. Furthermore, a predictive equation based on a sample of Chinese in Southeast Asia [9] might not be generalized to Asian athletes because sample participants were older and had higher BF% than most athletes.

Figure 6 compared the association between the fitted (predicted) BF% values by our proposed equation and by the Davidson equation with the observed (reference) BF% values. We can see that MSE of the proposed equation (MSE = 12.03) is smaller than that of the Davidson equation (MSE = 21.67). This suggests that the difference between the prediction made by our proposed equation and the reference is lower than that made by the Davidson equation and the reference. Moreover, the regression line between the prediction by our proposed equation and reference is more of a diagonal line than that by the Davidson equation and reference, which also suggests a closer correlation. More comparison between the results of our proposed equation and the Davidson equation can be found in Table 3. The prediction made by our proposed equation has a narrower range of the absolute percentage change and a smaller mean absolute percentage change than that made by the Davidson equation. The proposed equation also has a much higher adjusted R2 than the Davidson equation. In summary, our proposed equation performed better than the Davidson equation in predicting BF% values for Asian athletes. Similarly, we compared the association between the predicted BF% using our proposed equation and using the Henry equation [9] with the observed (reference) DXA-measured BF% values. MSE of our proposed equation is also smaller than that of the Henry equation (MSE = 18.156). Additionally, the proposed equation had a higher adjusted R2 (adjusted R2 = 0.65) than the Henry equation (adjusted R2 = 0.46). More comparisons between the proposed equation and Henry equation can be found in Table 3, which all suggested that our proposed equation made more accurate prediction of the BF% for the Asian athletes than the Henry equation.

Fig. 6
figure 6

Predicted values of BF% against observed reference BF% (DEXA). a Proposed equation of this current study. b Asian-specific equation by Davidson et al. [6]. Our proposed equation fits the test data better than the Asian-specific equation by Davidson et al. MSE \(\Sigma\)(reference BF%–predicted BF%)2/sample size

In the absence of DXA scanning, estimated BF% derived from predictive equations can help sport scientists manage athletes’ overall health and wellness. Information on fat-free mass derived from BF% is also useful for interpreting data on resting metabolic rate when assessing energy availability [20].

Equations derived from a specific athletic population, such as elite football players, predict BF% better in athletes than equations based on general population samples [18]. It is unclear, however, whether equations based on athletes competing in one sport are applicable to athletes competing in other sports. The athletes underlying our equation competed in a wide variety of sports and their diverse ethnicities imply that this equation may be applicable to Chinese, Malay, and Indian athletes at least.

Considering the prominent abdominal adiposity of Asians, our predictive equation included skinfold measures of the abdomen and waist circumference. As found in our study, abdomen measures had a very significant (P value < 0.0001) correlation with body fat. Similarly, quadratic transformation of waist circumference was significantly correlated as well (P value = 0.024). Finally, we also observed a significant association between front thigh and body fat.

Measurements of DXA are affected by technical variations, such as machine models and biological variations, including recent exercise, recent meal ingestion, and level of hydration [16]. Because the main purpose of the vitamin D study (training sample) was to assess bone mineral density, we did not require the athletes to avoid exercise and meals before scanning. Additionally, data for this study were pooled using two separate studies, resulting in different sport science practitioners measuring skinfold thickness of study participants. However, all sport science practitioners who measured skinfolds were ISAK-certified. The small and homogenous sample size for the validation set was also a limitation of the study. Pooling additional data from Asian athletes may improve the accuracy of the predictive equation, given the relatively small sample population of this study.

Conclusion

We developed a predictive equation to estimate BF% from skinfold thicknesses and waist circumference in Asian athletes. The development of such equations may help practitioners estimate BF% of Asian athletes when accepted options such as DXA are unavailable. In future research, it may be possible to pool data from additional Asian athletes and from a greater range of sports to improve the accuracy and precision of the equation.