Introduction

Obesity is a global public health crisis1 associated with considerable health risks that increase the risk of coronary artery disease, stroke, cancer and premature mortality.2, 3 The importance of identifying obesity as a disease in a clinical care setting is critical to the management of such patients.4 Accurate diagnosis of obesity in older adults is an essential first step in delivering effective treatment to older adults most at risk.

Body mass index (BMI) is the most common method to diagnose obesity in primary care and subspecialty settings. Population-based studies have proven the metabolic consequences of having a BMI ⩾25 kg m2 and the mortality risk of a BMI ⩾30 kg m2.2, 3 These guidelines have been incorporated in public health campaigns and have become common practice. Other anthropometric measures have been suggested for use, including waist circumference (WC), as they additionally place individuals at high overall cardiometabolic risk, independent of BMI.5 However, they have not been fully recommended to be used in recent national guidelines.4

While BMI may reasonably predict adverse outcomes in global population-based adult studies, recent studies have demonstrated that traditional BMI cutoffs may, in fact, misrepresent the degree of adverse outcomes in older populations.5, 6 This is partly explained by the changes observed in body composition occurring with aging,7 including the gradual increase in fat mass, the decrease in muscle mass and quality or sarcopenia, and the degree of underlying systemic inflammation. Identifying the predictive validity and diagnostic accuracy of BMI in this older sub-population is critically important to provide reasonable recommendations to front-line clinicians. The purpose of this study was to determine the diagnostic performance of BMI to identify obesity based on body fat in elderly subjects using established cutoffs for overweight and obesity. We also determined the differences in underlying metabolic abnormalities in those with varying degrees of body fat content using body composition measurements, but not otherwise classified as having obesity.

Materials and methods

The National Health and Nutrition Examination Surveys are cross-sectional surveys conducted by the Centers for Disease Prevention and Control since 1971. The survey samples non-institutionalized adults of the United States and oversamples minorities and elderly adults. It is a complex stratified multistage probability sampling design allowing generalizability of the results to the rest of the population. All of the survey contents and procedures are available online at http://www.cdc.gov/nchs/nhanes.htm (accessed February 2015). Data for this analysis were limited to the 1999–2004 data sets. The survey has been approved by an internal Institutional Review Board, and was exempt from local review because of the deidentified nature of the results.

Of the 38 077 total participants screened, 31 125 were interviewed, and 29 402 were examined in a standardized mobile examination center. We limited our analysis to those aged 60 years and older as the relationship between obesity and BMI is less clear in an elderly population. In the cohort aged ⩾60 years, 7729 were screened, 5607 (72.5%) were interviewed and 4984 (64.5%) were examined. All subjects included in our analysis had body composition data obtained by dual-energy X-ray absorptiometry (DEXA). There were 4984 participants fulfilling these criteria and were classified by race (non-Hispanic White, non-Hispanic Black, Hispanic and Other), and by age group, where applicable (60–69.9, 70–79.9 and ⩾80 years). All baseline demographic characteristics were assessed using a self-report questionnaire.

Measurements were all performed on the right side of the body to the nearest 0.1 cm, except where casts, amputations and other factors prevented such assessment. Height was measured using a stadiometer after deep inhalation, and weight was measured using an electronic digital scale, calibrated in kilograms. Body mass index was calculated as weight (kg) divided by height (m) squared. WC was measured in the standing position at the iliac crest, crossing the mid-axillary line, with the measuring tape placed around the trunk. Blood pressure was measured in the mobile examination center by a trained examiner following the latest recommendations of the American Heart Association Human Blood Pressure Determination by a mercury sphygmomanometer.8 Determinations were recorded directly onto a computerized data collection form and the blood pressure reported to the examinee is that reported in this study. All DEXA data were obtained using a QDR-4500, Hologic scanner (Hologic, Bedford, MA, USA) by trained technicians. The procedure lasted roughly 3 min. DEXA exclusions consisted of subjects who were ⩾192.5 cm or weighed ⩾136.4 kg in this subgroup. Metal objects, except false dentition and hearing aids, were removed. Overall fat mass, muscle mass, bone measurements, appendicular skeletal muscle mass of all limbs and bone mineral content were assessed. Total body fat percent and lean mass percent were subsequently calculated. These techniques were similar in all NHANES (National Health and Nutrition Examination Survey) cycles.

Detailed specimen collection and processing instructions are discussed in the NHANES Laboratory/Medical Technologists Procedures Manual located on the NHANES website (http://www.cdc.gov/nhanes). Vials were stored under appropriate frozen (−20 °C) conditions until they were shipped for testing. Non-fasting routine biochemistries, including glucose and triglycerides, were performed with a Hitachi Model 704 multichannel analyzer (Boehringer Mannheim Diagnostics, Indianopolis, IN, USA). Total cholesterol, high-density lipoprotein, triglycerides and low-density lipoprotein cholesterol were shipped to Johns Hopkins University Lipoprotein Analytical Laboratory (Baltimore, MD, USA) for testing. Blood specimens, for fasting glucose and insulin, were processed, stored and shipped to the University of Missouri-Columbia (Columbia, MO, USA) for analysis, and C-reactive protein was performed at the University of Washington (Seattle, WA, USA). The homeostatic assessment model-1 was determined using published equations to determine insulin resistance and β-cell function.9 HOMA-IR (Homeostasis Model Assessment-insulin resistance) was calculated as: (fasting insulin fasting glucose (mg dl−1))/405. Homa-B was calculated as (360xinsulin)/(glucose-63), represented as a percentage.

Statistical analysis

All data were merged and analyzed according to the policies and procedures outlined by NHANES. Baseline characteristics are presented as weighted means with standard errors for all continuous variables, and weighted percentages for categorical determinations. Because of the known differences in body composition,7 baseline characteristics were stratified by sex. The gold-standard assessment was considered body fat percent based on DEXA-obtained adiposity to determine the diagnostic performance of BMI. Obesity diagnosis based on fat content measured with DEXA was defined as having body fat ⩾25% for men and 35% for women,10 based on values recommended by the American Association for Clinical Endocrinology and those used in our previous studies.5, 6, 11 Subjects were also classified according to standard BMI cutoffs of 25 and 30 kg m−2 representing overweight and obesity.

Diagnostic performance was assessed by determining sensitivity, specificity, positive and negative predictive values, and positive and negative likelihood ratios. Receiver operating characteristics curves were constructed for BMI to detect BF%-defined obesity for all subjects separately by sex and ethnicity. We additionally report the distribution of individuals with a normal BMI but elevated body fat and differences in metabolic variables in subjects whose BMI <30 kg m2 with differing sex-specific cut-points of body fat and separately demonstrating cumulative distribution functions of percent body fat by BMI cutoff (30 kg m2) in both sexes. T-tests of unequal variances compared metabolic variables between these two groups in each sex. Replicates and data review were performed for quality assurance. All analyses were conducted using STATA v.13 (STATA, College Station, TX, USA) accounting for strata, primary sampling unit and weighting. Separate weights were used for the fasting morning subsample. Interview weights were used according to NHANES procedures to account for the unequal probabilities of selection, participant non-response, non-reponse to the in-home interview and mobile center examination, and also were poststratified to match estimates of the US non-institutionalized adult population. A P-value <0.05 was considered statistically significant and Bonferroni multiple comparison adjustments were performed when necessary.

Results

Our final data set consisted of 2531 men and 2453 women aged over 60 years, as indicated in Table 1. Mean BMI was 28.0 and 28.5 kg m2 in men and women, respectively, with 28.9% and 34.3% of older adults classified as having obesity based on BMI. Based on body fat, 87.5 and 89.1% of men and women are classified as having obesity. In those aged ⩾80 years, a BMI ⩾30 kg m2 had a very low sensitivity, negative predictive value and concordance rates as compared with younger cohorts. Notably, both lean mass and appendicular skeletal mass were higher in men compared with women.

Table 1 Baseline characteristics: National Health and Nutrition Examination Surveys 1999–2004

Table 2A represents the diagnostic performance for a BMI of 25 and 30 kg m2. As the cutoff for BMI increases from 25 to 30 kg m2, the sensitivity drops and the specificity increases in both sexes. Correct classification of obesity drops markedly with age with both cutoffs, but is markedly lower using a BMI ⩾30 kg m2. The ideal BMI to identify obesity in men and women is 24.91 and 24.1 kg m2, respectively (Figure 1). Table 2B represents using standard WC cutoffs and Figure 2 represents the receiver operator curves noting the optimal thresholds are 97.6 and 87.4 cm, respectively. In Table 3, we present data on metabolic variables in the subset of subjects in each sex with a BMI <30 kg m2 and a low WC stratified by body fat. Across both non-fasting and fasting samples, a number of indicators suggest the heterogeneity of those with a BMI <30 kg m2 with regard to cardiometabolic dysfunction. These differences were not observed in women with a low WC. Cumulative distribution functions are presented in Figure 3. We present in Figure 4 the distribution and weighted prevalence in those with a BMI <30 kg m2 and body fat. The line designates the standard body fat cutoff for obesity (men: 25%; women: 35%). Last, Table 4 represents the adjusted correlation coefficients between BMI and body fat, lean mass and appendicular skeletal mass, both by sex and age group.

Table 2A Diagnostic performance for BMI using cutoffs of ⩾25 and ⩾30 kg m2 by age group and sex
Figure 1
figure 1

(a and b) Receiver operator curves for BMI for all subjects aged ⩾60 years in the National Health and Nutrition Examination Survey 1999–2004 sample included in this analysis to detect body fat percentage by sex. (a) Men and (b) women.

Table 2B Diagnostic performance for waist circumference for diagnosing obesity using sex-specific cutoffs by age group and sex
Figure 2
figure 2

(a and b) Receiver operator curves for WC for all subjects aged ⩾60 years in the National Health and Nutrition Examination Survey 1999–2004 sample included in this analysis to detect body fat percentage by sex. (a) Men and (b) women.

Table 3 Metabolic differences based on body fat content among men and women with a BMI <30 kg m2
Figure 3
figure 3

(a and b) Cumulative distribution functions of percent body fat in men (a) and women (b) in subjects with a body mass index ⩾30 and <30 kg m2. Vertical lines represent percent body fat cutoffs for men (25%) and women (35%).

Figure 4
figure 4

Variations in percent body fat in men (a) and women (b) in subjects with a body mass index <30 kg m2. Line represents body fat cutoffs for each sex (⩾25% in men; ⩾35% in women).

Table 4 Adjusted correlation coefficients between BMI, BF% lean mass and ASM by sex and age group

Discussion

Our study highlights the challenges of using BMI as the most widely used and accepted method to diagnose obesity in clinical care setting, particularly in older adults. With the changes observed in body composition in this patient population, our data provide an opportunity to caution clinicians in solely relying on this anthropometric measure for counseling patients on reducing their weight and lowering their cardiovascular risk.

To our knowledge, this is the first analysis using nationally representative data to determine the diagnostic performance of BMI using DEXA as the gold-standard and that focuses specifically on older adults. Previous studies using DEXA data from NHANES have focused on those with and without physical limitations.12 These authors noted excellent specificity but poor sensitivity, in addition to considerable misclassification based on body fat percent. Flegal’s analysis using differing anthropometric indices, including BMI, WC, waist hip circumference and waist to stature ratio, focused predominantly on correlation coefficients and agreement between metrics,13 rather than focusing on diagnostic accuracy using our the methods used in this analysis. Our group has explored this relationship previously14 in a systematic review that demonstrated BMI ⩾30 kg m2 had an overall pooled sensitivity of 50% and specificity of 90%. A specific analysis that used bioelectrical impedance demonstrated the changes observed in diagnostic accuracy in the general population using similar cutoffs, albeit in a general population.11 In this study, sensitivity of a BMI ⩾30 kg m2 for obesity peaked in the 40–49.9 years age group at 44% in men and in the 50–59.9 years age group at 54% in women. This dropped to 27% and 43% in the 70–79 years age group, respectively. Specificity remained high in both sexes (>90%), although negative predictive value dropped with age from 70% and 69% in men and women in the 20–29 years age group to 51% in both sexes in the 70–79 years age group. For all subjects, area under the curve was 0.88 with an ideal BMI of 25.5 kg m2 (sensitivity 83%, specificity 76%). However, BIA is highly inaccurate in older adults and can be influenced by food consumption, exercise, ethnicity and certain medical conditions. As body water content differs in older adults, this may also influence its precision and accuracy. Others have used body plethysmography and have observed similar results.15 DEXA scanning does not have these limitations and has less bias than BIA,16 and in older adults could be a better modality for the ascertainment of body fat. Our results are similar to others that have demonstrated the poor discrimination between body fat % in populations with coronary artery disease.17 In an Australian study, one group suggested the importance of gender- and age-specific thresholds when using BMI to indicate adiposity.18 Notably, BMI is even inaccurate in assessing adiposity in pediatric populations.19

Our analysis proves that the diagnostic accuracy of BMI is markedly poor in both sexes with increased age, reflected by the lower concordance indices. The ideal cut-point for BMI in this population-based cohort is ~25 kg m2 in both sexes, a cutoff markedly lower compared with the current criterion to diagnose obesity. A BMI of 25 kg m2 is associated with the lowest mortality point in a number of longitudinal studies. In fact, our results, coupled with those linking a BMI ~26–27 kg m2 with the lowest mortality, suggest that traditional BMI cutoffs are likely inaccurate and conceivably should be revisited. While the degree of correlation was satisfactory between BMI, body fat and measures of muscle mass, we believe that the interplay between muscle mass and fat is likely to not only impact the degree of functional capacity in older adults but may also obscure the adequacy of using BMI as a simple measure of adiposity. Additionally, the majority of subjects with a BMI >25 kg m−2 have obesity based on body fat.

We believe that while BMI has its shortcomings, it still may be a useful measure to use. For instance, in older adults, previous studies have predicted a direct relationship between obesity, disability and mortality.2, 20 Recent consensus statements from the Foundation for the National Institutes of Health Sarcopenia Project21 have indeed incorporated BMI in grip strength cutoffs for clinical identification of at risk subjects for weakness. However, its utility in clinical practice for obesity alone should be used with great caution informed by our study findings. Two major initiatives rely on BMI in an older adult population, including the Physician Quality reporting measures22 and the Medicare Obesity Benefit.23 Our data prove the limitations of not only using this measure but also demonstrate that the majority of older people in the United States population that have obesity based on body composition, which may otherwise be classified as not having obesity based on a BMI <30 kg m−2.5 We purposefully presented discordant cases in Table 3 to determine the difference in metabolic profiles in those with a non-obese BMI but different body fat composition. This subset analysis proves that subjects with obesity mischaracterized by BMI (BMI <30 kg m2) have differing metabolic profiles and there may be sex-specific differences based on central obesity. Additionally, there are certain populations where an elevated BMI may lead to improved outcomes, a phenomenon known as the obesity paradox.24 Strongly encouraging the sole use of BMI in practice-based settings may target inappropriate populations or outcomes and other measures including WC should be considered. In older adults, adiposity localized centrally, and WC may be a possible alternative for adiposity assessment. While DEXA may be widely available for measurement of body composition, in the United States, it is not a reimbursable procedure for this indication and hence the need to consider alternative anthropometrics. Physical function and quality of life are important patient-specific outcomes in older adults, and targeted outcomes of primary care obesity interventions in this population should alter the focus from weight or BMI to such measures as advocated by others.23, 25

As with any cross-sectional study, we acknowledge the intrinsic methodological limitations of NHANES. While there is oversampling of older adults, we are limited by the number of subjects in the older age categories. Additionally, our results can only be extrapolated to community-based adults, and not institutionalized adults. While body fat % is considered the gold standard in defining obesity, the cutoffs appear to be arbitrary. Although other authors have repeatedly used these cutoffs and inadvertently referred to the 1995 WHO Technical Report,26 there remains no scientific rationale for using such cutoffs other than expert opinion.27, 28, 29 Future validation threshold studies by age, gender and race are critically needed.

A disadvantage of categorizing a continuous variable into categories is not only the loss of study power but values slightly above the threshold may have only incremental and modest long-term risk, potentially resulting in overdiagnosis.30 Misclassification is possible as well, and this has implications for public health in the identification and management of higher risk populations. While DEXA scanning is a reasonably inexpensive modality to routinely assess body composition, it is performed for unrelated clinical indications and not for this sole purpose. Future research and advocacy would provide more accurate assessments of obesity status compared with present anthropometric measures. Its accuracy in older adults is superior to that of bioelectrical impedance, in that the latter may underestimate truncal obesity and is highly dependent on water content, making it suboptimal for use in older adults.

Our study confirms using DEXA-based body composition measures that BMI suboptimally identifies adiposity. While gold-standard methods such as CT and MRI may provide accurate whole-body and regional assessment of fat and muscle,31 these are clinically impractical and costly for routine assessment. We suggest that accurate measurements of adiposity be considered using DEXA in older adults, particularly when this test is performed for other indications, such as osteoporosis screening or monitoring. This can eliminate the challenges observed with using BMI as a clinical tool and its lack of diagnostic accuracy. Future studies should evaluate the added cost burden compared with the information that this modality can provide to a clinician.