Introduction

Breast cancer death rates in the US population have declined over the past 20–30 years due to advancements in early detection and treatment [1]. Yet contingent with this increasing population of patients who have been treated for breast cancer is an increasing population at risk for treatment-related complications. Breast cancer-related lymphedema (BCRL), a condition that results from disruption of the lymphatic system by breast cancer treatment, is among the most feared of these negative sequalae [2,3,4,5]. While this may manifest initially as transient swelling in the ipsilateral arm, breast, or trunk, it can chronically progress to irreversible fibrosis and interstitial hypertrophy [6, 7]. BCRL further compromises quality-of-life in breast cancer survivors via associated symptoms of pain, heaviness, disfigurement, and functional impairment, as well as being associated with increased rates of infection and lymphedema-related hospitalizations [2, 8, 9]. While long-term follow-up studies are limited, the risk of BCRL should for now be considered lifelong [10, 11].

The recognition that BCRL is associated with significant post-treatment morbidity has led to increasing attention toward early identification and intervention. Unfortunately, the body of literature on BCRL is hampered by inconsistency in the measurement methods and diagnostic criteria for BCRL. Objective measures of BCRL include bioimpedance spectroscopy (BIS), perometry, circumferential tape measurement, and water volumetry, all of which present distinct sets of advantages and disadvantages [12]. Multiple organizations have released guidelines that address diagnostic modalities for the early detection of BCRL [13]. While newer modalities of BIS and perometry have received increasing emphasis in the most recent iterations of best practice guidelines, in clinical practice many centers continue to use the more economical and accessible alternative of circumferential tape measurement [14,15,16,17]. This study will focus in particular on a comparison between methods of objective volumetric assessment. BIS will not be addressed here due to lack of interchangeability with volumetric techniques, though this modality has received favorable assessments within a number of best practice guidelines [14, 16]. Notably, existing best practice guidelines do not recommend a particular screening strategy, reiterating only the importance of preoperative baseline measurement and regular surveillance of both arms [18,19,20].

Among the most frequently utilized volumetric measurement instruments are perometry and circumferential tape measurement with volumetric conversion. Perometry utilizes a frame of infrared light beam-receiver pairs to measure limb outline with sub-centimeter definition and thus derive limb volume by the disc model method [21]. It benefits from being quick, accurate, and highly reproducible, but comes at a high up-front cost. In contrast, tape measurement is inexpensive and easily accessible, but is time-consuming with inconsistent inter- and intra-rater reliability [12]. Moreover, there are numerous suggestions for how circumferential tape measurement should be performed. The National Lymphedema Network (NLN) recommends a minimum of six circumference measurements: mid-hand, wrist, elbow, upper arm just below the axilla, and 10-cm distal to and proximal to the lateral epicondyle on both arms [15]. The International Society for Lymphology recommends measurements at 4-cm intervals from the ulnar styloid (wrist) to the axilla [22]. Measurement strategies from independent clinical studies are further variable. While tape measurement has historically been used with an absolute criterion of 2-cm circumference increase as a criterion for lymphedema, the accuracy of this measure is questionable [10, 23]. With multiple circumferential measurements, the frustum model (sum of multiple truncated cones) can be used to calculate total arm volume, which can then be compared directly to perometry.

Individual institutions are inevitably forced to make decisions between cost, efficiency, and accuracy if deciding to implement a screening program for BCRL. Measurement techniques are not interchangeable, and existing practice exhibits extensive inconsistency. Thus, the goal of this study was to use a novel method of simulated circumferential measurement to compare perometry to volumetric tape measurement for the detection of BCRL, utilizing both anatomic landmark- and interval-based techniques.

Materials and methods

Study design

Lymphedema screening program

From 2005 to 2017 at our institution, optoelectronic perometry was used to prospectively screen women with new diagnoses of breast cancer for lymphedema, with approval of the Partners Healthcare Institutional Review Board. This protocol for lymphedema screening is the standard-of-care at our institution [24]. All patients underwent preoperative baseline arm volume measurement as well as postoperative follow-up measurements at regular intervals of approximately 3–8 months.

Simulated tape and volumetric measurements

The perometer utilizes an array of optoelectronic infrared transmitters and lamp light-receiver pairs to calculate limb volume [21]. For each measurement, custom-modified PeroPlus 2000 software (Pero-system Messgeräte GmbH, Germany) was used to extract a pair of arm diameter measurements at each 4.7-mm segment. MATLAB 8.0 was used to smooth and visually extract anatomic landmark data, and then calculate arm segment volumes (The MathWorks, Inc., Natick, MA, USA). Consistent with the perometer software, perometry volumes were calculated as the sum of averaged cylinders every 4.7-mm. Simulated tape measurement was performed as follows (Fig. 1): For the 4-cm intervals method, measurements were sampled every 4-cm beginning at the wrist and extending to the axilla (4-cm Intervals) [22]. An end correction was applied to account for the last segment of less than 4-cm (4-cm Intervals, End Correction). For the landmarks method, measurements were sampled at the wrist, elbow, and axilla plus two additional points either 10-cm distal to and proximal to the elbow (Landmarks) or halfway between the wrist and elbow or elbow and axilla (Landmarks, Midpoint) [15]. Volumes were then calculated by the sum of truncated cones (frustum) model for each of these methods. The volume of the hand was excluded given our experience of perometer inconsistency with this measurement and poor anatomic approximation of a frustum model [25].

Fig. 1
figure 1

Common methods of breast cancer-related lymphedema (BCRL) quantification: a Landmarks method; b Landmarks, midpoint method; c 4-cm intervals method, noting that the last increment nearest to the axilla is less than 4-cm (including this defines the 4-cm intervals, end correction method); d perometry method

Quantifying arm volume changes

The previously validated relative volume change (RVC) formula was used to quantitatively determine the percentage arm volume change compared with preoperative baseline [RVC = (A2U1)/(U2A1)− 1] [26]. A1 is the preoperative and A2 the postoperative at-risk arm volume, while U1 and U2 are the analogous volumes on the contralateral side. Importantly, the RVC formula accounts for preoperative baseline as well as temporal changes in size of the at-risk and contralateral arms [18]. Clinically apparent lymphedema was defined as a RVC ≥ 10% while low volume lymphedema was defined as RVC 5–10%, both occurring > 3 months after surgery [27].

Patient population

This study included 287 female patients diagnosed with unilateral breast cancer who underwent comprehensive lymphedema screening at our institution. This cohort was randomly selected from our institutional database after stratification by BMI (< 25, 25–30, and > 30 kg/m2), in order to ensure analysis of all arm sizes. All patients had bilateral baseline perometer measurements and at least one postoperative measurement. Only measurements greater than 3 months after final surgery were used to avoid misclassifying transient postoperative swelling as BCRL [7]. Patients with bilateral surgery, local recurrence, or subsequent distal metastases were excluded. Demographic, clinicopathologic, and intervention-related characteristics were obtained via medical record review.

Statistical methods

Statistical analysis was conducted with R (Version 2.15, http://www.R-project.org). The full set of perometer measurements was treated as the reference technique for this analysis. The Bland–Altman method was used to compute the mean difference and 95% confidence interval between simulated tape measurement methods and perometry for the total arm, upper arm, and forearm. Two-by-two confusion matrix analysis was performed for each simulated tape measurement technique against perometry, first for the full cohort, then stratified by BMI. Time to detection of low volume and clinically apparent BCRL was compared across methods by boxplot methods.

Results

Patient population

Median postoperative follow-up time in this cohort of patients with unilateral breast surgery was 34.7 months (range 3.2–110.0 months), with a median of 4 postoperative visits greater than 3 months after final surgery (range 1–29). Median age at diagnosis was 56 years (range 27–85 years), with median BMI at diagnosis of 27.3 kg/m2 (range 16.9–50.7 kg/m2). Full demographic and clinicopathologic characteristics of the cohort can be found in Table 1.

Table 1 Demographic and clinicopathologic characteristics of patient cohort (n = 287 unless otherwise indicated)

Segmental arm volumes

A total of 4350 distinct arm volumes for the 287-patient cohort were analyzed. The correlation coefficient (r) was consistently > 0.98 for all tape measurement methods when compared to perometry. For both the ipsilateral and contralateral arms, there was no significant difference in total arm volume between tape measurement methods and perometry (Fig. 2a, 95% confidence interval for limits of agreement inclusive of the null hypothesis). Compared to perometry, both anatomic landmarks methods significantly underestimated upper arm volume (Fig. 2b, mean difference − 207 mL [− 13.9%] for the ipsilateral upper arm and − 202 mL [− 13.8%] for the contralateral upper arm) and overestimated forearm volume (Fig. 2c, mean difference + 170 mL [+ 21.8%] for the ipsilateral forearm and + 170 mL [+ 22.1%] for the contralateral forearm). For the 72 patients who developed RVC ≥ 10% by perometry, the major contributor to arm volume change was the upper arm in 46/72 patients (63.4%), forearm in 8/72 patients (11.1%), and both segments equally in the remaining 18/72 patients (25.0%).

Fig. 2
figure 2

Mean volumes by different measurement methods for the total arm (a), upper arm only (b), and forearm only (c). There was no significant difference in total arm volume between methods; however, the use of either landmarks method significantly underestimated upper arm volume and overestimated forearm volume. CI confidence interval

Detection of low volume and clinically apparent BCRL

Each of the four simulated circumferential measurement methods (4-cm intervals; 4-cm intervals, end correction; Landmarks; Landmarks, midpoint) was evaluated against perometry as an objective screening test for BCRL. Both landmark-based methods had greater sensitivity (93.1 and 90.3% vs. 81.9 and 77.8%), specificity (93.5% vs. 68.4 and 92.6%), positive predictive value (82.7 and 82.3% vs. 46.5 and 77.8%), and negative predictive value (97.6 and 96.6% vs. 91.9 and 92.6%) for detection of RVC ≥ 10% when compared to both interval-based methods. The same generally held true for detection of RVC 5–10%  (Table 2). The sensitivity for detecting low volume lymphedema (RVC 5–10%) was lower than for clinically apparent BCRL (RVC ≥ 10%): 16.0–66.7% versus 77.8–93.1%, respectively. BMI did not affect these confusion matrix analyses. For the true positive cases where volumetric tape measurement correctly identified patients with RVC ≥ 10% and RVC 5–10%, time to detection of either clinically apparent or low volume BCRL did not differ compared to perometry (Fig. 3).

Table 2 Sensitivity, specificity, and predictive value analyses compared to perometry (%)
Fig. 3
figure 3

Time to detection of relative volume change (RVC) ≥ 10% (a) or 5–10% (b)

Discussion

This study compared both simulated anatomic landmark- and interval-based volumetric circumferential measurement techniques against perometry for the detection of BCRL after breast cancer treatment. There is remarkable heterogeneity in the literature for the definition of lymphedema, even when incorporation of objective diagnostic criteria can be agreed upon. While screening and measurement guidelines described in the National Lymphedema Network position statement recommend consistent pre- and post-treatment measurement of both arms, the criteria for treatment referral are left to individual institutions [15]. There are currently no robust head-to-head trials validating one technique against another, though a number of small prospective trials are ongoing, including at our institution [28,29,30,31]. In the context of increased vigilance for identifying low volume lymphedema and emphasis on early intervention, standardized quantification is an idealized goal; in the presence of practical barriers, an understanding of the comparative accuracy of common quantification methods is of minimum necessity.

Regarding a direct comparison of arm volumes between methods, we demonstrated that a high correlation coefficient is a deceiving metric across a cohort of patients with multiple repeated arm volume measurements. Had we stopped at this point, the subsequent finding of segmental discrepancies would have been overlooked. The broad misuse of correlation as a proxy for similarity in medical research is well documented, inclusive of lymphedema research [32,33,34]. Though there was no significant difference between total arm volumes of either the ipsilateral or contralateral side, we showed that anatomic landmark-based methods underestimate volume of the upper arm and overestimate volume of the forearm. The absolute magnitude of this mean difference was greater than 200 mL for both the ipsilateral and contralateral upper arm. Notably, the threshold of 200 mL is an oft-quoted yet erroneous absolute volume threshold used to define lymphedema [10, 35,36,37,38,39]. These findings are particularly concerning for the many patients with edema localized to or most prominent in an isolated arm segment. Furthermore, of the 72 patients in our cohort who developed RVC ≥ 10% by perometry, only one quarter had edema distributed throughout the whole arm. The major contribution to RVC was the upper arm for nearly two-thirds of these patients. This pair of findings raises concerns about detection of localized lymphedema with landmark-based circumferential methods, given systematic upper arm underestimation. More generally, segmental lymphedema and tissue composition is an area of recent interest within BCRL. Stout et al. examined a cohort of 46 patients with subclinical lymphedema by perometry, showing that segmental volume changes of the forearm were correlated with total limb volume change before the formal diagnosis of subclinical BCRL [33]. Several small studies have utilized bioimpedance spectroscopy, dual energy X-ray absorptiometry, and ultrasound techniques to further explore this question [40,41,42].

Adriaenssens et al. compared perometry, water displacement, and volumetric circumferential measurements in a cohort of patients with varying degrees of BCRL, but limited their discussion to absolute volume differences [43]. To date, no previous study has compared perometry with volumetric circumferential measurement to assess accuracy of BCRL detection by relative volume change formulas, a criteria that incorporates both temporal baseline and asymmetry between ipsilateral and contralateral arms [18, 26]. In this retrospective analysis, we demonstrated that anatomic landmark-based circumferential measurement techniques were superior to 4-cm interval-based techniques at thresholds for both low volume (RVC 5–10%) and clinically apparent BCRL (RVC ≥ 10%) for this cohort of patients. This was true despite the fewer number of measurements required for the landmark-based techniques. Data from our group’s ongoing prospective study of BCRL measurement methods have shown that total measurement time is 22.4% greater for 4-cm interval- compared to landmark-based methods (unpublished data) [28]. The logic behind the relative favorability of the method with fewer sampling points likely stems from the closer approximation of a frustum model based on anatomic landmarks compared to one based on arbitrary sampling intervals. Furthermore, while landmark-based methods performed well for detection of RVC ≥ 10%, its sensitivity was only 63.2–66.7% for RVC 5–10%. These findings are potentially practice changing for an institution limited to volumetric tape measurement techniques due to resource limitations.

Findings of this study must be considered in light of its limitations. First, the simulated nature of circumferential measurement in this study does not allow us to account for test–retest reliability and the human error of tape measurement [44]. Yet if anything, the method utilized in this study of extracting segmental circumference data from perometer exports represents an idealized version of guideline-recommended tape measurement strategies. Tidhar et al. have previously reported that the inter-rater reliability of volumetric circumferential measurement by trained physical therapists in lymphedema management was not acceptable for clinical practice, despite the high intra-rater reliability in that report [45]. Generalizability is most desirable in a clinical practice where a single provider may not be available to measure a given patient at every appointment. Therefore, the sensitivity and specificity parameters presented in this study represent the best-case scenarios for volumetric circumferential tape measurement techniques compared to perometry. Nonetheless, we consider this a hypothesis-generating study. Findings are meant to contextualize the body of existing research and influence prospective clinical trial design. Otherwise, the exclusion of the hand is a fundamental limitation of current perometry methods. This is an area of limited research, though some patients seem to develop lymphedema that is disproportionately isolated to the hand [25]. Furthermore, the correlation of these findings with subjective symptom assessment and physical exam would serve to strengthen our conclusions, as combination methods are thought to improve accurate diagnosis of lymphedema [7, 12, 46].

The recognition of BCRL as an important complication of breast cancer treatment has led to multidisciplinary endorsement of screening programs that address the goals of early detection and intervention [12]. Despite these efforts, the inconsistent application of different objective BCRL quantification techniques leads to difficult conclusions for individual institutions and the field as a whole. Given that circumferential tape measurement remains the most widely used method of limb volume assessment, the most reliable yet efficient strategy for tape measurement should be better defined and standardized [14]. This study suggested the superiority of landmark-based simulated circumferential measurement compared to interval-based techniques, a finding that has the potential to minimize the time providers spend measuring patients. At the same time, these findings generated questions about the underestimation of upper arm volume and overestimation of forearm volume compared to perometry, a significant finding for emerging concerns of segmental lymphedema both demonstrated here and in previous work. These questions warrant further research, specifically in the form of prospective screening trials that directly compare techniques of volumetric BCRL quantification.