INTRODUCTION

Poor health literacy is highly prevalent amongst individuals with diabetes, and has been associated with worse glycemic control and increased diabetes-related complications.1 More specifically, poor understanding of disease management targets, specifically glycemic control as expressed by the hemoglobin A1C (A1C), has been associated worse glycemic control.2 , 3 Beyond the association with glycemic control, positive correlations between better understanding of the A1C and higher scores for other diabetes-outcomes, specifically dietary self-care, exercise self-efficacy, and blood sugar monitoring self-efficacy, have been demonstrated.2 All these findings are concerning, given studies estimating that only around 25 % of patients really understand the A1C value.2 , 4 While there is a lot of data supporting the assertion that the A1C is confusing to patients, there is limited information about the sources for this confusion. A review of patient message boards raised some common issues: how an A1C can be “bad” if fingerstick blood glucose values are “good”, confusion about presentation as a percentage, and the non-intuitive assessment scale.5 , 6 Regardless of the specific barriers to understanding, we contend that there must be more effective ways to present information on diabetes control and disease management targets, and that improving this understanding can affect patients’ outcomes.

A 2008 study published in the British Medical Journal demonstrated the impact on outcomes of translating poorly understood medical values into more relevant and meaningful formats.7 Smokers were provided feedback on their lung function either in terms of FEV1 or “lung age.” Patients given their “lung age” had significantly higher rates of smoking cessation at the end of the study.7 A similar approach of using more intuitive metrics that may provide more effective feedback could be useful in communicating glycemic control to patients. For this study, the two formats for communicating diabetes control were chosen for their familiarity; the first, letter grades, A through F, and the second, faces expressing representative emotions.

While the connection between efforts to improve understanding of diabetes control and subsequent improved disease outcomes may not be immediately apparent, we would argue that, as suggested by both the Health Belief Model of behavior change and standard economics, an increased sense of personal susceptibility and disease severity serve a key role in eventual behavior change.8 Other elements, such as adequate resources, self-efficacy, and ongoing reinforcement, are also needed. Still, depicting information on glycemic control in a more meaningful and provocative way could increase the personal significance of improving diabetes control and be a catalyst for eventual behavior changes.

In this randomized controlled trial involving patients with poorly controlled diabetes, we examined the effect of using new forms of feedback on diabetes control, letter grades and faces, on future glycemic control and general perceptions of diabetes and diabetes control. We hypothesized that providing feedback about current diabetes control in one of these new ways would not only have a greater impact on participants’ glycemic control, but would also differentially impact participants’ perceptions of their diabetes, particularly their assessment of current disease control. And, if there were changes in participants’ perceptions of diabetes control, changes in related perceptions, such as personal risk of diabetes-related complications, would be more likely.

METHODS

Study Design

The protocol was approved by the University of Pennsylvania institutional review board (IRB). The study’s funding source played no role in the study’s design, conduct, or reporting. Once consent was obtained, a research assistant randomized participants to one of three study arms by selection of one of 250 randomly ordered, sealed envelopes containing the assignment. A biostatistician generated the random numbers and created the envelopes. Randomization was not performed until eligibility was determined; however, blinding following randomization was not possible, given the nature of the intervention.

Study Population

Participants were recruited from urban, academic internal medicine practices. Potential participants were identified from the electronic medical record (E.M.R.) using the following filters: ICD code for diabetes and a recent A1C value greater than 8 %. We excluded children, pregnant women, or anyone with a documented cognitive deficit. Given American Diabetes Association (A.D.A.) guidelines stating that an A1C goal of 7 % may not be suitable for all patients, an A1C cut off of greater than 8 % was chosen to ensure that an A1C decrease would likely be appropriate for all participants.9 Patients with type I diabetes were not specifically excluded. Providers were asked permission to enroll their eligible patients.

Study Outcomes

The primary study outcome was difference in A1C between baseline and 6 months. The 6-month A1C values were chosen over 12-month A1C values, given the low likelihood that this type of intervention would have an effect 12 months following the intervention if no effect was seen at 6 months. Secondary outcomes examined included changes in participant answers to four questions related to their perceptions of diabetes: current diabetes control, seriousness of the disease, their assessment of future personal risk of diabetes-related complications, and the value they placed on good diabetes control in preventing complications, which for the sake of brevity we will refer to collectively as diabetes perceptions.

Study Intervention

Participants were randomized to receive individualized information about their current glycemic control in the mail in one of three ways: (1) a “diabetes report card” containing letter grades ranging from A to F (grade arm); (2) a report card containing faces whose emotion reflected current glycemic control (face arm); or (3) a report card containing the actual A1C value (standard arm) (Fig. 1). The grade arm intervention design was based on our contention that most individuals, even those with low educational attainment, are familiar with the letter grade system of evaluation. While there is no evidence for this type of usage of letter grades in communicating personal health information to patients, the use of letter grades to express information related to health is not completely novel. Health inspectors in several large cities use a letter grade system to rate restaurant safety and there has been recent publicity regarding the use of letter grades to rate hospital safety.10 , 11 Similarly for the design of the face arm, while the use of the faces to express personal health information has not been studied in this context, faces are widely used in other contexts such as for rating levels of pain.12 They have also been used to provide feedback in non-medical contexts such as energy usage, where consumers using above average amounts of energy who were shown a frowning face made larger usage cuts, while those shown a smiling face continued their low use.13

Figure 1
figure 1

Example report cards.

For the grade and face study arms, the corresponding A1C values were determined based on A.D.A. recommendations. The best grade or happiest face was deemed equivalent to an A1C less than or equal to 7 %, while any value over 10 % was interpreted as a grade of “F” or as a crying face. Values between 7 % and 10 % were divided, and grades and faces reflecting level of control were assigned. Figure 1 shows examples of the “report cards” participants in each of the study arms received.

Data Collection

A telephone survey conducted at enrollment collected demographic information, as well as the participant’s personal diabetes history. We assessed participant numeracy, given our contention that accurate assessment of disease control requires interpretation of numerical values (A1C, blood sugars, etc.), as well as past evidence supporting the role of adequate numeracy in successful diabetes management.14 , 15 We used the Schwartz numeracy assessment tool, comprised of three questions assessing ability to handle basic numerical scenarios (e.g., “Imagine that we flip a fair coin 1,000 times. How many times…would it come up heads…?”).16 Given the lack of a brief, validated scale for measuring general perceptions of diabetes, the study team wrote the four survey questions regarding diabetes perceptions; answers to these questions were assessed on a 5-point Likert scale.

Following receipt of the “report card,” a post-intervention telephone survey was administered. The post-intervention survey, conducted on average 2–3 weeks following study enrollment, repeated the four questions on diabetes perceptions. Participants received $20 following completion of each survey. At 3 months post-enrollment, an updated “report card” was mailed. This “report card” corresponded to initial study arm enrollment; however, its content reflected the participant’s most recent A1C value. For patients who did not have a repeat value, the three-month “report card” reflected the enrollment A1C value. At 6 months following enrollment, A1Cs were collected using the E.M.R. With IRB approval, we also collected 12-month A1C values for all enrolled participants and tracked A1C data for eligible, non-enrolled individuals, who were not enrolled due to an inability to make phone contact, participation in another study, or achievement of needed sample size.

Statistical Analysis

The sample size for this study was chosen to provide 80 % power to detect a difference of 0.8 % between enrollment and 6-month A1C values. The effects of the most common interventions for diabetes, including lifestyle changes, metformin, sulfonylureas, and insulin show average A1C reductions of 1–2 %, 1.5 %, 1.5 %, and 1.5–2.5 %, respectively, while other medications show an A1C decrease less than 1 %.17 A difference of 0.8 % was chosen, as differences of this magnitude have been shown to be associated with clinical benefits18 , 19; in addition, the magnitude was less than the expected effect from a more intense intervention and thereby potentially achievable. Using an alpha value of 0.025 to account for the two primary comparisons, we targeted a sample size of 59 participants in each study arm, for a total of 177 enrolled participants. The study was powered to examine differences between each of the experimental arms and the standard arm.

The data analysis was performed using Version 9.3 of the SAS System for Windows. For the primary study outcome, we analyzed data using multiple imputation to handle missing 6-month A1C values. The imputation model included the following participant factors: study arm, ethnicity, age, gender, education level, grades, past diabetes education course, income, diabetes duration, insulin use, history of complications, depression status, numeracy, and answers to diabetes perceptions survey questions. After imputing the data, we used ANOVA to test for differences in A1C change among the groups in each imputed data set, and then combined the results using standard formulae.20 We assessed for different rates of A1C follow-up between study arms and for any difference in baseline characteristics between participants who had a 6-month A1C and those who did not.

We used Type III SS to test for changes in diabetes perceptions following the study intervention. We have included a more detailed description of the statistical methods used for this analysis in a technical note following the discussion. This allowed us to test for any aggregated group effect on outcomes, whether from face, grade or standard group assignment, controlling for everything else. To test if the change in perception differed between treatment arms, we introduced an indicator variable that described the period (pre-intervention or post-intervention) and focused on the interaction coefficient of the treatment effect (study arm) and this indicator. Standard errors were calculated robustly using the REPEATED statement in PROC GENMOD to account for the correlation among repeated observations from the same individual. All four models were adjusted for annual income, educational attainment, past diabetes education class, grades earned in school, and numeracy score. To test for balance of baseline covariates among treatment groups and between missing and non-missing participants, we used Pearson chi squared or Fisher’s exact test.

Three additional sensitivity analyses were performed. First, we compared the A1C distributions of the enrolled study population with that of the larger eligible, non-enrolled population, to ensure representativeness of the study population. Second, to assess for an intervention effect in the standard arm, we compared the change in A1C at 6 months of the participants randomized to the standard arm with that of the eligible, non-enrolled population. Last, instead of using multiple imputation to account for the missing participant 6-month A1C values, we analyzed the change in A1Cs between study arms using just the available 6-month A1C with available 12-month A1C values in place of any missing 6-month values.

RESULTS

Between May 2010 and November 2010, 177 participants enrolled in the study. Figure 2 shows the numbers of subjects who were identified using the E.M.R., agreed to study participation, completed each survey, and who had new A1C values at 6 months post-enrollment. The average baseline A1C of enrolled participants was 9.9 % (S.D. 1.7) (Table 1). The distribution of A1C values for enrolled participants was right-skewed, contributing to a higher than expected mean A1C. A similar A1C distribution was noted in the larger population of eligible, but non-enrolled, individuals. There was no statistically significant difference in A1C values at enrollment by study arm, gender, ethnicity, income, or educational attainment (Table 1). In this population of patients with poorly controlled diabetes, time since diagnosis and completion of a diabetes education course were also not associated with enrollment A1C values.

Figure 2
figure 2

Assessment for eligibility, randomization, and follow-up.

Table 1 Baseline Characteristics of Participants (%, Unless Otherwise Noted as 95 % CI)

There were approximately 50 % of 6-month A1C values missing in each study arm (Fig. 2). There were no differences in the frequency of missing A1C values among arms (p value = 0.88). We compared characteristics of participants with and without 6-month A1C values, including demographics, diabetes history, and enrollment A1C; these participants differed only in the percentage reporting insulin use (56 % for missing A1C vs. 72 % for non-missing A1C, p = 0.02). Given the many missing A1C values, the previously described multiple imputation method was employed to evaluate the primary outcome of change in A1C at 6 months. No significant differences in change in A1C values at 6 months existed between treatment groups; the change in A1C between enrollment and 6 months was -0.89 % (0.3), -0.55 %(0.3), and -0.74 %(0.37) for the face, grade and standard arms, respectively (Table 2). Table 2 shows results from the multiple imputation, with results of significance tests shown in the last three columns. Performing the analysis using only subjects with observed 6-month A1C values (i.e., without multiple imputation) and using available 12-month A1Cs in place of missing 6-month values both yielded similar results. Apart from the random sampling difference due to unequal sample sizes, when considering only subjects with non-missing values, the distributions of the A1C changes at 6 months were similar for participants in the standard arm and the group of eligible, non-enrolled individuals. The p value of the t-test for equality of the means for these A1C differences was not significant (p = 0.17), suggesting there was no significant study effect on those in the standard arm.

Table 2 Change in A1C by Study Arm, Analyzed Using Multiple Imputation to Account for Missing Data. [Data from Five Imputed Data Sets (using MIANALYZE)]

In the pre-intervention survey, 23.7 % of participants described their current disease control as ‘good’ or ‘excellent’ in spite of an average A1C of 9.5 %, and 39.6 % reported ‘moderate’ control despite a mean A1C of 9.8 %. Notably, the majority of the group mistakenly describing their control as ‘good’ or ‘excellent’ reported having diabetes for over 5 years, and did not differ in their history of diabetes-related complications or educational attainment when compared to those with more accurate perceptions of current glycemic control. On the post-intervention survey, 21.8 % of participants described their control as ‘good’ or ‘excellent’ with a group mean A1C of 9.8 %, and 42 % described their control as ‘moderate’ with a mean A1C of 9.6 %. There were 21 participants who did not complete post-intervention survey. There were no significant changes over time (pre-intervention vs. post-intervention) for any of the four perception outcomes (Table 3).

Table 3 Changes in Diabetes-Related Perceptions by Study Arm

Certain participant characteristics were found to be associated with diabetes perceptions, independent of study arm. When adjusting for educational attainment, diabetes education, reported grades, and numeracy, patients who reported past diabetes education had approximately 1.9 (1.09, 3.18) times higher odds of rating their diabetes control more negatively (i.e., more accurately), independent of the treatment received. These patients also had an 11 (1.24, 96.5) times higher odds of placing greater value on good diabetes control as an important element in avoiding future complications. Patients with higher numeracy levels had 2.6 (1.23, 5.52) times higher odds of rating their current diabetes control as worse, and a 6.25 (1.01, 36.6) times lower odds of devaluing the importance of good control.

DISCUSSION

In this randomized, controlled trial of adults with poorly controlled diabetes, replacing the poorly understood hemoglobin A1C with feedback in the form of letter grades and faces showing different emotions had no differential effect on glycemic control at 6 months. The intervention also had no noted effects on participants’ perceptions of their diabetes, specifically their assessment of current disease control. We did note high levels of misunderstanding of current diabetes control, with the majority of participants (71 %) describing their current control as ‘moderate’, ‘good’, or ‘excellent’ in spite of an average A1C greater than 9.5 %. This baseline misunderstanding could not be explained by demographic factors, specifically educational attainment, or disease-related factors, such as disease duration, highlighting both the complexity of health literacy and the lack of current information regarding how patients assess their diabetes control. Participants reporting past diabetes education courses had more accurate perceptions of their diabetes control following the intervention, regardless of study arm, suggesting that diabetes education may impact receptiveness to and comprehension of new health-related information, and supporting the A.D.A. recommendation that newly diagnosed patients be referred for diabetes education.9

The majority of our study population was comprised of African-American women with low numeracy, limiting generalizability to other populations. Beyond this, the study had several limitations in its design that may have contributed to the negative findings. First, the formats chosen (grades, faces) and the scale used to assess diabetes perceptions were not piloted. Second, we had no information about participants’ control prior to study enrollment. Third, this study was based on the assumption that for patients, perceived diabetes control is associated with actual glycemic control and not other factors such as symptoms or complications. Fourth, resources were focused on the assessment of 6-month outcomes rather than on assessment of more proximal mediating factors, such as diabetes knowledge, activation level, and medication adherence. Fifth, unlike other successful non-pharmacologic interventions for chronic disease management in similar populations, such as financial incentives and peer mentoring, our study participants only received the information twice without any additional reinforcement.21 23 Sixth, the A1C change of 0.8 % used to power the study may not have been ideal. Last, although we did not find any indication of non-random missing data, the decision to not include a scheduled 6-month A1C collection in the study design resulted in many missing values limiting the analysis.

A study currently underway attempts to address many of these limitations. In this two-phase mixed methods study, we conduct semi-structured interviews with patients to explore the ways in which they understand diabetes control, assess perceived barriers to understanding information about their diabetes, and elicit feedback on new, different formats for communicating this information. Unlike the completed study, we are piloting a wider array of potential communication formats, including more-positively framed formats, rather than formats that, for some, may seem punitive. In the second phase of the study, we will test the impact of these communication approaches on glycemic control, as well as on several intermediaries of behavior change, including diabetes knowledge, patient activation level, and immediate decision-making.

There is evidence that more effective communication affects diabetes outcomes. In a recent study of patients with diabetes, poorer ratings of overall communication were significantly associated with decreased adherence to cardio-metabolic medications.24 While increasing the comprehensibility and meaningfulness of medical information is only one component needed to improve patient–provider communication, it is arguably an easier one to address than other contributing factors, such as individual physician empathy, patient trust, and limited physician time. We acknowledge that in isolation, it can likely only support and motivate the behavior changes needed to improve long-term outcomes; however, this does not diminish the importance of finding more effective methods of providing essential information to our patients. And, though our study shows that replacing the A1C with letter grades or faces did not significantly improve patient’s understanding of their diabetes control or their glycemic control, these findings do not negate the possibility that better alternatives to the A1C exist and should be further explored.

TECHNICAL NOTE

*The SAS nomenclature (Type I, II, III and IV) for different types of sums of squares has been widely accepted when referring generally to different hypotheses tested in a classical ANOVA. Type III SS tests examine the significance of a partial effect, that is, the significance of an effect with all the other effects controlled in the model.

*To analyze the change in diabetes perceptions following the intervention, the responses to the questions were first collapsed into ordered categories. For the questions regarding current diabetes control, seriousness of diabetes, and concern for personal risk of future complications we transformed the five-category Likert scale into three ordered categories (1,2,3) by collapsing answers on each end of the scale, e.g., “Poor” and “Terrible” and “Good” and “Excellent” into one category each, and keeping the middle response, e.g., “Fair”, as a separate category. For the question regarding the value of good control in preventing complications, answers were dichotomized. Cumulative logistic regression was then used to investigate the association of different factors with the probability of falling into the different ordered categories; specifically, we modeled the odds of falling into more accurate vs. less accurate answer categories. Since all enrolled participants had an A1C of 8 % of greater at enrollment, more accurate answers to the questions were, poorer control, diabetes more serious, higher risk of complications, and higher value placed on good control. Each study participant contributed two data points to the analysis (pre and post-intervention), allowing us to make inferences on changes over time within study arms. This was accomplished by introducing an interaction term of time and study arm indicators in the model. The estimates in Table 3 quantify the change over time in odds for each treatment. An odds ratio of 1.0 would indicate that there is no effect of time on the perception outcome, while an odds ratio of 2.0 would indicate that at the post-treatment assessment, the likelihood of giving a more accurate (diabetes poorly controlled, diabetes more serious, risk of complication higher, greater value of good control) category is twice as large. For all but one entry in Table 3, the confidence intervals around the odds ratios include one, indicating no significant effect of time (i.e., the intervention) on these perceptions. The p value comes from the test of equality of effects of time in all three study arms; this test is equivalent to jointly testing that both of the time x study arm interaction coefficients are 0. In all cases, there are no apparent differences by treatment group.