Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction and Background

What percentage is 20 of 100? For most readers of this book, the answer is straightforward. Many patients, however, have difficulties grasping this and other basic statistical concepts (Davids et al. 2004; Lipkus et al. 2001; Schwartz et al. 1997; Woloshin et al. 2001). Statistical numeracy is part of a more general concept of quantitative or mathematic literacy (Golbeck et al. 2005; Speros 2005) and includes understanding the concept of a random toss and knowing how to perform elementary calculations with percentages (Davids et al. 2004; Lipkus et al. 2001). This knowledge is essential for understanding risks associated with different diseases, medical screenings, and treatments, and, consequently, for making informed decisions about health (Cokely and Kelley 2009; Estrada et al. 2004; Nelson et al. 2008; Peters et al. 2006; Reyna and Brainerd 2007; Rothman et al. 2006). This chapter describes a cross-cultural study investigating three important unanswered questions about statistical numeracy in the health context.

First, are there differences in the level of statistical numeracy between countries with different educational and medical systems—such as the USA and Germany? Several large national and international studies have included items that measure a broader concept of quantitative literacy, for example, the Programme for International Student Assessment (PISA 2003), the Trends in International Mathematics and Science Study (TIMSS; Gonzales et al. 2004), the National Assessment of Adult Literacy (NAAL; Kutner et al. 2006), and the International Adult Literacy Survey (IALS; Tuijnman 2000). Most of these studies, however, are limited to student populations and/or do not deal specifically with statistical numeracy—in particular not in the context of health. Given a stronger emphasis on mathematics and science education in the early grades in Germany compared with the USA (Rindermann 2007), it is possible that statistical numeracy is higher in Germany. However, the opposite could also be true. Because most health expenditure in the USA is privately based (55%) (see Chap. 1; World Health Organization 2012) and because patient-targeted advertising of prescription drugs is allowed, US residents may have more experience in dealing with information about medical risks, and consequently have higher statistical numeracy than the residents of Germany—where only 23% of health expenditure is privately based.

Second, what is the relationship between statistical numeracy and demographic characteristics such as age, sex, and education? To promote the ideal of informed and shared medical decision making (Barry 1999; Frosch and Kaplan 1999; Hanson 2008), it is essential to identify low-numeracy groups and to educate them in using quantitative statistical information or communicate information about health using nonquantitative formats such as visual displays and analogies (Edwards 2003; Galesic and Garcia-Retamero in press; Galesic et al. 2009; Garcia-Retamero and Galesic 2009), see Chaps. 7 and 911. However, all of the extant studies of statistical numeracy in health used nonprobabilistic samples of patients and students. Although informative about the numeracy skills of certain narrow groups, these studies do not allow for generalizations to any broader population. Consequently, they do not allow us to draw conclusions about the relationship between numeracy and demographic characteristics such as sex, age, and education.

Third, are objective measures of statistical numeracy equivalent to recently proposed subjective measures of this concept (Fagerlin et al. 2007)? In studies of convenience samples of patients and an Internet population, subjective measures were found to be less burdensome for the participants, at the same time approaching predictive validity of the objective measures of statistical numeracy (Zikmund-Fisher et al. 2007). Subjective measures of numeracy, however, have not yet been administered to probabilistic national samples that would enable researchers to study the relationship between objective and subjective numeracy in different demographic subgroups or to conduct cross-cultural comparisons.

To answer these questions, we conducted two studies on probabilistic national samples in the USA and Germany. This enabled us to compare statistical numeracy skills of adult population in these countries and in different sociodemographic groups within the countries.

2 Study 1: Investigating Objective Statistical Numeracy in Probabilistic National Samples

In Study 1, conducted on probabilistic national samples in the USA and Germany, we investigated whether there are differences between the two countries in the level of objective statistical numeracy and sought to determine the relationship between numeracy and demographic characteristics.

2.1 Method

2.1.1 Participants

Study 1 was conducted from July 10 through 24, 2008, on probabilistic national samples in the USA (n  =  1,009) and Germany (n  =  1,001), using panels of households selected through probabilistic random digit dial telephone surveys and afterward supplied with equipment that enabled them to complete computerized questionnaires. Thus, existing Internet access or lack thereof did not affect households’ ability to become panel members. The panels—built and maintained by the online research panel Knowledge Networks in the USA [http://www.knowledgenetworks.com; 43,000 households (16% of those in the initial sample)] and the market research institute Forsa in Germany [http://www.forsa.de; 20,000 households (11% of those in the initial sample)]—allow for statistical inference to the general population. These panels were already used successfully in a number of studies in the areas of health, medicine, political and social sciences, economics, and public policy (Baker et al. 2003; Jacoby 2006; Lerner et al. 2003; Miller et al. 2006; Schlenger et al. 2002). Methodological studies have shown that data from such panels are comparable to the results obtained through traditional probabilistic surveys (Chang and Krosnick 2009). The possibility of using computerized questionnaires enabled us to ask relatively complex questions involving numerical and visual information about medical treatments on a nationally representative sample.

Of the panel members who were invited to participate in the study, 54% in the USA and 52% in Germany completed the questionnaire. This is a good response rate for this survey mode (Vehovar et al. 2002). The sample structure is shown in Table 2.1. According to official statistics, the percentage of population with less education is much higher in Germany than in the USA, so we oversampled the less-educated population in the USA to ensure equivalent sample sizes of less-educated participants in both countries. To adjust for this and for minor discrepancies due to nonresponse, we used design (in the USA) and poststratification (in both countries) survey weights to bring the sample proportions in line with the population proportions. The goal of such weighting adjustments is to correct for known differences between sample and population in the hope of providing unbiased survey estimates (Bethlehem 2002; Gelman and Carlin 2002). Standard errors in all analyses were estimated using the Taylor series linearization method for estimating population characteristics from complex sample survey data, by means of commercially available software (SPSS Complex Samples procedures, SPSS version 17.0.1 (SPSS, Inc, Chicago, IL) and SUDAAN [RTI International, Research Triangle Park, North Carolina]; Siller and Tompkins 2006).

Table 2.1 Structure of the sample of participants in Study 1 in terms of gender, age, and educationa

2.1.2 Stimuli and Procedure

Statistical numeracy was measured on a scale including three items developed by Schwartz et al. (1997), and six items developed by Lipkus et al. (2001), for a maximum score of 9 (see Table 2.2). The questions were translated into German by a native German speaker with excellent knowledge of English, back-translated into English by another person with equivalent language skills, and compared with the original English version. Any inconsistencies were resolved by a native German speaker and an excellent English speaker familiar with the research objectives. Finally, the English and German versions were compared and edited by a bilingual English and German speaker Chap. 15. When programming the questionnaire, special care was taken to ensure that the interface looked the same in the English and German versions. In sum, we believe that the materials in English and German were comparable. The Ethics Committee of the Max Planck Institute for Human Development approved the method used herein, and all participants consented to participation through an online consent form at the beginning of the survey.

2.2 Results

The statistical numeracy scale has satisfactory internal consistency: Cronbach alpha was 0.80 in the USA and 0.73 in Germany. Percent of correct answers to each of the items is presented in Table 2.2. For further analysis, we transformed the original scores ranging from 0 to 9 to a scale of 0 to 100%, indicating the percentage of the nine items that were answered correctly.

Table 2.2 Percent correct answers for each item of the numeracy scale by country in Study 1 (see also Chap. 15)

As shown in Table 2.3, German participants had higher numeracy skills than those in the USA: On average 69 vs. 65% of the items were answered correctly. This difference remains after controlling for differences in sex, age, education, and income between the two countries.

Table 2.3 Average percentage of correctly answered items on the objective numeracy scale by country and demographic groups in Study 1

On the level of each country, sex, age, and education are all related to the numeracy score. In both countries, men had higher scores than women. Numeracy skills dropped with age (r  =  −0.12 [−0.19, −0.05] in the USA, and r  =  −0.13, 95% CI [−0.20, −0.06] in Germany) and increased with education (r  =  0.50 [0.44, 0.56] in the USA, and r  =  0.28 [0.21, 0.35] in Germany) and income (r  =  0.32 [0.25, 0.39] in the USA, and r  =  0.20 [0.13, 0.27] in Germany). When we enter sex, age, education, and income together in a regression model, all four show independent effects in Germany, but in the USA only sex, education, and income explain differences in numeracy scores, while the effect of age was no longer present.

The inequality in numeracy skills was larger in the USA than in Germany, as reflected in the ratio between the scores in the 90th and 10th percentile of the participants ordered by their scores: This ratio was 4.5 in the USA vs. 3.0 in Germany. The inequality is visible, in particular, in average scores of people with low educational attainment vs. highly educated people in the USA: 40 vs. 83% correct, compared to 62 vs. 81% in Germany (see Table 2.3). We discuss the implications of these results in Sect. 2.4.

3 Study 2: Investigating Subjective Statistical Numeracy in Probabilistic National Samples

In Study 2, we investigated whether subjective measures of statistical numeracy (Fagerlin et al. 2007) correspond to objective measures (Lipkus et al. 2001; Schwartz et al. 1997) in general populations of the USA and Germany. If a subjective numeracy scale can differentiate between people with objectively low and high numeracy skills across different demographic groups, this would speak to its wide applicability. In addition, we tested whether the subjective perceptions of one’s numeracy are dependent on the context in which they are measured, namely, before or after answering several difficult numerical questions. If the scale is sensitive to context, this would limit its applicability because the results in clinical practice would depend on patients’ recent experiences with quantitative information.

3.1 Method

3.1.1 Participants

Study 1 participants were ordered by their objective numeracy scores, and those with the highest and lowest scores were invited to participate in Study 2, conducted 3 weeks after Study 1 (August 1–15, 2008), resulting in a sample of 498 participants. Basic demographic characteristics of the sample are given in Table 2.4. This sample enables us to compare low- and high-numeracy groups within each country, as well as each of those groups between countries.

Table 2.4 Structure of the sample of participants in Study 2 in terms of numeracy, gender, age, and education

In the USA, 65.8% of all participants in Study 1 completed Study 2, and in Germany, 83.1%. The response rates among high- and low-numeracy participants were similar in both countries (i.e., it was not the case that the low-numeracy group had lower response rates). The low- and high-numeracy groups in Germany represent, respectively, approximately the bottom and top third of the population sorted by numeracy scores. Because of lower response rates in the USA, the low- and high-numeracy groups represent, respectively, approximately the bottom and top 40% of the population. Nevertheless, the average numeracy scores in both groups were still somewhat lower in the USA (Table 2.4).

3.1.2 Stimuli and Procedure

Subjective numeracy was measured with seven of the eight items developed by Fagerlin et al. (2007; see also Zikmund-Fisher et al. 2007). The items were answered on a six-point scale, where higher values indicate higher perceived numeracy. We excluded the item “How good are you at calculating a 15% tip?” because it is culturally specific to the USA (see Table 2.5; see Chap. 15 for the translation of the items into German). Chapter 15 lists all of the items used. The questionnaire was developed in the same way as that for Study 1. Half of the participants were randomly assigned to complete these items before a set of questions involving relatively demanding numerical calculations of risk reductions and the remaining half completed the items after answering the questions (for more details on these questions, see Garcia-Retamero and Galesic 2009; see also Chap. 10).

3.2 Results

To compare the scores on the subjective numeracy scale with the objective numeracy data, we recoded each item—originally answered on a scale of 1 to 6—to be 0 when the answer was 3 or less, or 1 when the answer was 4 or higher. Mean and standard deviation of answers to each of the items are presented in Table 2.5. For further analyses, we summed the recoded answers to the seven items and transformed the resulting scores to a scale of 0–100%, indicating the percentage of answers to the seven items that reflected high subjective numeracy.

Table 2.5 Mean ratings of items in the subjective numeracy scale by country and numeracy in Study 2

The subjective numeracy scale has satisfactory internal consistency. The Cronbach’s alpha ranged from 0.75 to 0.87 across the two countries and groups with high vs. low objective numeracy skills. The scores on the scale were not sensitive to context: They were similar when the items were positioned before or after the tasks involving difficult calculations (average before/after difference  =  2.8, 95% CI [−5.4, 11.0]); this was so for high- and low-numeracy groups in both countries.

How well does the subjective numeracy scale differentiate between participants who are very high vs. very low in terms of their objective numeracy skills (as determined in Study 1)? The average subjective numeracy scores for these two extreme groups were 38.9 (SE  =  4.4) and 79.0 (SE  =  2.5) in the USA, and 45.5 (SE  =  3.7) and 80.0 (SE  =  2.7) in Germany. These differences were stable across gender, age, education, and income groups. However, compared to the differences in objective numeracy scores between the two extreme groups (M  =  35.6, SE  =  2.8 vs. M  =  90.9, SE  =  1.1 in the USA, and M  =  37.2, SE  =  2.0 vs. M  =  95.6, SE  =  0.7 in Germany; see Table 2.4), the differences in subjective numeracy scores were smaller.

How well does the subjective numeracy scale differentiate between participants who are very high vs. very low in terms of their objective numeracy skills (as determined in study 1)? The mean (SE) subjective numeracy scores for these two extreme groups were 38.9 (4.4) and 79.0 (2.5) in the USA, and 45.5 (3.7) and 80.0 (2.7) in Germany. These differences were stable across sex, age, education, and income groups. However, compared with the differences in objective numeracy scores between the two extreme groups (mean [SE], 35.6 [2.8] vs. 90.9 [1.1] in the USA, and 37.2 [2.0] vs. 95.5 [0.7] in Germany; Table 2.4), the differences in subjective numeracy scores were smaller.

4 Discussion and Conclusions

An average citizen of the USA and Germany could answer only two-thirds of nine relatively simple items testing basic statistical numeracy skills (Table 2.3). Statistical numeracy was somewhat lower for women than for men, and it dropped slightly with age but only in Germany. Across most demographic groups, German participants achieved somewhat higher scores than did US participants. An exception was the group with the highest education, in which US participants fared somewhat better. Differences in education systems—in particular the stronger focus on mathematics and science education in Germany from an early age (Stigler et al. 1999; Tuijnman 2000)—are likely to be the main factor underlying the differences in statistical numeracy between countries.

The inequality between people with more or less education in the USA was much larger than in Germany. Although a college-educated American could answer 83.1% of items correctly, those with less than a high school diploma could do so for only 39.9% of the items. Even for those who had a high school education the average percentage of correct answers in the USA was only 56.4%, lower than the average for German participants who had not completed a high school education (62.3%; Table 2.3).

The large differences in numeracy between persons with lower and higher educational levels have varying consequences in different medical systems. For instance—at least before the new health care reform—less educated US residents are particularly likely to be in a position to have to decide about their medical care. Although 99.7% of Germans have health insurance (see Chap. 1; see also Statistisches Bundesamt Deutschland 2011), 35% of US residents—in particular, those of lower socioeconomic status—had no health insurance or insufficient coverage (Schoen et al. 2005) and had to decide whether to pay for various medical treatments and screenings (Schoen et al. 2007). Given their low statistical numeracy, they might have had difficulty making good decisions.

The present chapter, to the best of our knowledge, describes the first study investigating statistical numeracy skills in probabilistic national samples in the USA and Germany, allowing comparison of different demographic groups within each country as well as comparison between the two countries. It also describes the first cross-cultural comparison of objective and subjective measures of statistical numeracy.

At the same time, a limitation of the studies is that levels of numeracy in the general population could be even lower than our results suggest. To become members of the national panels from which our samples were selected, participants had to accept having a computer or special TV set with Internet access installed in their homes. It is possible that people with low numeracy refused this more often than did those with high numeracy skills. On the other hand, our sample represents accurately the overall population in terms of education. Furthermore, there is no particular reason to expect that numeracy but not general educational level would be related to higher rates of refusal.

Our findings have clear implications for medical practice. Physicians should not assume that all patients can understand simple statistical indicators that are often used to express risks and benefits of medical screenings and treatments. For example, approximately 20% of the participants in both Germany and the USA could not say which of the following numbers represents the biggest risk of getting a disease: 1, 5, or 10%. Ratios were even more difficult—almost 30% could not answer whether 1 in 10, 1 in 100, or 1 in 1,000 represents the largest risk. Similarly, almost 30% of the study participants in both countries could not state what percentage 20 of 100 is, and most (76.5% and 53.7% of the participants in the USA and Germany, respectively) could not transform 1 of 1,000 to a percentage. Furthermore, many participants lacked the understanding of the concept of random toss. When asked how many times a fair coin would come up heads in 1,000 flips, more than one-fourth of the study participants in both countries gave answers that were obviously incorrect (less than 400 or more than 600 times).

Given the low levels of statistical numeracy of many patients, physicians could use items from the subjective numeracy scale to identify patients who may have problems understanding numerical information. If they have such a patient, physicians could communicate risks and benefits of treatments by means of formats that do not require high levels of numeracy, such as visual displays (see Chaps. 9, 10, and 11; see also Hanson 2008; Galesic et al. 2009; Lipkus 2007; Lipkus and Hollands 1999) and analogies (see Chap. 7; see also Garcia-Retamero and Galesic 2009; Galesic and Garcia-Retamero 2012), rather than numerical expressions. In this way, patients with low numeracy skills could understand statistical information and make better decisions about their health.