Keywords

FormalPara Learning Objectives
  1. 1.

    Define “risk factor information system” and explain how such systems complement primary scientific research and vital statistics systems.

  2. 2.

    Describe specific data collection methods employed by various risk factor information systems.

  3. 3.

    Describe similarities and differences among national-level risk factor surveillance systems, and explain the rationale for specific risk factor information systems focused on special populations.

  4. 4.

    Identify repositories that enhance the dissemination of risk factor data through consolidation or integration, and identify technologies that may change risk factor data collection.

Overview

Risk factor information systems monitor the prevalence of specific antecedents of premature disease and death. These systems focus on tracking behaviors, conditions, and exposures to increase awareness of the burden of disease in a community, support prioritization of public health resources, and allow measurement of the effectiveness of prevention programs. There are a variety of important risk factor information systems in use at the present time, both in the United States and internationally: some systems are designed to produce national or regional estimates, while others have a more local, community focus; some systems cover a broad range of health risk factors across all demographic groups, while others focus on a small number of disease-specific exposures in special populations; some systems require only subjective responses, while others collect additional measurements of the body and biological assays. There are numerous efforts underway that use information technology to make risk factor information more accessible and useful through integration and innovative presentation, and the future uses of new information technologies to augment risk factor surveillance are explored.

Introduction

For centuries, scientists have used vital statistics systems as primary data sources to study trends in morbidity and mortality. In the early 1500s, as a means of warning the public about local plagues, parish clerks in London began weekly postings of deaths and their causes, which came to be known as the Bills of Mortality [1]. In the 1600s, John Graunt (a haberdasher by trade) became fascinated with demographic patterns in these “lists of the dead,” and published his Natural and Political Observations Made upon the Bills of Mortality. His work was notable for a number of innovations, including the creation of “life tables” (charts of survivorship based on age) and frequency summaries by cause of death—spurring greater interest in the systematic capture and use of these data [2].

The practice of public health continued to evolve, driven in part by the effective use of vital statistics and other mortality data to characterize and prevent premature death [3]. Over time, public health practitioners developed important health indicators from these data, such as mortality rates and years of potential life lost (YPLL), that continue to be used to communicate and assess the severity of important public health problems in the modern era [4].

By the start of the twentieth century, the public health community had recognized that vital statistics and other mortality data lacked the breadth, depth, and timeliness to effectively detect, describe, and respond to modern threats to the public’s health, as increasing focus was placed on mitigating the antecedent behaviors, conditions, and exposures (hereafter referred to as risk factors) that strongly influence future disease, disability, and death [5]. From this need for richer and more current risk factor information, public health agencies developed specialized surveillance procedures and systems to support them.

This chapter introduces the concept of risk factor information systems, including the rationale for their use, and their role in preventing premature morbidity and mortality. Specific examples will be presented to acquaint the reader with: the breadth of conditions and populations under surveillance; the variety of methods that are employed to gather data and disseminate results; and some examples of the use of the data to improve public health. The chapter will then review examples of informatics innovations that may contribute to more efficient and effective use of risk factor information in the future.

Risk Transition in the Twentieth Century

The twentieth century saw a significant change in the nature of premature mortality worldwide. In 1900, the leading causes of death in the United States were infectious in origin—pneumonia and influenza, tuberculosis, and gastrointestinal infections; by the end of the century, the leading causes of death had taken a decidedly non-communicable turn—heart disease, cancer, noninfectious airway diseases, cerebrovascular disease, and accidents [6]. A similar shift occurred worldwide, with nearly two-thirds of deaths now attributable to chronic illnesses—mainly cardiovascular diseases, cancers, diabetes, and chronic lung diseases [7].

This risk transition from infectious to non-communicable causes of death was due in large part to important scientific advances: public health interventions, such as vaccinations and improved sanitation that reduced the incidence of infectious diseases; and improvements in medical care that prevented premature death. In addition, extended longevity has led to an aging population (with older adults having the highest rates of chronic diseases) [8]. The world’s population was expanding, and people were living longer with diseases that took a slower toll on their beings.

One important consequence of this risk transition was the recognition that measures of mortality were not sufficient to convey all outcomes of chronic disease. For example, the multi-systemic sequelae of Type 2 diabetes mellitus debilitate the individual long before death. This recognition led to increased interest in the quality, not simply the length, of life lost [9]. The past four decades have seen the advent of additional health indicators, such as quality-adjusted life years (QALYs) and disability-adjusted life years (DALYs), to provide public health with additional tools to communicate and assess the effect of chronic diseases on the health of populations [10].

Public health placed great focus on identifying the causative factors that increased the risk of living with, and dying of, chronic disease. Scientific research in the latter half of the twentieth century revealed many of these underlying risk factors. For example, the Framingham Heart Study has provided generations of information regarding specific conditions or exposures that contribute to cardiovascular disease and premature death, including obesity [11], type 2 diabetes mellitus [12], smoking [13], and genetic associations [14], as well as risk factors for stroke and dementia [15]; in addition, the protective properties of healthful behaviors, such as proper diet [16] and exercise [17] were identified. Early studies linking tobacco smoking to bronchiogenic carcinoma set the stage for future work revealing the risk factors for lung and other cancers [18]. In addition, industrialization brought to prominence new risk factors for premature death and disability, including environmental contaminants, occupational hazards, and injuries and violence. A common thread among many unhealthful risk factors was that their effects accumulated over years, even decades, and the key was to identify these risk factors in individuals as early as possible.

The Nature of Risk Factors and the Causal Chain

In the United States [19] and worldwide [7], the leading risk factors for chronic diseases are tobacco, poor diet and physical inactivity, and alcohol consumption. These risk factors are all external (and, therefore, avoidable) exposures or behaviors that directly cause chronic disease, or create antecedent internal states (such as elevated cholesterol and hypertension) that cause chronic disease and death. Further, all of these identified risk factors contribute to one or more of the leading chronic diseases (heart disease, cancer, noninfectious airway diseases, cerebrovascular disease); conversely, these leading chronic diseases have one or more of these antecedent risk factors [20]. Risk factors represent the start of a causal chain of events that lead to disease, disability, and untimely death.

While tertiary prevention (the treatment of symptoms and complications of disease to prolong life and forestall death) and secondary prevention (the detection and treatment of disease before it becomes symptomatic) are important health activities, primary prevention (the identification or mitigation of the risk factors for disease before they cause disease) is the mainstay goal in public health. To effectively prevent a disease, it is important to have specific and timely information about the prevalence of its risk factors, and this information must be reliable and comparable in order to plan, manage, and evaluate appropriate interventions [21].

Risk factor information systems provide much of the information used to monitor the prevalence and trends of specific risk factors at the local, national, and international level. Public health leaders use these systems to prioritize those health problems that are relevant in their communities, and to concentrate resources on evidence-based prevention programs. Further, where standardized measurement of risk factors is employed in ongoing surveillance, comparisons can be made over time (supporting evaluation of prevention programs that have been implemented) and across geographies (where communities may forecast regional trends or assess interventions that have been effective in comparable locales). Some risk factor information systems, particularly those that are incorporated into vital statistics or otherwise report on acute causes of death and injury, may also augment scientific research by identifying new dangers to the public’s health.

The next sections of this chapter will provide specific examples of risk factor information systems, including the breadth of conditions and populations under surveillance, the variety of methods that are employed to gather data and disseminate results, and some examples of their effective use.

National (United States) Risk Factor Systems

In the United States, there are a number of important risk factor surveillance activities that have national scope, and collect information on the breadth of risk factors that lead to injury, disability, disease, and death. Three prominent systems are presented and compared (Table 18.1).

Table 18.1 Some national (United States) risk factor systems

The National Health Interview Survey (NHIS)

The National Health Interview Survey (NHIS) is a cross-sectional, multi-purpose survey of households that monitors the health of the civilian, non-institutionalized population of the United States. Established by the National Health Survey Act of 1956, and initiated in 1957, the survey has been administered by the National Center for Health Statistics (NCHS) of the Centers for Disease Control and Prevention (CDC) since 1960 [22].

Employees of the US Bureau of the Census conduct the annual survey throughout the year, following interview procedures defined by NCHS. The NHIS uses computer-assisted personal interviewing (CAPI) technology, allowing interviewers to enter responses directly into a computer as the survey is conducted, and promoting the efficient and accurate capture of data. The NHIS uses computer-assisted personal interviewing (CAPI) technology, allowing interviewers to enter responses directly into a computer as the survey is conducted, and promoting the efficient and accurate capture of data.

The NHIS sampling plan is intended to select participants in households that are statistically representative of the population of the United States, excluding those persons in long-term care facilities, active duty members of the Armed Services, the incarcerated, and US nationals living abroad. The sampling plan is multi-staged, and redesigned following every decennial census. The first stage identifies primary sampling units (PSUs) covering the 50 states and the District of Columbia; a PSU may be a county, a small group of contiguous counties, or a metropolitan statistical area. A PSU is further subdivided into area segments (containing 8–16 addresses) and permit segments (containing approximately 4 addresses from housing units built after the most recent census. To correct for statistical bias of under-represented populations, the NHIS oversamples (selects more) persons of black, Asian, and Hispanic heritage. Participation in the survey is voluntary and uncompensated, and the responses of participants remain confidential. In the 2010 survey, household interviews were completed for 89,976 persons in 34,329 households with a household response rate of 79.5 % [23].

The survey itself has two main parts: a Core questionnaire and Supplements. The Core questionnaire collects socio-demographic and basic health information, including important risk factors such as physical activity, tobacco use, and injuries and poisoning. The Core questionnaire has four components:

  • Household (basic demographic information about all members of the household);

  • Family (additional information about health-related issues and socio-demographic factors);

  • Sample Adult (additional health questions specific to one adult in the household); and

  • Sample Child (additional health questions specific to a child in the household—if any).

The Core questionnaire has remained relatively stable following a significant redesign in 1997, allowing for analysis of trends over time, but limiting comparability with prior years [24]. The Supplements portion of the NHIS includes questions on specific public health topics of interest, including cancer screening, complementary and alternative medicine, children’s mental health, and Healthy People 2010 objectives. Health information can be trended for specific socio-demographic groups and the country as a whole, but the sample size is not large enough for precise state-specific estimates. The survey questionnaires and the survey data can be accessed on links at the NHIS website. The data are also summarized in reports from NCHS and by researchers using the datasets.

NHIS data are generally used to monitor national trends in disease and disability, to track national health objectives (such as Healthy People 2020), and to evaluate Federal health programs. The data may also be used for public health research to describe the status of specific conditions in particular socio-demographic groups, or to identify new associations—such as linkages [25] between occupation and lung cancer, or to create or evaluate policy. Since the data are intended to be nationally representative, their utility for state and local public health monitoring of risk factors may be limited.

The National Health and Nutrition ExaminationSurvey (NHANES)

The National Health and Nutrition Examination Survey (NHANES) is a multi-component survey designed to assess the health and nutritional status of adults and children in the United States. NHANES began in the early 1960s and, like the NHIS, is administered by the National Center for Health Statistics [26].

The NHANES sample is intended to be nationally representative. The sampling plan for NHANES is multi-staged, and includes PSUs (roughly corresponding to single counties) and secondary sampling units (SSUs) that are progressively divided from segments (generally equivalent to city blocks) to households and then individuals. Each annual sample selects from approximately 15 counties nationwide. The NHANES oversamples for persons age 60 and over, and also for persons of black or Hispanic heritage. The annual sample size is approximately 5,000 participants, who receive monetary compensation. The number of persons sampled for NHANES in the years 2009–2010 was 10,253 [27].

NHANES has two major components: an interview and a physical examination. The NHANES interview is administered using CAPI technology, and includes socio-demographic and health-related questions; categories of risk factors elicited include smoking, alcohol consumption, sexual practices, drug use, physical fitness and activity, and dietary intake. The NHANES examination is conducted by medical personnel, and includes medical, dental, and physiological measurements, as well as laboratory tests.

NHANES questionnaires and survey data can be accessed on links at the NHANES website. The data are also summarized in reports from NCHS and by researchers using the datasets. NHANES findings have been used to: assess nutritional status risk factors; establish national standards for measurements such as height, weight, and blood pressure; and even link chemical exposures to chronic diseases [28].

The Behavioral Risk Factor Surveillance System (BRFSS)

The Behavioral Risk Factor Surveillance System is a cross-sectional, telephone-based survey that collects state-level data about health-related risk behaviors, chronic conditions, and the use of preventive services by residents of the United States. The survey is conducted in and by all 50 states, plus the District of Columbia and three US territories, with technical assistance from the National Center for Chronic Disease Prevention and Health Promotion (NCCDPHP) of the CDC [29].

Each state administers its survey continuously throughout the year, using its own employees or contractors. Approximately 350,000–400,000 participants nationwide are selected annually using random digit dialing (RDD) techniques to both landlines and cellular phones—a recent change to accommodate cell-phone only households [30]. Participants are adults 18 years or older; participation is voluntary, and there is no monetary compensation.

Each state’s BRFSS has three components: a standardized set of core questions that are asked every year (fixed core) or every other year (rotating core); optional modules that states may elect to use; and state-specific questions. The core categories of risk factors on the BRFSS include alcohol consumption, asthma, cardiovascular disease, diabetes, disabilities, exercise, and tobacco use, among other areas of interest. The use of standardized core questions allows for comparisons to be made across and within states over time.

Unlike the NHIS and NHANES, the BRFSS does not employ a sampling plan, as participants are selected at random. The BRFSS employs a methodology that weights collected survey data based on age, race/ethnicity, sex, geography, marital status, education level, home ownership, and type of phone. As part of the Selected Metropolitan/Micropolitan Area Risk Trends (SMART) project, data may be analyzed for specific metropolitan and micropolitan statistical areas (MMSAs) with 500 or more respondents.

BRFSS data and documentation can be found on the BRFSS Annual Survey Data webpage. BRFSS data are used in all states to establish and track state and local health objectives, support and evaluate health policies, develop and plan health programs, public education, create new laws or regulations, implement disease prevention and health promotion activities, and monitor trends [31]. Some state-level uses include monitoring of diabetes trends [32], assessment of state smoking prevalence [33] and evaluation of smoking cessation programs [34], as well as tracking exposures [35].

Some common barriers to more widespread state and community use of BRFSS data include limited availability of regional and subgroup data, lack of data analysis skills, and inadequate staff resources [36]. For these reasons, CDC has used information technology to facilitate greater use of the data, including a web-based, menu-driven query system to create summary tables and graphs. In addition, CDC developed BRFSS Maps—a web-based application that uses geographic information system (GIS) technology to create interactive maps that display behavioral risk factor prevalence data at the state and MMSA level.

Although the survey is telephone-based, there has been much research done to validate the reliability of the responses [3739]. Concerns about decreasing response rates on landline phones prompted recommendations to include cellular phones in the random digit dialing methodology [40]. The use of dual-frame survey for landlines and cell phone numbers has been a recent update to the methodology to continue to get valid, reliable, and representative data [41].

Comparing the Systems

While the NHIS, NHANES, and BRFSS are all similar in terms of monitoring health status in the United States, including the prevalence of important health risk factors, there are important differences to consider.

In terms of statistical comparability, national estimates on the prevalence of specific risk factors are generally comparable [42], although estimates may differ when further stratifying by demographic subgroup [43]. These variances in estimates may be due to differences in methods of data collection and analysis [44]. Following a decline in BRFSS response rates (from 72 % in 1993 to 51 % in 2006) some differences in comparability have been observed on selected measures between BRFSS and NHIS, and between BRFSS and NHANES [45].

The NHIS and NHANES are limited to national-level estimates, while the BRFSS by design can produce state-level (and in some instances, city-level) results. Further, these data may not be directly comparable with data in other national systems such as HEDIS [46]. Consequently, where a similar risk factor is measured in more than one system, all relevant systems should be considered before making important public health assessments of prevalence or outcome.

Risk Factors in Special Populations

While large risk factor information systems may effectively monitor the health of specific demographic groups, geographic regions, or the nation as a whole, they may not be appropriate to monitor the prevalence of risk factors or the outcomes of targeted interventions in specific, high-risk populations. Specialized risk factor information systems have been developed to address this need, and the selected examples are intended to demonstrate the breadth of populations studied and the variety of methods employed (Table 18.2).

Table 18.2 Some risk factor surveillance systems for special populations (United States)

The Youth Risk Behavior Surveillance System (YRBSS)

The Youth Risk Behavior Surveillance System (YRBSS) uses a school-based survey to monitor the prevalence and trends of risk behaviors that place youth in the United States at most risk for premature morbidity, mortality, and social problems. The survey is conducted by state, local, and territorial education agencies as well as tribal governments, with technical assistance provided by the Division of Adolescent and School Health (DASH) of the CDC. Each survey is intended to be representative of the state or local educational jurisdiction that conducts it; the CDC conducts a separate national school-based survey that is intended to be representative of students across the United States. YRBSS data are used primarily by state and local education agencies to describe risk behaviors, create awareness, supplement staff development, set and monitor program goals, develop health education programs, support health-related legislation, and seek funding [47].

The Youth Risk Behavior Survey (YRBS) is the specific data collection instrument for the YRBSS. The YRBS is conducted biennially during odd-numbered years. The survey is self-administered and comprises 87 core multiple-choice questions across six categories of priority health-risk behaviors: behaviors that contribute to violence and unintentional injuries; tobacco use; alcohol and other drug use; sexual behaviors that contribute to pregnancy and sexually transmitted diseases; unhealthy dietary behaviors; and inadequate physical activity. To preserve anonymity, the survey does not collect personal identifiers, and participants are not compensated. The survey uses paper-and-pencil with results scanned in electronically for processing and analysis.

For each state or local education agency, a two-stage cluster sample design is used to produce samples representative of 95 % of students in grades 9–12. The first stage selects for schools with probability proportional to school enrollment; the second stage randomly selects appropriate classes within the identified schools. If the overall response rate for a survey is greater than 60 %, it is considered to be “weighted” and representative of the students attending public school in that state or local jurisdiction. The survey design specifically excludes certain groups of youth, including absentees and dropouts, and students that attend private school, alternative schools, or who are home-schooled [48].

The YRBSS has conducted other special national surveys in the past, specifically capturing populations not present in public schools, grades 9–12. In 1992, a Youth Risk Behavior Supplement was added to the 1992 NHIS, and included youth who were attending and not attending school (this group was oversampled) [49]. In 1995, a mail-based National College Health Risk Behavior Survey was used to determine the prevalence of health-risk behaviors among college students [50]. In 1998, a school-based National Alternative High School Youth Risk Behavior Survey was administered to measure priority health-risk behaviors among students attending alternative high schools who are at high risk for failing or dropping out of regular high school, or who have been expelled from regular high school because of illegal activity or behavioral problems [51].

YRBSS data and documentation can be found on the YRBSS Data Files & Methods webpage. As education agencies have historically lacked the resources to conduct statistical analyses on complex survey data, DASH has used innovative information technology to make survey data more usable for its constituents. In the 1990s, DASH developed and distributed a CD-ROM based application that allowed users to query data. In 2001, DASH developed Youth Online, a web-based, menu-driven system created using user-centered design principles; the user experience was informed by the most common data requests of YRBSS stakeholders. Youth Online allows users to generate summary tables and graphs, and conduct ad hoc trend-analyses and comparisons with real-time evaluation of statistical significance. The utility of YRBSS is limited by the need for an appropriate response rate in order to provide comparable (weighted) data, and by the paucity of measures to demonstrate changes in prevalence or trends that result from monitoring these behaviors.

Pregnancy Risk Assessment Monitoring System (PRAMS)

The Pregnancy Risk Assessment Monitoring System (PRAMS) is a nationwide surveillance system that collects state-level information to monitor changes in specific maternal and child health indicators. The PRAMS is conducted by participating states, with technical assistance provided by the CDC’s Division of Reproductive Health [52].

The PRAMS is administered annually. For each participating state, the PRAMS sample is selected from all women who have had a recent live birth. Each state samples 100–300 women each month (approximately 1,300–3,400 each year). Low-weight births are usually oversampled, as are some high-risk populations, and some states oversample by race/ethnicity. All states currently use either a participation incentive (sent to all mothers in a sample) or reward (sent only to respondents) to enhance response rate [53].

PRAMS has two initial data collection methods: the primary method is a mailed survey questionnaire, with frequent follow-up mailings made to non-responders; the second method is a telephone survey, in the event of repeated non-response to the mailed survey [54]. The survey is standardized to permit comparisons among states, although some customizations are permitted. Specific risk factors monitored by PRAMS include barriers to and content of prenatal care, obstetric history, maternal use of alcohol and cigarettes, physical abuse, contraception, economic status, maternal stress, and early infant development and health status.

Mothers’ responses are linked to birth certificate data for subsequent analysis [55], and may be further linked to other available data sources, including: newborn screening; Medicaid; birth defects data; Women, Infants, and Children program (WIC); hospital discharge data; Sudden Infant Death Syndrome (SIDS) data, and Assisted Reproductive Technology (ART) data.

PRAMS data may be queried using CDC’s PRAMS Online Data for Epidemiologic Research (CPONDER) system. PRAMS data are used by researchers and for state program evaluation, and have been used to gain support for program initiatives directed at unintended pregnancy, to promote policies aimed at monitoring or reducing unintended pregnancy, to acquire additional funds for related programs (such as family planning), and to evaluate psychosocial risk and prenatal counseling [5658].

National HIV Behavioral Surveillance System (NHBS)

The National HIV Behavioral Surveillance System (NHBS) tracks behaviors and care access among persons at high risk for HIV infection. The NHBS was created in 2003, and is administered by the National Center for HIV/AIDS, Viral Hepatitis, STD, and TB Prevention (NCHHSTP) at CDC [59].

The survey is conducted by public health staff and administered using a handheld personal computer device that facilitates the efficient collection of data. A standard survey instrument is used, to collect core demographic information and information about specific risk factors, including sexual behavior, injection and non-injection drug use, HIV testing and results, and access and use of prevention services. In addition to the core questions, local jurisdictions may add questions to help evaluate local HIV prevention programs. The survey is anonymous, and participants receive monetary compensation [60].

The NHBS samples are intended to be specific to the 20 participating jurisdictions, with a separate sample selected to be nationally representative. The survey focuses on the three populations at highest risk for HIV: men who have sex with men (MSM), injecting drug users (IDUs), and high-risk heterosexuals (HET). To recruit MSM, venues that are highly frequented by MSM are selected; for IDUs and HET, respondent-driven sampling (where participants recruit additional participants) is employed. Within each jurisdiction, 450–500 eligible persons are recruited from the at-risk population of interest, and participate in interviews and testing [61]. Data collected from NHBS are used to describe trends in key behavioral risk indicators and to evaluate HIV prevention programs; the data also further characterize the at-risk populations, identify gaps in prevention services, and identify new prevention opportunities.

International Systems

There are a number of CDC-supported risk factor information systems used internationally, including the Global School-Based Health Survey (GSHS), the Global Adult Tobacco Survey (GATS), and the Global Youth Tobacco Survey (GYTS). However, the focus of this section will be on three separate international efforts to provide prevalence information on major health risk factors, particularly in developing countries where the determinants of premature mortality are divergent for children vs. young adults vs. older adults, acknowledging the ongoing effects of poverty and infectious disease (see Table 18.3) [62].

Table 18.3 Some international risk factor surveillance efforts

Global Burden of Disease, Injuries, and Risk Factors Study

The Global Burden of Diseases, Injuries, and Risk Factors Study (GBD) was commissioned in the early 1990s by the World Bank as the Global Burden of Disease Study, and is now led by the Institute for Health Metrics and Evaluation (IHME) at the University of Washington, in collaboration with Harvard University, Imperial College London, Johns Hopkins University, University of Queensland, University of Tokyo, and the World Health Organization (WHO) [63]. The GBD collects information on 291 diseases and injuries, 67 risk factors, and 1,160 disease sequelae, across 21 regions on all continents except Antarctica, and 20 age groups—using DALYs as a common metric to account for premature mortality as well as the prevalence, duration, and severity of premature morbidity and injury. The GBD was designed to support rapid implementation of a cost-effective data collection system in developing countries, and has been adapted to meet a diverse set of cultural, demographic, and linguistic contexts; categories of risk factors cover a range of public health problems, including: communicable disease, newborn and maternal health, nutrition, non-communicable diseases, and injuries [64].

The IHME hosts innovative applications to encourage the dissemination and use of GBD data: GHDx, a web-based data query system; and GBD Visualizations, a web-based data presentation and analysis tool. GHDx allows users to access links to datasets for international and US risk factor information systems, and allows users to directly query GBD data by country and topic [65]. GBD Visualizations utilizes a wide variety of interactive charts (Fig. 18.1) to allow the user to easily understand and communicate data [66]. The GBD has been used by governments, and non-governmental organizations, to inform priorities for research, development, policies, and funding [67].

Fig. 18.1
figure 1

Examples from GBD Visualizations (Source: The Institute for Health Metrics and Evaluation at the University of Washington, 2013. Used with permission)

STEPS

The WHO STEPwise approach to Surveillance (STEPS) is a multi-component survey designed to assess health and nutritional status in WHO member countries. STEPS has standardized questions and protocols to support monitoring trends over time, as well as across-country comparisons [68].

STEPS is a three “step” assessment: (1) Questionnaire; (2) Physical Measurements; (3) Lab tests. The first step is required; the second and third steps are subject to the availability of local resources. The questionnaire includes: a required set of questions related to important risk factors (socio-economic conditions, tobacco and alcohol use, and nutritional status and physical inactivity); an “expanded” set of recommended questions (socio-cultural factors, hypertension, and diabetes topics); and an “optional” set of questions (covering mental health, intentional and unintentional injury and violence, and oral health). The physical measurements include: required measurements (height, weight, waist circumference, and blood pressure), “expanded” measurements (hip circumference and heart rate) and “optional” measurements (skin fold thickness, and a physical fitness assessment). The laboratory tests include: required tests (fasting blood sugar and total cholesterol), “expanded” tests (Fasting HDL and triglycerides), and “optional” tests (Oral glucose tolerance, urine exam, salivary nicotine metabolites).

STEPS data are disseminated via the WHO Global InfoBase, which is a data warehouse that collects, stores and displays health information on chronic diseases and their risk factors for all WHO member states [69].

MEASURE Demographic and Health Surveys (DHS)

The MEASURE Demographic and Health Surveys (DHS) is a project to collect and disseminate nationally representative data on health and population in developing countries. DHS is primarily funded by the United States Agency for International Development (USAID), which has conducted 230 surveys in more than 80 countries since 1984; donors and host countries provide additional funding [70].

There are two main types of DHS Surveys: Standard and Interim. The Standard DHS Surveys are conducted approximately every 5 years, and typically sample between 5,000 and 30,000 households. The Interim DHS Surveys focus on key performance monitoring indicators but may not include data for all impact evaluation measures (such as mortality rates). Interim surveys are conducted between cycles of the Standard survey, and the sample size is typically smaller.

The core questionnaires collect basic demographic and health information. There is inter-nation variation in questions and methods: most surveys include women of reproductive age (15–49) and men age 15–59, whereas in some countries only women are interviewed. Other required questionnaires focus on marriage, fertility, family planning, reproductive health, child health, and HIV/AIDS; some optional questionnaires have focused on domestic violence and maternal mortality [71]. DHS datasets are available on the MEASURE DHS website.

Opportunities in Information Technology

There are numerous efforts underway that use information technology to support the dissemination and use of risk factor data, and there are emerging opportunities for other innovative uses of information technology to augment the capture of risk factor data, and the identification and evaluation of new risk factors.

Integrated Data Dissemination

Examples from the BRFSS, YRBSS/Youth Online, WHO STEPS/Global Infobase and GBD/IHME systems have been previously identified for their innovative use of informatics principles to assist in the dissemination and analysis of risk factor data. Presented here are additional noteworthy examples of consolidation or integration of risk factor data from multiple sources.

CDC WONDER

CDC’s Wide-ranging OnLine Data for Epidemiologic Research (WONDER) system is a menu-driven, web-based system that provides access to risk factor information on births, deaths, cancer, HIV/AIDS, tuberculosis, census, and other data that have been collected from other surveillance activities. Users may generate tables, maps, and other data extracts, as well as access relevant publications electronically. An application programming interface (API) has been developed to support automated web service data queries using XML.

CDC WISQARS

CDC’s Web-based Injury Statistics Query and Reporting System (WISQARS) is a menu-driven, web-based system that provides access to information on risk factors for unintentional and violence-related injury in the United States. Fatal and nonfatal injury, violent death, and cost of injury data have been consolidated from several different information systems, including the National Vital Statistics System (NVSS), the National Electronic Injury Surveillance System-All Injury Program (NEISS-AIP), and the National Violent Death Reporting System (NVDRS). Users may generate tables, maps, summary reports, and other data extracts by filtering on a number of variables, including intent of injury, mechanism, affected body region, injury type, geographic location, sex, race/ethnicity, and age.

Health Data Interactive

CDC/NCHS Health Data Interactive is a web-based system that provides users with access to summarized data on a number of different health topics, including risk factors and disease prevention such as cholesterol level, hypertension, overweight/obesity, physical activity, smoking, and vaccinations for influenza and pneumonia. The data are presented as tabular summaries, and the user may filter the tables by several variables, including age, gender, race/ethnicity, and geographic location. Data may also be downloaded directly for external use.

Vital Stats

CDC/NCHS Vital Stats is a web-based system that provides users with access to summarized vital statistics risk factor data for deaths, births, and perinatal mortality. The data are presented as tabular summaries, which may be filtered by several variables; users may also generate graphs and maps, or download files for external analysis.

NCHS Data Linkage Activities

NCHS has an ongoing effort to more fully explore risk factors by linking its population-based health surveys (such as NHIS and NHANES), to other important data sources such as air monitoring data from the Environmental Protection Agency (EPA) and death certificate records from the National Death Index (NDI). Although no user-driven query system is available, NCHS provides access to public-use and restricted-use data sets for analysis.

Health Indicators Warehouse

The Health Indicators Warehouse is a collaborative effort of agencies of the US Department of Health and Human Services to consolidate access to national, state, and community risk factors and other health indicators. The user may initiate searches by a specific topic (e.g., demographics, disease, disabilities, specific health risk factors), geography, or initiative (e.g., Healthy People 2020, County Health Rankings). The user may also select directly from more than 1,000 specific indicators (such as “Cigarette Smoking: Adults” or “Cholesterol Level: Adults”). The user may access the definition and rationale for the indicators, information about the data source, links to evidence-based interventions, as well as data summarized as tables, graphs, or maps (where appropriate). The warehouse also includes an API to support automated web service data queries using REST and SOAP services.

Emerging Opportunities

Web-based surveys, geographic information systems, and electronic health records are technologies that may have a future role in the capture of risk factor data, the identification of new risk factors, the evaluation of the effectiveness of intervention strategies, and the augmentation of existing data.

Web-Based Surveys

The use of Internet-detached electronic devices to capture risk factor survey data (interviewer-driven CAPI for NHIS and NHANES, and participant use of handheld computing devices for NHBS) has been previously described in this chapter, and noted for the benefits on efficiency of data capture and data validity. However, there is not widespread use of web-based surveys to capture risk factor data.

Web-based surveys have a number of benefits over conventional paper or in-person methods, including: electronic data capture; interactivity (including error checking and skip patterns); and rapid updating of survey content to address emerging needs. However, web-based surveys may not be appropriate where Internet connectivity is unavailable, a physical examination or laboratory testing (e.g., NHANES) is required, or the identity of the responder must be confirmed. There are other concerns regarding response rate and the validity of responses in web-based surveys [72].

The YRBSS has traditionally administered surveys with paper-and-pencil, with forms being collected and stored for electronic scanning in bulk, and edit checks applied during analysis. In a study comparing administration of the YRBS survey as paper-and-pencil vs. web-based mode, results indicated that prevalence estimates from paper-and-pencil and web-based surveys are generally equivalent [73]. Although this has the potential to streamline data collection, and enforce data validation at the time of the survey, additional study is required to determine the effect of technology-specific issues such as screen size and resolution before web-based surveys can be used in unmonitored settings.

The effect of web-based surveys on response rates appears to be mixed. Generally, web-based survey response rates are lower than with paper-based surveys [74, 75]. However, in the specific case of assessing the risk behaviors in a college population, the response rates did not differ and students were more likely to answer socially-threatening items on a web-based survey [76]. Further study is required to determine whether this effect on response rate is specific to participant age, or the subject matter, or is an effect that will extinguish over time as the aging demographic becomes more technology-savvy.

Geographic Information Systems

Geographic information system (GIS) technology is a well-established tool for public health communication and analysis. The use of maps to present risk factor surveillance data has been highlighted in selected systems in this chapter, although the presentation of results at small geographic levels may be limited by the specificity of the geographic data collected (if any) or the representativeness of the smaller corresponding sample size.

GIS may also be a valuable tool for identifying and evaluating risk factors for disease (particularly those related to environmental exposures), and targeting interventions or public health policy. For example, GIS is commonly used to assess risk for lead exposure, and to evaluate screening programs. Lead screening programs have typically targeted high-risk populations by risk markers such as older housing and poverty. Detailed capture of geographic information as part of household surveillance can further refine targeted screening and validate risk-factor-based prediction rules [77], while also identifying unexpected clusters and potential new sources [78]; policies to remediate lead hazards can then be implemented and their outcomes evaluated [79]. GIS was used in another study to establish that living in a residence with more nearby traffic increased the risk of childhood asthma; this has potential implications for targeting asthma screening and education programs, as well as issues of vehicular emissions and urban planning [80].

Electronic Health Records

With the increasing adoption of electronic health records (EHRs) [81], and the collection of specific clinical quality measures (CQMs) that support the “meaningful use” of EHRs, there is a new opportunity to conduct surveillance of risk factors in populations. Although the future use of risk factor data from EHRs is not well understood, the potential availability of these data may facilitate determination of prevalence rates, and help evaluate the outcomes of individually-targeted interventions for specific risk factors, such as tobacco use and cessation [82]. In addition, the use of data mining and analytic techniques on EHR data has the potential to permit inferences about new risk factors that have not previously been identified [83].

In an example from the University of Wisconsin, EHR data were linked with community-level data to describe asthma and diabetes prevalence and health care quality, for individual patients and the community at-large, suggesting potential future use in assessing health status and outcomes [84]. There are a number of current limitations, including few instances of direct access to EHR data for public health use, and the quality and representativeness of EHR data; however, once available, the sheer volume of clinical data may allow for selective sampling and may make risk factor estimates reliable for smaller geographic levels than is possible using traditional survey methods [85]. Further investigation will be needed to determine the reliability and validity of objective physical measures (such as height and weight) in addition to the degree of standardization of responses about risk behaviors (such as smoking and exercise) across many EHR vendors.

There are two BRFSS demonstration projects underway to evaluate the potential use of EHRs to conduct behavioral risk factor surveillance. In the first project, consenting patients will be surveyed and their responses will be linked to their respective electronic health records, to create an anonymized data set containing patient survey data. Researchers will then compare individual survey responses to the corresponding EHR data to evaluate their validity and reliability for monitoring population health. The second project will use simulated patient data to test analytic tools that summarize self-reported data collected from web-based surveys and compare them statistically with EHR data. These demonstration projects are expected to complete by the end of 2013 [86].

Conclusion

Risk factor information systems are a relatively new tool used in the prevention of premature injury, disability, disease and death. They are used on a national and international scale, as well as at the community level and in special populations. Recent efforts in data dissemination (e.g., Health Indicators Warehouse) and presentation(e.g., GBD Visualizations) should facilitate the analysis, understanding, and use of risk factor data. Innovative use of geographic information systems has been effective in identifying risk factors in communities, and in evaluating outcomes of disease programs, and the use of clinical data from electronic health records may increase the efficiency with which interventions can be targeted and evaluated.

Review Questions

  1. 1.

    Explain how risk factor information systems complement vital statistics systems and primary scientific research. What has driven the need for risk factor information systems in the last century?

  2. 2.

    What are the basic components of a risk factor information system? Why do the methods of data collection vary for different risk factor information systems?

  3. 3.

    What are some similarities and differences among the following behavioral risk factor surveillance systems: the National Health Interview Survey (NHIS), the National Health and Nutrition Examination Survey (NHANES), and the Behavioral Risk Factor Surveillance System (BRFSS)?

  4. 4.

    Why are separate risk factor information systems needed for special populations?

  5. 5.

    What are some similarities and differences among prominent international risk factor information systems?

  6. 6.

    Explain how innovations in information technology may affect the use of risk factor data and knowledge. How may electronic health records and other information technologies augment the collection of risk factor data?