Problem statement

A central goal of behavioral medicine is to support better health through behavior change and maintenance. Human behavior is highly dynamic, multi-factorial in origin, and affected by interactions between individuals and contextual factors (Kelder et al., 2015), thus making behavior change a complex scientific problem. Matching this level of complexity requires robust strategies for organizing and effectively curating scientific knowledge to enable aggregation and comparison of findings across research studies (Chorpita et al., 2011; Weisz et al., 2014). At present, effective knowledge accumulation is stymied by the lack of shared terms and labels. For example, target behaviors such as taking medications, “healthy” eating, and engaging in physical activity are measured in a variety of different ways, which may or may not all correspond to the same core phenomena (e.g., self-reported physical activity is often only weakly correlated with objective physical activity measurements). This problem of the mixing of terms and labels is common in the behavioral literature. As an example, the same label might be used for different constructs (e.g., objective and self-report physical activity are both labeled as physical activity, but arguably represent different constructs). On the other hand, different labels might be used for what could feasibly be the same construct (e.g., self efficacy vs. perceived behavioral control vs. locus of control). This lack of common terms and shared definitions for interventions, mechanisms of action, outcome measures, target individuals and context, renders the aggregation of knowledge across behavioral science difficult. We believe that such an aggregation is necessary to properly study the complexity of human behavior.

This lack of common terms is evident when examining the proliferation of theories and concepts in the literature. For example, a multidisciplinary literature review of theories of behavior change, with strict inclusion criteria in relation to theory and to behavior, identified 83 theories with a total of 1725 component constructs (Davis et al., 2015). The theories included in this review consisted of a mean of 21 constructs, (range 5–91), thus suggesting a wide range of hypothesized constructs and inter-relationships. These theories tend to be overlapping and underspecified: often they share constructs with other theories, use different names for the same constructs, measure the same constructs using differing items, and inadequately define constructs and relationships. Based on the construct labels and definitions, it appeared that researchers were using different terms to refer to the same phenomenon, and vice versa. Only three of the theories in this review set out to be integrative or unifying, a point that will be returned to later.

These inconsistencies restrict the potential for advancing theory evaluation and development, in behavioral science more generally, as well as its application in behavioral medicine. They limit the precise specification of theoretical constructs, how they are measured, and the relationships between behavior change techniques and their mechanisms of action (one form of theoretical construct; see Table 1 for key terms used in this article). This in turn limits our capacity to efficiently integrate and summarize available evidence into evidence-based theories and to apply those theories to design and disseminate interventions to change health-related behaviors.

Table 1 Terminology overview

Many fields in science have confronted these problems. One strategy that has been used in several fields for supporting an efficient knowledge accumulation with considerable potential is the use of what, in information science, is called an “ontology.” Conceptually, an ontology is a systematic method for carefully articulating the inter-relationships between classes of carefully defined “things” or phenomena we care about (e.g., intervention components, theoretical constructs) (Weber, 2012). Ontologies provide a mechanism to support efficient knowledge accumulation into “knowledge bases,” which are databases of data tagged as belonging to ontology classes. For example, by far the most widely used and successful examples of ontology in science is the Gene Ontology which began in 1998 (The Gene Ontology Consortium, 2015a, b). The Gene Ontology provides a standardized means of defining the classes (and their inter-relationships) in the domain of gene products and their associated biological processes, cellular components, and molecular functions. Using the Gene Ontology, investigators can annotate new or existing data and add it to the knowledge base (e.g., Gene product X has molecular function Y) in a standardized, computer readable format. To date the Gene Ontology has been used to annotate the data from more than 100,000 peer reviewed scientific publications. This aggregated knowledge base would not be possible without the Gene Ontology and provides the foundation from which investigators can query the current state of knowledge (e.g., find all the gene products in the mouse genome involved in signal transduction), examine the data for relationships that are only possible because the Gene Ontology allows for integration of many different datasets (e.g., look for similarities in genes that change in their expression as people age) and even develop novel hypotheses (e.g., use the existing data and the computability of the ontology to predict the likelihood that a particular gene is involved in a disease (The Gene Ontology Consortium, 2015a, b).

The example of the Gene Ontology highlights the potential that ontologies hold the potential to move behavioral science forward from a domain in which each laboratory or school of thought is siloed and its data is incompatible with others, to one in which the existing evidence is integrated (therefore both larger and richer), searchable, and can be reasoned against using computers to discover new relationships, develop novel hypotheses and expose gaps in the evidence. Work has begun to specify an ontology of behavior change interventions, including linked ontologies of characteristics such as behavior change techniques, modes of delivery and target behaviors, which will be introduced later in the article.

While, conceptually, there is great promise for ontologies in behavioral medicine, at present the behavioral medicine community is largely unaware of ontologies. The purpose of this paper is therefore three-fold: (1) To introduce the behavioral medicine community to ontologies and their potential utility, and the steps needed to create an ontology; (2) to draw attention to current research that is beginning to populate ontologies related to behavior change interventions, including specification of behaviors, behavior change techniques, mechanisms of action, and links between behavior change techniques and mechanisms of action; and (3) suggest next steps to support the development and use of ontologies for organizing knowledge about behavior and behavior change. Table 1 contains a guide to terms used in this article.

What is an ontology?

The term ‘ontology’ means, in philosophy, (aka. Metaphysics) the study of the kinds of things that can exist and their relations to each other. In Information Science the meaning of ontology is similar but more specific and practical. In Information Science, ontologies are information structures that define abstract classes and the relations between those classes. This is in contrast to specific operational definitions, such as a measure of an individual’s self-efficacy to quit smoking or a description of the setting where the study took place. Such operational definitions are typically contained in a knowledge base. Section How can ontologies support efficient knowledge accumulation? below, describes the reciprocal relationship between ontologies and knowledge bases.

Controlled vocabularies are collections of preferred terms that are used to promote consistent description and retrieval of data, and may include definitions for both abstract classes (e.g., a definition of a general “theoretical construct”) as well as more specific classes (e.g., definition of self-efficacy as a particular theoretical construct). Controlled vocabularies often provide listings of synonyms and antonyms for the defined terms (Arp et al., 2015). The development of controlled vocabularies is usually consensus based and generally requires a group of individuals to review and refine definitions of the terms included to reflect changes in the domain over time. Once developed, controlled vocabularies can be used to annotate or “tag” information. For example, the Medical Subject Headings (MeSH) terms used in PubMed includes a controlled vocabulary that allows annotation of the biomedical scientific literature (U.S. National Library of Medicine, 1999).

The terms in a controlled vocabulary are usually structured as a hierarchical taxonomy, which specifies “parent–child” relationships between higher level and lower level classes. For example, MeSH terms are a controlled vocabulary organized into a taxonomy. The representation of the construct self-efficacy within MeSH demonstrates a hierarchical taxonomy: MeSH\psychiatry and psychology\behavior and behavior mechanisms\personality\personality development\self-concept\self-efficacy (Lowe & Barnett, 1994).

Ontologies build on the logic of taxonomies but they are more flexible as they allow more than one type of relationship. For example, in a taxonomy, the parent–child relationship could be expressed as “self-efficacy is_an expressed belief” but in ontologies other relationships could also exist (e.g., self-efficacy is_measured_by self-report). This highlights a critical distinction between a taxonomy and an ontology: taxonomies define single parent–child relationships, ontologies allow for individual classes to have relations to more than one other class and these relations may be either the parent–child (is_a) relationship or a variety of other types of relations (such as is_measured_by and is_part_of).

Ideally, ontologies are codified into a computer readable format, enabling computers to “understand” the link between the various classes as well as the language used to define a class. This, for example, enables various Natural Language Processing approaches to use the ontology (along with its list of synonyms and antonyms) to detect instances of classes in academic papers. A commonly used ontology format is the Web Ontology Language (OWL) (McGuinness & Van Harmelen, 2004) and a system commonly used for encoding an ontology in OWL is Protégé, an open-source ontology editor (Gennari et al., 2003). Once developed, ontologies can be shared via publicly available portals such as http://bioportal.bioontology.org.

Ontologies can include different levels of abstraction. For example, “Basic Formal Ontology” is an “upper level ontology” describing abstract classes that hold across all reality (i.e., they are not specific to one scientific domain like biology). To accomplish this, Basic Formal Ontology first divides reality into two very broad classes: things that continue or persist in time (continuants) and entities that occur or happen (occurrents), which could be thought of less formally as “things” and “processes” (Arp et al., 2015). An ontology of behavior change interventions would exist below the higher-level Basic Formal Ontology. Within an ontology, the linkages between high-level abstractions (e.g., continuants and occurrents) and specified sub-domains (e.g., behavior) is formalized. This formalization enables knowledge integration across scientific domains (a point we return to in sectionWhat are the next steps in ontologies related to behavior change interventions?).

How are ontologies created?

Ontology development is an incremental and iterative process that can be divided into roughly five steps. We illustrate these with examples related to behavior change interventions.

  1. 1.

    Decide on the scope of the ontology.

  2. 2.

    Develop a controlled vocabulary of classes and their properties.

  3. 3.

    Develop a taxonomy that defines the parent–child relationships between classes.

  4. 4.

    Expand on the single parent–child relationships described in the taxonomy to define all relevant relationships between the various classes including the parent–child relationships described by the taxonomy.

  5. 5.

    Codify the ontology in a computer readable format.

Step 1 in ontology development is to determine the scope of the proposed ontology, or stated simply, to determine what the ontology will describe and what it will not. The scope of the ontology determines the scope of the underlying controlled vocabulary. The process of scoping is often consensus-based and iterative. For example, an ongoing effort by Larsen, Michie, West, and colleagues to create an ontology of behaviors (hereafter, the behavior ontology) (Larsen et al., 2015), was designed to be interdisciplinary and to tie in with behavioral theory efforts. It started with an examination of top journals as determined by disciplinary rankings and interviews in ten disciplines (behavioral medicine, nursing, psychology, sociology, education, communication, management information systems, marketing, management, and economics) to ensure wide coverage of behaviors (criterion 1), scoped to include all articles in the selected years (5–10 years, with an eye to balancing the influence of each discipline, because some disciplines published a large set of articles monthly whereas others published a few articles quarterly) (criterion 2) that contained at least one construct and one behavior to ensure applicability to behavioral theories (criterion 3). The behavior had to be the dependent variable of the study to ensure that the behavior was of practical interest (criterion 4). Behaviors were defined as only those variables that are observable (criterion 5), non-autonomic (criterion 6), did not require a rating beyond whether it occurred or not (criterion 7), were naturally occurring (criterion 8), at the individual level (criterion 9), and that did not act as proxies for other variables (criterion 10).

Step 2 in the development of an ontology is to create a controlled vocabulary. There are several complementary on-going efforts focused on developing controlled vocabularies related to behavior change including the Behavior Ontology and the Behavior Change Techniques (BCT) Taxonomy v1. The need for such an approach to standardize language and concepts comes from the very rapid uptake of a taxonomy of behavior change techniques (see Michie et al., 2015b). Within a few years, this work has received >7500 citations, including >400 citations within 2 years of publication to the 2013 paper reporting the taxonomy BCTTv1 (Michie et al., 2013, 2015b).

An important step when developing a controlled vocabulary is to carefully define synonymous and antonymous classes and, ideally, to build on previous controlled vocabularies/ontologies whenever possible. For the behavior ontology, the existing controlled vocabulary (and hierarchical structure, discussed next) from the International Classification of Functioning, Disability, and Health (ICF) (World Health Organization; WHO, 2013), endorsed by all WHO member States in 2001 was adopted. Each class (including mid-level classes) was defined, using the U.S. National Cancer Institute (NCI) Thesaurus for definitions when possible to enable cross-ontology links in the future, and using the Oxford English Dictionary when the NCI Thesaurus did not contain the term in question. The behaviors were fitted into the existing hierarchy when possible. The hierarchy was extended when needed, and definitions were examined in an iterative fashion.

When a suitable controlled vocabulary upon which to build does not exist, a logical strategy for developing them is via consensus. The BCT Taxonomy v1 is a good example of the sort of consensus-based work required to develop a controlled vocabulary. It was developed over 3 years with 400 behavior change experts from 12 countries, overseen by an International Advisory Board of 41 experts (Michie et al., 2014). Comprising 93 BCTs with clear and distinct labels, definitions, and examples, it provides a set of terms that can be used to better describe behavior change interventions across research studies. For example, one class of BCTs is labeled “goals and planning”. Within goals and planning there is the specific class of techniques labeled “action planning”, which is defined as: “prompt detailed planning of performance of the behavior (must include at least one of context, frequency, duration, and intensity). Context may be environmental (physical or social) or internal (physical, emotional, or cognitive). Within action plans the synonym concept of implementation intentions is included.”

Step 3 is the creation of the hierarchical structure for the controlled vocabularies. For the behavior ontology, ICF’s classes of “D: Activities and participation” and (to a lesser degree) “E: Environmental factors” were used. When possible, the ICF hierarchy was maintained, but as the goals of the ICF taxonomy and the behavior ontology were not identical, extensions were created. A part of the draft ontology is shown in Fig. 1. The numbering scheme follows ICF, and additions to ICF are shown with a period followed by lower-case letters (for example, D5702.d Protective actions).

Fig. 1
figure 1

Behavior ontology’s taxonomy (partial view)

For the BCT Taxonomy v1, there was not a hierarchical structure to build on in the literature. The BCT Taxonomy v1 was hierarchically organized into 16 classes according to how they were conceptualized by users, to promote ease of use (Cane et al., 2015). In hierarchical form, the BCT Taxonomy v1 has at least three levels. For example, the class named “Goals and planning” includes nine BCTs, one of which is “Goal setting (behavior).” This, in turn has been deconstructed into lower-level components such as prompting goal-setting and setting a goal within an appropriate time-frame (Lorencatto et al., 2015).

Step 4 is the establishment of relevant inter-relationships between key classes. In the behavior ontology, three types of non-hierarchical relationships are specified: (1) reverse_of, (2) behavior_links, and ? (3) dimension_of. Reverse_of addresses cases where two behaviors, often in different parts of the hierarchy, are opposites. For example, “D469.a Physical activity” was coded as the reverse_of “D5701.b1 Resting.” Behavior_links included behaviors that also often exist in different parts of the hierarchy, but could be considered synonyms in certain circumstances, such as the coding of “D110.a Watching TV” and “D469.b Sedentary activity.” The third relationship type, dimension_of, was reserved for behaviors that might be indistinguishable from a higher class behavior to a casual observer, but with important differences for some researchers. For example, the behavior “D166. Reading” is important in itself for some researchers as it leads to thinking and learning. For others, the material being read is important, such as the reading of religious/spiritual text for religious scholars, “D910.b2 Reading spiritual text.”

Step 5 involves codification of the ontology into a computer readable format such as Web Ontology Language OWL. This last step then enables utilization of the ontology. To date, no ontologies directly relevant to behavior change interventions have been encoded in OWL or other formats. However, the Nursing Care Coordination Ontology at the NIH Bioportal addresses the coordination, people, places, and problems of delivering care and thus might also be a logical starting point for work on an ontology of context for behavior change interventions.

How can ontologies support efficient knowledge accumulation?

As the Gene Ontology example illustrates, there are great opportunities for more efficient knowledge accumulation when ontologies and knowledge bases are used in tandem. We highlight two opportunities particularly relevant for behavioral science: (1) improving class definitions and comparability of operational definitions and (2) facilitating automatic meta-analysis.

Improving class definitions and comparability of operational definitions

In section How are ontologies created?, we outlined expert-consensus strategies for generating a controlled vocabulary. However, even with well-conducted expert-consensus work, there will be relevant construct labels that are not included.

A set of tools in computer and information sciences that could be useful for improving class definitions and comparability of concrete operational definitions fits under the rubric of “Natural Language Processing.” Natural Language Processing (NLP) involves techniques that grew out of Psychology and Linguistics, such as Pennebaker’s classic work exploring how different words are associated with a variety of psychological phenomena (2001) and also work from information science (e.g., Deerwester et al., 1990; Wong et al., 2012). Natural Language Processing often involves the quantification of words, phrases, and syntax of language to create meaningful clusters that can facilitate comparison.

Larsen and colleagues have been using Natural Language Processing techniques for better specifying abstract classes and their inter-relationships. For example, Larsen and Bong (2016) apply Natural Language Processing to examine similarities and differences of words, phrases, and syntax of specific items used in self-report scales to examine how closely two measures with the same construct label actually belong in different classes—an issue classically labeled the jingle fallacy (Thorndike, 1904). As a complement, these same techniques enable the examination of whether two measures with different construct labels actually belong in the same class—classically labeled the jangle fallacy (Kelley, 1927).

For example, take three items from each of three constructs in the http://theorizeit.org knowledge base, each item purporting to measure a specific latent construct, each reported by different papers. A researcher would generally be expected to find these constructs in the literature through searches for the construct’s author-given label. The three items are described as follows (author-given labels in parenthesis):

  1. 1.

    “I would find the system useful in my job” (Usefulness).

  2. 2.

    “Using the system would improve my job performance” (Performance Expectancy).

  3. 3.

    “In general, information that is useful in your purchase decision is important” (Usefulness).

Given standard literature search approaches, a researcher looking for research on the usefulness construct would likely retrieve the papers containing sample items 1 and 3 because of the names given their constructs. With recent tools, such as the Construct Identity Detector (CID; Larsen & Bong, 2016), all item pair similarities may be examined empirically without the need for respondent data. These similarity scores are often restricted to the 0–1 range, with 0 indicating no similarity and 1 indicating perfect similarity. Construct item 1 and item 2, labeled usefulness and performance expectancy, turn out to have a high similarity score (0.76), suggesting a possibility that they are measuring the same thing (if so, an instance of the jangle fallacy) in spite of using different labels, whereas item 3, labeled usefulness, has a low similarity score with both item 1 and item 3 (0.08 and 0.04, respectively), suggesting that the two constructs labeled usefulness are unlikely to be measuring the same phenomenon (an instance of the jingle fallacy). Additional theoretical and empirical work is required to more clearly specify the degree of “similarity” or lack-thereof to connote to items as measuring the same concepts vs. different concepts. On a theoretical side, this will be partially dictated by the concept itself as more general concepts (e.g., motivation) will likely have many plausible variations that are encapsulated in the concept whereas more focused concepts (e.g., performance expectancy) might be specified well enough to enable more thoughtful use of such empirical “similarity” scores as the one described above. Assuming the discussion is focused on more specified concepts, this problem is analogous to that faced when attempting to determine the construct membership based on survey data. This question remains unsettled as it will depend on which items are admitted into a principal components analysis (Kolenikov & Angeles, 2009), as well as the validity and desired reliability of the survey measures (Litwin, 1995). While it has taken a century to gather the current level of understanding of item belongingness, faster work will be required for CID measures. In the meantime, being able to detect high-similarity items is important when creating ontologies as these are the items most likely to belong to the same class, and can therefore be presented to the researchers first to save effort.

As these examples illustrate, Natural Language Processing supports comparison of operation definitions to examine similarities and differences. While helpful for the operational definitions themselves, it is also valuable for refining the controlled vocabulary for an ontology by providing comparison of operational definitions that should be the same but are not (e.g., comparison of the two usefulness items) and two operational definitions labeled as different classes but are actually the same (i.e., first usefulness and performance expectancy items).

Facilitating automated meta-analysis

Natural Language Processing techniques can also be used to extract other relevant information from studies to enable meta-analytic work and, indeed, enable comparison of different plausible views of how to organize and structure knowledge. For example, Stauton et al. (2014) examined the relationship between perceived control, motivation, and oral self-care. Natural Language Processing algorithms can be developed that extract information and link the operational definitions (e.g., measurement items) used in Staunton et al. (2014) to the classes defined in an ontology (e.g., classes from the behavior ontology). As specific items are used, natural language processing can be used to link those items to different labeling structures that these operational definitions could plausibly represent. For example, these same items could be organized into a hierarchical structure that is implied by the theory of planned behavior (using the label perceived behavioral control). Using the strategies described in 4.1, Stauton’s construct labeled perceived control could be linked to a higher order class.

An ontology is conceptually able to accommodate any subtle differences in definitions between concepts from different theoretical traditions while still enabling a higher order class to be specified that would acknowledge the similarities between the concepts. For example, previous research has been conducted to define classes of mechanisms of action, derived from synthesizing constructs from theories of behavior change (Michie et al., 2005; Cane et al., 2012). The result, the Theoretical Domains Framework, comprises classes (termed domains in the framework) of factors that mediate how a behavior change intervention produces a desired behavioral outcome (Cane et al., 2012). Using this classification strategy of mechanisms of action, the construct of perceived control could be linked to the higher class, beliefs about capabilities. The construct labeled oral self-care (i.e., follow-up flossing) could be linked to the behavior ontology, as an instance under the flossing class (e.g., Behavior/Activities and participation/Self-care/Caring for body parts/Caring for teeth/Oral care/flossing), therefore providing additional knowledge that Stauton’s paper was (in part) also about self-care. The full correlation table of estimated effect sizes of the relationships between classes could be extracted using Natural Language Processing techniques and incorporated into a knowledge base that has extracted similar details from other studies. For example, Stauton provides 21 correlations via inclusion of other variables like age, gender, baseline flossing. In this way, seemingly disparate operational definitions, if carefully compared with Natural Language Processing, linked to higher level abstract classes within an ontology, and then with effect size information extracted could be used to support meta-analysis. More details about this process are delineated in Wong et al. (2012) and Li and Larsen (2011).

These tools and the resulting knowledge base, when complemented with an ontology, facilitate automated meta-analysis. The typical meta-analysis operates according to a “top-down” approach, starting with particular research questions and constructs in mind. Next, a literature search takes place to locate relevant findings. While comprehensive, existing approaches for summarizing evidence are far from efficient, particularly for the purpose of large-scale exploratory research. It is feasible to take a “bottom-up” approach via the use of the Natural Language Processing algorithms just described. For example, an emerging strategy, called metaBUS (www.metabus.org) (Bosco et al., 2015), takes a “bottom-up” approach by curating virtually all obtainable meta-analytic information (e.g., the extracted information described in the previous example), independently of particular research questions (i.e., not just the hypothesized construct relationships are gathered but all measured construct information), thus allowing users to rapidly locate and summarize existing evidence. Beginning with journals in the area of applied psychology, the metaBUS database has grown to approximately 800,000 effect size estimates.

Essential components of tools like metaBUS include not only extraction of relevant information and tools to run statistics (i.e., linkage to R), but also the linkage of this knowledge base to an ontology. For example, metaBUS currently is using the ISA taxonomy, which includes nearly 5000 hierarchical classes (Bosco et al., 2015). This ontological structure facilitates empirical reviews for questions at various levels of the ontological classes. To illustrate, colleagues working on metaBUS provided an example automated meta-analysis that examined the relationship between education and turnover intention (the intention to quit one’s job), one of tens of thousands of plausible relationships to examine within MetaBUS. Using the ISA taxonomy allowed the clumping of labels that were deemed synonyms for each target class (e.g., for education, other labels included: degree, educational attainment, qualifications, etc.). This resulted in 133 effects from 119 separate samples that had relevant data related to the question on the relationship between education and turnover intention to generate a meta-analytic effect size estimate of r = .035 (k = 133; 95 % CI .015, .055). This type of work would be further supported via robust ontologies that can support specification of classes and their relevant inter-relationships. This preliminary example provides a very concrete understanding of how ontologies, when complemented with current advancements in technologies such as Natural Language Processing, can facilitate a more efficient knowledge accumulation.

Why is now the time for ontologies related to behavior change interventions?

There are several movements in science that make this an important time for behavioral medicine as well as behavioral and social sciences to actively utilize ontologies.

Large-scale research trends that set the stage for ontologies related to behavior

As scientific communities advance by taking full advantage of networking and information technologies, the need to organize the communities’ knowledge into machine-translatable formats through practicable ontologies has grown (President’s Council of Advisors on Science and Technology (PCAST), 2010). The international Human Genome project, for example, was only made possible by enforcing the use of common terms (i.e., the Gene Ontology) and data structures in order to accommodate distributed contributions from laboratories around the world. Based on the success of that project, the National Institutes of Health (NIH) continues to support the structured use of ontologies as it fosters contributions through open science initiatives, such as the database of Genotypes and Phenotypes (dbGaP) and its Big Data to Knowledge (BD2K) initiatives (Hesse et al., 2015). The National Patient Centered Outcome Research Network (PCORNET) is an example of an architecture designed to support the utilization of common variables across comparative effectiveness trials. The ontologies from PCORNET are being leveraged for the Million Patient Cohort underlying the NIH’s Precision Medicine Initiative. Internationally, the Consortium of European Social Science Data Archives (CESSDA) is an example of an effort to make social science data broadly available across countries within the European Union.

There are a number of reasons why ontologies are becoming critical within these large scientific trends. As our metaBUS example illustrates, common knowledge structures facilitate more efficient meta-analysis and scientific syntheses, which in areas such as medicine can be translated more efficiently into clinical guidelines (Institute of Medicine [IOM], 2008). When levels of evidence are too low for drawing conclusions, ontologies can ease the path toward gathering new evidence relative to the replicability and generalizability of findings (Nosek et al., 2015). From a translational perspective, ontologies can be used to create more efficient information systems that can influence the ways in which scientists work together. The Open Geospatial Consortium (2016) was instrumental in creating the standards in language and data needed to build harmonized geographic information systems. The results of that work led to development of extensible frameworks that now underlie commercial applications, such as Google Earth®, as well as experiments in public health geographic data platforms, such as The Community Commons. On the biomedical side, efforts have been underway to create semantically interoperable information systems to guide clinical care. Data collected through these data systems can, in turn, be used to improve care and facilitate discovery through what some have referred to as the “learning healthcare system.” Unfortunately, behavior and behavioral interventions will not be included within these systems unless the community can organize its knowledge through accepted ontologies (IOM, 2011).

How can ontologies support use of emerging data sources?

The need to access and analyze data from multiple sources creates the demand for ontologies. This is especially true for “Big Data,” which are unstructured by definition. Big data, “… whose size forces us to look beyond the tried-and-true methods that are prevalent at that time” (Jacobs, 2009, p. 36), is permeating practice in virtually every aspect of society, and certainly challenges our methods. Rather than allowing these big data to overwhelm, these data must be organized and accessed through ontologies. To accomplish this organization, we need to build a matrix of links between the classes in the ontology and the data. These links are necessary to theoretically inform intervention development (Michie & Johnston, 2012). A preliminary attempt at linking BCTs with theoretical constructs Michie et al. (2008) suggested the potential usefulness of such a scheme, as it has been used widely to inform intervention development (e.g., French et al., 2012). This work is being further developed with more advanced methods to link BCTs to their mechanisms of action. Advancing theory in the context of big data requires careful development, refinement and integration of disparate theories and models of behavior (Cobb et al., 2011).

Access to data can occur via data portals, which represent a common approach to combining and sharing research data. By far the largest and most relevant data portal for behavioral science specifically is likely the Inter-University Consortium for Political and Social Research (ICPSR, 2016). ICPSR maintains over 65,000 research relevant files such as surveys and their results. The online ICPSR Social Science Variables Database searches over four million variables and enables searches for studies containing any pair of variables. ICPSR provides access to multiple data sets that contain common data elements that are quite relevant for behavioral researchers, measuring both observable behaviors (e.g., smoking, physical activity, diet/nutrition) and associated psychological mediators and moderators such as perceptions of disease risk, depression, anxiety, and self-efficacy to change behavior. The Federal government also provides population-level health data (many found in ICPSR) with these types of Common Data Elements (CDEs) in surveys like the Behavioral Risk Factors Surveillance System (BRFSS; http://www.cdc.gov/brfss/) and the Health Information National Trends Survey (HINTS; http://hints.cancer.gov/). There are emerging efforts across data producers to use the same CDEs across surveys where possible to increase comparability and thus support ontology development through iterative testing of inter-relationships between CDEs and the predictive power of the CDEs with relevant outcomes. Another portal is the Inter-Nomological Network (Larsen, 2016), which like metaBUS employed research assistants to develop taxonomies and contains constructs, definitions, measurement items, and citations from sources in nine disciplines. The federal government has also created access to myriad types of data (including behavioral data) through portals like data.gov and healthdata.gov. These resources (i.e., ICPSR, INN, metaBUS, healthdata.gov) provide a solid foundation from which to define the relevant knowledge base for targeted ontologies.

How can “data harmonization” efforts complement ontology efforts?

Data harmonization refers to creating consistent and unambiguous data that can be combined or merged together—sometimes referred to as semantic interoperability (Hunter, 2003). This process typically focuses on merging data via the use of common data elements. These elements consist of a name or descriptor, which is typically the underlying construct label that is being assessed, a precisely defined question or data collection method (e.g., “On average how many cigarettes do you smoke every day?”) and a specified format (e.g., “a positive integer between 0 and 250″). At the very least, data harmonization requires that independent datasets are using common data elements (e.g., operational definitions) and the same construct label for these elements. Ontologies that provide shared definitions of classes and relevant constructs and that provide class relationships would facilitate data harmonization efforts. Even if individual investigators don’t share or merge their data with others, having clear, shared definitions of the classes improves understandability of data. Once data are based on an ontology, the shared meanings will allow researchers to understand more precisely what is meant by a term.

There are many ongoing efforts related to data harmonization that are complementary to ontology work. For example, the NCI’s “crowd-sourced” Grid-Enabled Measures (GEM) portal (Moser et al., 2011), was developed to support the standardization of construct names, definitions, and specification of appropriate measures for each construct via the use of a wiki-based platform that supports community editing and voting. The site also provides links to datasets and a team collaboration area and thus has complementary interest to those of an ontology (e.g., a controlled vocabulary). A more expert-driven strategy is illustrated by the psychometric instruments carefully developed for the NIH PROMIS project to cover patient-reported outcomes in the three domains of physical, mental, and social health. Each instrument bank was created based on a careful review of thousands of questionnaire items. A major advantage of PROMIS is that many of the instrument banks are being translated in other languages such as Spanish (Paz et al., 2013) and Chinese (Liu et al., 2013), which may help in defining common international controlled vocabularies. Further, PROMIS was developed from legacy measures and it is possible to cross-walk scores if researchers prefer to use the legacy instruments.

The PROsetta Stone initiative (2016) is a data harmonization project focused on developing methods to link PROMIS measures with other measures such that one can link these other measures to PROMIS (even without collecting data with PROMIS measures) to allow researchers to “cross-walk” scores across different legacy measures with PROMIS being the common metric. This has already been done with measures including those related to anxiety, depression, and quality of life (Schalet et al., 2014). These types of methods allow researchers to integrate across data already collected and thus provide another strategy beyond the Natural Language Processing algorithms discussed above for facilitating even more rigorous automated meta-analysis.

Both GEM and PROMIS were designed to support better data harmonization, particularly of self-report constructs and measures. Overall these and collaborations with existing data harmonization efforts like the National Database for Autism Research (Hall et al., 2012) or calls for the re-use of existing measures and for better data harmonization (see: https://www.nlm.nih.gov/cde/) can be tapped for research in an ontology (U.S. National Library of Medicine, 2012). In particular, the provision of relevant constructs, plausible definitions, relevant measurement tools, and even datasets provides the “raw material” for ontologies using the Natural Language Processes techniques describe earlier. While these initiatives were not designed with ontologies in mind, a well developed ontology tied into resources ranging from ICPSR, METABus, GEM, INN, and PROMIS could take advantage of their content and structure to advance ontologies related to behavior change to greatly improve theoretical and data harmonization.

The “bottom-up”, community-driven approach (e.g., GEM) to data harmonization and the “top-down” expert-driven approach (e.g., PROMIS), each have strengths and weaknesses. The former allows for a wide-range of researchers and clinicians to provide information regarding definitions and inter-relationships between constructs—to support the creation of an ontology– and ratings of the “best” measures of those constructs to promote their common use. With contribution from stakeholders over time, the community can achieve consensus using a similar process as seen with Wikipedia. The downside is that the process is time-consuming and requires concerted effort by stakeholders to provide feedback. Expert-driven efforts provide constructs (and related measures) that take the guess-work out of deciding what is the “best” measure and how the underlying construct is defined and thus promote harmonization. However, these types of efforts may not reflect the needs of different stakeholders (e.g., clinicians typically need short measures while researchers want the most reliable and valid measures regardless of length) for their specific assessment efforts. What they have in common is the overall goal of data harmonization.

What are the next steps in ontologies related to behavior change interventions?

In this section, we present some early work developing an ontology of behavior change interventions that combines and extends ontological efforts already discussed (e.g., behavior change techniques and behavior) and links them via mechanisms of action delineated in the Theoretical Domains Framework (Cane et al., 2012). We then discuss strategies for linking these efforts to the work of the broader scientific community, make recommendations for researchers to more actively engage with and use ontologies in their work and propose a funding agenda for developing and applying ontologies.

Defining the inter-relationships between the developing ontologies

In order to fulfill the potential of emerging ontologies (i.e., BCTs mechanisms of action, and behavior) an essential task is to define the inter-relationships between them. West and Michie (2016) have proposed classes and inter-relationships that comprise an ontology to link BCTs, mechanisms of action, behavior, and context as depicted in Fig. 2. The goal of this ontology is to provide a means of answering the question, “What works to change what behaviors, for whom, in what situations, how and why?” (West & Michie, 2016). As depicted in Fig. 2, this ontology links together interventions, mechanisms of action, target behaviors, context, and usage, each of which could be represented by its own ontology or controlled vocabularies. For example, the Behavior Change Intervention Ontology includes the BCT Taxonomy v1 controlled vocabulary and controlled vocabulary of modes of intervention delivery (a reliable version produced in Michie’s lab) and elaboration of the relationships between these two vocabularies. A controlled vocabulary is also being developed for an ontology of mechanisms of action, comprising 107 mechanisms at 3 levels, with 18 unique within-theory relationship types specified (see Michie et al., 2015a). As noted above, the goal of this set of related ontologies is to provide a means of structuring evidence about behavior change in a causal framework. This set of related ontologies can be seen as an extension of a framework for modeling how behavioral interventions might influence health that describes causal links between behavioral determinants, behaviors, physiology and health proposed by Hardeman et al. (2005).

Fig. 2
figure 2

West and Michie (2016) behaviour change interventions ontology

As depicted in Fig. 2 this ontology asserts that characteristics of interventions (i.e., their content and delivery) are related to behavior through designated mechanisms of action. These relationships are moderated by intervention usage and context. All of these relationships for a given intervention can be thought of as an “intervention-behavior complex” that is associated with an effect. A smoking cessation intervention provides an example. Providing ultrasound feedback on carotid artery stenosis (intervention) may increase the likelihood of a smoker making a quit attempt (behavior) by raising perceived susceptibility to and anxiety about the health risks from smoking (mechanism of action). Intervention usage (uptake and engagement) may be affected by the intervention content or mode of delivery and moderate the impact of the intervention on the mechanisms of action. For example, some smokers may be reluctant to subject themselves to ultrasound feedback leading to low uptake, or may disengage emotionally from it when they hear unwanted information. Similarly, context (target population and setting) may influence usage as well as moderate impact of the message on the mechanisms of action and the impact of mechanism of action on the behavior. For example, heavily nicotine dependent smokers (context) may be less likely to expose themselves to the feedback (usage). If they are exposed to the intervention, these heavy nicotine dependent smokers (context) may ignore the feedback (usage) thus minimizing its impact on their anxiety levels (mechanism of action). In this example, increased anxiety is the mechanism of action for fostering reduced smoking and thus context is diminishing usage, which moderates the relationship between the intervention and the mechanism of action. As this example illustrates, the intervention effect in a given population or for a given individual may vary as a function of the key moderating variables of usage, and context. Table 2 defines each of the classes in this ontology structure.

Table 2 Key classes in the Behaviour Change Intervention Ontology (West & Michie, 2016)

With these classes and inter-relationships defined, it enables a strategy for making linkages between the other ontologies. It also specifies other classes that need to be taken into account when attempting to empirically study the relationship between BCTs and hypothesized mechanisms of action. This is important in that the effectiveness of behavior change interventions depends on understanding links between BCTs and mechanisms of action. Current research (Michie et al., 2015b) is developing and testing a methodology for making these links between the BCT Taxonomy v1 and mechanisms of action (Cane et al., 2012) and the construct mapping work from an 83-theory review (Davis et al., 2015; Michie et al., 2015b). It is identifying hypothesized links in the published literature (>300 articles coded by BCT and mechanism of action) and investigating consensus amongst 98 behavior change experts. This is another piece in the jigsaw of the larger program of research to develop methods for specifying the empirical relationships between all the components in the ontology of behavior change interventions.

Ontologies such as the one above provide structures in which well specified constructs and relationships can be introduced to develop unifying theories of behavior change. Developing theories within an ontological framework enables the assessment of the extent to which a theory is unifying and enables empirical evaluation using common language and concepts.

With this causal pathway described, it will be important to remain mindful of the purpose of the ontology and the purpose of the knowledge base. As described by Kelly and Moore (2012), evidence-based practices build on the tandem use of rationalism and empiricism. Rationalism includes abstract classes and their inter-relationships and includes disciplines like mathematics or logic. These rationalist arguments are the a priori specifications of these meaningful classes. For example, “all bachelors are men” is a class-structured relationship (in Kantian terms used by Kelly and Moore, an analytic a priori relationship) defining a relationship that is true by definition because the definition of a bachelor is an unmarried man. To deny that all bachelors are men is self-contradictory to the definition of bachelors and thus requires no empirical testing. Of course, one might question the definition of bachelors and suggest that perhaps an inclusion of women should take place (e.g., refactoring the definition from being about unmarried men to unmarried people). Again though, the definition establishes what is true by definition about the abstract concept of being a bachelor. Empiricism, on the other hand, involves observation and “matters of fact.” Observation is required for establishing relationships such as a causal relationship because there is no inherent, relationship that is defined via the abstract definitions. It is essential to acknowledge this distinction, and thus the logical and essential role for ontological structuring of abstract concepts, because it is often the desire of a research endeavor to not study the specific instances within a single study but instead to generalize what was learned to a higher level abstract concept. For example, a researcher may be interested in examining whether social support is a useful mechanism of action for improving medication adherence. Social support must be operationally defined in a specific instance to translate the abstract concept into an observable phenomenon (e.g., as an intervention or as a measurement of social support). Overall, ontologies provide an organizing tool for rationalist information (e.g., defining abstract terms and their inter-relationships) and the knowledge base is the repository of observation (empiricism).

Based on the distinction between rationalism and empiricism, relationships specified between classes within ontologies should be, by definition, true, and not require empirical testing. For example, the relationships described in the Behavior Ontology are definitionally true based on Larsen et al.’s (2015) specifications, derived from ICF. This is in contrast to the constructs and operational definitions used in specific studies. For these studies, the existence or lack of existence of a relationship is definable in the knowledge base (e.g., does a change in self-efficacy cause a change in physical activity). This is important because of the impact of moderating factors that might influence when a given relationship is or is not observed (e.g., context and usage as defined by West and Michie). For example, the majority of studies focused on the Theory of Reasoned Action and Theory of Planned Behavior, might find evidence that the constructs of attitude and behavioral intention are related. Some work shows that this relationship disappears in certain contexts and/or when other constructs are introduced (e.g., Venkatesh et al., 2003). As such, the relationship between attitude and behavioral intentions must remain defined in the knowledge base because it is a conditional relationship.

Aligning ontologies across scientific domains

As progress is made on ontologies related to behavior change interventions, it is essential that these ontologies not be developed in silos. In particular, there is already considerable work in a variety of other scientific domains as diverse as plant science (Walls et al., 2012), medications (Hanna et al., 2013), and cognitive science (Poldrack, 2016). As described earlier, the Gene Ontology (The Gene Ontology Consortium, 2015a, b) is a great success story for ontology work as it enables easy query of the current state of knowledge, examination of the data for relationships using many datasets, and development of novel hypotheses. The Gene Ontology and other ontologies highlight the fundamental importance of ensuring that ontologies within the behavioral sciences be linked to these other successful ontologies to ensure behavior is taken into account within a broader understanding of health (IOM, 2011).

Bridging ontologies from different scientific domains will be essential for many emerging targets, such as the USA’s Precision Medicine Initiative (National Research Council, 2011), which focuses on the development of prevention and treatment strategies that take individual differences into account. Ontologies, when carefully designed within each domain and linked to other higher order ontologies (e.g., the Basic Formal Ontology), can start to enable the sort of massive querying of the knowledge base that is currently taking place for genes. Given that behavior has been estimated to explain 40 % of variations in health outcomes (McGinnis et al., 2002), it is of critical importance that ontologies related to behavior change can be linked to other ontologies to achieve targets such as precision medicine.

A key next step for behavioral scientists interested in developing ontologies is to ensure they are carefully building on ontologies from other domains. For example, the ICF discussed earlier for the behavior ontology, provides a classification of health and health-related domains with a focus on body, individual, and societal perspectives, which might be a useful “bridge” between domains. This is no small task and will require careful collaboration between scientific domains but, as implied by initiatives like precision medicine, the pay-off for improving overall health is potentially enormous.

How can one contribute to ontologies for behavior and behavior change?

As summarized above, there is already ongoing work focused on the development of the components of ontologies for behavior change interventions. The BCT Taxonomy v1, including the controlled vocabulary and hierarchical structure, will be reviewed in the future and refined into BCT Taxonomy v2 when there is a sufficient basis for this. Researchers and practitioners utilizing v1 for intervention development and evaluation and evidence synthesis are encouraged to provide feedback to inform the future development of v2 by, for example, suggesting additional BCTs and adaptations for specific groups and cultures, providing reliability data, sharing translations, and reporting terms that cause confusion. A portal has been established to gather feedback (and communications have been sent to professional and scientific organizations and networks to encourage feedback (BCT Taxonomy v1 Team, 2016). An international consortium will review these and when the time is judged to be right, v2 will be developed and released. The aim is to maximize the possibility for the scientific community to work together effectively and efficiently with shared classes and terms and to move forward collaboratively to improve and share knowledge.

Not only are there efforts in further refining BCT Taxonomy v1, but there is ongoing work in extending the BCT Taxonomy v1 with their mechanisms of action. Led by UK and US researchers and guided by an extensive international advisory board, a methodology for achieving a consensus matrix of links between BCTs and mechanisms of action is being developed (Connell et al., 2015). The methodology involves triangulating findings from a literature review of about 300 published articles coded by BCTs and mechanisms of action with the findings from a consensus study involving 98 international experts in behavior change. The literature-based study will consider how the links have been reported in published intervention evaluations. The consensus exercise elicits expert views. A final study will consider clusters of BCTs and their associations with theories as a whole rather than with individual mechanisms of action. Funds are being sought by behavioral, computer and information scientists to further populate the components of the Behavior Change Interventions Ontology outlined in Fig. 2, which will include setting up an international consortium to steer this work and promote engagement of the scientific and user community. For the success of such efforts, it is important that the relationships are derived from careful analysis of findings in the knowledge base, and that a broad representation of the scientific community be actively involved and willing to update the ontologies as new evidence emerges.

Beyond these active projects, there are other ways to contribute to the refinement of ontologies related to behavior change interventions. For example, we suggest authors consider taking advantage of the controlled vocabularies, taxonomies, and (eventually) ontologies created whenever possible and to actively engage in and use the resources focused on data harmonization (e.g., GEM, PROMIS) as these are very complementary to any ontological work. When individual researchers find that these tools are not fit for their purposes, feedback should be provided back to the developers of these resources [e.g., send feedback to the BCT Taxonomy Team or actively engage in the GEM community (BCTTv1 Team, 2016; NCI, 2016)]. A second suggestion is for all behavioral medicine manuscripts to include and report (perhaps as a table), construct names and synonyms when applicable, full definitions of constructs, relevant citations, and the full descriptions of items used to measure those constructs in their study. If there are space limits, these can be reported in an online appendix or other portal such as within Open Science Framework (2016). By making item-level correlations available, older findings that would otherwise be incompatible with new findings can be integrated, which is especially important when considering the conversion of results between reflective and formative constructs (Larsen & Bong, 2016). This information is very valuable for supporting the type of Natural Language Processing work described above, thus enabling both refinement of the controlled vocabulary, identification of constructs labeled differently but likely measuring the same thing, and ultimately, automated meta-analysis.

A proposed funding agenda

Needless to say, creating functional and long-lasting ontologies for behavior change interventions will take resources. Government funding agencies have contributed successfully to the development of ontologies in other disciplines. For example, in the United States, NIH launched the Biomedical Information Science and Technology Initiative (BISTI) in 2000 with a goal of making optimal use of computer science and technology to address problems in biology and medicine. BISTI was instrumental in contributing to the development of the Gene Ontology discussed earlier. The World Health Organization took a more top-down funding approach in its support of the International Statistical Classification of Disease and Related Health Problems (ICD). The ICD taxonomic structure has since been incorporated into the Systematized Nomenclature of Medicine (SNOMED), which guides development of interoperable electronic health record systems using a private funding model. Both have changed the science and practice of medicine.

The time is right, we believe, to extend these types of funding efforts into the realm of behavioral medicine. A first priority might be to extend the BISTI funding model to support an international consortium of scientists to produce a functioning ontology with input and buy-in from the larger behavioral medicine community. An ontology that does not guide integrative science, or that does not fit within the emerging informatics structures underlying contemporary medicine, will miss the mark. The next opportunity for funding would be to enable the development of informatics tools to incorporate the ontology into a functional set of tools. Both the US’s National Science Foundation and the NIH have small business innovation research (SBIR) programs that could be targeted toward these goals, as do many nongovernmental foundations. In the UK, the Medical Research Council, Engineering and Physical Sciences Research Council and the Wellcome Trust have suitable funding streams for this work. The timing is propitious, given the explosion of patient-facing smartphone apps, wireless devices, and wearables that will be generating a cacophony of behavioral change data unless efforts are made to harmonize the conceptual meaning behind their embedded interventions.

Our discussion of Natural Language Processing methods, as well as the jingle and jangle fallacy, brings to the foreground another important recommendation for funding. Ontologies are never perfect to begin with, but can be improved through empirical application, for example within behavioral medicine, and the application of computational techniques to find misaligned constructs, detect changes in usage over time, and to find new terms through pattern analysis. We encourage basic science funding into the empirical application, computational tools and psychometric methods that will sharpen our community’s refinement of ontologies in the future.

In all of these proposed efforts, we recognize that the composition of any organizing body’s advisory boards and technical evaluation panels will leave a lasting imprint on the ontologies that emerge through funding efforts. In anticipation of that likely outcome, we encourage funding agencies and professional societies to follow best practice in governance to ensure proper management of their respective scientific communities’ vested interests. The US-based National Institutes of Health, for example, has strict guidelines in place through the Federal Advisory Committee Act (or FACA, see P.L. 92-463; 5 U.S.C. App.) to govern the formulation of transparent and representative advisory boards to ensure that norms of technical rigor and equitable representation are in place for scientific endeavors. Likewise professional societies and cross-disciplinary consortia follow a set of transparent bylaws for selecting advisory board members both to ensure compliance with articles of incorporation (if applicable) and to engender trust among members. International societies will operate both through treaty and through the mutual consent of sovereign nations to protect the interests of their multinational members. We anticipate that as the use of common ontologies grows in the field of behavioral medicine, the need to ensure adequate representation of consortial interests through transparent governing processes will also grow. Understanding how to formulate these governing principles in a way that is self-correcting and reliable should be a high priority among participating organizations.

Conclusions

The goals of this paper were to introduce ontologies to the behavioral medicine community, to provide some examples of ongoing efforts in the creation of ontologies for behavior change interventions, and to outline uses of ontologies and next steps for advancing ontologies. As described in the paper, ontologies provide unifying frameworks that can support the synthesis of empirical work conducted by researchers and encoded in a knowledge base, to facilitate more effective knowledge accumulation. We noted that ontologies have been used to accelerate scientific progress in other disciplines, and showed that they hold similar promise in behavioral science. If we ignore these trends we do so at our peril, recognizing that the consequence could be a slowed discovery engine in behavioral medicine and a deluge of unintegrated data from sensors and software. Based on this, we strongly urge the behavioral medicine community to engage with using and developing ontologies for behavior change interventions, drawing on insights from ontologies developed in other relevant scientific domains. Finally, we urge funding and other agencies to consider more active methods to support the training and use of ontologies within the behavioral sciences both to aid scientists in their use but also to support the creation of these ontologies that would enable linkage across scientific domains.