1 Introduction

International organizations (IOs) govern an incredible range of issues and take a wide variety of forms. Today, for example, there are many bodies involved in the governance of climate change and other environmental issues (Bulkeley et al., 2014; Rowan, 2021). The United Nations (UN) and UN Environment play central coordinating roles and provide secretarial services for a variety of treaty-based regimes. The Group on Earth Observations (GEO), by contrast, performs only the single function of coordinating data collection on environmental problems. The International Renewable Energy Agency (IRENA) also has a specific mandate—promoting clean energy adoption—but a range of other bodies, like the Major Economies Forum (MEF), offer flexible platforms to address multiple topics. Some of these institutions operate quite independently of states, but others are more tightly controlled. IRENA has a large secretariat based in Abu Dhabi; the MEF has none. The UN includes all states, GEO has about 100 members, but the MEF has only 17. Within a single issue-area, then, there are considerable differences in institutional design. And, in many other issue areas—global finance, security, human rights, and global health—the institutions underpinning international cooperation exhibit similar diversity.

Scholars of International Relations (IR) have analyzed the drivers and impacts of this important institutional variation for some time. And the field has, traditionally, conceptualized IOs broadly to account for the wide variety of organizations that exist. From this starting point, a range of theoretical arguments have then been advanced. Some apply to all IOs (Abbott et al., 2016; Gray, 2020), some hone in on particular regions or sub-types (Kahler, 2000; Volgy et al., 2009), and still others are concerned with the inner-workings of individual institutions (Kleine, 2013a). Theories can be formulated at each level, and data must match the level of argumentation to evaluate these in a compelling way. Across each, however, the Correlates of War (COW) IO dataset has played a crucial role. This dataset contains information on roughly 500 IOs created between 1815 and 2014, providing information on their dates of activity and patterns of state participation (Wallace & Singer, 1970; Pevehouse et al., 2004; Pevehouse et al., 2020). It has been used in a variety of studies to evaluate arguments about the entire population of IOs—their birth, life, death, membership, and impacts (Russett & Oneal, 2001; Mansfield & Pevehouse, 2006; Hafner-Burton et al., 2008; Eilstrup-Sangiovanni, 2020). It has also served as the empirical basis for many others that explore variation within samples of IOs to gather more fine-grained data—for example, on their independence, openness, authority, and size (Haftel & Thompson, 2006; Tallberg et al., 2013; Gray, 2018; Hooghe et al., 2019).

Though diverse in focus, these studies are united in that they each use the COW dataset as a “definitive” list of all the IOs that exist. Less frequently noted, we observe, is the fact that the dataset actually employs a highly restrictive definition—one that does not match the concept of an IO embedded in many theoretical arguments or the way the term is commonly used by states and practitioners. In practice, the dataset offers an index of highly-legalized, or “formal,” IOs, not IOs as such. This means that many of the organizations states rely on to pursue international cooperation—particularly, less legalized “informal” IOs, but also so-called “emanations”—are not accounted for. Many inferences that scholars make about IOs, whether they acknowledge this or not, are therefore necessarily conditional on legal formality when they use the COW dataset in this way. In some cases, this may not be problematic. But legal formality is a high bar to clear in international politics and this means that the “universe” of IOs is being truncated at a demanding and often theoretically-relevant point. As such, there is a risk that findings about IOs—their growth and decline, how various design features trade-off against one another, or the causal impacts of IOs on state behaviour—are actually shaped by the sample of IOs being used in a way that has so far gone overlooked.

Our claim in this article is that this “mismatch” between the broader concept of an IO explicitly and implicitly embraced by much of the discipline and the narrower one embedded in the most commonly used dataset has systematic and substantial implications for research. Our aim is to illuminate these implications—when and how this mismatch shapes our understanding of IOs—to contribute to theoretical debate, and to encourage researchers to think carefully about the fit between concepts and measures. We show that operationalizing the concept of an IO in a way that reduces the mismatch we identify can change the substantive results of analyses. Sometimes these changes are minor and simply add nuance to earlier understandings. In others, though, the changes are more significant and can even lead to conflicting conclusions about causal dynamics, compelling us to revise our theoretical expectations. In many cases, we argue, these issues can be solved in the future through more careful specification of theories and by using data that align with the concepts being employed in a study. For all, it will be essential to devote more attention to these issues moving forward.

It should be said, before moving on, that we are not the only scholars to observe COW’s shortcomings. For example, Tana Johnson (2014) uses the Yearbook of International Organizations, instead of COW, to analyze institutional emanations. Julia Gray’s analysis of zombie IOs supplements the economic IOs from COW with additional regional economic IOs where her theoretical argument should apply (Gray, 2018). Liesbet Hooghe and her coauthors subset the COW dataset to identify the most institutionalized IOs, the domain where their arguments about scale, community, delegation, and authority are most relevant (Hooghe et al., 2019). Thus, many scholars recognize that COW is not adequate for all research questions. Yet it remains unclear how exactly COW’s limitations may impact empirical research.

We begin, in the next section, by reviewing how the concept of an IO has developed in the literature, demonstrating that COW’s definition of an IO is narrow and out of step with how IR scholars, international lawyers, and state practitioners have generally thought about these bodies. Certainly, the COW project’s definition is consistent with its own research program and the dataset has had a valuable impact on research practices (Hafner-Burton et al., 2008; Gartzke & Schneider, 2013). Without doubt, as well, it accurately measures a crucial dimension of organized international cooperation. However, the understanding of IOs that prevails in the rest of IR (and beyond) is typically much broader.

There are a number of aspects to this, which we highlight below. But, in this paper, we focus primarily on the implications of omitting informal IOs. Informal organizations are nearly identical to the types of formal bodies found in the COW dataset in that both have states as members and exhibit some level of institutionalization. However, informal international organizations are constituted by non-binding, or “informal,” international agreements, and therefore their underlying legal nature contrasts with formal IOs, which are founded by treaties (Vabulas & Snidal, 2013; Roger, 2020; Vabulas & Snidal, 2020). Informal bodies have attracted growing attention in IR in recent years, as scholars have begun analyzing a wide range of informal and “low cost” varieties of governance, from public–private partnerships to transgovernmental networks of public officials, with many of these debates playing out in The Review of International Organizations (Slaughter, 2004; Kleine, 2013a; 2013b; Koremenos, 2013; Libman & Obydenkova, 2013; Stone, 2013; Vabulas & Snidal, 2013; Andonova, 2017; Abbott & Faude, 2020; Roger, 2020; Carlson & Koremenos, 2021; Martin, 2021; Westerwinter et al., 2021; Westerwinter, 2021). This interest partly reflects the fact that such arrangements appear to have grown increasingly common and have become more deeply involved in the governance of many pressing issues, particularly since the end of the Cold War. But, while many of these institutions are genuinely novel and have only burgeoned in number relatively recently, IR scholars and practitioners have actually long acknowledged their existence. Indeed, despite their omission from COW, they have been part of the concept of an IO used throughout IR and other cognate fields, like international law, since the early 20th century (Potter, 1935; Cox & Jacobson, 1973; Klabbers, 2001; Archer, 2015; Klabbers, 2015). This is a fact that has important empirical and theoretical ramifications for the study of these crucial institutions.

Conceptualization is a key task in any field of research, and refining settled ways of operationalizing ideas can sometimes spur major advances. Pamela Paxton (2000) has shown how adjusting measures of democracy to consider female suffrage undermines longstanding certainties about the onset and waves of democratization. Nils Gleditsch and his coauthors’ refinements to the concept of armed conflict have spawned intense debate about the incidence and severity of collective violence over time (Gleditsch et al., 2002). These kinds of re-evaluations are especially likely to occur when concepts and measures are mismatched, when the omitted dimension of a concept is substantively large, and when relevant phenomena are driven by distinct dynamics, implying that there is meaningful causal heterogeneity. Restated for IOs, if there are many informal organizations and these differ from formal ones in theoretically relevant ways, then existing findings may be more fragile than previously recognized. In the third section of the paper, we demonstrate that this is indeed the case. New datasets reveal, for instance, that informal IOs presently constitute about 40% of all international organizations—a large and growing share of the total IO population that can no longer be overlooked by scholars of IR (Roger, 2020; Vabulas & Snidal, 2020). Many recent studies have also demonstrated that informal IOs have unique functional properties and domestic implications for members (Farrell & Newman, 2019; Sauer, 2019; Roger, 2020). And, finally, we show that states can be members of a similar number of formal IOs, but have radically different informal membership profiles. There are, then, strong reasons to suspect that omitting informal IOs may be analytically consequential for certain theoretical arguments.

In the final section, we show how reducing the concept–measure mismatch can indeed meaningfully alter findings about IOs. We employ a new panel dataset of membership in informal organizations, which complements and significantly expands the existing COW dataset, to replicate three exemplary studies of IOs and illustrate the implications of the concept–measure mismatch (Greenhill, 2010; Mansfield & Pevehouse, 2006; Bernauer et al., 2010). We show how key statistical relationships within each study change when we operationalize the concept of an IO differently. In some cases, findings generalize further than previously thought. In one reanalysis, for instance, we show that both formal and informal IOs can have powerful socializing effects on states, prompting them to better human rights practices. And, in fact, informal IOs appear to have even larger impacts than formal ones. In this case, what is true of formal IOs is largely true—perhaps, even, more true—of informal bodies. In other cases, however, alternative ways of operationalizing membership lead to qualitatively different conclusions. In another reanalysis, we find that including informal IOs shifts our understanding of the relative importance of domestic and international drivers of participation in global governance. Across all three studies, we demonstrate how different measures of IOs can either extend, inflect, or challenge an existing finding. All three outcomes are substantively important for IR research, as they illuminate the hidden assumptions in arguments and prompt us to rethink the underlying causal dynamics at work.

In our conclusion, we draw the strands of our analysis together and look forward. Our analysis shows how the concepts embedded in data can shape research answers quite profoundly. Accordingly, we recommend several ways scholars can act on the insights we offer to improve how we consume and produce IO research. The most straightforward implication is that applied IO research must match theoretical concepts and empirical measures. Rather than using data “off the shelf,” or giving priority to one IO measure, scholars should ensure that the set of organizations included in an analysis—whether formal bodies, informal bodies, or both—aligns with the scope conditions implied by the theory one wishes to test. Depending on the argument being advanced, we believe, it may even be important to consider other varieties of governance as well, like emanations or public-private partnerships. Equally important, we argue that taking additional forms of cooperation into account can advance theoretical debates. Given their growing importance in world politics, for instance, informal IOs deserve much greater attention than they have received thus far. But studying informal bodies alongside the formal institutions scholars have traditionally focused on can also provide analytical leverage over more general questions. Within each of the studies we examine there has been some degree of contestation over the precise causal mechanisms at work. Some have viewed the legal nature of institutions as playing an all-important role, while others see things differently and focus on aspects of organized cooperation that may be more common across all types of IOs. We do not resolve these debates definitively here. Nonetheless, we do demonstrate that examining variation across subtypes of IOs can sharpen inquiry, reveal the dynamics that underpin global governance, and open doors to new theoretical terrain.

2 Background: The concept-measure mismatch

Today, there are hundreds of IOs that differ from one another in a remarkable number of ways. Indeed, when scholars analyze IOs, they have an enormous menu of options to choose from. In practice, however, the discipline has devoted considerable attention to a narrow range of more specific IOs that are not necessarily representative of the total population (Dixon, 1977; Hafner-Burton et al., 2008). Qualitative studies and textbooks have focused overwhelmingly on a few large and highly legalized bodies created in the aftermath of World War II, such as the United Nations (UN), the International Monetary Fund (IMF), the North Atlantic Treaty Organization (NATO), and the European Economic Community (EEC). These are important institutions, without doubt. However, they constitute a small and highly particular subset of all the IOs that exist. As a result, it is not always clear how well findings derived from these travel to the broader population. Quantitative studies often analyze a much larger number of bodies, and for this reason would seem to allay any question of representativeness. Such analyses have become much more prominent, especially since the 1990s. Yet quantitative work on IOs overwhelmingly draws on the COW dataset, which indexes a subset of IOs with a particular suite of design characteristics—not IOs as such. Given its importance to our argument, it is useful to consider this in some detail.

The first version of the COW dataset was developed by Michael Wallace and J. David Singer in the 1970s (Wallace & Singer, 1970). They created the dataset to test hypotheses about the impact of IOs on the incidence of war. More recently, their efforts have been carried forward by others, and COW data have become pervasive in empirical IO research (Pevehouse et al., 2004).Footnote 1 The dataset has proven to be particularly popular because it contains country-year membership data for a large number of IOs, making it especially convenient for scholars seeking to assess their drivers, dynamics, and impacts statistically.Footnote 2 However, a careful reading of the literature on IOs reveals that the concept embedded in the dataset does not align with what many mean by the term “international organization.”

The authors of the COW dataset make clear that they are primarily interested in measuring “formal” IOs, which they define as bodies that: (a) are created by states, (b) possess a secretariat or exhibit other evidence of institutionalization, and (c) are constituted by an “internationally recognized treaty” (Pevehouse et al., 2004).Footnote 3 There is certainly some rationale for this. First, nearly all of the widely studied IOs created after World War II meet these criteria. As a result, formal bodies often come to mind most immediately when we use the term. Second, there are lots of formal IOs. Many other less prominent bodies have also taken this form and until the 1970s virtually all did. It is no surprise, then, that Wallace and Singer focused on these institutions.

Yet formal IOs are hardly the only variety. In fact, political scientists have long acknowledged that there are many different kinds of IOs, as have international lawyers.Footnote 4 Perhaps most importantly, practitioners and states have also not regarded formal IOs as the only variety, as we discuss further below. Thus, the “true” population of IOs that both scholars and policymakers seem to have had in mind when they study and speak about these institutions has been broader (Clinton & Sridhar, 2017).

There are several ways this is so. Many—the creators of the COW dataset included—have acknowledged that there are a large number of bodies that clearly “count” as IOs but which are not created by states per se (Pevehouse et al., 2020). Some are constituted under the authority of other IOs, violating the first criterion (a) for inclusion in the COW dataset. Examples include the United Nations Development Programme, the World Food Programme, and the European Banking Authority. Some scholars have concluded that such bodies (often referred to as “emanations”) represent the vast majority of active IOs (Shanks et al., 1996; Johnson, 2014). Yet, given the centrality of the COW dataset and the fact that these institutions do not meet their inclusion criteria, emanations are rarely included in empirical analyses. One might argue that such organizational “progeny” are not independent from their “parent” institutions. If so, their inclusion in a dataset might result in some form of double counting. But the actual independence of these bodies varies, as does their membership, and when this so it makes excluding them on the basis of their lineage more tenuous. While true, we set aside emanations in this article. Our argument may apply to them as well, but data on state membership in such bodies is scant and our theoretical understanding of them is still underdeveloped. We know that they can be more independent, but it is still not clear when this is the case and we do not have comparable measures of this phenomenon.

We focus, instead, on another kind of “exception” that has been increasingly discussed in the literature and that we have developed a much more complete empirical picture of: informal IOs. As noted in the introduction, many organizations are constituted by states and exceed a basic threshold of institutionalization, yet are created without legally binding international agreements. Such institutions are directly analogous to formal IOs in that they host regular meetings of state officials, have internal committee structures and decision-making procedures, and possess unique corporate identities that make them instantly recognizable as organizations. Many even have secretariats that perform nearly identical functions. But, unlike formal IOs, they have been created through informal agreements, memorandums of understanding, or “soft law”—instruments that deliberately avoid establishing obligations under international law (Lipson, 1991; Abbott & Snidal, 2000). Thus, these IOs satisfy COW’s first two criteria—they are independent organizations, created by states—but violate the third (c) because they are not established using an “internationally recognized treaty.” These bodies are excluded from the COW dataset; yet, as with emanations, they have historically been part of the core concept of an IO.

In recent years, analytical interest in informal organizations has grown tremendously, especially following the pioneering work of Felicity Vabulas and Duncan Snidal (Vabulas & Snidal, 2013; Andonova, 2017; Westerwinter, 2021; Abbott & Faude, 2020; Roger, 2020; Vabulas & Snidal, 2020; Westerwinter et al., 2021). Their research has rigorously conceptualized and called attention to these important IOs. But, in fact, a longer intellectual lineage recognizes that IOs may be constituted with non-binding agreements. Pitman Potter, arguably the most influential theorist of IOs in the first half of the twentieth century, was among the first to acknowledge this. His efforts to define and classify IOs, which shape our thinking to this day, explicitly downplayed their legal nature. In 1935, in a pathbreaking article, Potter (1935, 218) wrote,

it is not the legal elements in the situation, the mutual rights and obligations of the parties, much less the text expressing that legal element, that constitutes the organization, but the union of states, partly juristic but also largely practical in nature. It is not what the organization is constitutionally and legally [...] but what it actually does, that determines its real nature and significance.”

For Potter, what matters most is the “organization-hood” of the institutions states establish. It is the unique “union of states” that compels us to regard a body as part of the “universe” of IOs, not its legal status. And Potter was not alone. In The Anatomy of Influence, another landmark study, Harold Jacobson and Robert Cox explicitly acknowledge that while many organizations are constituted formally by treaties, there are exceptions. “Given the nature of the international system,” they wrote, in 1973, “the creation of an international organization requires concrete action by states. Usually, although not always, such actions are consecrated in an international treaty” (Cox & Jacobson, 1973, 5, emphasis added). These sorts of statements litter the writings of scholars, from the 1930s to the present day. Clive Archer reviews 10 definitions advanced between the 1958 and 1995, and only Wallace and Singer define them solely as institutions created by binding agreement. He concludes, on this basis, that IOs are essentially “continuous structures,” created by states, that may be established either by treaty or a non-binding “constituent document” (Archer, 2015, 33). Finally, one more recent study of formal IOs by Thomas Volgy and his co-authors recognizes that such bodies are only part of a much broader category that includes “non-formal” international organizations as well (Volgy et al., 2009, 13–14).

Thus, when we look across a broad set of IR studies, it is clear that legal formality has not traditionally been considered an essential part of what it means to be an IO. This should not be surprising in view of thinking in adjacent fields, such as international law. Jan Klabbers, author of the most widely used textbook on international institutional law, observes, “[while] it is the case that most international organizations are set up on the basis of a treaty, this is not invariably the case.” Indeed, he says when creating an IO, “states have the choice between using a legally binding instrument and a non-legally binding instrument” (Klabbers, 2015, 144). Other widely used texts make analogous points (White, 2005; Seyersted, 2008). And, no less an authority than the International Law Commission has stated, while “[most] international organizations are established by a treaty,” they are also “sometimes established without a treaty” (United Nations, 2009, 27).

The simplest reason behind this is the fact that states have not exclusively emphasized the legal nature of international organizations. Prominent bodies like the Asia Pacific Economic Cooperation, the General Agreement on Tariffs and Trade, the Nordic Council, and the Organization for Security and Cooperation in Europe, among many others, have all been regarded as such and yet none was originally established by an international treaty (Kahler, 2000; Klabbers, 2001).Footnote 5 National legislation on the privileges and immunities granted to IOs—one of the few places where states explain what “counts” as an international organization under domestic law—also reveals that states rarely define them exclusively on the basis of their legal status. In Canada, for instance, the Foreign Missions and International Organizations Act of 1991 defines an IO as an “organization, whether or not established by treaty, of which two or more states are members, and includes an intergovernmental conference in which two or more states participate” (Government of Canada, 1991, c. 41, sec. 2(1), emphasis added). Although it is impossible to review all such definitions here, similar ones can be found in the legislation of most OECD states (Reinisch, 2013). Few restrict IOs to formally constituted bodies.

Thus, states have regularly accepted that at least some IOs are informal, as have international lawyers. Most of those who have sought to rigorously define these institutions in the field of IR have acknowledged the same. This all attests to the fact that the concept of an IO prevailing in the discipline is much wider than the one we find in the most commonly used dataset of IOs. The COW authors deliberately choose to exclude these bodies. We hasten to add, though, that this should not be considered an error on their part. Indeed, the authors of the COW dataset are admirably clear about their intention to measure an extremely important type of IO. They are well aware that other varieties exist, informal ones included (Pevehouse et al., 2020, 3, 6–7). Nonetheless, many studies use the COW dataset as if it offers a definitive list of all IOs, aiming to make more general claims about them, or using it to test theories that in fact have a wide scope. Indeed, the COW dataset has been used in a remarkable variety of studies to evaluate theories about the life-cycle, membership patterns, and impacts of IOs. Many others sample from the COW dataset to develop finer analyses, focusing on specific types of IOs or particular design features. In doing so, many these studies assume, either explicitly or implicitly, that they can generalize about IOs using this data (Hafner-Burton et al., 2008, 175). However, given this mismatch, it is important to reflect upon whether—or when—it is safe to do so.

3 The issue: When mismatches matter

Certainly, there are conditions when it may be safe to generalize from a narrow subset of formal bodies to the broader population of international organizations. This would be so, first, if the number of excluded observations—informal organizations—was relatively small. If this were the case, the subset of formal organizations would comprise virtually the entire population of IOs and omitting informals would probably not affect empirical findings substantially. However, if informal IOs represented a large share of all IOs, then their omission would be more fraught. This need not be true under all conditions, though. If the two types of institutions were identical in key respects, a study based on only one would not be greatly affected by the absence of the other. However, if there were important differences between them, then conclusions drawn from a narrower sample may rest on unsteady foundations.

Until recently, scholars have rarely evaluated whether these conditions are satisfied in practice. Mostly, researchers appear to have based their thinking on intuition and anecdotal observations. Cox, Jacobson, and Klabbers have each claimed that informal IOs are likely small in number (Cox & Jacobson, 1973, 5). And, in at least a few instances, scholars also seem to think that informal bodies may not be all that different from their formal counterparts. Klabbers (2001, 410) has claimed that “in terms of input, output and outlook” informal bodies can be “indistinguishable from a hypothetical [formal] organization that would do the same job.” David Zaring (2019, 9) has said, similarly, that international governance “can look much the same whether it is conducted through treaty, custom, or other, less formal, mechanisms.” If Klabbers, Zaring, and others are right, then the concept–measure mismatch we have identified in the literature may not be all that consequential.

Yet the evidence supporting statements that informal IOs are rare, or that they simply mirror formal IOs and are therefore inconsequential—and potentially ignorable in statistical analyses—is relatively thin. Indeed, as our understanding of informal institutions has improved, what we have learned undermines such assertions. A recent update of the original database developed by Vabulas and Snidal (2020) and another published by Roger (2020), which employs a somewhat broader definition of an informal IO, reveal that the number of informal organizations is actually much larger than previously thought. Combining Roger’s data with the COW database, we show in Fig. 1 that there has been a tremendous increase in the number of informal IOs relative to formal ones during the post-war era. While formal IOs were the dominant institutional form in the early post-war period, they have been losing institutional share to informal IOs. Informals began to grow quickly in number in the 1980s, just as the number of formal bodies began to stagnate and even decline. As a result, by 2010, informals comprised roughly 40% of active IOs. Today, their numbers are larger still. While it may have been safe to ignore informal IOs in the 1970s, when the first version of the COW dataset was created, this omission is likely to be far more consequential today.

Fig. 1
figure 1

Formal and informal IOs are increasing over time. The growth of informal IOs is roughly constant from the 1980s, but the number of formal IOs peaks in 1998

A careful reading of recent work also demonstrates that formal and informal bodies can differ in important ways that may be relevant to various causal arguments about IOs. Certainly, they share crucial features. As organizations, for instance, they both provide venues for state officials to gather, bargain, and exchange information. Both may also have secretariats that perform nearly identical functions. For some analyses, these similarities may be what matter most. However, informal bodies are also thought to possess functional properties that make them better or worse at addressing specific cooperation problems. And, for other analyses, these can be highly relevant and cannot be assumed away. For example, Vabulas and Snidal have argued that informal organizations are faster, more flexible, and can offer states greater confidentiality (Vabulas and Snidal, 2013; Sauer, 2019; Carlson & Koremenos, 2021). Formal organizations are diverse, but can, on balance, facilitate cooperation between larger groups of states, achieve greater scale and scope, and can operate more independently. Accordingly, current work suggests these organizations are created for different reasons. Informal IOs are established when states face problems that are rapidly changing and politically sensitive; formal organizations are chosen when there is greater scope for opportunism. Even more recently, others have shown that informal IOs have different domestic implications, and that when powerful states face significant domestic constraints they rely on informality to bypass political opposition (Roger, 2020; Vabulas & Snidal, 2020).

A number of scholars have also demonstrated that informal IOs can have unique impacts on state behaviour. Uwe Puetter (2006) and Tom Sauer (2019) claim that informal bodies offer social environments that are more conducive to deliberation and consensus-building. By doing away with diplomatic protocol and shielding negotiations from the public eye, informal organizations help officials to engage in free-flowing dialogue and build trusting relationships. This can, in turn, make them effective environments for persuasion, socialization, and policy learning. Kal Raustiala (2002) and Chris Brummer (2014) have advanced related arguments, explaining how participation in informal bodies can promote adherence to international rules, since officials within them can foster closer interpersonal relationships and this generates stronger reputational effects. Finally, Henry Farrell, Abraham Newman and their coauthors argue that access to informal IOs and reliance upon soft law can strengthen domestic actors by providing political resources and building technical capacity (Bach & Newman, 2010; Newman & Posner, 2018; Farrell & Newman, 2019).

Taken together, these studies all imply that while they do share important characteristics there may be significant differences in the composition of state membership across formal and informal IOs. If states use formal and informal organizations to achieve different ends, then one would expect to see unique political dynamics unfolding across subtypes. And if, as a result, one type of state prefers to govern through formal bodies while another type prefers informal ones, then measuring participation by counting only formal IOs could severely undercount membership and lead to misspecification of arguments when such data are used. Accordingly, we would risk misunderstanding a significant component of international organization.

To illustrate how these compositional effects manifest, consider counts of IO membership from the dataset we utilize in this article (discussed in more detail below). Figure 2 plots state membership in formal and informal IOs at four different time periods from 1955 to 2010. Membership across each subtype of IO is positively correlated, as states that participate the most in one subtype of IO are also often the top participants in the other subtype of IO. Nonetheless, there are substantial differences in the membership portfolios of individual states. For example, by 2010, many states had joined over 50 formal organizations, but had fewer than 15 informal memberships; meanwhile, others were affiliated with relatively similar numbers of formal and informal IOs. While most states are joining more informal IOs during this time period, the gap between the most and the least active states grows steadily. Ultimately, the composition of IO memberships—which IOs states join and who states partner with—can differ across IOs in myriad ways that these scatter plots can only hint at. These counts do not indicate whether states select different partners across subtypes of IOs or provide more qualitative information about membership decisions. Nonetheless, Fig. 2 emphasizes the basic point of the concept–measure mismatch: namely, assuming behaviour associated with formal bodies is representative of behaviour across all IOs is questionable and may lead us to different conclusions than we might reach if we considered a fuller population of IOs.

Fig. 2
figure 2

State membership in formal and informal IOs increases over time and correlates positively. By 2010, there are large differences in IO portfolios

4 Empirical analysis: How the mismatch matters for research on IOs

How exactly does this mismatch shape analyses of world politics? We study this by re-examining several prominent studies that have employed the COW dataset, observing how their results change when we operationalize the concept of an IO differently. To do so, we utilize a new dataset of state participation in informal IOs. This builds on an already-published but more limited version (Roger, 2020). In the basic dataset, informal IOs are defined as international institutions, created by states, that are constituted by non-binding international agreements and that meet a minimum threshold of institutionalization, defined as possession of a distinct corporate identity, regular meetings, and evidence of committee structures or common decision-making procedures.Footnote 6 Each element of this definition, which mirrors the one underlying the COW dataset, is operationalized using several specific indicators, which were then used to sift through hundreds of “candidate” organizations. These indicators, explained further the Appendix and in Roger (2020), determine whether an agreement is considered “non-binding,” whether meetings are “regular,” what counts as evidence of “committee structures,” and so on. For candidate organizations, the basic dataset relied primarily on the same source material as COW, namely the Yearbook of International Organizations. As such, the final dataset can be readily combined with the COW dataset to provide a more complete picture of the universe of IOs.Footnote 7 The elementary units in both are the same—they are all IOs—and, at bottom, differ only in their levels of legal formality.

The first version of the informal IO dataset only contained information on the years in which each organization was established and the original members, or “founders,” of each body. This naturally limited its usefulness for scholars. Here, we have vastly expanded its value by adding yearly membership data for 244 (94%) of the 260 informal bodies it contains. This was done for all states and in accordance with COW’s coding rules (Pevehouse et al., 2004; Pevehouse et al., 2020). Thus, the two datasets are now fully compatible at the state-year and IO-year levels. Scholars can move between IO- and state-level measures of membership across formal, informal, and a summary of “total” IOs as appropriate for their research. Note, though, that a number of informal IOs that appear in the COW dataset have been removed to create a separate measure of formal IO memberships that we use in the reanalyses.

We assess the generalizability of existing findings with an eye to understanding whether and how subtypes of IOs generate distinct effects in world politics. Depending on the type of argument a study makes, there are two ways that data may affect findings. First, some studies explicitly intend to apply, or, in view of their theoretical claims can plausibly be interpreted to apply to all IOs, but are actually evaluated using only the COW dataset of formal organizations. Our examination suggests this constitutes a large share of quantitative studies, though certainly not all. Here, the key question is whether conclusions do in fact extend to the full population of IOs as intended. If the findings hold using a more encompassing measure of IOs, then the original inferences are strengthened. However, if key findings change, then more restrictive scope conditions may be necessary. In the extreme, the causal claims may need to be rethought. For other studies that restrict their arguments to formal IOs, there remains an interesting theoretical question of whether their findings actually travel further than intended—a question of scope conditions. A reanalysis may support the original inference or lead us to conclude that the relationship actually extends across subtypes. In the latter case, the original scope conditions are too conservative. For each type of argument—those about all IOs or those about formal IOs in particular—it is possible that conclusions may change when evaluated using new data.Footnote 8

Our replication-based review of these arguments allows us to probe the generalizability and scope conditions of existing findings by showing how different operationalizations have very precise implications for individual studies. In our reanalyses, we are primarily interested in the stability of regression coefficients when relationships are estimated using different measures of IO membership. If changes are small, this would suggest that the concept–measure mismatch we have highlighted may be a more peripheral concern. On the other hand, if an alternative operationalization leads to large changes in coefficients or substantive shifts in our interpretation of results, this would indicate that the mismatch has real implications. Overall, it should be emphasized, our intention is not to show that these studies are “wrong.” It is, instead, to probe their empirical and theoretical limits, with an eye to identifying areas where this disjuncture between concepts and measures of IOs does seem to shape the answers we get, and where there may be a need for tighter argumentation, theoretical innovation, or refinement of research practices.

We selected studies, first, to maximize the breadth of coverage. Each article advances a unique argument about IOs; namely, concerning their role as forums for socialization, as tools of states, and as mechanisms for signalling cooperative intentions—three of the major roles identified in the broader IO literature. Second, all three studies make use of the COW dataset but, crucially, appear to make claims about IOs in general and include language providing reasonable grounds for thinking that the arguments could or should apply to a broader range of cases. There are two rationales here. First, the papers are somewhat ambiguous about the scope of their claims and none explicitly limits its arguments to formal IOs alone. Indeed, the term “formal IO,” or a reasonable equivalent, is not used in any; each is concerned with “international organizations,” with no qualifying adjective. Interestingly, as well, when the term is defined, the authors have used definitions that are not aligned with the one underlying the COW dataset, making no reference to legal formality, and which are known to be more inclusive. Given this, readers might well embrace a quite expansive interpretation of the claims being made, particularly in view of the wider concept of an IO that has historically been embraced by the field, as demonstrated above. Perhaps, of course, formality was implied. But, second, and equally importantly, none of the articles develops its argument solely with reference to the legal nature of the institutions they investigate. When we disentangle the theoretical claims being advanced, there are many statements implying that the theory could well apply to all IOs, in principle, or which can lead to conflicting interpretations of the role formality should play. Thus, in our discussion of each paper, we introduce the arguments that are made and explain why it is plausible to include informal IOs within the scope of each analysis.

Beyond these considerations, finally, we selected studies where changes could be clearly identified. Some studies sub-divide IOs based on their level of institutionalization such that they cannot be directly compared with our dataset at present (Boehmer et al., 2004; Hooghe et al., 2019). And, within this set of studies, we choose work that employs strong research designs, is widely cited, has been published in leading journals, and whose findings could be replicated. As a result, we believe these studies represent best practices in the field and should constitute “tough cases” for our argument. If key findings change here, this is likely to occur in a variety of other studies as well.

This is indeed what we find, although to varying degrees and in different ways in each of the three studies. Overall, the reanalyses provide strong evidence that the measures being used in scholarship on IOs can matter a great deal, indicating the importance of carefully matching concepts and measures in this area of study. The cases also illustrate our broader point about the importance of considering informal IOs in world politics. Each of the findings we revisit could arise from different underlying mechanisms that have been debated in the field—up to the present day—particularly whether formal legal commitments or other mechanisms linked to state membership in an organization leads IOs to have the effects they do. While we do not aim to resolve these debates definitively, we show how greater attention to the legal design of institutions can help refine and advance theoretical inquiry.

4.1 Case 1. Human rights and the socializing role of IOs

Do IOs act as forums for socialization? Brian Greenhill investigates this by studying how the levels of human rights protection afforded by a state’s counterparts in IOs influence its own human rights practices. This study builds on a well-developed constructivist literature that explores how states’ interests can be reformulated “through [processes] of interaction with other states, whereby states copy, or learn from, the forms of behavior exhibited by others” (Greenhill, 2010, 129). Greenhill argues that IOs provide useful venues for such interactions and that when actors adhering to different norms are brought together on a regular basis this should lead to a level of normative convergence that would not be expected otherwise. Specifically, membership in IOs should improve human rights practices when the other IO members that a state interacts with have strong respect for human rights.

To evaluate this argument, Greenhill measures the average human rights protection afforded by members of an international organization—what he refers to as an IO’s membership “context”—then aggregates this across a state’s portfolio of IO memberships. This measure represents the average level of respect for human rights among the members of an IO averaged over each body that a country is a member of (see Eq. 1).

Greenhill derives the IO context measure from the COW dataset, such that the theory is only tested for formal IOs. It is possible, of course, that he only ever intended the argument to apply to formal IOs, though Greenhill never uses the term itself. Nonetheless, there are grounds for readers to think the argument applies more broadly, and that if it really was his intention to limit his argument to formal IOs, then those scope conditions might be too restrictive. Crucially, in our view, he theorizes that “this type of socialization effect can be thought to take place in any [IO] that provides some sort of venue for interstate communication” (Greenhill, 2010, 130, emphasis in original). As stated, then, Greenhill’s argument appears to be about all IOs. It does not establish scope conditions that limit the argument to formal institutions, and it is plausible to think there are good reasons not to restrict the scope of the argument: formal organizations constitute only a fraction of the venues in which state officials communicate with their foreign counterparts. Informal organizations offer equally plausible, perhaps even superior, forums for socialization, as some researchers have suggested (Checkel, 2001; Raustiala, 2002; Puetter, 2006; Sauer, 2019).Footnote 9 Within informal organizations, government officials may have greater privacy, which may promote franker discussion, and ultimately facilitate more effective persuasion. If these reasons hold, then estimates of the socializing effects of IOs may even be stronger when we use a more inclusive measure of membership, all else being equal.

Is this the case? We re-examine how IOs act as forums for socialization by replicating Greenhill’s study using new measures of states’ IO context across IO subtypes. We construct separate measures of states’ IO context for formal IOs only, informal IOs only, and total IOs. Each is constructed as follows:

$$ \text{IO Context}^{d}_{it} = \frac{{\sum}_{j=1}^{J} (\overline{\text{HR}}_{j t} | \text{IO}^{d}_{ijt} = 1)}{\# \text{IOs}^{d}_{it}} $$
(1)

where d ∈{Greenhill, Formal IOs, Informal IOs, Total IOs} indexes the different IO datasets, i indexes states, jJ indexes IOs, t indexes years, and \(\overline {\text {HR}}\) denotes the average Cignarelli and Richards (CIRI) human rights score of states in IO j. The final measure is the average human rights score of the member states involved in all the IOs that state i is a member of in each year. Formal and informal IO context measures are accordingly on the same scale, facilitating interpretation. The measures are highly correlated, although informal IO context scores tend to be higher than formal ones (see Fig. APP-1).

Greenhill estimates an ordered probit model with lagged independent variables and a lagged dependent variable. Accordingly, we estimate versions of the same econometric model:

$$ Prob(\text{CIRI}_{it}{}={}l) = Prob(\kappa_{l-1} < \beta \text{IO Context}^{d}_{it-1} {}+{}\delta \text{CIRI}_{it-1} + \boldsymbol{\gamma} \mathbf{C}_{it-1} + \epsilon_{it} < \kappa_{l}) $$
(2)

where d ∈{Greenhill, Formal IOs, Informal IOs, Total IOs} indexes the IO context measures from different datasets, i indexes states, t indexes years, lL indexes the eight levels of CIRI physical integrity rights scores, and C is a vector of country-level covariates. We are interested, above all, in the stability of β, the coefficient on our measures of states’ human rights context across the different samples of IOs.

Table 1 reports our reanalysis of Greenhill’s main model.Footnote 10 In model 1.1, we present the baseline estimate from Greenhill’s paper, using his own IO context variable and the full set of controls from his replication files. In model 1.2, we show results using a refined formal IO-only context variable that removes a number of informal organizations that appear in the COW dataset. We find the same relationship: better human rights practices among fellow IO members is associated with improvements in human rights domestically. Turning to informal IOs in model 1.3, we find the same positive and statistically significant relationship. Substantively, a one-standard deviation change in informal IO context is associated with a one-seventh standard deviation change in human rights practices, while an equivalent change in formal IO context is only associated with a one-ninth standard deviation change in human rights practices. These are actually relatively small differences substantively, and the confidence intervals for the standardized coefficients overlap across measurement specifications. Finally, in model 1.4, we find the same positive relationship for total IO context.

Table 1 Exact replications of Greenhill (2010: table 2, model 1)

In our view, the results of our reanalysis broadly support Greenhill’s central claim about international organizations as forums for socialization. But they also modify his ideas, as his argument generalizes to all IOs. In the context of our analysis exercise, we interpret this as a “minor change” scenario, where an existing finding is reinforced or travels further than originally intended when using a wider measure of IO membership. Across all of the measures we employ, the key coefficients are positive, suggesting that interacting in all types of IOs can lead to improvements in human rights practices. Informal IOs have the strongest socializing effect, although the difference between them and formal ones is minor. This nevertheless shows that formality may be causally relevant—at the margin, at least—but scholars will have to specify and investigate how this is so more precisely. For now, the main point is that operationalizing IO membership differently can lead us to extend an existing finding to a wider set of IOs. As our next reanalysis shows, however, this need not always be the case: they are sometimes inflected as well.

4.2 Case 2. Democratization and the drivers of membership in IOs

Why do states join IOs? Many studies have argued that states become members in order to coordinate with others and solve problems that they could not on their own. This sort of argument is largely functional in nature. However, in an important article Edward Mansfield and Jon Pevehouse (2006) argued that IOs can provide important, purely domestic benefits too. Specifically, their study drew attention to how membership can help democratizing states cement liberal reforms. They claim that IOs do this, first, by transmitting information about leaders’ behaviour and “sounding an alarm” if reforms fall short or are rolled back. Second, IOs may be able to impose conditions on those seeking membership, driving behavioural convergence around liberal practices. And, finally, IO membership may raise the costs of reform-reversals, since such moves may be punished through sanctions, suspensions, or expulsions that can disrupt the stream of benefits that flow from international cooperation. These attributes of IOs are attractive to democratizing policymakers, and therefore Mansfield and Pevehouse argue that democratizing states can be expected to join more IOs.

Should this theory apply to formal organizations only? Mansfield and Pevehouse are unclear on this point, as they never discuss formality or law in their text and use a definition of IOs that is more expansive than the one in the COW dataset, making no reference to legal formality (Mansfield and Pevehouse 2006, 138; Shanks et al., 1996, 593) . It is plausible to think that this is simply assumed. More specific language might have helped to clarify their intentions—a point we return to later on. But when we consider the authors’ causal mechanisms, only some of these are obviously better suited to formal IOs. Defecting from a binding agreement should have greater reputational costs compared to a non-binding agreement. This matches research in international law on the causal role that such commitments generally play. However, informal IOs may be equally capable of providing important support for democratizing states. Informal bodies can, for instance, enact policy surveillance, peer review, and membership conditionality (Brummer, 2014). Indeed, they often do this quite well. Several prominent election monitoring bodies that raise alarm bells for backsliding are informal in nature, including the Commonwealth Secretariat and the Organization for Security and Cooperation in Europe. Paul Poast and Johannes Urpelainen (2018) have also argued that the “real” work that IOs do to facilitate democratic consolidation is through the more mundane task of capacity-building and training officials instead of through explicit commitments and conditionality. Again, informal bodies can do this quite effectively, since capacity-building is often a critical instrument in their governance toolbox (Raustiala, 2002; Slaughter, 2004; Bach and Newman, 2010). To this point, many IOs designed explicitly to support new democracies are informal as well, including the Community of Democracies and the Community of Democratic Choice.

As in our discussion of the socialization case study, we can consider what the scope conditions should be for this argument. If we read Mansfield and Pevehouse as arguing that only formal IOs can support democratizing states, then the scope conditions are clear. However, the existing literature advances alternative mechanisms to explain why democratizing states seek IO membership, and many of these mechanisms could be activated by informal IOs. As a result, it is an interesting theoretical and empirical question whether the scope conditions should bound the findings to formal IOs only, or include informals as well. Adding informal IOs to our analysis can actually help to evaluate these arguments, since it heightens the contrast between mechanisms linked to formal, explicit commitments and those linked to capacity-building.

Mansfield and Pevehouse predict the number of IOs a state joins by estimating a linear regression model with panel-corrected standard errors. We replicate their analysis using the authors’ original independent variables and the new data on IO membership described in the previous section. Specifically, we estimate versions of the following econometric model:

$$ \begin{aligned} {\Delta} \# \text{IOs}^{d}_{it} = \beta_{1}\text{Democratization}_{it} + \beta_{2}\text{Autocratization}_{it} + \beta_{3}\text{StableDemocracy}_{it}\\ + \boldsymbol{\gamma} \mathbf{C}_{it} + \boldsymbol{\delta} \mathbf{Z}_{t} + \alpha_{r} + \epsilon_{it} \end{aligned} $$
(3)

where d ∈{Mansfield & Pevehouse, Formal IOs, Informal IOs, Total IOs} indexes the IO measures from different datasets, i indexes states, t indexes years, αr are fixed effects for geographical regions, C is a vector of country-level covariates, and Z is a vector of time-varying, system-level covariates. Regime changes (democratization and autocratization) are measured as shifts in regime type relative to five years prior, using changes in countries’ Polity scores. Stable autocracy is the left-out regime category. We are particularly interested in the stability of β1, the coefficient on democratization estimated using different IO datasets.

Table 2 presents the results. In model 2.1, we replicate the original finding and recover the same positive, statistically significant effect of democratization on IO membership. We find a similar substantive effect in model 2.2 using our revised measure of formal IOs only. We again find the same same positive substantive and statistically significant relationship for total IO membership in model 2.4. However, our investigation using informal IOs suggests a much smaller effect and one that is not statistically significant. In model 2.3, we find that democratization has no apparent effect on states’ decisions to join informal bodies. Informal IOs on their own do not have a statistically significant relationship with democratization. Comparing across models raises the issue of interpreting effects sizes. The effect of democratization is to join 0.314 [0.118, 0.510] formal IOs, while democratizing states join only 0.032 [-0.058, 0.122] informal IOs. As we can see, the lower limit of the confidence interval for formal IO membership overlaps with the upper limit of the confidence interval for informal IO membership. In further analyses using standardized coefficients, these are statistically distinguishable only at the 85.6% level (see appendix figure APP-3).

Table 2 Exact replications of Mansfield & Pevehouse (2006: table 2, model 1.1)

These qualifications aside, the overall results of the analysis reshape our interpretation of the causal mechanisms linking democratization and IO membership. Our findings suggest that democratization has a net positive effect on membership, supporting the original finding. However, we also find that this effect appears to differ across IOs. Crucially, this appears to be driven by the fact that democratizing states reach out to formal, not informal, IOs. This, in turn suggests that the legal nature of these institutions and their ability to facilitate credible commitments matters to states—not solely their ability to build capacity, as Poast and Urpelainen theorize. Accordingly, this finding helps to illustrate our concerns about how different ways of measuring IOs can affect results, and it emphasizes the need for greater specificity about the set of cases that a causal mechanism is intended to apply to. Ultimately, this analysis offers a mixed case for the literature since the overall effect of IOs points in the same direction, but is substantially lower due to heterogeneity across subtypes. It shows that this is driven by the effect of just one type of IO—not all IOs, and reveals how exploring differences across institutions can contribute to current theoretical debates. Here, the legality of institutions appears to matter, not merely the fact of being international organizations.

4.3 Case 3. Domestic or international drivers of global governance?

In the previous case, results proved to be different in theoretically consequential ways, but the original finding was broadly upheld. In the final study that we revisit, accounting for informal organizations changes our perspective more substantially. Thomas Bernauer, Anna Kahlbenn, Vally Koubi, and Gabriele Spilker assess whether state ratification of international environmental agreements (IEAs) is better explained by “domestic” or “international” factors (Bernauer et al., 2010). Domestic factors are those linked to states’ domestic political economies, such as their institutions and wealth. International factors refer to states’ relationships with other actors, such as regional dynamics, globalization, or peer effects. Bernauer et al. operationalize these latter linkages with several indicators, but mainly focus on the extent of states’ pre-existing IO memberships. The study argues that IO membership supports broader global governance behaviour (such as the ratification of IEAs) because “more extensive membership in international organizations motivates states to behave more co-operatively when it comes to forms of international cooperation that lie outside the scope of specific international organizations they have joined at some prior time” (Bernauer et al., 2010, 514). They argue, further, that “membership in international organizations signals a general willingness of states to behave co-operatively in international matters, which states may also carry over to other very particular issue areas such as environmental policy” (Bernauer et al., 2010, 515).

The argument, as stated, applies to IOs as such. The term “formal IO” or a reasonable equivalent does not appear in the text. Although it is possible, again, that the authors intended the argument to apply to only formal IOs, this is not stated explicitly and based on their own reasoning it is plausible to think that it extends further. The paper nevertheless operationalizes state membership in IOs using the COW dataset and finds that greater IO membership is associated with more and quicker IEA ratifications. Crucially for the argument being made, it also finds that the effect of IO membership outweighs the effects of domestic variables. Everything hangs on this result. Yet, a positive effect of IO membership could actually be consistent with either a signalling mechanism or a socialization mechanism, as the sections quoted above illustrate. And this matters, in turn, for the types of IOs we think the argument should apply to. If membership primarily signals greater cooperativeness, then legal commitments could lead formal IOs to have larger effects than informal IOs, as in the study by Mansfield and Pevehouse. But, if membership mainly works by socializing states to more cooperative preferences, then both formal and informal memberships should have positive effects. Informal IOs might even be more efficient mechanisms, as our consideration of Greenhill suggests. New data can help disentangle the scope of these two mechanisms, since formal and informal IOs differ in their legality but both serve as plausible forums for socialization. If only formal memberships predict ratification, then we should draw the more qualified inference that membership in more legalized institutions—not international factors per se—drive governance patterns, perhaps because greater formality screens “sincere” co-operators. As with the two previous case studies, we can evaluate differences across formal and informal IOs as a way to consider the appropriate scope conditions for existing arguments.

To what extent, then, does this finding rest on the particular way that the study operationalizes IO membership, and what implications do revisions have for our understanding of theoretical mechanisms? Bernauer et al. estimate a logistic regression model treating a state’s binary IEA ratification as grouped duration data with the time interval set to one year. The unit of observation is the country-treaty pair in each year, allowing the inclusion of treaty-specific and country-specific covariates. The study models time dependence using cubic time polynomials. Here, we estimate versions of the following econometric model:

$$ \begin{aligned} Prob(\text{Ratification}_{ijt} = 1 | \text{Ratification}_{ijt-1} \neq 1) = logit^{-1} (\beta \# \text{IOs}^{d}_{it} \\ +\boldsymbol{\gamma} \mathbf{C}_{it} + \boldsymbol{\zeta} \mathbf{X}_{ijt} + \boldsymbol{\theta} \mathbf{W}_{jt} + \boldsymbol{\delta} \mathbf{Z}_{t} + \alpha_{r} + \epsilon_{ijt}) \end{aligned} $$
(4)

where d ∈{Bernauer et al., Formal IOs, Informal IOs, Total IOs} indexes the IO measures from different datasets, i indexes states, j indexes IEAs, t indexes years, αr are fixed effects for geographical regions, C is a vector of country-level covariates, X is a vector of country-treaty-level covariates, W is a vector of treaty-year covariates, and Z is a vector of time-varying, system-level covariates. As earlier, we are interested in the stability of β, the coefficient on the count of IO memberships across datasets.

In Table 3, we begin by replicating Bernauer et al.’s main finding using their published dataset in model 3.1. We recover the same statistically significant positive effect of membership on IEA ratification. In model 3.2, we use revised formal IO membership data and match the original finding. The results begin to diverge in models 3.3 and 3.4. In model 3.3, we find that membership in informal institutions has no effect on IEA ratification, even though the study’s theoretical argument could be expected to generalize to this domain. Second, in model 3.4, while we continue to find that states joining more IOs also ratify more IEAs, the substantive effect of membership is reduced. This attenuation is critical for the study’s conclusions, since the authors rely on the large coefficient in model 3.1 to stress the importance of international over domestic factors as drivers of global governance. The effect of membership is now weaker than regime type and comparable to many of the other variables in the model. To compare across models, we standardize the coefficients and find that they overlap across measures. The original IO measure, as well formal and total IO measures, have very similar positive standardized coefficients and confidence intervals, though the standardized coefficient for informal IOs is nearly zero and overlaps with zero. Nonetheless, the upper limit of the standardized informal IOs confidence interval partially overlaps with the lower limit of the standardized formal IOs confidence interval. The two coefficients are only statistically distinguishable at the 92.4% confidence level (see appendix figure APP-3).

Table 3 Exact replications of Bernauer et al. (2010: table 3, model 2)

Thus, a key implication from the Bernauer et al. study is not robust to this alternative specification of the independent variable. This leads, ultimately, to a very different conclusion than the one that Bernauer et al. reach. “International factors” do not seem have uniform effects on IEA ratification and, overall, appear to be weaker than domestic ones. Evidently, not all international factors are equal, as membership in formal IO is much more closely associated with IEA ratification than membership in informal ones. In terms of the two possible mechanisms, this would suggest that legalization plays the bigger role in their argument and that socialization processes have a more limited impact, if any, in this case. However, this finding raises another question of endogeneity. On the evidence, it is equally possible that certain “types” of states prefer “harder” forms of international cooperation, which manifests in a tendency to govern transboundary problems using both binding agreements and formal organizations. Another type of state may be more likely to prefer less institutionalization, which leads them to address problems using “softer” policy instruments that are not fully captured by the binding IEAs investigated. Given that not all international factors matter and that formal varieties of cooperation go together, this possibility cannot be excluded. And, in fact, some support for this conclusion exists in a recent study by David Hagebölling, where shared informal IO membership better predicts a country dyad’s likelihood of signing a non-binding strategic partnership than shared formal IO membership, which does not have a statistically significant effect (Hagebölling, 2019). All of this suggests that the key variable in Bernauer et al.’s study may actually be capturing the effect of domestic preferences over formality rather than the international impacts of membership, thereby locating the drivers of participation in global governance more squarely in domestic than international politics.

More would need to be done to test this possibility systematically. The key take-away, here, is that this case illustrates our main argument: that findings in the field can be profoundly shaped by the concepts embedded in our measures, and that taking informal IOs into account can help advance theoretical debates. Indeed, in scenarios where major changes to statistical results arise, this may lead us toward very different conclusions about international organization.

5 Conclusion

Informal organizations have increased in number and prominence. As a result, a growing body of research focuses on understanding them, shining a light on a relatively neglected variety of cross-border cooperation. Thus far, however, scholars have only rarely, if at all, considered what their earlier omission has meant for the study of IOs more generally. Our argument here is that this has been significant. Informal IOs have long been recognized as a type of IO, yet, with emanations, they have been left out of the most prominent dataset. While this omission may have been justified at the time the COW dataset was created, this concept–measure mismatch has taken on increased significance, as informal IOs and reliance on the COW dataset have both grown. As informal bodies proliferate, the way IOs are operationalized in many studies has become detached from the real population of interest. This leaves us, ultimately, with a skewed view of world politics. Certainly, not every study is affected. But it is as if scholars aimed to generalize about IOs by only observing those with universal membership or only those dealing with economic issues. We risk impoverishing our understanding of IOs as a result.

Table 4 summarizes our conclusions about the concept–measure mismatch from the studies we examined. We find that formal and informal IOs have different drivers of membership, just as the effect of extensive participation in formal IOs differs from that of informal IOs for predicting state ratification of environmental treaties. However, this does not suggest that informal IOs do not matter. Indeed, we also show that informal IOs have the same, or perhaps even stronger, socializing effect as formal IOs. There is no single net effect of informal IOs on world politics. Instead, substantive theories of IOs—for example, whether IOs are forums for socialization or tools for commitments—are implicated differently by considering subtypes of IOs. Ultimately, taking informal IOs into account can reshape our understanding of IOs and shed light on enduring theoretical debates.

Table 4 Reanalysis findings and implications

While it is impossible to determine the number of studies whose results are shaped by these issues, there are reasons to think this is only the tip of the iceberg. Given the prominence and rigour of the studies we examine, our conclusions are likely to extend to other published results. Many other unpublished studies could also be affected. Mismeasuring IO membership can inflate effects sizes, as in the democratization replication, but it could also attenuate effect sizes. By truncating IO membership using the COW data, it may have been harder for some studies to discover effects of IOs, leading scholars to find no effects where hypotheses may have been valid for informal or total IO membership. It is, of course, impossible to guess how many studies may have been affected in this way, given that null findings are almost never published, but the potential impact of such an effect could be quite large.

The issue is also likely to extend to a range of other studies that focus on narrower subsets of IOs. The arguments we reanalyze treat all IOs alike, essentially pooling information across different types. But, recent research on IOs uses these aggregate measures less frequently than in the past, choosing to focus instead on specific features, such as their legal authority or the access they grant to transnational actors (Tallberg et al., 2013; Hooghe et al., 2019). The same issue can affect these studies, though, since they typically use the COW dataset to generate their lists of IOs, drawing random samples from it or focusing on specific subsets. Their analyses are, therefore, just as likely to be conditioned by legal formality as the ones we have focused on.

What should be done moving forward? First, we believe readers should be conscious of the generalizability of published findings on IOs. Too often, scholars refer to “international organizations” rather loosely, leaving the actual domain of an analysis and the degree to which an argument can generalize ambiguous. It is possible that a causal process applies to all IOs, but it is also possible that it varies across subtypes of IOs, or even other forms of governance entirely. Short of reanalysing studies, as we have done here, consumers of IO research will have to think carefully about the degree to which existing theory suggests the legal formality distinction matters. For an argument like Greenhill’s, what matters is that an IO provides opportunities for socialization and formality appears secondary. For other arguments, formality dominates. At the level of a theoretical argument, we may have strong priors that mechanisms are conditional on formality. But for many others, effects could hold across types. In such cases, only a careful reanalysis will provide definitive insights.

With respect to future research, the onus falls on producers. We think much can be accomplished by being more specific about terminology and scope conditions. If a scholar argues that a study is about “IOs” as such, they must recognize that the total population is vast, including formal and informal IOs, but also, we would add, many emanations as well. If this is not the population that an argument is intended to generalize to, then appropriate qualifications should be used to express an argument’s intended scope. Encouragingly, a growing number of studies define their population precisely (Hooghe et al., 2019), but this practice should become more widespread. Then, for the set of IOs that a theory is intended to apply to, it is imperative to use data that matches the concepts being employed. If a theory applies to all types, we suggest that researchers adopt a broader measure of IOs—both formal and informal. If not, then a scholar should state the level of uncertainty surrounding their claims. In other cases, of course, a theory may only be expected to apply to formal or informal IOs. If so, then focusing empirically on one type in an analysis is justified. However, it is also possible to test the validity of this claim, perhaps using IOs of the other type as a placebo. As we have demonstrated here, though, such an analysis might also reveal that causal claims travel further than one initially realized.

Ultimately, one of the most valuable aspects of accounting for informality is the leverage that this provides for theoretical debates. Informal institutions have become increasingly important over time, regulating a wide array of issue areas. They should be studied for this reason alone. However, in each of the cases we have explored, there has also been some debate about which underlying causal mechanisms prevail. Many of these debates turn on whether an institution’s legal nature plays the key causal role, or whether other non-legal mechanisms, such as socialization and capacity-building, are at work. While we have not set out to resolve these questions definitively here, each reanalysis demonstrates how future research can address these questions more precisely using data on different types of IOs. Future researchers can leverage variation in institutional design, and the new dataset that we provide here, to address these long-standing questions, thereby advancing research on IOs and world politics. And, equally valuably, this may highlight new puzzles for researchers to explore.