Introduction

Balancing behavior against threats—in a nutshell, the tendency of states to try and close existing power gaps with threatening neighbors through the formation of coalitions or the enhancement of their own strength—has for decades enjoyed special prominence in the field of international relations (IR), mostly because of its centrality to the realist branch of IR theory. Successful balancing operations (such as the Sino-American détente of the 1970s) have, at times, had far-reaching consequences for global power shifts. However, this is not the only reason for the amount of attention that this phenomenon commands. Equally interesting are the many cases in which large-scale rebalancing did not occur despite massive preceding shifts in the balance of power, most notably the end of the Cold War. Accordingly, the field has accumulated an extensive range of theoretical explanations for when and how such behavior is expected to take place—see Waltz (1979, 2000), Walt (1985, 1987), Rosecrance (1995) and Schweller (2004) for just a selection of the most prominent arguments.

This paper aims to contribute to these debates by conducting a systematic large-n study of cases where states have been exposed to an outside stimulus to engage in balancing behavior, but have consequently embarked on sometimes drastically different strategies. For theoretical and conceptual reasons explained further below, I focus on states that faced one clearly defined foreign source of threat from a superior power. I attempt to gauge the extent to which discrepancies in state behavior were caused by variances in the level of the external threat as well as by factors at the domestic level.

This article is structured as follows: The “Theoretical approach” section reviews the concept of balancing within the realist school in IR, introduces specific theoretical frameworks and formulates a set of contending hypotheses. Subsequently, the section on “Case selection” section introduces the selection criteria used to identify states that would be expected to engage in balancing, and, subsequently, the cases used for this study. The “Operationalization and methodology” section describes the set of indicators and regression models used to test the hypotheses, while its results are presented in the “Results” section. The final “Conclusion and outlook” section attempts to link the results of this paper to other existing findings and offers some suggestions on how this work might be expanded for use in subsequent research.

Theoretical approach

Contemporary realist theory in IR, arguably the field’s most prominent and hotly contested branch of thought, was initially conceived of as a completely structuralist and systemic approach. It is most closely associated with the work of Waltz (1979), who established it around the tenets of the primacy of self-preservation and the anarchic nature of the international system, which are assumed to provide strong incentives for states to respond to outside threats in a predictable fashion. State behavior is assumed to be strongly shaped by the presence of immediate existential threats—states that are weaker than their competitors will always be faced with their own possible demise as a sovereign entity (ibid.: 118), especially if the respective relationships are tense. Accordingly, states must strive to compensate for these power gaps by resorting to a rather limited repertoire of viable strategies.

Any action taken to alleviate a power gap is described as “balancing” behavior since it aims to (re-)establish a balance of power. These strategies are further divided into “internal” and “external” modes of balancing (ibid.: 168), depending on whether a state primarily seeks to augment its own capabilities or to find allies against a common threat. If the power gap is so pronounced that neither of these strategies can overcome it, alternatives may be found in “bandwagoning” or “appeasement” (ibid.: 126)—both of which essentially amount to sacrificing a measure of independence and paying tribute to a superior state. Finally, “buck passing” refers to any strategy of staying on the sidelines and avoiding any sort of engagement with a superior state on the assumption that others will take care of the threat and bear the associated costs instead.Footnote 1

In its initial formulation, structural realism assumes that the choice between these strategies is ultimately determined by the specific distribution of material capabilities (and, hence, power) between individual states. One widely adopted revision to this model is the replacement of the concept of a “balance of power” as the strategic aim of rational states with that of a “balance of threat,” acknowledging that threats are only actualized when a power shortfall co-occurs with an assumed hostile intent on the part of the superior side (Walt 1987: 21–26, 265f.). This was subsequently bolstered by empirical observations made at the end of the Cold War, when a very substantial power shift did not trigger any major alterations in existing alliance patterns.

Since the relative extent of power (or threatening power) is deemed to be the only relevant variable, other characteristics are dismissed, leading to an abstract conceptualization of states as “functionally alike units” that will respond to similar circumstances with similar behavior. However, this expectation is precisely where realism sometimes clashes with observed reality, thus creating empirical puzzles and supplying a starting point for scholars willing to relax some of realism’s assumptions while still holding on to its main tenets (Rose 1998). Specifically, the apparent failure of states to engage in balancing behavior against rising neighbors (Schweller 2004) resulted in renewed attention toward the domestic determinants of state behavior, and the two most plausible explanations for balancing failures are both connected to this level.

First, as mentioned above, states may disregard power imbalances because they do not actually feel threatened by them (Walt 1985). Since presumed hostility is strongly connected to a state’s policies rather than just its capabilities, its source can be traced back to the domestic level. Second, state leaders may be more concerned about preserving their individual power than the security of their nation, thus expending their limited resources on defusing domestic rather than foreign threats (Schweller 2004). Unit-level variables such as domestic institutions and power configurations may thus have an impact on the selection of foreign policy strategies (and vice versa). This proposition is characteristic for a number of revisions to the realist model, most of which are usually subsumed under the label “neoclassical” realism, which has in recent years emerged as the most active branch of realist thought in IR (Rathbun 2008; Rose 1998).

But how does this domestic influence play out? Apart from case-specific policy analyses, several more expansive and abstract theoretical frameworks that deal with this question have been proposed. Mastanduno et al. (1989) introduced a general framework including a set of testable hypotheses. This theoretical approach shapes much of the reasoning presented below, and some of the hypotheses that I ultimately derive are very similar. Hence, this paper can be understood as an attempt to apply these theoretical assumptions to balancing behavior by means of a quantitative study—thereby expanding upon the existing body of qualitative research and case studies.Footnote 2 The authors conceptualize “states” primarily as the ruling elite and administrative apparatus of a given country and as distinct from civil society and its various interest groups. To conduct politics at the international level, the state needs to command resources created at the domestic level, particularly the military costs associated with the overarching objective of national survival. Accordingly, states need to manage their societies’ resources by engaging in mobilization (economic expansion through investments) or extraction (through means ranging from taxation to expropriation). Costly interactions at the international level may necessitate extraction, but doing so always risks breeding domestic discontent.

Building on this work and some subsequent extensions and studies (Christensen 1996; Zakaria 1998), Taliaferro (2006) proposed a “resource extractive” model of the state that conceptualizes unit-level features like state institutions and the predominance of ideological trends like nationalism and statism as the key intervening variables between systemic impulses and strategic choices related to internal balancing measures. This addresses the puzzle of why some states, such as Meiji-era Japan, were able to withstand external threats through internal balancing, while others like China failed to adopt them and saw their power wither as a result. In his conclusion, Taliaferro stresses the point that balancing behavior cannot be understood without analyzing a state’s domestic institutions and power balance, since they can preclude the adoption of strategies that would be rational from a national point of view but detrimental to the interests of domestic actors. Similarly, Schweller (2006: 62–68) explores cases of underbalancing and suggests that the cohesion of elites and society at large is crucial for formulating effective balancing policies against rising threats.

While these theoretical considerations contain many specific predictions that cannot be covered in their entirety by this study, they nevertheless still inform it by casting a light on the influence of unit-level institutions. Subsequently, I will focus on two variables that capture key features of these institutions: strength and inclusiveness. The former is a straightforward adaptation of the models outlined above, centering on the fact that states may simply be unable to marshal the necessary resources for an effective threat response. In turn, the overall strength of a state’s institutions depends on many factors, like the scope of government, its centralization, administrative efficiency and technology. This aspect of “state strength” should not be confused with “power” measured at the national level, as a significant share of national resources may effectively lie outside of the grasp of the state.

Likewise, a state’s democratic or authoritarian character will shape how the preferences of social actors impact elite decision making and thus the selection of policies and strategies: democratically elected leaders have to retain the favor of a majority of voters, while autocrats need to cultivate a coalition of domestic interests that is powerful enough to defeat any would-be challengers. The latter usually include factions that may be small in number but in control of the nation’s means of production (business elites or bureaucrats, depending on the economic system) and destruction (the military), respectively.Footnote 3

As mentioned above, states seeking to reduce power gaps with their neighbors face a basic choice between either trying to expand their own capabilities or finding allies against a mutual threat, or a combination of both. Subsequently, I will introduce each of these strategies separately while trying to devise a set of concrete, testable hypotheses.

Internal balancing strategies

When addressing internal balancing strategies, it is first necessary to identify the determinants of state power within the theoretical context and to explain how they may be used in different strategies. According to realism, a state’s power arises from its material capabilities, which are in turn divided into industrial resources, military strength, population size, geography and technology (Waltz 1979: 129–131, 2000). For the purpose of this analysis, we can discount factors that either cannot be altered meaningfully through state action (geography) or that can only be shaped in the long term and are thus less relevant for addressing immediate threats (demographics). The others, however, allow a state to expand specific aspects of its power, often involving trade-offs. For example, it may engage in a short-term military buildup, extracting resources from its economic base in the process and likely touching upon the interests of influential social actors through redistribution. Practice of internal balancing can thus be observed by focusing on how many resources a state devotes to its military establishment. I propose two slightly different variations here, each coupled with a different resource: manpower and capital. Accordingly, states may pursue internal balancing by engaging in societal militarization or economic mobilization.Footnote 4

Societal militarization is defined here as expanding the size of the military’s personnel. This involves shifting potential workers to a generally nonproductive task, thus bolstering the military sector at the expense of the civilian economy. Since high militarization levels are usually achieved by drafting recruits, this also takes a heavy toll on the citizenry. However, this strategy can usually be implemented rather quickly and therefore constitutes an effective response to immediate threats. Strong states with well-developed bureaucracies should be better equipped to organize such actions and to support a larger standing army.

Economic mobilization means shifting financial resources from the civilian or public economy to the military sector. Domestic actors affiliated with the military-industrial complex—high-ranking officers, the armaments industry, national security bureaucrats and their political allies—can be expected to benefit at the expense of the citizenry at large. Since the military’s backing is often a key element in keeping authoritarian elites in power, the latter should be more inclined to support this sector regardless of the extent of external threat faced. Strong states should also find it easier to conduct such transfers because they already have direct access to a larger share of national wealth.

External balancing strategies

External balancing is employed when trying to win allies for the common cause of facing down a shared threat. According to one of realism’s central tenets, weaker states are expected to build a coalition against a stronger rival—particularly if the latter is increasing its power so quickly that it might soon become dominant within a region (Waltz 1979: 118f.; Mearsheimer 2002: 155–157). Sometimes, help can also be enlisted from extraregional states that are not yet themselves threatened by a rising power, but who are nevertheless wary of facing a more direct challenge further down the road if it establishes regional dominance and sets its sights elsewhere, thus engaging in “offshore balancing” (Mearsheimer 2002: 264–266).

Compared to the multiple different facets of self-strengthening, there is much less variety in the actions that a state can undertake in the pursuit of external balancing, this being a straightforward quest to establish alliances and pacts to aid each other against a clearly designated enemy. Nonagression pacts with other neighbors can also help, inasmuch as they defuse less relevant threats and allow a state to focus on the more direct and important ones. The expected relationship between domestic institutions and external balancing behavior is harder to verify since these agreements are generally much less likely to impact resource distribution at the domestic level.Footnote 5 Previous research on this question has generally found that the regime type of any single state is not by itself a strong predictor of its alliance behavior. However, two states that share the same type—or, according to Werner and Lemke (1997), a wider range—of domestic institutions are more likely to establish a stable partnership (Lai and Reiter 2000). This should at least leave autocracies with more alliance opportunities if for no other reason than the fact that regimes of this type had persistently outnumbered democracies up until the crest of the “Third Wave” in 1990.Footnote 6

While a government’s freedom of action in crafting foreign alliances could possibly be seen as a facet of state strength, it cannot be directly subsumed under the definition of the term given above. Here, I am going to assume that a state’s access to a society’s resources has no direct relationship with its ability to engage in alliance building. However, a lack of domestic strength could very well become the primary motivation for doing so: if a state’s power disadvantage stems from its lack of control over its own resources, looking for help abroad may appear more attractive than trying to extend its domestic reach, which could provoke a backlash (Mastanduno et al. 1989). Hence, I will assume that weaker states should establish more alliances simply because they have a greater need for them, although their own lack of attractiveness as a partner may of course be a countervailing factor.

Both strategies outlined above have their advantages and disadvantages for states seeking to restore the balance of power in their region. On the one hand, enlisting the help of allies can be used to address a power gap both very quickly and at less overall cost if the newfound partners contribute their own resources. On the other hand, the international system’s persistent anarchy and self-help logic cannot be completely overcome by alliances of convenience. The choice that states face is also not an absolute either-or proposition, and they are confronted with powerful motivations for hedging their bets by combining both—for example, expanding one’s own abilities could make a state a more attractive alliance partner to others.

Hypotheses

In order to assess the utility of these frameworks, I have devised a set of nine working hypotheses grouped into three sets to account for the different balancing strategies outlined above. In each set, one hypothesis deals with the expected influence of the nature of a state’s regime, the strength of its domestic institutions and the magnitude of the power gap between the state and its competitor, respectively. In other words, the first two always deal with unit-level factors, while the last one addresses systemic impulses.

H1a

The more authoritarian a state is, the more it will tend to resort to societal militarization.

H1b

The stronger a state is at the domestic level, the more it will tend to resort to societal militarization.

H1c

The weaker a state is relative to its competitors, the more it will tend to resort to societal militarization.

H2a

The more authoritarian a state is, the more it will tend to resort to economic mobilization.

H2b

The stronger a state is at the domestic level, the more it will tend to resort to economic mobilization.

H2c

The weaker a state is relative to its competitors, the more it will tend to resort to economic mobilization.

H3a

The more authoritarian a state is, the more it will tend to establish external alliances.

H3b

The weaker a state is at the domestic level, the more it will tend to establish external alliances.

H3c

The weaker a state is relative to its competitors, the more it will tend to establish external alliances.

This setup allows for a direct comparison of the relative influence of each of the three theoretically plausible impulses for balancing. Certainly, there are many other relevant factors that have in the past been used to successfully explain the specific strategic choices of states, such as the level of armaments expenditure or alliances.Footnote 7 However, for the purposes of this paper, a limited selection that can consistently be applied to all interesting dependent variables is more useful since the main intention here is not to maximize the explanatory power of any single model but rather to compare the relevance of the independent variables.

Case selection

Since the emphasis of this study is on state behavior in a highly threatening environment—in other words, in circumstances that would be especially amenable to realist assumptions—the criteria for selecting cases were quite narrowly formulated in order to come up with a small sample that could thus be considered a “most likely” design as far as a systemic balancing incentives are concerned. Another key aim was to focus on relationships where the “balance of threat” (rather than the more general “balance of power”) is presumed to provide the dominant systemic impulse for state behavior since it is more plausible as a motive (Walt 1985, 1987). However, “threat” is a concept that is even more difficult to define in quantitative terms than “power” due to its association with subjective perceptions. Hence, I have attempted to hold this factor as constant as possible by devising selection criteria that should help isolate cases facing threats that are similar in nature but different in scope.

Consequently, I propose to use the presence of an external threat to the territorial integrity or even existence of the state as the main criterion. More specifically, I have attempted to isolate cases in which one state laid a formal claim to a part or all of another state’s territory. This choice was made for the following reasons: First, territorial claims are likely to be seen as threats independent of the characteristics of specific administrations or elites raising them, because they are usually rooted in long-standing national historical narratives and legacies [Murphy 2005; see also Hensel (1996) on territorial conflicts as drivers of interstate rivalry]. Second, they can be relatively clearly observed since such claims are usually openly stated. Third, countervailing territorial claims are very hard to resolve peacefully due to the ease of mobilizing nationalist sentiments against would-be compromisers (Walter 2003). Additionally, territorial conflicts have been shown to be a very important trigger of heightened interstate rivalry and, ultimately, conflict (Vasquez and Leskiw 2001). In turn, involvement in long-lasting conflicts makes it necessary to formulate long-term strategic responses that should be easy to observe. Fourth, threats arising from territorial conflicts are much harder to meet by employing alternatives to balancing strategies (like buck passing or appeasement). Since this paper focuses on balancing alone, ruling out unaccounted alternatives is appropriate because they would otherwise potentially skew the results.

However, this selection rule is not by itself sufficient to identify significant external threats. There are currently plenty of formally unresolved territorial conflicts in the world in which both sides refrain from aggressively pursuing their claims, even among allies like Canada and the USA. These conflicts are so low-profile that they are almost never reported on in the media, and many citizens (and even elites) may not even be aware of them, rendering them mostly irrelevant.

In order to establish which territorial disputes are seen as a threat to a nation’s territorial integrity, it is helpful to draw upon another data source: the list of dyadic militarized interstate disputes (MIDs) maintained by the Correlates of War (CoW) project (Ghosn et al. 2004). This database contains a large number of dyadic disputes, ranging in intensity from mere threats to full-scale wars. From this list, we can identify a sample of dyads that were engaged in militarized territorial disputes in the post-WWII era and whittle it further down to only serious conflicts by choosing those which saw at least a “clash” between both sides resulting in a hundred or more fatalities. This relatively high level of intensity is chosen in order to only include conflicts that were the result of systematic, large-scale state action instead of the agency of individual soldiers or officers.

Next, the remaining dyadic relationships are investigated to identify which clashes occurred during the time period when both sides were already involved in a territorial dispute (in other words, disputes that only arose as a result of the observed military action are not counted). We are thus left with a number of cases where these disputes were significant enough to trigger a conflict between both sides’ organized militaries, usually as a result of permanently high border militarization. Table 1 contains the cases identified through this process, ordered alphabetically by the name(s) of the disputed territory. It also gives the names of both claimants, the duration of the conflict and the state estimated to be the weaker one. A conflict’s inception is measured by the year in which one side first staked an official claim to the other’s territory, while its end is marked either by the peaceful resolution of the conflict or by other circumstances.

Table 1 Cases selected for analysis.

Finally, identifying the weaker side in each conflict is crucial, because it is these states which have the strongest incentive to engage in balancing. The process for estimating the weaker side is straightforward: For every year of an ongoing conflict, both states’ comprehensive national strength scores—as according to the CoW’s National Material Capabilities Database (Singer 1987)—are compared with each other. The country with the lower score is then designated as the weaker side for that year. In most cases, the weaker side remains the same throughout the conflict, except for some dyads in which both countries are very close to each other in terms of comprehensive strength. For these cases, Table 1 gives the country that was weaker for the longer time and gives the overall number of years for which this was the case.

The list of cases includes many of the most destructive conflicts that arose during the latter half of the twentieth century and covers many well-documented instances of balancing behavior.Footnote 8 The only “false positive” may be the Falklands dispute, in which the weaker side (Argentina) arguably did not have to fear a direct threat from its opponent (the UK), because it was not in control of the disputed territory, had a reliable security guarantee from the USA, and the other side was not prone to waging aggressive preventive wars. All in all, the selection method yields a total of 18 dyadic disputes and 688 country years for further analysis.

Operationalization and methodology

To address the hypotheses outlined above, I assembled a dataset containing the following information for each country involved in a dispute and—as far as possible—each year of its duration:

  • Overall national strength, as measured in the composite index of national capability (CINC) variable in the CoW national capabilities dataset cited above. This variable is an average of six subindicators and is expressed as a nation’s share of the world total, thus ranging between 0 and 1 in value.Footnote 9 Since CINC data are currently only available up to 2007, this year is the upper boundary for the period covered by the dataset.

  • The democratic or authoritarian character of a state’s political institutions. For this, I used the polity2 variable from the Polity IV dataset.Footnote 10 The variable ranges between -10 (highly authoritarian) and 10 (highly democratic).

  • The overall strength of a state’s government. For this, I used state final consumption expenditure as a percentage of GDP as provided by the World Bank (going back to 1960) for all countries in this sample except North Korea. Due to the regression parameters described below, this year effectively forms the lower boundary for the covered period. The resulting value ranges between 0 and 100.

  • Each state’s economic mobilization, as measured in military expenditure as a share of (nominal) GDP. Military expenditure in USD was obtained from the CoW national capabilities dataset; the corresponding GDP figures in USD were obtained from the World Bank (again, only covering the period from 1960 and excluding North Korea). These values range between 0 and 1.

  • Militarization, as measured in the personnel strength of each state’s standing military as a share of total population. Both of these figures were obtained from the CoW national capabilities dataset. These values range between 0 and 1.

  • Alliance policy, a concept that is highly complex and hard to measure quantitatively. As an approach, I obtained data on formal alliances from the Alliance and Treaty Obligations Project [ATOP])Footnote 11 and subsequently calculated the sum of the capabilities of a nation’s alliance partners in order to account for the strength which they could lend. Alliances were included based on the following criteria:

    • All bilateral defense pacts.

    • Multilateral defense pacts that did not include the other state in the dyad as well.

    • Bi- and multilateral nonagression and neutrality pacts with adjacent states provided that they did not include the opponent as well.

      These alliances are considered to be in effect for any year in which they covered a six-month period or more. Multiple parallel agreements with the same partner(s) were not counted separately. Things got a little more complicated when summarizing the capabilities of a nation’s alliance partners, especially in multilateral agreements: all partners that also had similar agreements with the dyadic opponent were considered neutral and removed for all years in which the latter were in effect. For example, Egypt was no longer considered an Arab League ally of Syria or Jordan after it signed its separate nonagression pact with Israel in 1979. After these computations, the overall sum of the strength of a nation’s alliance partners ranges between 0 and 1.

    • Finally, since the framework presented above is not exhaustive, I included several control variables. One such factor was the overall dissimilarity between the regimes of both states within a dyad, because—as noted above—previous studies have shown that this factor tends to have an exacerbating impact on conflicts (Souva 2004).Footnote 12 Accordingly, threat perceptions may depend on how different the regimes of both states are and thus form an additional balancing incentive. To check for this, “regime dissimilarity” was operationalized as the absolute value of the difference between the polity2 scores of both nations in a dyad.

      Second, mobilization as measured in human and capital resources devoted to the military is likely to be especially high in times when a threat has been actualized, i.e., when the nation is at war. In order to avoid distortions, I included a dummy variable indicating whether a weaker state was at war during a year (1 if this was the case, 0 otherwise). For this indicator, I resorted to the CoW database, considering all instances of militarized interstate disputes at the “war” level.

      It would also have been desirable to check for the effects of economic interdependence as measured in terms of bilateral trade, another factor that is usually expected to lower mutual threat perceptions (Lee and Mitchell 2012). Unfortunately, however, obtaining data for many of the dyads within this sample was very difficult even despite the checking of multiple standard sources (CoW, the World Bank and Gleditsch’s (2002) database). The last source, which was overall the most comprehensive one, would have reduced the sample to 168 country years when applying listwise exclusion of cases, also eliminating many dyads completely. Hence, this variable was left out of the final analysis.

The method of analysis is as follows: First, the comprehensive national strength estimates for each pair of countries are compared for each year of their dispute. The state with the lower score is designated as the weaker one for that year, and the ratio between both claimants’ scores is stored in an additional variable for later use as a measurement of the “power differential.” Power gaps are thus expressed as the score of the stronger state divided by that of the weaker, which captures relative power differences.

Second, the model proceeds to select the indicators for the dependent and other independent variables for each weaker state and year. These values are then used as the input for a series of regression analyses that estimate the degree of correlation between the independent and dependent variables. This setup allows for a systematic analysis of the relative influence of both domestic institutions and system-level impulses for state behavior.

The main restrictions on the model’s scope are its two econometric variables (state strength and the GDP figures necessary to calculate the share of military expenditure), since the World Bank database does not cover the period prior to 1960. Additionally, the North Korean government unfortunately does not supply this kind of data, which means that this case is eliminated from the analysis across all models.Footnote 13 The remaining dyads show a great deal of variance, with power imbalances ranging from quasi-parity (Israel–Syria and Morocco–Algeria) to a pronounced dominance of one side (China–Vietnam). Similarly, the sample also comprises both very weak and very strong states, domestic institutions ranging from totalitarian dictatorships to inclusive democracies, and levels of mobilization ranging from the outright absence of any armed forces to full-scale efforts.Footnote 14

In order to explore the relationships between these factors, they were entered into multivariate linear regression models with a listwise exclusion of missing data (any case in which data for a single variable are missing is excluded from the analysis altogether). I account for fixed country-level effects through the inclusion of dummy variables for each weaker state, restricting the analysis to within-case variation.Footnote 15 The independent variables were lagged for one year (meaning that a state’s environment in year n is supposed to explain its behavior in year n + 1) since balancing policies take time to craft and implement and the effect of domestic constraints and external pressures is as a result likely to be somewhat delayed.

Variables other than the polity indicator, its derivative measuring regime difference and dummies were log transformed to bring them in line with model assumptions. For the first two models, these transformations were sufficient to derive valid assumptions regarding heteroskedasticity, skewness and kurtosis, while the third model additionally features robust standard errors to correct for this issue. All models also pass tests for multicollinearity between the independent variables, checked for by estimating variance inflation factors (VIF).

Results

Table 2 contains the regression output for all three models to be able to compare the influence of each factor on specific dimensions of balancing, omitting coefficients and significance estimates for country dummies. As can be seen at first glance, the models that address internal balancing (1 and 2) perform much better at predicting the dependent variable than the one that focuses on external balancing.

Table 2 Regression results.

In model 1, democracy, state strength and the power gap are all shown to be highly significant predictors of militarization. In the first two cases, the effect is also as expected in the relevant hypotheses. However, the estimate for the effect of the power gap is a negative one, indicating that the bigger a power gap gets, the lower a nation’s militarization. Since this model only considers within-case variance, this finding is based on individual state reactions to shifting conditions, not long-term structural differences in the magnitude of threats faced by various states. The most readily apparent interpretation would seem to be a causal connection that goes further than what the relevant hypothesis stated—in this case, that states might be able to effectively reduce power gaps through militarization, meaning that the former decreases as the latter goes up. However, since the size of a nation’s army is just one of the six equally weighted components of the CINC score, even a dramatic expansion would only have a correspondingly diluted effect on the power differential. It is also curious that a similar effect does not turn up in the next model, although shifts in military expenditures and their effect on the power gap would seem to be subject to the same logic.

Model 2 also fares quite well in explaining the variance in military expenditure as a share of GDP. Here, no apparent effect of domestic political regimes can be found, although both state strength and power gap are highly significant predictors. In the latter case, the effect is also in line with expectations, since the estimate is for a linear positive relationship between wider power gaps and higher military spending.

Finally, model 3 does not find many significant effects that would explain the extent of external balancing, which is at least in part due to the difficulty of capturing this concept in a time in which states were moving from official alliances to other cooperation mechanisms. Unsurprisingly, this model’s explanatory power is much more reliant on country dummies than the two others, while just one explanatory variable is highly significant—and here, too, state strength is the most powerful predictor. This confirms the hypothesis that weak domestic regimes are the most active in making up for their shortcomings through external partnerships (Table 3).

Table 3 Summary of hypothesis tests

Conclusion and outlook

The model developed in this paper yields some conclusions about the sometimes tenuous connections between balancing incentives—as derived from dangerous power shortfalls compared with threatening neighbors—and observable outcomes. By themselves, these incentives alone are not sufficient for explaining a meaningful share of the variance between different countries’ practice of internal or external balancing. Domestic institutions, on the other hand, seem to be highly relevant factors for both internal and external balancing, and actually fare better than systemic incentives when it comes to explaining outcomes. Irrespective of whether they are merely intervening factors that distort objective measurements of outside threats—in Rose’s (1998) words, acting as a “transmission belt” linking threats and responses—or primary causal factors in themselves, these features should not be ignored in the analysis of balancing behavior. These results lend further support to a criticism that has been frequently leveled against the purely structuralist interpretations of realist thinking and are in line with the theoretical expectations and existing case studies of its neoclassical variant.

At the same time, there are a few noticeable shortcomings within these models as well as some questions that remain open, both offering substantial reasons to continue research in this direction. Most importantly, variance in external balancing in general could not be sufficiently accounted for even when employing both domestic and systemic explanatory factors. This is most likely due to problems with the indicator used to measure alliances, as it misses out on less-formal bilateral relationships that are nevertheless still important for balancing purposes. Additionally, the increasing prevalence of multilateral collective security schemes over classic bilateral alliances may have something to do with this: long-standing multilateral alliances are themselves institutions of considerable influence, and there is substantial evidence that they are resilient even in the face of massive shifts in the balance of power (NATO being the most prominent example here). Still, this model could be improved by employing more inclusive indicators like direct bilateral military assistance or other subsidies, if sufficient data can be obtained.

Second, all models are highly reliant on country effects in deriving their explanatory power.Footnote 16 While this was, again, especially pronounced in the case of the model dealing with alliance policy, it also affected those which explored internal balancing efforts. This emphasizes the difficulty of finding truly general principles underlying the behavior of highly complex units like nation states, and the importance of paying attention to the idiosyncrasies of cases. Individual case studies have already been advanced to explore high-profile balancing failures or other deviant cases, and these results suggest that there are many other examples of unexpected behavior even when bolstering balance-of-threat theory with domestic explanatory factors.

Third, the method employed here to identify cases faced with a high threat level should also offer some opportunities for future research. However, the very concept of what constitutes a “threat” is intrinsically subjective in nature and thus lends itself to qualitative rather than quantitative analysis. Nevertheless, the rule employed here did result in the identification of many bilateral relationships that are generally considered to be among the world’s most serious interstate conflicts—which suggests that the focus on territorial disputes, coupled with power gaps, can serve as an effective part of future quantitative explorations of the concept of threat.