1 Introduction

The recent spurt in the empirical research on the causes and consequences of corruption has outpaced the related growth of theoretical research on the determinants. Researchers have studied multiple causes and effects of corruption with varying samples, variables, time periods and measures (Dimant and Tosato 2018; Lambsdorff 2006; Seldadyo and de Haan 2006; Svensson 2005). Despite the large body of research, one relevant aspect, namely of the effectiveness of government enforcement in controlling corrupt activity, has received relatively larger theoretical attention with relatively little empirical evidence. On this issue this paper attempts to fill the gap between theory and evidence by determining the effects of law enforcement and institutional setting in shaping corruption. More specifically, the work examines the relative effectiveness of various types of law enforcement (police, judges, prosecutors), judicial efficiency (measured by conviction rates), and broader measures of related institutional quality (e.g., rule of law, regulatory quality—see La Porta et al. 1999; Rose-Ackerman 1999) in combating corruption.

The main idea is that while law enforcement may not necessarily be effective in curbing corruption, the quality of the institutional setting is a necessary condition to reduce corruption. This occurs both because law enforcement might drain (crowd out) public resources from other useful public investments, and because the institutional setting is the contingent framework within which enforcement run (hence institutions are “gums” within which the “teeth” are set). In this sense, law enforcement (police and prosecution) may be described as the “direct or visible tool” to control corruption, while better institutional quality can be seen as “the framework” within which one can activate those tools. We hence expect to find a stronger and sound negative effect of institutional quality on corruption and a weaker or even uncertain effect of law enforcement on corruption.

It could be the case that, in certain nations, there might be a higher share of police, judges or other enforcement officials, but the overall quality of governance might be low when, for example, the enforcement officials are themselves corrupt or the delegation of duties (chain of command) is not clear. Differences in institutional setting and relevance of law enforcement are quite pronounced across nations with different social, economic, political, geographic and historical compositions—all of which can potentially bear upon corruption (Goel and Nelson 2010; Treisman 2000).

The theoretical literature has considered different related scenarios, focusing on optimal punishments to deter corruption, stylized games between corrupt officials and enforcers (who themselves might be corrupt), etc. However, the empirical literature has failed to keep up as a few of these dimensions are not readily quantifiable (e.g., behavioral aspects of interactions between bribe takers, bribe givers and enforcers), while in other cases comparable data over time or across jurisdictions is not available (e.g., enforcement employment across countries). In fact, in a recent review of the empirical literature, Dimant and Tosato (2018) list two dozen odd categories of influence on cross-country corruption but enforcement is not listed as a category. A similar omission is noted in an earlier well-cited literature summary by Treisman (2007).

This paper used data on a large sample of nations to examine the relative effectiveness of law enforcement (police, judicial and prosecutorial), judicial efficacy (conviction rates), and related institutional quality (law and order, regulatory quality, rule of law). In general, most of the corruption literature, barring a few country-specific studies (e.g., Goel and Nelson 2011; Goel and Rich 1989), use aggregate indices such as the rule of law to capture enforcement. This study, on the other hand, is able to consider direct measures of enforcement (i.e., police, judges, and prosecutors) across a range of nations and compare their efficacy in reducing corruption vis-à-vis aggregate indices of government quality/enforcement (see La Porta et al. 1999). To draw clear comparisons, we use dental analogies, terming law enforcement employment as teeth, prosecution rates (judicial efficiency) as bite,Footnote 1 and institutional quality as gums. As healthy gums house teeth that enable an effective bite, so do good institutions empower enforcement agencies to effectively and credibly provide deterrence.

Placing the analysis in the literature on determinants of cross-country corruption, the results show that piecemeal enforcement efforts to combat corruption by increasing enforcement employment would not be effective, rather comprehensive improvements in institutional quality by strengthening the rule of law or regulatory quality bear greater results. These findings are robust across indices of corruption that capture somewhat different aspects and have useful implications for the design of government policies to combat corruption. Since the institutional setting, more than other aspects of the economy, characterizes capital accumulation and development, these results are particularly relevant for policy implementation in developing countries.

The rest of the paper is organized as follows: Sect. 2 provides the theory and discusses the extant literature; Sect. 3 describes the data and estimation; Sect. 4 reports the results; and concluding remarks are given in the final section.

2 Theory and literature

Generally pegging to the seminal works of Becker, including Becker (1968) and Becker and Stigler (1974), a significant body of theoretical works has studied the efficacy and the nature of the means to curb corruption, in economics as well as in other social sciences (see, for example, Anechiarico and Jacobs 1994). Whereas Becker (1968) started off more generally by focusing on “rational” criminals who trade off the costs and benefits of their actions, over time scholars have attempted to understand criminal choices by focusing on other features: nonmonetary penalties (Bac and Bag 2006), tax and legalization schemes (Burlando and Motta 2016), and optimal penalties (Friehe and Miceli 2016; Polinsky and Shavell 2001). In the context of Becker, enforcement variables mainly capture the direct costs of corrupt actions, although there may also be indirect costs, such as a loss in reputation for engaging in corrupt acts or due to the deterrence effect.Footnote 2

Another vein of the theoretical works analyzes the markets for corrupt activity (Shleifer and Vishny 1993, 2002), with focus on the monopolistic powers of enforcers (Garoupa and Klerman 2010) and their compensations (Becker and Stigler 1974; Mookherjee and Png 1995). Furthermore, the possibility of enforcement bodies being themselves corrupt have been recognized (Priks 2011).

More recently, the literature has highlighted the central role of institutions in shaping corruption and crime. La Porta et al. (1999) underline the importance of the institutional setting in determining the main aggregate outcomes, a role that becomes even more fundamental in developing nations in which corruption and crime may strongly undermine growth and development (Bardhan 1997; Banerjee 1997). The institutional setting may even alter the nature of corruption by shifting the relative bargaining power between corrupt bureaucrats and private agents (Capasso and Santoro 2018; also see Shi and Temzelides 2004). And yet, because of the multidimensional nature of corruption and bribery, the theoretical body of work on the issue remains confined to stylized models with limited avenues for empirical verification and direct policy application.

Hence one can reasonably argue that the lag in empirical research focusing on how and why enforcement may hinder corruption is not due to a lack of recognition but to a lack of comparable data across countries.

Few studies have indeed looked into the linkages between law enforcement and corruption, but these studies focus on individual nations. For example, using different measures of corruption, Goel and Nelson (2011) consider the effects of police, judicial and corrections employment in affecting corruption across states in the United States (also see Goel and Nelson 1998). They found that only judicial employment may significantly shape corruption, while police employment increased perceived corruption. In contrast, Alt and Lassen (2014) found that in the U.S. more prosecutor resources lead to more convictions for corruption crimes, while finding more limited evidence for the deterrent efect of increased prosecutions. In a different perspective, and using cross-national survey information, Goel et al. (2016) were instead able to detect corruption within government occupations, including (general) government officials, customs officers, and police officers. They found police corruption to be qualitatively different from corruption in other government occupations.

Hence, despite the theory and common sense soundly dictating a strong effect of both law enforcement and institutional setting on corruption, the related empirical evidence is quite thin. The objective of this paper is to fill this gap and to highlight and measure how different features of enforcement may curb corruption. With regard to the title of the paper, institutions are taken to “gums” that house the “teeth” or enforcement (police, judges, prosecutors) which make the “bite” or convictions possible.

To further explain the possible effect of enforcement on corruption, one could think about individual dimensions of enforcement in the context of a classic hold up problem in corrupt markets. While individual enforcement agents (police, judges, prosecutors) might hold up corrupt officials, there is a possibility for the accused to dodge the system by bribing their way, circumventing the system by changing jurisdictions, etc. However, comprehensive improvements in law enforcement with institutional change do not allow for such dodging or arbitrage. Along another dimension, we term the presence of police, judges and prosecutors and institutions as latent enforcement (capturing the probability of detection), while prosecution rates are active enforcement (capturing the probability of punishment).

Our main idea hinges on the general theory and on a presumption: while law enforcement may not necessarily be effective in curbing corruption, a good institutional setting is the only necessary condition to reduce corruption. The reason is twofold. On the one hand, law enforcement drain resources from other effective public goods expenditure, for example education and public infrastructure; on the other hand, law enforcement may per se nurture corruption when it becomes too intrusive, for example increasing police corruption and hampering business. Hence, we expect that latent enforcement has an ambiguous effect on corruption. More concisely, we test the following hypotheses:

H1

The effects of latent enforcement (proxied by police, judicial, and prosecutorial employment) hinge on whether the presence of enforcement employees acts as a deterrent (deterrence effect) or if enforcement employees are either corrupt or bypassed (complicit effect).

H2

Actual enforcement measured by conviction rates has a negative effect on corruption, ceteris paribus.

H3

Better enforcement institutions reduce avenues to bypass individual dimensions of enforcement and thus are likely to be effective in combating corruption.

Next, we turn to a discussion of the data and estimation to test these hypotheses and to address the aspects alluded to in the title of the paper.

3 Data and estimation

3.1 Data

The data consist of a cross-section of over 80 countries averaged over the period 2000–2015. While some of the variables used in this study are available for more countries and more years, our study faces the limitation of the corresponding availability of the enforcement and prosecution variables. Table 1 provides details on variable definitions and summary statistics.

Table 1 Variable definitions, sources and summary statistics

The main dependent variable is corruption. Given the issues with adequately capturing the level of corrupt activity, we employ two alternate measures. First, to measure corruption we rely on the PRS Group's International Country Risk Guide index of corruption [Corruption (ICRG)] based on expert ratings on a scale from 0 to 6, which we rescaled with higher numbers denoting more corruption. This index of corruption measures the assessment of corruption in the political system and is concerned primarily “with actual or potential corruption in the form of excessive patronage, nepotism, job reservations, ‘favor-for-favors’, secret party funding, and suspiciously close ties between politics and business”. According to this index, Zimbabwe is the most corrupt country and Finland is the least corrupt.

Of course, measuring actual corrupt activity is extremely difficult, thus as a robustness check we consider another widely used measure of corruption from the Transparency International. The corruption perceptions index [Corruption (TI)] measures the extent of corruption in the public sector and is based on business perceptions of country experts on a scale from 0 to 10 with higher numbers denoting more corruption. The correlation between these two measures [Corruption (ICRG) and Corruption (TI)] of corruption is positive and quite high (0.95).

The main explanatory variables relate to measures of enforcement, prosecution and of institutional quality. While the empirical literature on corruption has extensively employed some of these, in particular those reflecting institutional quality, we uniquely employ more direct measures of enforcement and conviction rates in the present work. The list of main regressors is the following:

  • Measures of enforcement (teeth)Footnote 3 POLICE, JUDGES, PROSECUTORS, ALLenforce

  • Measures of prosecution (bite) ConvictionRT

  • Measures of institutional quality related to enforcement/prosecution (gums) LawOrder, RegQUAL, RuleLAW

To test our hypotheses we consider several measures of latent enforcement including broad (i.e. the gums) and narrow (i.e. the teeth) measures. Narrow measures of latent enforcement include police, judicial and prosecutorial employment, as well as their aggregation (ALLenforce) per capita from the European Institute for Crime Prevention and Control. Presence of these would deter corruption, although government officials engaged in enforcement might themselves be corrupt (Mookherjee and Png 1995; Priks 2011). According to our sample, nations with the highest number of POLICE, JUDGES, PROSECUTORS, ALLenforce include Bahrain, Slovenia, Columbia, and Mauritius, respectively; and the nations with the lowest number of POLICE, JUDGES, PROSECUTORS, ALLenforce include Syria, Ethiopia, Ethiopia and Zimbabwe, and Syria, respectively.

To measure actual enforcement, or the “bite”, we use conviction rates (ConvictionRT) calculated as the percentage of adult persons convicted per suspected offenders for all offences collected from the European Institute for Crime Prevention and Control (see UNODC 2010). The country with the greatest number of convictions is Mauritius and the country with the least is Columbia. Using these variables we are able to discern the effectiveness of the “bite” versus the “teeth” of corruption enforcement.

Finally, we include three measures of enforcement-related institutional quality or “gums”. As a broad measure of enforcement we include an index for the rule of law and another to capture law and order. The index of rule of law (RuleLAW) is from the Worldwide Governance Indicators project and is based on a scale from − 2.5 to + 2.5, with higher numbers denoting stronger rule of law. This index “captures perceptions of the extent to which agents have confidence in and abide by the rules of society, and in particular the quality of contract enforcement, property rights, the police, and the courts, as well as the likelihood of crime and violence.” We also consider the law and order index (LawOrder) from The PRS Group's International Country Risk Guide based on a scale from 0 to 6 with higher numbers denoting stronger law and order. This measure captures the “strength and impartiality of the legal system” and “an assessment of popular observance of the law”. Additionally, we include an overall index capturing the quality of regulation (RegQUAL). Excessive regulation and red tape give bureaucrats monopoly power to extract bribes, thus the quality of the regulation is important for shaping the incentives for corruption.

As expected, the pairwise correlations between the three enforcement variables are positive, albeit lower in magnitude than those between the institutional quality measures (Table 2).

Table 2 Correlation matrix of key variables

The data for the variables come from established international sources that have routinely been used in the literature. The availability of enforcement statistics from the UNODC (2010) enables us to add novelty to this research, albeit with the limitation of a cross-sectional analysis and with a set sample of nations.

3.2 Estimation

The following general model encompasses the above assumptions and discussions:

$$ {\text{Corruption}}_{\text{i}}^{\text{j}} = {\text{f}}\left( {{\text{Enforcement}}_{\text{i}}^{\text{k}} , {\text{Prosecution}}_{\text{i}} , {\text{Institutional quality}}_{\text{i}}^{\text{m}} , {\text{Controls}}_{\text{i}}^{\text{Z}} } \right) $$
(1)

where i = 1, 2, 3,…; j = Corruption (ICRG), Corruption (TI); k = POLICE, JUDGES, PROSECUTORS, ALLenforce; m = LawOrder, RuleLAW, RegQUAL; Z = GDP, DEM, ETHNIC, Protestant.

To empirically test our hypotheses we operationalize Eq. (1) by constructing a linear regression model and estimating the model parameters using OLS with robust standard errors. Furthermore, we account for geographic considerations by including regional dummy variables. The OLS estimation is supplemented with 2SLS estimation to account for the possible bi-directional causality between corruption and enforcement.

The overriding goal is to test the hypotheses posed above and to evaluate the relative effectiveness of gums, teeth and bite in curbing corrupt behavior.

To complete our empirical model we control for other economic, political, and cultural factors that impact corruption. To do this we rely on the extant literature to determine the relevant variables (see, e.g., Aidt 2003; Lambsdorff 2006; Seldadyo and de Haan 2006; Svensson 2005; Treisman 2007). To account for economic factors we include real GDP per capita (GDP) where greater prosperity typically means more resources devoted to curbing corruption. Greater economic prosperity also increases the opportunity cost of breaking the law. Democratic countries, measured by the degree of democracy (DEM), give power and voice to citizens to voice their discontent and remove corrupt politicians from office.

Following Paldam (2002), we account for cultural aspects that influence corruption by considering the composition of ethnicities within a country (ETHNIC).Footnote 4 Greater ethnic diversity is generally taken to increase corruption, as bribes provide trust or confidence among diverse ethnic groups. We also account for religion by including the percent of the population that identifies as Protestant (Protestant), which has been shown to reduce the likelihood of corruption (Lambsdorff 2006; Treisman 2000).

While the corruption-determinants literature has considered a multitude of influences (Dimant and Tosato 2018; Lambsdorff 2006; Seldadyo and de Haan 2006), we use the ones that have been consistently shown to be significant in a cross-country context, while adding some new enforcement variables.

For diagnostic tests we report tests for heteroscedasticity and non-normality of the errors using Cameron and Trivedi’s (1990) decomposition information matrix (IM) under the null hypothesis that the errors are homoscedastic and normally distributed. To test for model misspecifications due to non-linearities we report the Ramsey regression equation specification error test (RESET) under the null hypothesis that the model is correctly specified. Finally, to check for problems with multicollinearity we report the mean variance inflation factor (VIF) for the independent variables in each model. Models with a VIF that exceeds 10 can be problematic. Next, we turn to the results.

4 Results

4.1 Baseline results

The baseline results are in Models 3.1–3.4 in Table 3. The R-squared exceeds 0.74 in all models confirming that our model explains more than 70% of the variation in corruption. According to the diagnostic tests the errors are mostly free from multicollinearity and appear to be homoscedastic and normally distributed with some minor deviations.

Table 3 Effectiveness of enforcement (“teeth”) in curbing corruption: baseline models

Related to the “teeth” measure of enforcement the coefficients on our four measures POLICE, JUDGES, PROSECUTORS, and ALLenforce are all insignificant, except for POLICE. Curiously the coefficient on POLICE is positive and statistically significant, which might be due to a reaction to an increase in corruption or these agents of the government being corrupt themselves (Goel and Nelson 2011; Goel et al. 2016), we deal with this simultaneity in Sect. 4.2. Thus, merely showing enforcement “teeth” appears to be ineffective at curbing corruption, and, in some cases, may encourage corruption as enforcement employees might themselves be corrupt (Goel et al. 2016).

These results are supported using an alternate measure of corruption [Corruption (TI)] shown in Models 3.5–3.8 in Table 3.

Turning to the control variables, more prosperity reduces corruption across all models, whereas democracy is mixed in its effects on reducing corruption. In terms of cultural influences on corruption, more ethnic fractionalization in a nation has a mostly positive, albeit insignificant, effect on corruption. On the other hand, the Protestant work ethic significantly reduces corruption across all models (see Treisman 2007). Overall, among controls, greater prosperity and a greater share of Protestant population are effective at reducing corruption. Whereas the effect of economic prosperity can be seen as tied to better governance in wealthier nations, the influences of Protestant population can be viewed in the context of social factors affecting corruption.

Table 4 results use ConvictionRT as a measure for the “bite”, along with three institutional measures capturing overall quality in prosecution and enforcement that account for the “gums”—i.e. RegQUAL, LawOrder, and RuleLaw. The presence of enforcement employment by itself would not prove an effective deterrent if convictions are delayed or conviction rates are low and for the convictions machinery to work efficiently, there needs to be a well laid out institutional framework.

Table 4 Effectiveness of convictions (“bite”) and institutions (“gums”) in curbing corruption: baseline models

According to the baseline models, Models 4.1–4.4, the coefficient on ConvictionRT is insignificant, whereas the coefficients on the general measures of enforcement are negative and highly statistically significant. However, the effectiveness of enforcement varies across the three measures. For instance, for a 1% increase in RegQUAL, LawOrder, and RuleLaw, corruption decreases by 0.267%, 0.873%, and 0.424%, respectively.

These results are supported using an alternate measure of corruption in Models 4.5–4.8. The control variable are in general agreement with those reported in Table 3, except that GDP is insignificant in Models 4.2 and 4.8, and democracy is significant in six of the eight models.

In sum, the results support hypothesis H3 regarding the relative superiority of good institutions over piecemeal enforcement in deterring corruption. Hypothesis H2 finds some support in terms of the negative sign, but lacks statistical significance.Footnote 5 Finally, the statistical support for hypothesis H1, allowing for the possibility of a deterrence effect and a complicit effect, is low, likely reflecting the presence of both influences somewhat cancelling each other out.

To check the sensitivity of our main findings we carry out two robustness checks in the following two sections: first, we account for possible simultaneity using instrumental variables (Sect. 4.2); second, we consider the possible nonlinearities in enforcement (Sect. 4.3).

4.2 Accounting for possible simultaneity1

Conceivably, increases in corruption could prompt law makers to ramp up enforcement in terms of employing more resources devoted to enforcement and higher conviction rates.Footnote 6 To account for this possible simultaneity we employ instrumental variables and re-estimate the models in Tables 3 and 4 using two-stage least squares (2SLS) and report the results in Tables 5 and 6, respectively.

Table 5 Effectiveness of enforcement (“teeth”) in curbing corruption: accounting for possible two-way causality
Table 6 Effectiveness of convictions (“bite”) and institutions (“gums”) in curbing corruption: accounting for possible two-way causality

To instrument latent and actual enforcement we use the total number of rail lines per capita (RAIL) and a dummy variable to capture those countries that were sovereign prior to when the International Criminal Police Organization (Interpol) formation talks were initiated in 1914 (https://www.interpol.int/About-INTERPOL/History), (Sovereign). Countries with an extensive rail system have a more developed infrastructure and more resources devoted to enforcement or more effective enforcement.Footnote 7 Furthermore, sovereign countries prior to the formation of the Interpol likely had some sort of a law enforcement apparatus in place, and the Interpol over time reinforced that. This way our Sovereign variable is trying to capture institutional legacy in law enforcement that would likely not affect current corruption directly.

To instrument enforcing institutions we employ readily available “internal” instruments by using the lagged value (average from 1990 to 1999) of the endogenous variables as well as Sovereign. Using lagged values of the endogenous institutional variables is attractive in that they are typically highly correlated with the contemporaneous endogenous variable. For example, the correlation between RegQUAL, LawOrder, and RuleLaw and their lagged values is 0.95, 0.96, and 0.86, respectively. While the exclusion restriction is more difficult to satisfy we argue that lagged variables of the endogenous variables impact corruption through their impact on contemporaneous institutional variables—see Bazzi and Clemens (2013) for detailed discussion of using lagged endogenous variables as instruments in growth regressions.Footnote 8

To check if the instrument is relevant (i.e. correlated with the endogenous variable) we report two tests: the Kleibergen-Paap rk LM and the Kleibergen-Paap rk Wald F tests. The Kleibergen-Paap rk LM tests if the instruments are correlated with the endogenous variable and the Kleibergen-Paap rk Wald F tests if the instruments are only weakly correlated with the endogenous variable. A rejection of the null hypothesis indicates that the instrument is relevant. In addition, we report the first-stage results (Tables 5a and 6a) and reduced form results (Tables 5b and 6b) in the “Appendix”.

Although the coefficients on latent enforcement (i.e. POLICE, JUDGES, PROSECUTORS) are all negative, they are insignificant. These are in line with the baseline findings in Models 3.1–3.8. The lack of significance might be due to an absence of coordination across regulatory bodies and/or bureaucratic red tape in deployment and processing. Unfortunately, we do not have hard data to control for these aspects.

The results for the control variables show that GDP, Protestant, and in some cases, democracy are effective at combatting corruption, whereas fractionalization is insignificant in all cases. The result for GDP is consistent with better governance in wealthier nations and with economic prosperity increasing the opportunity cost of illegal acts.

Turning to the results in Table 6, again show that general measures of enforcement, RegQUAL, LawOrder, and RuleLaw significantly reduce corruption while ConvictionRT is negative, albeit insignificant. These results confirm those in Table 4. Thus, strong gums prove to be most effective in curbing corruption. The relative effectiveness of these institutional quality measures stems from the fact that, as opposed to individual dimensions, overall improvements in institutions capture both the quantity and quality (including coordination, governance, red tape, etc.) of enforcement.

While the control variables are largely consistent with those reported in Table 4, there are some interesting deviations. For instance, democracy is positive and significant in some cases (Models 6.4, 6.6, and 6.8), and Protestant is insignificant when conviction rates are controlled for. Overall, the baseline results are robust after accounting for simultaneity.

The diagnostic tests reported at the bottom of Tables 5 and 6 reveal mixed results related to the strength of the instrument choice. That is, the insignificance of the Kleibergen-Paap LM test in 6 of the 8 models and the first-stage and reduced-form results in Tables 5a and 5b in the “Appendix” suggest potentially weak instruments. In Table 6, however, the Kleibergen-Paap LM test is statistically significant in 6 of the 8 models, and these results are consistent with the first-stage and reduced-form results in Tables 6a and 6b in the “Appendix”. Given the mixed results, we turn to a robustness check that considers an alternate set of instruments.

4.2.1 Robustness check 1: accounting for possible simultaneity2

Because finding good instruments is the bane of empirical research we check the robustness of the 2SLS results using an alternate set instruments. Following Bergh and Nilsson (2014) and Berggren and Nilsson (2015) we consider information on enforcement and institutions in neighboring countries. In particular, we employ the spatial lagged value of each of the endogenous variables as instruments. To construct the spatial lag of each variable we use the inverse geographic distance to define neighbors. For 134 countries we construct the spatial weight matrix W as a 134 × 134 row-standardized weight matrix, where the ijth element in W is defined as \( w_{ij} = \frac{1}{{d_{ij} }} \) with d measuring (in km) the great-circle distance (based on country centroid) between country i and country j. Accordingly, countries closer in geographical distance receive a higher weight. This setup conforms to Tobler’s law of geography that states “everything is related to everything else, but near things are more related than distant things.”

Using this weight matrix we then construct the spatial lag as the weighted average of neighboring values of the endogenous variable. Moreover, each spatial lag is constructed using the temporal lag of the endogenous variable to further mitigate concerns with endogeneity due to reverse causality. Specifically, the institutional variables RegQUAL, LawOrder, and RuleLaw are averaged over the period 1991–2000 while the years of latent enforcement (POLICE, JUDGES, and PROSECUTORS) and actual enforcement (ConvictionRT) vary they are mostly clustered in the early-to-mid part of the 2000’s. Using these spatial lag variables as instruments we re-estimate the models in Tables 5 and 6 and report the results in Tables 7 and 8.

Table 7 Effectiveness of enforcement (“teeth”) in curbing corruption: accounting for possible two-way causality
Table 8 Effectiveness of convictions (“bite”) and institutions (“gums”) in curbing corruption: accounting for possible two-way causality

The results in Table 7 confirm the results in Table 5 that show latent enforcement is insignificant in its effect on corruption. Indeed, the Kleibergen-Paap LM test rejects that the instruments are not correlated with the endogenous variable in six of the eight models—also see first-stage results and reduced-form results in Table 7a and 7b in the “Appendix”. The control variables are also consistent, except that ethnic fractionalization is now positive and statistically significant in Models 7.1–7.4.

Table 8 includes the re-estimation of the models in Table 6. Here again, the results are consistent with those in Table 6 with the exception that LawOrder, albeit negative, is insignificant at conventional levels. Further, the Kleibergen-Paap LM test is rejected in four of the eight models, and the first-stage and reduced-form results in Tables 8a and 8b confirm that the instruments are relevant at least for RegQUAL and RuleLaw. Nonetheless, these results mostly confirm the previous baseline (Tables 3 and 4) and IV (Tables 5 and 6) results, and thus serve to validate our main findings that institutions, over latent and actual enforcement, are what matter most for corruption reduction.

4.2.2 Robustness check 2: accounting for possible simultaneity3

As noted in this study and elsewhere in the literature, the multidimensional nature of corruption makes it particularly challenging to identify good instruments. In the context of our chosen instruments, RAIL and Sovereign, one could argue that rail networks and perhaps even governance and institutional quality might be positively related to a nation’s economic prosperity. To address these potential concerns, we reran the models reported in Tables 5 and 6, dropping GDP as a regressor. The corresponding results are reported in Tables 5c and 6c in the “Appendix”.Footnote 9 Both the Kleibergen-Paap rk LM test and the Kleibergen-Paap rk Wald F test are consistent with the results reported in Tables 5 and 6.

The main findings about teeth, gums and bite are supported—institutions prove effective in combating corruption, but law enforcement and punishment do not.

4.3 Robustness check 3: nonlinear effects

The assumption thus far has been that enforcement has a linear or constant effect on corruption, however, it is conceivable that enforcement has a diminishing effect on corruption and, thus the relationship is nonlinear. Indeed, the significance of the Ramsey RESET test statistic hints at possible omitted non-linearities. To check whether enforcement exhibits nonlinearities on corruption we include a quadratic term and re-estimate Models 3.1–3.8 in Table 3 and Models 4.1–4.8 in Table 4 and report the results in Tables 9 and 10, respectively.

To facilitate interpretation and mitigate the effects of multicollinearity we center each enforcement variable by subtracting the mean from each observation.

Overall, the results support the baseline results, however, there exist some interesting differences. In Table 3, the coefficient on linear POLICE is positive and significant, while the quadratic term is negative and significant in Model 9.1, this supports a diminishing effect on corruption.Footnote 10 The remaining enforcement variables and their corresponding quadratic terms lack any statistical significance. These results are largely confirmed with the alternate measure of corruption, although the coefficient on the quadratic POLICE is insignificant (Model 9.5).

Table 10 reports the results after re-estimating the models in Table 4 and including the quadratic term. In contrast to the results in Table 9, and consistent with enforcement having a diminishing effect on corruption, both the linear and quadratic terms are negative and statistically significant for the three of the four measures (i.e. RegQUAL, LawOrder, and RuleLaw) (Table 10).

Table 9 Effectiveness of enforcement (“teeth”) in curbing corruption: nonlinear effects
Table 10 Effectiveness of convictions (“bite”) and institutions (“gums”) in curbing corruption: nonlinear effects

Finally, the coefficient on the linear term for ConvictionRT is positive and insignificant, and the coefficient on the quadratic term is negative and significant consistent with a diminishing effect, however, this is not robust across the two measures of corruption.Footnote 11 The control variables are largely consistent with the baseline models. Consequently, these results verify H2, with the caveat that there exist nonlinearities in the impact of certain enforcement dimensions on corruption. The concluding section follows.

5 Conclusions

Whereas the theoretical literature on effective enforcement for corruption/bribery has considered many dimensions, empirical research in this respect has failed to keep up (see Dimant and Tosato 2018; Treisman 2007) with theory as some dimensions are either not quantifiable (e.g., relative risk attitudes of bribe takers and bribe givers, optimal punishment or optimal compensation) or corresponding data are not available outside of surveys for specific nations (e.g., monopolistic powers of regulators, timing of bribes, etc.). Yet, empirical verification of theories is important especially if policies are to be framed based on recommendations.

Within this spectrum, the present paper adds a somewhat new dimension to the substantial body of research on factors driving cross-national corruption, and examines the effectiveness of enforcement in reducing corruption. The main novelty lies in comparing the relative influences of latent enforcement (measured via police, judicial, prosecutorial employment; also, the sum of the three—ALLenforce) versus actual enforcement (conviction rates). Although the main novelty was to consider the efficacy of specific dimensions of enforcement, it turns out that aggregate measures or comprehensive enforcement is the one that pays dividends when it comes to corruption control.

In particular, results show that piecemeal enforcement efforts to combat corruption by increasing enforcement employment would not be effective, rather comprehensive improvements in institutional quality by strengthening the rule of law or regulatory quality bear greater results. These findings are robust across indices of corruption that capture somewhat different aspects. Quantitatively, law and order shows the largest impact on corruption followed by rule of law, and regulatory quality showing the least impact.Footnote 12 Furthermore, the effects of enforcement have a diminishing effect on corruption, thus it is necessary for additional resources to be allocated toward enforcement to be as effective at combating corruption. Thus, in terms of the title of the paper, when it comes to corruption control, strong gums (institutions) are more effective than showing teeth (enforcement employment) or the bite (conviction rates).Footnote 13

The insignificance of enforcement measures might be related to the fact that increases in any one dimension (say police force), without an accompanying change in related dimensions (judges, prosecutors) and improvement in institutions (rule of law) would not prove to be effective corruption deterrents. For example, even allowing for the fact that all police personnel were honest and zealous in their fight against corruption, they would not effectively curb corruption if there were not enough judges or prosecution rates were low (both of which would delay or reduce expected punishments for corruption) or institutions were weak (leading to inconsistent, uncertain, apprehension and/or punishment). On the other hand, improvements in institutional quality encompass broader dimensions that would enhance expected punishments (or at least the chances of apprehension). On the flip side though, institutional change is quite gradual (and perhaps even costlier than changing individual enforcement employment) and institutional change might be less politically expedient.

The main policy lesson from this study is that comprehensive improvements in enforcement involving better institutions related to law and order are more effective in combating corruption than focus on individual dimensions of enforcement. However, institutions change gradually and such change is not readily evident. Thus, politicians facing public outcry over corruption scandals or to show greater resolve in fighting corruption during election years might find it easier to increase enforcement employment. While such moves may be politically expedient, our results show that these endeavors are unlikely to provide real results, unless institutional quality also improves. Further, the gradual pace of institutional change and concurrent resource investments that go with it would prove especially challenging for developing nations looking to control corruption in a relatively short period.