1 Introduction

Our aim in this paper is to determine the order of magnitude of the average cartel overcharge, based on an extended version of a database of cartels that was used by Connor (2010).Footnote 1 This database contains overcharge estimates (OEs) that were obtained from a survey of several studies of cartels as well as three types of variables: the first group (Y) consists of variables that describe the cartel episode (e.g., duration, scope, geography, etc.). The second group (Z) consists of factors that are posterior to the cartel episode (e.g., estimation method or publication source). The third group (W) consists of a single dummy variable that indicates whether the cartel was “found or pleaded guilty”. While Y and W are likely related to the true overcharge, Z captures potential estimation biases.

The raw OE data are themselves potentially biased and the variable W is likely endogenous. Hence, a naive OLS regression of the OEs on Y, W, and Z should be avoided. To verify whether the OEs are biased or not, we perform a meta-regression analysis, in the spirit of Connor and Bolotova (2006), who show that part of the variability of the OEs is indeed due to the bias factors.

We use a Kullback–Leibler divergence to compare the probability of an OE’s being larger than some value θ conditional on (Y, W) to the same probability conditional on (Y, W, Z). The two conditional probabilities are quite close for \( \uptheta \in [0,65\,\% ] \) but diverge sharply for θ > 65. This divergence is caused by the fact that the joint distribution of the variables that are involved in the probit models that are specified for the probability of (OE > θ) become degenerate as θ exceeds a certain threshold. Next, we regress the logarithm of OE on Y, W, and Z on increasing subsamples of type (0, θ]. The results allow us to identify the range \( OE \in (0,49\,\% ] \) as the most reliable for our meta-analysis. Thus, our final results are derived from a Heckit regression that infers bias-corrected OEs for the whole sample by using unbiased estimates of coefficients obtained from the subsample \( OE \in (0,49\,\% ] \).

Applying the methodology described above, we find mean and median bias-corrected OE of 16.47 and 16.17 % for the subsample \( OE \in (0,49\,\% ] \) of 16.68 and 16.17 % for the subsample of effective cartels (with strictly positive OEs), and of 15.47 and 16.01 % for the whole sample. These numbers are significantly lower than the means and medians of the raw OE data. Moreover, the comparison of bias-corrected mean and median OEs reveals a fairly homogenous behavior of cartels across different types, geographical locations, and time (antitrust regime) periods.

The paper is organized as follows: in Sect. 2, we describe the context and the literature that surround our research. Section 3 presents the raw OE and discusses data problems. Section 4 illustrates the danger of converting Lerner indices into OE while ignoring the competitive mark-ups. Sections 5 and 6 presents the methodology that we used to detect the presence of bias in the OE data. Section 7 presents the determinants of cartel overcharge that are unveiled by our meta-analysis. Section 8 presents the steps of our bias-correction methodology and the summary statistics for the bias-corrected OE. Section 9 presents an analysis of variance of the OE bias. Section 10 concludes. “Appendix” contains the summary statistics of the database.

2 The Context

The United States Sentencing Guidelines (USSG) recommends a base fine of 10 % of the affected volume of commerce for a firm that is convicted of cartel activity, plus another 10 % for the harms “inflicted upon consumers who are unable or for other reasons do not buy the product at the higher price”. This yields a fine of 20 %, subject to further adjustments for aggravating and mitigating factors. The observed total financial fines generally fall in a range from 15 to 80 % of affected sales. Moreover, there is a possibility of incarceration for the individuals involved in the collusion. For fiscal years 2010–2014 (5 years), US antitrust prosecutions resulted in over US$5.14 billion in criminal fines and penalties, including the largest cartel fine in history—$1.14 billion for the liquid crystal display (LCD) panel cartel—and more than 295 years of jail time.

In the European Union, the determination of fines accounts for the severity of the damages that are inflicted upon consumers, suppliers, and clients as well as aggravating and mitigating factors. The basic fine is set within a range of 0–30 % of affected commerce plus 15–25 % as an additional dissuasive measure. However, the total fine must not exceed 10 % of the “worlwide group turnover in the financial year preceding the decision.”Footnote 2 For fiscal years 2010–2014 (5 years), European antitrust prosecutions resulted in over €8.93 billion in criminal fines and penalties. This amount includes the highest fine in history: €1.47 billion for the TV and computer monitor tubes cartel.

Connor and Lande (2008) examined a large number of OE studies and found an average in the range 31–49 % and a median in the range 22–25 %. Based on this, they concluded that “the current Sentencing Commission presumption that cartels overcharge on average by 10 % is much too low”. A similar study that was conducted by Connor (2010) concludes that “…penalty guidelines aimed at optimally deterring cartels ought to be increased”.

Combe and Monnier (2011, 2013) performed an analysis of 64 cartels that were prosecuted by the European Commission and concluded that “fines imposed against cartels by the European Commission are overall sub optimal.” In criticizing the Canadian Competition Bureau approach, Kearney (2009) wrote “The assumption of an average overcharge of 10 percent also has been put into question by economic survey evidence which suggests that the median long-run overcharge is much greater than 10 percent.”

However, Cohen and Scheffman (1989) argued that an increase of 1 % of a price above its natural competition level usually results in a reduction of sales of more than 1 %. Based on this, they concluded with respect to the USSG that “at least in price-fixing cases involving a large volume of commerce, ten percent is almost certainly too high”. Adler and Laing (1997, 1999) and Denger (2003) also judged that the fines imposed by the US Department of Justice are “astronomical” or “excessive” (Connor and Bolotova 2006, p. 1112).

Allain et al. (2011) develop a dynamic model of cartel stability and find that the cartel-level fines imposed by the European Commission in the 64 cartels analysed by Combe and Monnier are on average above the proper deterrence level.Footnote 3 Considering a more recent database at the firm level, Allain et al. (2015) conclude that the majority of firm-level fines imposed by the European Commission over the period 2005–2012 are above the deterrence level.

Hence, there is disagreement among specialists about the magnitude of cartel overcharges and thus, about optimal fines. Our paper contributes to this debate by providing an econometric method that appropriately deals with the limitations of the Connor database.

In Boyer and Kotchoni (2011, 2012), we conducted a meta-analysis by introducing three refinements with respect to Connor and Bolotova (2006): first, we removed all cartels with OE larger than or equal to 50 % as well as zero OE from the sample at the estimation stage. Second, we used a K-means analysis to separate the sample into four “homogenous” clusters. And third, we regress the log of OE on the Y, W, and Z variables (which were defined above) while assuming that the coefficients of Z vary across the clusters. An inverse Mills ratio was used to control for sample selection biases. Our results confirmed that the OEs are biased as they depend on the Z variables. We obtained an average bias-corrected OE of 18.89 % for the subsample of effective cartels and of 17.52 % for all cartels.Footnote 4

Critics of our previous analyses centered on the following issues: the trimming of the sample at 50 % has not been well-motivated; the regressors used for the meta-analysis include the indicators of the clusters identified in a prior K-means analysis on the same data; the Heckit procedure assumed that the same latent variable drives the occurrence of zero OEs and OEs above 50 %; and the variable that indicates whether the cartel members pleaded or were found guilty is likely endogenous.

The present article takes those critiques into account: we design an empirical framework in which the trimming bound is justified on statistical grounds; we rely on the Kullback–Leibler divergence in lieu of a K-means analysis to underscore the presence of bias in the raw OE data and bring other data problems to light; we properly take into account the endogeneity of the “guilty” variable; we apply the Heckit procedure to effective cartels only, so that only the right-hand truncation of the data needs to be controlled; and the zero raw OE are included unaltered in the sample that is used to predict the bias-corrected OE of all cartels.

3 The Connor Database

As mentioned earlier, the database that is used for our study is an extended version of the one that is used in Connor (2010). The raw sample consists of 1178 cartel episodes, from which 59 are discarded because of missing information. This leaves us with a sample of 1119 cartels, with OEs ranging from 0 to 1800 %. The mean OE is 45.5 % on the whole sample and 49 % for the subsample of strictly positive OEs. The mean is 20.5 % for the cartels with OEs that lie strictly between 0 and 49 %, which represents 69.9 % of the sample. OEs that are larger than 49 % represent 22.9 % of the sample, and the average OE for this subsample is 136.2 %.

However, the sample means of 45.5 and 49 % are influenced by a small number of outliers. Roughly 1 % of the OEs are larger than 400 %; and when the 5 % largest observations are left out of the sample, the average OE drops from 49 to 32 %. These outliers should be treated carefully when using econometric methods that are sensitive to their presence (e.g., OLS regressions). The skewness of the distribution (Fig. 1)Footnote 5 implies a significant difference between the means and medians.

Fig. 1
figure 1

Overcharge Estimates: Distribution skewed to the right. Note Overcharges larger than 400 % (1 % of the sample) are not shown on this figure

It should be emphasized that the overcharge data consist of estimates that were previously published by different experts and researchers. Therefore, they are potentially subject to model errors, estimation errors, endogeneity bias, and sample selection.Footnote 6

The raw overcharge data are quite heterogenous across regions, scope (domestic versus international), and time periods (Table 1). This clearly raises aggregation problems. Indeed, the average overcharge that is obtained for the whole sample is meaningful only if the conditions that determine the but-for price are the same across time and markets. As noted by Levenstein and Suslow (2003), “The reported price increases vary widely by industry and by source.”

Table 1 Means and medians of raw OEs per location and types of cartels
Table 2 Summary statistics of the Connor database
Table 3 Average values of the explanatory variables on selected ranges of OEs

The following variables are listed in the Connor database:

\( {\hat{\uptheta }} \) :

the overcharge estimate (OE), which is summarized in Table 1

Y1 :

Duration, discretized: 1 if duration is < 5 years; 2 if duration is from 6 to 10 years; 3 if duration is from 11 to 15 years; and 4 if duration is 16 years or more

Y2 :

Scope: equals 1 if domestic and 0 if international

Y3 :

Bid rigging: equals 1 if Yes and 0 if No

Y4 :

Geographic market: five dummy variables for US, EU, ASIA, ROW including Latin America, and WORLD cartels that cannot be associated with a primary region

Y5 :

Antitrust law regime in the US: six dummy variables for P1 (1770–1890); P2 (1891–1919); P3 (1920–1945); P4 (1946–1973); P5 (1974–1990); and P6 (1991–2004)

W:

Found guilty or pleaded guilty: equals 1 if Yes and 0 if No

Z1 :

Overcharge estimation method: dummy variables for Price before conspiracy (PBEFOR); Price war (PWAR),;Price after conspiracy (PAFTER); Yardstick (YARDST); Cost based (COST); Econometric modeling (ECON); Historical case study with no method specified (HISTOR); legal decisions (LEGAL); and other unspecified methods (OTHER)

Z2 :

Type of publication: dummy variables for Peer reviewed journal (JOURNAL); Chapters in a book (EDBOOK); Monograph or book (MONOGR); Government report (GOVREP); Court or antitrust authority source (COURT); Newspapers (NEWSPAPER); Working paper (WORKP); and Speech or conference (SPEECH)

The Y variables describe the alleged cartel episode and are therefore objectively related to the true overcharge. The period dummy variables (Y5) are used in Connor and Bolotova (2006) to capture the effect of US antritrust law regimes over times. These dummy variables are closely related to eras that are identified and studied at length by Kovacic and Shapiro (2000). The early time periods (P1, P2, and P3) are likely to be more important for the US than for the rest of the world. This argues for interacting those time periods with the US geographical market dummy in our regressions.

The Z variables describe circumstances that are posterior to the occurrence of the cartel episode. They are subjective and may therefore generate an overcharge estimation bias. Regarding the estimation methods (Z1), the traditional “yardstick” involves a cross-section comparison of firms, products, or markets. The “before-and-after” and the “price war” methods might be considered as the time series version of the “yardstick”. The “cost-based” and the “econometric” methods represent more sophisticated measurement efforts at implementing either version of the yardstick method.

The variable W (Guilty or not) is alone in its category. It is potentially related to the true overcharge while open to subjectivity: a guilty plea or judgement is not a foolproof indicator of guilt, but an entity that chooses to plead guilty has likely been involved in an effective price-fixing conspiracy. This argues for treating W as a distinct category.

Our study uses the Y, W, and Z variables described above to explain the OE.Footnote 7 Table 2 presents summary statistics of all variables. Additional summary statistics are presented in the “Appendix”.

4 The Proper Characterization of the But-for Price

Let \( \tilde{p} \) be the price that is imposed by a cartel; and let p be the but-for price: the price that would prevail absent the cartel. The cartel overcharge—expressed as a percentage of the but-for price—is given by \( \updelta = \left( {{\tilde{\text{p}}} - {\text{p}}} \right) / {\text{p}} \). While the cartel price \( \tilde{p} \) is observed, the but-for price p needs to be estimated.

An important cause of potential bias in the OE resides in the potential difficulties that are raised by the proper characterization of the but-for environment. Indeed, the observed time series of prices are the result of several causes. For instance, an inelastic demand may grant a firm significant market power that translates into high mark-ups. Product differentiation can cause a previously pure-and-perfect competitive market to behave as a monopolistic competition one. However, oligopolistic markets have margins over MC that can be significantly larger than zero. As noted by Morrison (1990), “The empirical results suggest that mark-ups in most US manufacturing firms have increased over time, and tend to be countercyclical.” Hall (1988, Table 4) claims that the ratio of price to marginal cost is in the range of 2 to 4 in US industries.

Table 4 Probit estimation results for selected ranges of OEs

The proper but-for price is equal to the marginal cost plus a margin. In pure and perfect competition, this margin is low and even close to zero. The accurate assessment of this margin is quite important when converting Lerner indices into cartel overcharges.Footnote 8 The Lerner index of market power is defined as:

$$ {\text{L}} = \frac{{{\text{p}} - {\text{c}}}}{\text{p}}, $$
(1)

where p is the market price and c is the marginal cost (MC). If the condition that would prevail in the absence of a cartel is pure and perfect competition, the but-for price is by p = c so that L = 0. The corresponding overcharge in the cartelized market is given by \( \updelta = \frac{{{\tilde{\text{p}}} - {\text{p}}}}{\text{p}} \) and the Lerner index is \( {\text{L}} = \frac{{{\tilde{\text{p}}} - {\text{c}}}}{{{\tilde{\text{p}}}}} \). The Lerner index is converted into the overcharge via the formula \( \updelta = \frac{\text{L}}{{1 - {\text{L}}}} \).

In general, the competitive but-for price is equal to c plus a margin m: p = c + m. Likewise, the Lerner index in the cartelized market should be calculated as \( {\text{L}} \equiv \frac{{{\tilde{\text{m}}}}}{{{\text{c}} + {\tilde{\text{m}}}}} \), where \( \tilde{m} \) is the inflated mark-up due to the cartel. Therefore, the true cartel overcharge is given by \( \updelta \equiv \frac{{{\tilde{\text{m}}} - {\text{m}}}}{{{\text{c}} + {\text{m}}}} \). However, the overcharge that would be inferred from the Lerner index that (wrongly) assumes pure and perfect competition as the benchmark is:

$$ {\tilde{\updelta }} \equiv \frac{\text{L}}{{1 - {\text{L}}}} = \frac{{{\tilde{\text{m}}}}}{\text{c}} =\updelta + \frac{\text{m}}{\text{c}}\left( {\updelta + 1} \right). $$
(2)

Equation (2) illustrates the danger of converting Lerner indices into OE by ignoring the existence of competitive mark-ups in the but-for price.Footnote 9 If the true overcharge is δ = 10 % and \( \frac{m}{c} = 20\,\% \), the estimation bias implied by \( \tilde{\updelta } \) is 32 %: more than three times the true value. Note that the bias is increasing in both the true δ and \( \frac{m}{c} \). The other overcharge estimation methods are not necessarily exempt of biases either.Footnote 10

5 A Formal Assessment of the Quality of the OE Data

We consider assessing the effectiveness profile of cartels from the OE data available to us, where the effectiveness profile is defined as the probability of the true overcharge’s being strictly larger than a given threshold θ, for θ ≥ 0:

$$ \Pr \left( {\uptheta_{i} > \theta |{\text{Y}}_{\text{i}} ,{\text{W}}_{\text{i}} } \right) = \Phi \left( {{\text{a}} + {\text{Y}}_{\text{i}} {\text{b}} + {\text{W}}_{\text{i}} {\text{c}}} \right), $$
(3)

where θi is the cartel overcharge and Yi contains variables that determine the overcharge.

Unfortunately, we do not observe θi. Instead, we observe an estimate \( {\hat{\uptheta }}_{\text{i}} \), which is potentially influenced by some bias factors Zi. Thus, we can reasonably assume that:

$$ \Pr \left( {{\hat{\uptheta }}_{\text{i}} > \theta |{\text{Y}}_{\text{i}} ,{\text{W}}_{\text{i}} ,{\text{Z}}_{\text{i}} } \right) = \Phi \left( {{\tilde{\text{a}}} + {\text{Y}}_{\text{i}} {\tilde{\text{b}}} + {\text{W}}_{\text{i}} {\tilde{\text{c}}} + {\text{Z}}_{\text{i}} {\tilde{\text{d}}}} \right). $$
(4)

If \( {\hat{\uptheta }}_{\text{i}} \) were unaffected by the bias factors Zi, then it would be an unbiased estimator of θi, and \( {\tilde{\text{d}}} \) should be equal to zero in Eq. (4). We would then have:

$$ \Pr \left( {{\hat{\uptheta }}_{\text{i}} > \theta |{\text{Y}}_{\text{i}} ,{\text{W}}_{\text{i}} } \right) = \Pr \left( {{\hat{\uptheta }}_{i} > \theta |{\text{Y}}_{\text{i}} , {\text{W}}_{\text{i}} , {\text{Z}}_{\text{i}} } \right) \cong \Pr \left( {\uptheta_{\text{i}} > \theta |{\text{Y}}_{\text{i}} ,{\text{W}}_{\text{i}} } \right). $$

Otherwise, the term \( \Pr \left( {{\hat{\uptheta }}_{i} > \theta |{\text{Y}}_{\text{i}} ,{\text{W}}_{\text{i}} } \right) \) can be quite different from both \( \Pr \left( {\uptheta_{\text{i}} > \theta |{\text{Y}}_{\text{i}} ,{\text{W}}_{\text{i}} } \right) \) and \( \Pr \left( {{\hat{\uptheta }}_{\text{i}} > \theta |{\text{Y}}_{\text{i}} ,{\text{W}}_{\text{i}} ,{\text{Z}}_{\text{i}} } \right) \). Thus, the impact of Zi on OE can be detected by examining the following Kullback–Leibler divergence:

$$ \Delta \left(\uptheta \right) = \frac{1}{\text{n}}\mathop \sum \limits_{{{\text{i}} = 1}}^{\text{n}} \Pr \left( {{\hat{\uptheta }}_{i} > \theta |{\text{Y}}_{\text{i}} ,{\text{W}}_{\text{i}} } \right)\left[ {\log \Pr \left( {{\hat{\uptheta }}_{\text{i}} > \theta |{\text{Y}}_{\text{i}} ,{\text{W}}_{\text{i}} } \right) - \log \Pr \left( {{\hat{\uptheta }}_{\text{i}} > \theta |{\text{Y}}_{\text{i}} ,{\text{W}}_{\text{i}} ,{\text{Z}}_{\text{i}} } \right)} \right]. $$
(5)

The Kullback–Leibler divergence (1951) tells how dissimilar two distributions are. It is strictly positive if the two distributions are different and equals zero if and only if they coincide.

Indeed, consider two probability distribution functions f(x) and g(x). Jensen’s inequality implies:

$$ \log {\text{E}}_{\text{g}} \left( {\frac{\text{f}}{\text{g}}} \right) \ge {\text{E}}_{\text{g}} \log \left( {\frac{\text{f}}{\text{g}}} \right), $$

where Eg is the expectations with respect to g(x). However, the LHS of the previous inequality is equal to zero since:

$$ \log {\text{E}}_{\text{g}} \left( {\frac{\text{f}}{\text{g}}} \right) = \log \int {\frac{{{\text{f}}\left( {\text{x}} \right)}}{{g\left( {\text{x}} \right)}}{\text{g}}\left( {\text{x}} \right){\text{dx}}} = \log \int {\text{f}}\left( {\text{x}} \right){\text{dx}} = \log \left( 1 \right) = 0. $$

Therefore:

$$ 0 \ge {\text{E}}_{\text{g}} \log \left( {\frac{\text{f}}{\text{g}}} \right) = \int \log \left( {\frac{{{\text{f}}\left( {\text{x}} \right)}}{{{\text{g}}\left( {\text{x}} \right)}}} \right){\text{g}}\left( {\text{x}} \right){\text{dx}} . $$

Inverting the fraction inside the log yields:

$$ \int \log \left( {\frac{{{\text{g}}\left( {\text{x}} \right)}}{{{\text{f}}\left( {\text{x}} \right)}}} \right){\text{g}}\left( {\text{x}} \right){\text{dx}} = \int \log \left( {{\text{g}}\left( {\text{x}} \right) - {\text{f}}\left( {\text{x}} \right)} \right){\text{g}}\left( {\text{x}} \right){\text{dx}} \ge 0. $$

Finally letting \( {\text{g}}\left( {\text{x}} \right) \equiv \Pr \left( {{\hat{\uptheta }}_{\text{i}} > \theta |{\text{Y}}_{\text{i}} ,{\text{W}}_{\text{i}} } \right) \), \( {\text{f}}\left( {\text{x}} \right) \equiv \Pr \left( {{\hat{\uptheta }}_{i} > \theta |{\text{Y}}_{\text{i}} ,{\text{W}}_{\text{i}} ,{\text{Z}}_{\text{i}} } \right) \) and replacing the integral by a discrete summation leads to the expression provided for ∆θ in Eq. (5).

Any discrepancy between \( \Pr \left( {{\hat{\uptheta }}_{\text{i}} > \theta |{\text{Y}}_{\text{i}} ,{\text{W}}_{\text{i}} } \right) \) and \( \Pr \left( {{\hat{\uptheta }}_{\text{i}} > \theta |{\text{Y}}_{\text{i}} ,{\text{W}}_{\text{i}} ,{\text{Z}}_{\text{i}} } \right) \) is attributable to Zi. Figure 2 plots the values of the Kullback–Leibler divergence ∆θ on the y-axis against θ on the x-axis. All conditional probabilities are estimated by Probit.

Fig. 2
figure 2

Detecting the impact of the bias factors via the Kullback–Leibler divergence. Note The Kullback–Leibler distance, ∆θ, is on the y-axis and θ on the x-axis. The Probit models that predict \( \Pr \left( {{\hat{\uptheta }}_{\text{i}} > \theta |{\text{Y}}_{\text{i}} ,{\text{W}}_{\text{i}} } \right) \) and \( \Pr \left( {{\hat{\uptheta }}_{\text{i}} > \theta |{\text{Y}}_{\text{i}} ,{\text{W}}_{\text{i}} ,{\text{Z}}_{\text{i}} } \right) \) are estimated by using a step of 5 % for θ. The large jump that is observed at θ = 65 is caused either by huge biases in the OEs or by other data problems that involve the regressors

We note that the predicted values of \( \Pr \left( {{\hat{\uptheta }}_{\text{i}} > \theta |{\text{Y}}_{\text{i}} ,{\text{W}}_{\text{i}} } \right) \) and \( \Pr \left( {{\hat{\uptheta }}_{\text{i}} > \theta |{\text{Y}}_{\text{i}} ,{\text{W}}_{\text{i}} ,{\text{Z}}_{\text{i}} } \right) \) agree to a large extent when θ lies between 0 and 65 %. However, the two conditional probabilities diverge dramatically as soon as θ > 65 %. This suggests that either the OEs that lie above 65 % are heavily biased or there is an issue with some of the dummy variable regressors that are used in the Probit models.

In an attempt to understand the shape of Fig. 2, we examine the averages of the explanatory variables on selected ranges of OE (see Table 3). We see for instance that there is no cartel from the ROW with an overcharge above 60 %. There are no US cartels for the period P1 (the USxP1 row) with more than a 50 % overcharge. Likewise, the number of OEs that are collected from historical case studies and from speeches falls to zero after 70 %. The latter fact deserves attention as the jump observed on Fig. 2 occurs between θ = 65 % and θ = 70 %.

Some subtle data problems may remain unnoticed. For instance, the sample averages of HISTOR and SPEECH are the same for all headers of Table 3. A careful examination of the data shows that 217 of the 219 OEs that lie above 50 % are obtained via estimation methods other than “historical case studies” and released through publication media other than “speeches” (i.e., 217 of 219 cartels satisfy HISTOR = 0 and SPEECH = 0). In particular, there is no cartel such that HISTOR = 1 and SPEECH = 1. This kind of data problem will eventually translate into multi-collinearities.

To support the claim that biases in OEs are attributable to the Z variables, it is necessary to assess the extent to which our results are affected by the data problems that are identified above. For a robustness check, we repeat the exercise of Fig. 2 after some data transformations. First, we remove the dummy variable ROW, thereby assuming that the reference group for the geographical market is “WORLD + ROW”. Second, we merge the interaction variables USxP1 and USxP2. Third, we merge the estimation methods HISTOR and OTHER. Finally, we merge the publication sources NEWSPAPER and SPEECH. Table 4 shows the estimated Probit models for the binary variables defined by the headers of Table 3 after applying the data transformations. There is no visible identification problem based on the magnitudes of the estimated coefficients.

Figure 3 shows the curve of the Kullback–Leibler divergence based on the transformed data. This curve has the same shape as in Fig. 2. However, the data transformations have moved the jump from 65 % on Fig. 2 to around θ = 95 % in Fig. 3. This suggests that the jumps seen on both figures are caused by the fact that the joint distribution of the variables involved in the probit models of \( \Pr \left( {{\hat{\uptheta }}_{\text{i}} > \theta |{\text{Y}}_{\text{i}} ,{\text{W}}_{\text{i}} } \right) \) and \( \Pr \left( {{\hat{\uptheta }}_{\text{i}} > \theta |{\text{Y}}_{\text{i}} ,{\text{W}}_{\text{i}} ,{\text{Z}}_{\text{i}} } \right) \) become degenerate as the cursor θ is moved above certain levels.

Fig. 3
figure 3

The Kullback–Leibler divergence after correcting identification issues. Note ∆θ is on the y-axis and θ on the x-axis. The Probit models that predict \( \Pr \left( {{\hat{\uptheta }}_{\text{i}} > \theta |{\text{Y}}_{\text{i}} ,{\text{W}}_{\text{i}} } \right) \) and \( \Pr \left( {{\hat{\uptheta }}_{\text{i}} > \theta |{\text{Y}}_{\text{i}} ,{\text{W}}_{\text{i}} ,{\text{Z}}_{\text{i}} } \right) \) are estimated by using a step of 5 % for θ. The data transformations have moved the jump from θ = 65 on Fig. 2 to θ = 95 on Fig. 3

In an effort to understand better the nature of the jumps that are seen in Figs. 2 and 3, we examine separately the two components of the Kullback–Leibler divergence given by the following, where \( \Delta \left(\uptheta \right) = \Delta_{0} \left(\uptheta \right) - \Delta_{1} \left(\uptheta \right): \)

$$ \Delta_{0} \left(\uptheta \right) = \frac{1}{\text{n}}\mathop \sum \limits_{{{\text{i}} = 1}}^{\text{n}} \Pr \left( {{\hat{\uptheta }}_{i} > \theta |{\text{Y}}_{\text{i}} ,{\text{W}}_{\text{i}} } \right)\log \Pr \left( {{\hat{\uptheta }}_{\text{i}} > \theta |{\text{Y}}_{i} ,{\text{W}}_{\text{i}} } \right), $$
(6)
$$ \Delta_{1} \left(\uptheta \right) = \frac{1}{\text{n}}\mathop \sum \limits_{{{\text{i}} = 1}}^{\text{n}} \Pr \left( {{\hat{\uptheta }}_{\text{i}} > \theta |{\text{Y}}_{\text{i}} ,{\text{W}}_{\text{i}} } \right)\log \Pr \left( {{\hat{\uptheta }}_{\text{i}} > \theta |{\text{Y}}_{\text{i}} ,{\text{W}}_{\text{i}} ,{\text{Z}}_{\text{i}} } \right). $$
(7)

Figure 4 shows the curves of \( \Delta_{0} \left(\uptheta \right) \) and \( \Delta_{1} \left(\uptheta \right) \) based on the transformed data. We see that \( \Delta_{1} \left(\uptheta \right) \) diverges abruptly from \( \Delta_{0} \left(\uptheta \right) \) at θ = 95 %, which is where the large jump occurs in Fig. 3. The fact that the curve of \( \Delta_{0} \left(\uptheta \right) \) is smooth everywhere indicates that only the Z variables are causing the jump.

Fig. 4
figure 4

Decomposition of ∆θ into two components. Note Any significant improvement in the model fit induced by Zi will translate into in a visible difference between \( \Delta_{0} \left(\uptheta \right) \) and \( \Delta_{1} \left(\uptheta \right) \). The difference between \( \Delta_{0} \left(\uptheta \right) \) and \( \Delta_{1} \left(\uptheta \right) \) becomes visible around θ = 20 and increases slowly up to θ = 95. However, \( \Delta_{1} \left(\uptheta \right) \) diverges suddenly from \( \Delta_{0} \left(\uptheta \right) \) after θ = 95. The jump seen in Fig. 3 occurs at θ = 95

Note that \( \Delta_{0} \left(\uptheta \right) \) is the upper bound of \( \Delta_{0} \left(\uptheta \right) \) as the probabilities \( \Pr \left( {{\hat{\uptheta }}_{\text{i}} > \theta |{\text{Y}}_{\text{i}} ,{\text{W}}_{\text{i}} ,{\text{Z}}_{\text{i}} } \right) \) approach \( \Pr \left( {{\hat{\uptheta }}_{\text{i}} > \theta |{\text{Y}}_{\text{i}} ,{\text{W}}_{\text{i}} } \right) \) uniformly over the sample. Any significant improvement in the model fit induced by Zi will translate into a visible divergence between \( \Delta_{0} \left(\uptheta \right) \) and \( \Delta_{1} \left(\uptheta \right) \). We see in Fig. 3 that the Kullback–Leibler divergence increases slowly as one moves from θ = 0 % to θ = 95 %.

In summary, the magnitude of the overcharge estimation bias is increasing with the raw OE, but the large jumps seen in Figs. 2 and 3 are caused by data problems. This suggests we should treat the subsample of cartels with OE > 65 % (14.29 % of the sample) with caution,. Note that we are not claiming that the OEs that lie above 65 % are all biased upward, nor that the OEs that lie below 65 % are all exempt of bias. Indeed, the Kullback–Leibler distance is not necessarily robust to positive biases that leave the proportion of OEs larger than θ unaltered.

The fact that the probabilities \( \Pr \left( {{\hat{\uptheta }}_{\text{i}} > \theta |{\text{Y}}_{\text{i}} ,{\text{W}}_{\text{i}} } \right) \) and \( \Pr \left( {{\hat{\uptheta }}_{\text{i}} > \theta |{\text{Y}}_{\text{i}} ,{\text{W}}_{\text{i}} ,{\text{Z}}_{\text{i}} } \right) \) agree much at θ = 0 means that Zi does not affect the proportion of zero OEs, but it might influence the size of positive OEs. The next step of our analysis deals with the latter aspect: the effect of Zi on the conditional mean \( {\text{E}}\left( {{\hat{\uptheta }}_{\text{i}} |{\text{Y}}_{\text{i}} ,{\text{W}}_{\text{i}} ,{\text{Z}}_{\text{i}} } \right) \).

6 A Meta-analysis of the Cartel OEs

Meta-analyses are used in experimental fields to summarize the findings of studies on a particular topic. They may also be used to verify if the conditions of experiments impact their results. Hunter and Schmidt (2004) write: “[…] In our view, the purpose of meta-analysis is to estimate what the results would have been had all the studies been conducted without methodological limitations and flaws”. The meta-analysis conducted here is consistent with this statement. Our goal is to understand what causes the bias in the raw OEs, unveil the determinants of cartel overcharge, and predict bias-corrected OEs.

It is reasonable to expect that the true overcharge depends on the conspiracy period, the duration of the cartel, the characteristics of the firms involved in the collusion, and similar factors. However, we do not observe the true overcharge. Instead, we observe an estimate of it that is equal to the actual overcharge plus a bias. This bias can be positive, null, or negative. Hence in addition to the factors that affect the true overcharge, we can expect the OE to be sensitive to the subjective factors that may cause the bias: the estimation method or the publication source, both of which are “posterior” to the occurrence of the conspiracy. Formally, the bias is defined as the influence of factors that affect the reported OEs, but not the true overcharges.

If the true overcharge θi were observable, our objective would reduce to understanding what causes the bias in \( {\hat{\uptheta }}_{\text{i}} \). To this end, we would simply regress \( \log {\hat{\uptheta }}_{\text{i}} - \log\uptheta_{\text{i}} \) on Zi. The bias-correction of \( {\hat{\uptheta }}_{\text{i}} \) and the prediction of θi then become minor issues. As θi is not available to us, we specify a model where \( \log {\hat{\uptheta }}_{\text{i}} \) is on the LHS and the potential determinants of the true overcharge and of the estimation bias are on the RHS. In this approach, endogeneity issues regarding the determinants of the true overcharge need to be addressed.

We specify a log-linear meta-analysis model on truncated subsamples of type \( {\text{OE}} \in (0,\uptheta] \), where θ varies from 25 to 70 % by steps of 1 %.Footnote 11 The model that is estimated is:

$$ \log {\hat{\uptheta }}_{\text{i}} =\upbeta_{0} + {\text{Y}}_{\text{i}}\upbeta_{1} + {\text{W}}_{\text{i}}\upbeta_{2} + {\text{Z}}_{\text{i}}\upbeta_{3} + \widehat{\text{imr}}_{\text{i}}\upbeta_{4} + {\text{e}}_{\text{i}} , $$
(8)

where \( {\hat{\uptheta }}_{\text{i}} \in (\left( {0,\uptheta} \right] \) and \( \widehat{\text{imr}}_{\text{i}} \) is the inverse Mills ratio (IMR) that is associated with a preliminary Probit model for the indicator of \( {\hat{\uptheta }}_{\text{i}} \in \left( {0,\uptheta} \right] \) (i.e., the selection variable). Here, we are employing the Heckit procedure where an IMR that is estimated in a first-step Probit is included as an extra regressor in the second-step estimation in order to control for selection biases that would arise from the right truncation of the sample at θ. Details on the Heckit procedure can be found in Heckman Heckman (1979).

We prefer the Heckit to a right-censored Tobit because the former is less restrictive than the latter. Moreover, the Tobit model assumes that the regressand (here, OE) has not been measured with systematic biases, while our goal is specifically to estimate and remove the potential bias that may contaminate the OEs. We focus on modeling the log of OE because the log distribution is more symmetric, as is shown by Fig. 5.

Fig. 5
figure 5

Logarithm of positive OEs (92.8 % of the sample)

The Heckit procedure requires that some regressors be included in the first-step Probit from which the IMR is estimated and excluded from the second-step regression. Such regressors are called exclusion variables, and they ensure the identification of the parameters that are estimated in the second step. An ideal exclusion variable is a determinant of the probability of the OE’s belonging to the range (0, θ] that has no direct influence on the OE. Unfortunately, none of the regressors available to us is eligible for this role on theoretical grounds.

To circumvent this difficulty, we consider shrinking the information set that is repesented by the bias factors (Z) in the second step estimation. More precisely, we use Y and W along with the Z variables as regressors in the first-step probit. In the second-step regression, the estimation methods “historical case studies”, “legal decisions” and “Other” are merged into a single category. The type of publications “journals” and “working papers” are merged under one group, “book chapters” and “monographs” under a second group, and “Newspapers and Speech” under a third group. All other categories are kept unchanged. This shrinkage of the information set acts as exclusion restrictions.

The variable Wi—which indicates whether the cartel case has been resolved with a guilty plea or decision—is likely endogenous. Indeed, the decision as to whether to plead guilty in a cartel prosecution is potentially related to the existence and size of the overcharge: a guilty plea may be suggestive that a firm has been involved in an effective price-fixing collusion. A similar argument holds if the firm was found guilty.

We consider estimating two second-step regressions: one in which Wi is included, and another one in which it is instrumented. The first regression is predictive, while the second is more structural. The structural regression is aimed at estimating the coefficients of the regressors without bias while the predictive model provides point forecasts of the regressand.

In the structural approach, our instrumental regression consists of a Probit model where the probability of a guilty plea is conditioned by the Y variables. The endogeneity problem is addressed by replacing Wi by the Probit prediction of \( \Pr ({\text{W}}_{\text{i}} = 1|{\text{Y}}_{\text{i}} ) \). Further exclusion variables are in principle needed in the structural approach while none is available to us. Fortunately, singularity is avoided because the probability of a guilty plea is nonlinear in the included instruments.

The estimated coefficients that are obtained from Eq. (8) are used to predict bias-corrected OEs conditional on \( {\hat{\uptheta }}_{\text{i}} \in \left( {0,\uptheta} \right] \):

$$ {\hat{\uptheta }}_{{{\text{bc}}1,{\text{i}}}} \left(\uptheta \right) = \exp \left( {{\hat{\upbeta }}_{0} + {\text{Y}}_{\text{i}} {\hat{\upbeta }}_{1} + {\text{W}}_{\text{i}} {\hat{\upbeta }}_{2} + \widehat{\text{imr}}_{\text{i}} {\hat{\upbeta }}_{4} + \frac{{{\hat{\sigma }}_{\text{e}}^{2} }}{2}} \right),\quad {\hat{\uptheta }}_{\text{i}} \in \left( {0,\uptheta} \right]. $$
(9)

We also compute bias-corrected estimates unconditionallyFootnote 12 by removing the IMR:

$$ {\hat{\uptheta }}_{{{\text{bc}}2,i}} \left(\uptheta \right) = \exp \left( {{\hat{\upbeta }}_{0} + {\text{Y}}_{\text{i}} {\hat{\upbeta }}_{1} + {\text{W}}_{\text{i}} {\hat{\upbeta }}_{2} + \frac{{{\hat{\sigma }}_{\text{e}}^{2} }}{2}} \right),\quad {\hat{\uptheta }}_{\text{i}} > 0. $$
(10)

The Probit from which the IMR is inferred is estimated with the subsample of cartels with strictly positive OEs (1038 cartels), \( {\hat{\uptheta }}_{{{\text{bc}}1,{\text{i}}}} \left(\uptheta \right) \) is computed for cartels with OE lying in the range (0, θ]; while \( {\hat{\uptheta }}_{{{\text{bc}}2,{\text{i}}}} \left(\uptheta \right) \) is computed for the subsample of successful cartels (\( {\hat{\uptheta }} > 0 \)).

Given that the subsample used for estimation is truncated from the right, the Heckit should predict a larger average bias-corrected OE on the whole sample than in the subsample; that is:

$$ {\bar{\uptheta }}_{{{\text{bc}}1}} \left(\uptheta \right) = \frac{1}{{{\text{n}}\left(\uptheta \right)}}\mathop \sum \limits_{{{\hat{\uptheta }}_{i} \in \left[ {0,\uptheta} \right]}} {\hat{\uptheta }}_{{{\text{bc}}1,i}} \left(\uptheta \right) \le \frac{1}{\text{n}}\mathop \sum \limits_{{{\text{i}} = 1}}^{\text{n}} {\hat{\uptheta }}_{{{\text{bc}}2,i}} \left(\uptheta \right) = {\bar{\uptheta }}_{{{\text{bc}}2}} \left(\uptheta \right), $$
(11)

where n(θ) is the size of the subsample that is used for estimation. Furthermore, \( {\bar{\uptheta }}_{\text{bc1}} \left(\uptheta \right) \) should be a non decreasing function of θ over the valid range of overcharges, while \( {\bar{\uptheta }}_{{{\text{bc}}2}} \left(\uptheta \right) \) should be approximately flat. Violation of either of these rules is suggestive of the presence of biases in the OE data that our procedure failed to correct completely. Figures 6 and 7 plot \( {\bar{\uptheta }}_{\text{bc1}} \left(\uptheta \right) \) and \( {\bar{\uptheta }}_{{{\text{bc}}2}} \left(\uptheta \right) \) on the y-axis against θ on the x-axis.

Fig. 6
figure 6

Average bias-corrected OE predicted for the subsample \( {\hat{\uptheta }}_{\text{i}} \in \left[ {0,\uptheta} \right] \) by the predictive model. Note Dash-dotted line average bias-corrected OE \( {\bar{\uptheta }}_{{{\text{bc}}1}} \left(\uptheta \right) \) on the subsample used for estimation. Solid line average bias-corrected OE \( {\bar{\uptheta }}_{{{\text{bc}}2}} \left(\uptheta \right) \) on the whole sample

Fig. 7
figure 7

Average bias-corrected OE predicted for the subsample \( {\hat{\uptheta }}_{\text{i}} \in \left[ {0,\uptheta} \right] \) by the structural model. Note: Dash-dotted line average bias-corrected OE \( {\bar{\uptheta }}_{{{\text{bc}}1}} \left(\uptheta \right) \) on the subsample used for estimation. Solid line: average bias-corrected OE \( {\bar{\uptheta }}_{{{\text{bc}}2}} \left(\uptheta \right) \) on the whole sample

Figure 6 shows the case where W is used as a regressor (the predictive approach), while Fig. 7 is for the case where W is instrumented (the structural approach). In both figures, \( {\bar{\uptheta }}_{{{\text{bc}}2}} \left(\uptheta \right) \) is overall decreasing for θ ≤ 49 % and weakly increasing for θ > 49 %. If the bias that contaminates the raw OEs had been completely removed, the curve of \( {\bar{\uptheta }}_{{{\text{bc}}2}} \left(\uptheta \right) \) should be flat at least for the second half of the support of θ. This curve is flatter in Fig. 7 than on Fig. 6, which suggests that part of the problem is due to the endogeneity of W.

In our empirical framework, the potential sample selection bias that arises from the right truncation of the OEs is controlled by the IMR, whilst the original bias that potentially contaminates the raw OEs is corrected by Z. The fact that \( {\bar{\uptheta }}_{{{\text{bc}}2}} \left(\uptheta \right) \) is increasing on θ > 49 % does not mean that the correction for sample selection is ineffective. Instead, it means that Z is less effective at capturing the initial bias that contaminates the OEs as the truncation threshold increases beyond 49 %. This suggests that the least distorted Heckit model is the one estimated with θ = 49 %.

As expected, \( {\bar{\uptheta }}_{\text{bc1}} \left(\uptheta \right) \) is overall increasing in θ. In Fig. 6, however, the curves of \( {\bar{\uptheta }}_{\text{bc1}} \left(\uptheta \right) \) and \( {\bar{\uptheta }}_{{{\text{bc}}2}} \left(\uptheta \right) \) are close and intertwined on the domain θ > 49 %. This indicates that the Heckit procedure successfully corrects the sample selection bias. In Fig. 7, \( {\bar{\uptheta }}_{\text{bc1}} \left(\uptheta \right) \) remains strictly below \( {\bar{\uptheta }}_{{{\text{bc}}2}} \left(\uptheta \right) \). Furthermore, the curve of \( {\bar{\uptheta }}_{\text{bc1}} \left(\uptheta \right) \) is smooth and monotonic to a greater extent in Fig. 7 than in Fig. 6. This again is suggestive that the endogeneity correction matters.

7 The Determinants of Cartel Overcharge

The results of Sect. 5 led us to restrict the analysis of Sect. 6 to cartels with OEs that are lower than 65 %. The results of Sect. 6 further led us to restrict the analysis to cartels with OEs that are lower than 49 %. The current section presents the estimation results when the sample is truncated at 49 %. “Appendix” presents summary statistics for the subsamples of alleged cartels with OE = 0, 0 < OE ≤ 49 %, and OE > 49 %. Table 5 shows the Probit estimation results for the probability that a cartel is successful at raising its price above the competitive equilibrium level.Footnote 13 This Probit is trustworthy given our previous finding that \( \Pr \left( {{\hat{\uptheta }}_{\text{i}} > \theta |{\text{Y}}_{\text{i}} ,{\text{W}}_{\text{i}} } \right) \) and \( \Pr \left( {{\hat{\uptheta }}_{\text{i}} > \theta |{\text{Y}}_{\text{i}} ,{\text{W}}_{\text{i}} ,{\text{Z}}_{\text{i}} } \right) \) agree to a great extent at θ = 0.

Table 5 Probit model for the probability that a cartel is successful: \( \Pr \left( {{\hat{\uptheta }}_{\text{i}} > 0|{\text{Y}}_{\text{i}} ,{\text{W}}_{\text{i}} ,{\text{Z}}_{\text{i}} } \right) \)

There is a positive link between the duration of a cartel and its probability of being successful. Domestic cartels tend to be less successful than are international cartels, while bid-rigging tends to be more successful than other forms of cartels. Cartels that are resolved with a guilty plea have a higher probability of being successful than do the other cartels. The geographical location and time period seem to have no effect on the probability that a cartel is successful.

A higher proportion of zero OEs is obtained via historical case studies with no method specified than in studies that specify an estimation method. The proportion of strictly positive OEs that is obtained via legal decisions is higher than for other estimation methods. Finally, working papers contain a significantly higher proportion of strictly positive OEs than do other publication sources. The coefficients of the dummy variables P1, NEWSPAPER, and SPEECH have not converged. However, the estimation results are qualitatively similar when these dummy variables are removed.

Of all successful cartels, which ones are able to overcharge by more than 49 %? In Table 6, we attempt to provide an answer by estimating a Probit model for \( \Pr \left( {{\hat{\uptheta }}_{\text{i}} > 49|{\text{Y}}_{\text{i}} ,{\text{W}}_{\text{i}} ,{\text{Z}}_{\text{i}} } \right) \). The sample used for the estimation is restricted to the cartels with strictly positive OEs. We note that domestic cartels and bid rigging cases are associated with a lower probability that an OE is more than 49 %. The “Guilty” dummy variable (W) does not seem to be an important determinant of the probability of \( {\hat{\uptheta }}_{\text{i}} > 49 \), which contrasts with what is found in Table 5 for \( \Pr \left( {{\hat{\uptheta }}_{\text{i}} > 0|{\text{Y}}_{\text{i}} ,{\text{W}}_{\text{i}} ,{\text{Z}}_{\text{i}} } \right) \). Hence, pleading guilty is suggestive that the cartel has been effective, but not that the overcharge is above 49 %.

Table 6 Probit model for \( \Pr \left( {{\hat{\uptheta }}_{\text{i}} > 49|{\text{Y}}_{\text{i}} ,{\text{W}}_{\text{i}} ,{\text{Z}}_{\text{i}} } \right) \)

The probability of overcharging by more than 49 % is higher for US cartels (particularly, US cartels of period P3) and ASIA cartels compared to the reference group WORLD. We also note that the Z-variables are significantly correlated with the indicator of \( {\hat{\uptheta }}_{\text{i}} > 49 \). Indeed, OE obtained via econometric methods and those published in working papers and academic journals have relatively lower probabilities of being more than 49 %.

Table 7 presents the estimated coefficients of the meta-regressions. The dependent variable in these regressions is the log of OE. Therefore, if the coefficient of a RHS dummy variable is β i , the average OE for the group where the dummy variable takes the value 1 is exp(β i ) times the average OE of the reference group. This represents a percentage increase or decrease of exp(β i ) − 1. As the OE is already expressed in percentage, we will reference the results in terms of factor of increase or decrease in order to avoid confusion.

Table 7 Meta-analysis of cartel OEs

The predictive regression suggests that the ability of a cartel to charge a higher price increases with its duration. On average, the overcharge increases by a factor of exp(0.06) [=1.06] per quinquennium. Domestic cartels overcharge by a factor of exp(−0.3) [=0.74] less than do international cartels. Cartels that are resolved with a guilty plea overcharge by a factor of exp(0.27) [=1.31] more than do the other cartels. Cartels of the EU overcharge by exp(−0.23) [=0.79] less than do cartels at other geographical locations. OEs that are estimated via a “Price before”, “price war”, or “yardstick” method are on average larger than those that are obtained through the use of other estimation methods. Also, the OEs that are published in monographs, edited books, newspapers, and speeches are on average higher than those that are published in other media.

In the structural model, the coefficient of the “Guilty” dummy variable is not significant. This suggest that a guilty plea has no causal effect on overcharges even though it has a predictive effect. In retrospect, this result is quite intuitive: a firm adopts a guilty plea strategy because it has been involved in a successful cartel (i.e., “positive overcharge” causes “guilty plea”) and not the converse.

The R-square of the log-linear regression is slightly higher for the predictive model (0.09) than for the structural model (0.08). This is not surprising as the structural model is aimed at achieving an unbiased estimation of the parameters used to bias-correct the raw OE while the predictive model delivers the best fit of the OE in terms of in-sample mean square error.

Connor and Bolotova (2006) performed a meta-analysis of cartel OEs in which they modelled the OE as a linear function of Y and Z:

$$ {\hat{\uptheta }}_{\text{i}} =\upbeta_{0} + {\text{Y}}_{\text{i}}\upbeta_{1} + {\text{W}}_{\text{i}}\upbeta_{2} + {\text{Z}}_{\text{i}}\upbeta_{3} +\upvarepsilon_{\text{i}} . $$
(12)

They estimated different restrictions of the full model. For their full model (column [7] of Table 6 in their paper), they found that the OE is positively related to the duration, but does not depend on whether the firm is “guilty” or not; it is lower for domestic cartels and for cartels that have operated in the EU; and it is neither higher nor lower for bid-rigging cases, contrary to what is claimed by Cohen and Scheffman (1989). Further, they found that the size of overcharges has declined over time. Connor and Bolotova attributed the latter result to the increased severity of antitrust regulation.

Interestingly, they also found that the Z variables have significant impacts on OE. For example, they found that the “yardstick” method produces estimates that are at least 10 % higher than the “after the conspiracy” method. For the publication sources, they found that “government reports” and “court reports” produce estimates that are respectively 22 % lower and 15 % higher than “monograph or book”. The fact that the Z variables show significant effects in the regression suggests that the raw OE are indeed biased.

Our results differ in significant ways from theirs: first, we find that cartels that pleaded guilty have higher overcharges on average, but this effect is not causal. Second, we find that more recent cartels (periods P5 and P6) are not really different from cartels of previous periods with regard to average pricing behavior. The antitrust law regimes have no impact on the probability of an overcharge’s being positive. However, it has an impact on the distribution of positive overcharges. More precisely, US cartels of the period P3 have a higher probability of overcharging by more than 49 % than do other cartels in other periods.

8 Bias-correcting the OE

The coefficients that are estimated from the predictive regression (Table 7) are possibly distorted by the endogeneity of “Guilty”. Therefore, these coefficients should not be used to predict bias-corrected OE. The coefficients that are estimated from the structural regression (Table 7) are expected to be unbiased and may therefore be used to predict bias-corrected OE. However, some information can still be gleaned from the residuals of the former regression as they are potentially correlated with “Guilty”.

Our ultimate objective, which is to obtain good predictions of average bias-corrected OE, requires that we first remove the causal effect of the Z-variables from the raw OE. Once this step is completed, a predictive regression of the cleaned OE onto Y can be used to estimate bias-corrected conditional means of OE. This strategy, which combines the strenghs of the structural and reduced-form approaches, is presented below.

To begin, we estimate the second-step regression by excluding the “Guilty” dummy variable (W). The estimated equation is:

$$ \log {\hat{\uptheta }}_{\text{i}} = {\hat{\upbeta }}_{0} + {\text{Y}}_{\text{i}} {\hat{\upbeta }}_{1} + {\text{Z}}_{\text{i}} {\hat{\upbeta }}_{3} + \widehat{\text{imr}}_{\text{i}} {\hat{\upbeta }}_{4} + {\text{e}}_{\text{i}} . $$
(13)

The exclusion of W is justified by our previous finding that this variable has no causal effect on the overcharge. Next, we infer bias-corrected OE as:

$$ \log {\hat{\uptheta }}_{{{\text{bc}},{\text{i}}}} = \log {\hat{\uptheta }}_{\text{i}} - {\text{Z}}_{\text{i}} {\hat{\upbeta }}_{3} . $$
(14)

Finally, \( \log {\hat{\uptheta }}_{{{\text{bc}},{\text{i}}}} \) is regressed on Yi and \( \widehat{\text{imr}}_{\text{i}} \):

$$ \log {\hat{\uptheta }}_{{{\text{bc}},{\text{i}}}} = {\hat{\upbeta }}_{0} + {\text{Y}}_{\text{i}} {\hat{\upbeta }}_{1} + {\text{W}}_{\text{i}} {\hat{\upbeta }}_{2} + \widehat{\text{imr}}_{\text{i}} {\hat{\upbeta }}_{4} + {\text{e}}_{\text{i}} . $$
(15)

The variable W is included in regression (15) for the purpose of exploiting its preditive power. Equation (13) is a causal regression while Eq. (15) is a predictive regression. Table 8 shows the estimated coefficients for Eq. (15).

Table 8 Predictive model of bias-corrected OEs

We see that the “Guilty” dummy variable has significant predictive power for bias-corrected OEs. This predictive power was missed by the structural regression in Table 7. The negative coefficient of the “domestic” dummy variable (−0.28, significant) is stronger than in the predictive model (−0.20, non significant) but weaker than in the structural model (−0.37, significant). Cartels that operate in the EU have lower overcharges than do those from other geographical markets.

The bias-corrected OE of effective cartels are infessrred from estimating Eq. (15) and using Eq. (10). First, one estimates Eq. (13) using the logarithm of the raw OE as regressand. Second, one uses Eq. (14) to obtain the bias-corrected OE for each cartel. Third, one estimates Eq. (15) using the log of bias-corrected OE as the regressand. Finally, one uses Eq. (10) to predict bias-corrected overcharge conditional on Y and W.

Our previous analysis suggested that the raw OE data permits us to identify an effective cartel from an ineffective one. Therefore, the initial 0 % OEs are assumed to be exempt of bias and left unchanged.

Table 9 replicates Table 1 with bias-corrected OEs as input. For the subsample with initial estimates that lie in the range (0, 49 %], we find a mean overcharge of 16.47 % with a median of 16.17 %. For the subsample of strictly positive OEs, the mean is 16.68 % while the median is 16.17 %. For the whole sample (including the zeros), the predicted mean is 15.47 % while the median is 16.01 %.

Table 9 Means and medians of bias-corrected OEs per location and types of cartels

For US cartels, we find a mean bias-corrected OE of 15.69 % (with a median of 15.04 %) for the subsample that is used for estimation and 14.36 % (with a median of 14.48 %) for the whole sample. For EU cartels, the corresponding figures are 14.05 % (13.67 %) and 13.51 % (14.08 %). Moreover, we find a mean bias-corrected OE of 17.71 % (with a median of 18.66 %) for international cartels and 12.93 % (with a median of 13.68 %) for domestic cartels. Finally, we find that post-1973 cartels achieved higher bias-corrected mean overcharges (16.07 %) than did pre-1973 cartels (13.97 %). Overall, the means and medians bias-corrected OEs that are shown in Table 9 suggest a more homogenous behaviour of cartels than did the means and medians of raw OEs that were shown in Table 1.

Table 10 presents the mean bias-corrected OE for different categories of cartels according to whether they are domestic or international, in bid-rigging cases or not, and/or were found or pleaded guilty or not. Table 11 presents median bias-corrected OE for the same subgroups. The differences between the raw OEs (Table 1 and the left-hand side of Tables 10 and 11) and the bias-corrected OEs (Table 9 and the right-hand side of Tables 10 and 11) are quite striking. In several cases, the bias-corrected OE is at least twice smaller than the raw OE. Our results support the idea that cartels are overall more similar than the raw OE data might suggest.

Table 10 Raw versus bias-corrected mean OE
Table 11 Raw versus bias-corrected median OE

9 Analysis of Variance for the OE Bias

In this section, we attempt to understand which of the Z variables causes more bias in the raw OE. For that purpose, we define the bias as the difference between the log of the raw OE and the log of the bias-corrected OE.

$$ {\hat{\Delta }}_{\text{i}} = \log {\hat{\uptheta }}_{\text{i}} - {\hat{\omega }}_{\text{i}} , $$

where \( {\hat{\omega }}_{\text{i}} \) is the fitted value of Eq. (15). We regressed \( \hat{\Delta }_{\text{i}} \) on a constant and all Z dummy variables, keeping “econometric method” and “government report” as reference groups.

We run separate regressions for effective cartels, cartels with OE ≤ 49 %, and cartels with OE > 49 % (see Table 12). The percentage of explained variance is 8 % for all effective cartels, 5 % for cartels with OE ≤ 49 %, and 16 % for cartels with OE > 49 %. Thus, the R-square triples as we move from the subsample with OE ≤ 49 % to the one with OE > 49 %. This supports our previous finding that the raw OE lying above 49 % are substantially more biased than the remainder of the sample.

Table 12 Determinants of the bias contaminating the raw OE

We perform a Chow test for the stability of the coefficients across the two subsamples. The test statistic is given by:

$$ F = \frac{{\left( {1194.55 - 543.29 - 110.28} \right)/16}}{{\left( {543.29 + 110.28} \right)/\left( {1038 - 2 \times 16} \right)}} = 52.04 $$

Under the null hypothesis that the coefficients are the same for the two subsamples, the F statistic is a Fisher random variable with (16,1006) degrees of freedom. At a confidence level of 1 %, the critical value of the Fisher distribution with (16,1006) degrees of freedom is 1.9832. Hence, the null hypothesis is overwhelmingly rejected.

On the subsample of all effective cartels, all estimation methods (except HISTOR, LEGAL, and OTHER) are biased upward relative to the ECON method; the YARDST and PWAR methods seem to be the most important sources of positive bias for that subsample; OE that were published in government reports (GOVREP) are biased upward relative to those that were published in academic journals, court decisions, and working papers.

For the subsample of cartels with OE > 49 %, OTHER is less biased than is ECON, while YARDST is more biased; all of the other methods entail similar bias as the ECON method; OEs that were published in government reports are biased upward relative to those that are published in all other media.

For the subsample of cartels with OE ≤ 49 %, PBEFOR, PWAR, and YARDST are more biased than is ECON; OEs that were published in government reports are less biased relatively to those that were published in all other media.

YARDST is the only variable whose effect is significant and of the same sign for both subsamples. For all other variables, the effect is either significant for one subsample only or of opposite signs for the two subsamples.

10 Conclusion

Our study identifies the mean and median overcharges of cartels by performing a meta-analysis on an extended version of the database that is used in Connor (2010). Each observation in the sample is a potentially biased overcharge estimate (OE) that is obtained from a previous study.

Three groups of variables describe the observations: the first group consists of variables (Y) that explain the true overcharge. The second group (Z) consists of factors that capture potential estimation biases. The third group contains a single variable (W) that indicates whether the alleged cartel pleaded or was found guilty. This last variable is found to be endogenously related to the true overcharge. Our study bias-corrects the raw OEs by cleaning them from the contribution of the Z variables.

In order to assess the quality of the data, we used a Kullback–Leibler divergence to compare the probability that an OE is larger than θ conditional on (Y, W) to the same probability conditional on (Y, W, Z). The divergence between the two probabilities increases slowly on the range \( \uptheta \in \left[ {0,65\,\% } \right] \), but a large jump occurs at θ = 65 %. Although the results suggest the presence of bias in the raw OEs, this jump appears to be driven by other data problems.

We pursued our empirical investigations by estimating Heckit models for the log of OE on subsamples of type (0, θ]. If the OE data were unbiased, the average bias-corrected OE for the subsample of cartels with \( OE \in \left( {0,\uptheta} \right] \) should be increasing in θ and lower than the average bias-corrected OE inferred for all successful cartels. Also, the curve of the average bias-corrected OE of all successful cartels should be flat in the upper range of θ We find the latter condition to be violated. The curve of the average bias-corrected OE of all successful cartels is decreasing on θ ≤ 49 % and increasing on θ > 49 %.

Acting on these results, we estimate our final meta-analysis model on the subsample of \( OE \in \left( {0,49\,\% } \right] \). We employ a Heckit procedure to infer the bias-corrected OEs of all effective cartels (those with strictly positive OEs). The raw OEs that are equal to zero are included back unaltered in the sample that is used for prediction. Our meta-analysis delivers mean and median bias-corrected OEs of 16.47 and 16.17 % for the subsample with initial estimates lying in the range (0, 49 %], of 16.68 and 16.17 % for the subsample of effective cartels, and of 15.47 and 16.01 % for the whole sample.

Our results have significant implications for antitrust policy. Indeed, a major element in the prosecution of cartels is their capacity to exert upward pressures on prices. Becker (1968) and Landes (1983) examined the link between the cartel overcharge and the fine in a static game framework. Both authors concur that the optimal fine is equal to the illegal profit of the cartel divided by the probability of detection.

Allain et al. (2011) argued that the Becker–Landes rule must be interpreted with caution in a dynamic framework. They show that the optimal fine can be computed as either the annual illegal profit divided by the annual probability of detection, or the cumulative illegal profit over the lifetime of the cartel divided by its lifetime probability of detection.

Allain et al. (2015) and Harrington (2014) considered infinitely repeated games where a threat of deviation by a cartel member exists. Antitrust authorities may make the deviation profitable by granting partial or full leniency to the whistleblower. In such dynamic games, these authors show that the amount of the optimal fine is much lower than the correctly interpreted Becker–Landes rule suggests. Katsoulacos and Ulph (2013) conducted an analysis that accounts for the timing of antitrust authorities’ decisions and found that the optimal cartel fine is approximately 75 % of the amount implied by the conventional formula.Footnote 14

The mean and median bias-corrected OEs that are obtained from our analysis have little to say about any specific cartel case where an overcharge is used as a measure of antitrust damage. The true overcharge in a given case depends on the specific set of facts with regard to the challenged conduct, the structure of the industry to which the cartel belongs (e.g., sector, concentration, elasticity of demand), etc. In addition to the previous factors, the estimated overcharge depends on the availability and quality of the data, the method that was used to estimate the but-for price, etc. Hence, the analysis that is conducted in this paper could be improved if more data on cartels become available.