1 Introduction

Many scholars believe that there is a trial penalty for those who decide to invoke their sixth amendment right to a trial (Rubinstein and White 1978; Brereton and Casper 1982; Holmes et al. 1992; Dixon 1995; Johnson 2003; Ulmer and Bradley 2006). Jury sentencing supports the existence of a trial penalty if the sentence length is more severe than comparable cases under guilty pleas and bench trials (King and Noble 2004). Some studies find the existence of a trial penalty (Ulmer et al. 2010; Ulmer and Bradley 2006; King and Noble 2005; Johnson 2003). On the other hand, Breen (2011) finds the exact opposite; judges impose tougher sentences on defendants than jury trials in military courts.

Using data on murder cases in 33 large urban counties, this article finds evidence of a substantial trial penalty. The article uses Tobit regressions with a random effects design to control for idiosyncratic errors (i.e., unobserved differences) by county and state as well as the controlling for the lower bound of zero sentence lengths. The findings in this article show that some defendants might face differential treatment under alternative trial settings. For example, defendants convicted by juries face an 11-year trial penalty on their sentences over plea bargains. Further, we find that judges might treat defendants with prior convictions and gang affiliations more fairly than juries, since juries are more likely to be swayed emotionally in court. These findings could be particularly useful to defense attorneys who could use them to make better choices between plea bargains and jury and bench trials.

Plea bargaining, the practice of pleading guilty to a less severe charge, is a very popular outcome in the criminal justice system. The research on plea bargaining commonly estimates that 90 % of all convictions in the criminal courts are the result of guilty pleas (Alschuler 1981). A study in 1962 on 132 state county courts showed that 70 % of cases were decided by plea bargaining. The same study also found 73 % of defendants in US district courts in 1967 chose guilty pleas (Landes 1971). For the data analyzed in this paper, the percentage of convictions that result from plea bargains is 54 %.

One major concern with plea bargaining is that innocent defendants might plead guilty. This fear often sparks a heated debate (Alschuler 1981). The issue is further exacerbated if one considers risk averse agents. These agents are more likely to accept plea bargains because the risk of being convicted and facing a larger sentence is not worth the gamble, even if the probability of conviction is small. Zeisel (1980) supports this incentive scheme by showing that the sentences of New York City defendants convicted at trial were 136 % more severe than the plea bargains the prosecutors had proposed to the same defendants. Defendants have incentives to plead guilty, even when innocent, under certain circumstances because a prosecutor might offer a deal that reduces the potential sentence of the trial to such a degree that it is almost too good to pass up (Bar-Gill and Ayal 2004).

Proponents of plea bargaining cite the need for this system as a way to successfully navigate through the slew of trials the court face. The guilty-plea system has grown as a product of circumstances, not by choice. Today there is an administrative crisis in criminal courts largely due to the increasing volume of crime in recent decades, the regulation of human activity that were formerly beyond the scope of criminal law, and the substantially increased length of the average felony trial (Alschuler 1976). This process unambiguously lightens the workload of judges, prosecutors, and defense lawyers but does not necessarily mean the is acceptable. After all, the United States Constitution provides the right to a trial by jury, and some believe that a plea bargain is a mechanism that eliminates this right because it increases the costs of the trial (Lynch 2003). The other widely cited issue is the preservation of resources such as time, money, and even the effort by those involved in the judicial process. However, while saving resources is a top priority for the courts, this justification of plea bargaining raises difficult constitutional problems (Grossman and Katz 1983).Footnote 1

The remainder of the article is organized as follows: Section 2 presents the case of Bordenkircher v. Hayes. Section 3 provides the motivation behind the importance of analyzing plea bargaining. Section 4 describes the data. Section 5 addresses the question of whether there is a trial penalty. Section 6 explores the conviction outcomes of defendants under the three alternative trial settings, and Section 7 provides concluding remarks and future possible research extensions.

2 Bordenkircher v. Hayes

In the 1920s, the legal profession mostly opposed plea bargaining (Pound 1980). However, the United States started to depend more on plea bargains and, consequently, attitudes began to change. As early as the 1970s, the law profession was united in defending plea bargaining because of the cost concerns that had arisen in the past decades (Alschuler 1976).

The famous case of Bordenkircher v. Hayes established the precedent for plea bargaining. The defendant, Paul Lewis Hayes, was charged with forgery. This offense carried a two- to ten-year prison sentence. The prosecutor offered a plea bargain of five years in exchange for Mr. Hayes pleading guilty. More importantly, the prosecutor also stated he would indict Mr. Hayes under the Kentucky Habitual Crime Act if he did not accept the plea bargain. This indictment was possible because Mr. Hayes had two prior felony convictions. If found guilty, Mr. Hayes would serve life in prison under this indictment rather than the usual sentence of two to ten years. Mr. Hayes did not accept the plea bargain, and he was found guilty during trial. As warned, the prosecutor followed through with his promise of indictment. (Lynch 2003)

Mr. Hayes appealed the lower court’s decision arguing that the prosecutor violated the Fourteenth Amendment’s due process clause by carrying out a threat made during plea negotiations to punish him for simply invoking his right to trial. The government admitted that the only reason the indictment was threatened was to deter Mr. Hayes from his right to trial. However, they maintained they did nothing improper since the indictment was supported by the evidence. Ultimately the case reached the United States Supreme Court for final resolution. In a landmark 5–4 ruling, the Supreme Court approved the lower court’s handling of the case and supported Mr. Hayes’s sentence of life imprisonment (Lynch 2003). Justice Potter Stewart wrote that “threatening a stiffer sentence is permissible and part of any legitimate system which tolerates and encourages the negotiation of please. These threats do not violate the Fourteenth Amendment’s Due Process Clause because the defendant had the opportunity to avoid the risk of being convicted by accepting guilt in the plea bargain.”

3 Motivation

Defendants have a decision to make. They can decide to accept a plea bargain or they can reject it and go to trial. Often times plea bargaining results in a lesser charge. Therefore, the defendants must consider whether they prefer the expected jail sentence if convicted by trial to the plea bargain, and must make their decisions accordingly. Formally, the defendant’s decision to accept a plea bargain can be modeled as:

$$ p*0+\left(1-p\right)*\mathrm{Max}\;\mathrm{Years}\le \mathrm{V}\left(\mathrm{Plea}\right) $$
(1)

where V(Plea) is the plea bargain’s value to the defendant and is the probability of being found innocent. Equation 1 says that a risk neutral defendant should look at the expected sentencing versus the value of the plea. This model assumes defendants care only about the smaller expected sentence time. Therefore, this model should predict that even defendants who truly believe they are innocent will not accept the plea bargain. Thus, p is the probability of being acquitted and (1-p) is the probability of being found guilty. Defendants who believe they are innocent will have a very small p and consequently a very small value for Eq. 1 such that the expected value of standing trial is almost certainly smaller than the value from plea bargaining due to the fact that the prosecutor faces time, budget, and federal guideline constraints in sentencing.

At least, that is how the system is designed to work. Sometimes prosecutors can offer deals that are almost “too good to pass up.” Clearly, if prosecutors can offer any deal to defendants, even those who assert their innocence might accept guilty pleas if the costs are low, especially if they are risk averse. In the past, the courts have made some attempts to limit the power of the prosecutors. Federal judges have made prosecutors adhere to the federal sentencing guidelines. These guidelines limit the maximum sentence reduction to 25 % of the original sentence. Similar guidelines exist in the state courts. However, United States v. Booker changed these guidelines from binding to advisory. As a result, many deals have reductions much greater than 25 % of the sentence required from a conviction.

The plea bargaining system also affects the prosecutors. Prosecutors are more likely to offer a plea bargain to a defendant when they have less evidence against the defendant or need the defendant to testify in another (usually larger) trial.

4 Data

The data in this study come from the US Department of Justice‘s Bureau of Justice Statistics. The data cover murder cases in 1988 for 33 large urban counties in the United States. While the data are cross-sectional, there are variations between each county.

Table 1 provides the descriptive statistics for the data set. In total there are 3144 observations. The statistics show that 40 % of defendants accept plea bargains (1250). However, of the remaining decisions, a jury decides on 25.5 % of the guilty verdicts, judges decide on only 8.5 %, and the remaining 26 % are not guilty. Of the total defendants, 59 % are black, 38 % are white, 18 % are Hispanic, and 1.2 % are Asian. In addition, only a small number of defendants have an affiliation with a gang (4.5 %) or have a history of mental disorders (4.5 %). However, almost half (48.6 %) of the defendants have at least one prior conviction. Women comprise only 11 % of the data, and 12 % of all defendants are unemployed. The majority of the cases face Class 1 charges as the primary and major charge (93 %). Class 1 charges are defined here as the crimes with the most severe punishments; such as first, second, and third degree murders as well as first degree manslaughter. Class 2 charges include all of the other charges from accessory to murder to armed violence and everything in between.Footnote 2

Table 1 Summary statistics

The analysis also divides the data by the demographic composition of each alternative trial setting. Figure 1 below illustrates the racial composition of each of the three settingss: (1) bench trial, (2) jury trial, and (3) plea bargain. Black defendants make up a larger portion of bench trials while white defendants and Hispanic defendants make up a smaller portion of bench trials.

Fig. 1
figure 1

Demographic composition of jury trial, bench trial, and plea bargain

The primary variable of interest in this data is the length of a sentence, as recorded by the number of years. This variable, in conjunction with data on acquittals, can help inform lawyers on what the expected sentence might be in a given type of trial. For instance, acquittals comprise about 25 % of the outcomes, and the mean sentence conditional on guilt by trial is 19 years. Therefore, these numbers can be inserted into Eq. 1 as 0.25*0 + 0.75*19, or a 14.25-year sentence. Therefore, the expected sentence of going to trial is 14.25 years, but the plea bargain is only 8 years. Thus, because the expected punishment by trial exceeds the plea bargain, defendants are likely to accept plea bargains. Therefore, the plea bargain relies on this trial penalty to persuade defendants to choose it over the right to trial.

5 Sentencing under alternative legal options

Many scholars contend that there is a penalty for those who decide to invoke their sixth amendment right to a trial (Rubinstein and White 1978; Brereton and Casper 1982; Holmes et al. 1992; Dixon 1995; Johnson 2003; Ulmer and Bradley 2006). King and Noble (2004) suggest that sentencing by a jury supports the trial penalty if the sentence length is more severe than comparable cases with guilty pleas or in bench trials. Figure 2 gives the data on sentence lengths that are contingent on guilt.

Fig. 2
figure 2

Sentence length (in years) by trial setting

Figure 2 displays the average sentence length in number of years under each sentencing option. Breaking down the sentencing data into categories of alternative trial settings serves as a first step towards analyzing the trial penalty. As Fig. 2 illustrates, defendants found guilty through trials serve nearly twice the average sentence as those defendants who accept plea bargains. This is expected because the plea involves a lesser charge. However, this finding also illustrates the magnitude of the trial penalty. Under plea bargains, defendants on average face a sentence of 11.1 years. However, for those defendants found guilty by trial, they face an average sentence of 18.92 years, or a 70 % penalty for opting to go to trial.

Figure 2 also splits the trials into jury trials and bench trials (judge). The figure shows that defendants who are found guilty by juries face an average sentence of 22.3 years. In contrast, defendants found guilty by judges face a sentence of only 11.25 years. Thus, the data show that there is no trial penalty when facing a judge, but there is a substantial trial penalty when facing a jury.

There are costs and benefits associated with a bench trial. For instance, a defendant might choose a bench trial if he or she desires a quicker resolution, believes a jury might be swayed emotionally, or faces doubts that a jury can handle an extremely complex legal rule. On the other hand, there are certain risks. In a jury trial, the prosecution must convince the entire jury of the defendant’s guilt. There is only the judge to convince in a bench trial, which makes it a risky proposition. However, the judge will follow the legal rules and is not likely to be swayed by popular opinion. This neutrality might work for some defendants but against others, particularly in the context of crimes that involve emotional or political contexts.

Figure 2 presents the first stage in the analysis of the trial penalty under alternative trial settings. No further analysis would be needed if individuals were randomly assigned to each option. However, this is not the case. Therefore, the analysis can be estimated further with a Tobit Regression:

$$ {\displaystyle {\operatorname{S}}_{\mathrm{i}\mathrm{t}}^{\ast }}={\delta C}_{\mathrm{i}\mathrm{t}}+{\times^{\hbox{'}}}_{\mathrm{i}\mathrm{t}}\upbeta +{\upalpha}_{\mathrm{i}}+{\upvarepsilon}_{\mathrm{i}\mathrm{t}} $$
(2)

where αi ~ N(0; σ2) and εit ~ N(0; σ2), and the regressor vector xit includes an intercept. For left censoring at zero, we observe the Sit variable:

$$ {S}_{it}=\left\{\begin{array}{c}\hfill {S}_{it}^{\ast}\hfill \\ {}\hfill 0\hfill \end{array}\begin{array}{c}\hfill \mathrm{if}\hfill \\ {}\hfill \mathrm{if}\hfill \end{array}\right.\begin{array}{cc}\hfill {S}_{it}^{\ast}\hfill & \hfill \ge \hfill \\ {}\hfill {S}_{it}^{\ast}\hfill & \hfill \le \hfill \end{array}\begin{array}{c}\hfill 0\hfill \\ {}\hfill 0\hfill \end{array} $$

where S is the measure of sentence length, C is the vector of interest that contains five conviction dummies, X is a matrix comprised of control variables purported to influence sentence lengths, α is the idiosyncratic error (i.e. the unobserved differences between cities), ε is the error term, and β and δ are coefficient estimates.

The C comprises three judicial options with five alternative measures of conviction: (1) guilty by plea, (2) guilty by trial, (2a) guilty by judge, (2b) guilty by jury, and (3) not guilty. In other words, a defendant can stand trial or accept a plea bargain. If they stand trial, they either face a judge or a jury. The jury or judge then deliberates and finds a defendant either guilty or not guilty. This vector allows for a comparative analysis of conviction options. In order to test for the existence, and more importantly, the magnitude of the trial penalty, n-1 dummy convictions are included in the regression. Because these categories are exhaustive, each one of the dummy variables is compared to the omitted category in the analysis. The X is a measure of control dummy variables purported to influence sentence lengths; such as characteristics based on gender, race, prior convictions, drug dealer, mental disorders, gang affiliations, use of alcohol, and unemployment. The following is a description of the covariates used in the model.

There are five demographic variables: white, black, Hispanic, Asian, and female. These variables are all dummies with a value of one if white, black, Hispanic, or Asian and zero otherwise. Female is a dummy variable that gives a value of one if female and zero if male. Because there might be discrimination based on sex and race in the courtroom, prosecutors might discriminate when proposing plea bargains. Further, individuals might face discrimination depending on the trial type: jury or bench. In addition, there are other control variables that provide background information on defendants: mental and priors. Mental is a dummy with a value of one if the defendant has evidence of a mental illness and zero otherwise. If a defendant has a mental illness, his or her defense attorney might more aggressively seek plea bargains out of fear of the trial process. These defendants may be more likely to plead guilty by insanity and never face trial.Footnote 3

Priors is a dummy with a value of one if the defendant has any prior convictions and zero otherwise. Priors includes previous counts of violence, drug charges, convictions, probation, and incarcerations. One possibility is that defendants with a prior record will be judged more severely in court than others. Therefore, these defendants are more likely to plead guilty, and juries and judges might react differently to a defendant’s criminal history.

In order for there to be a trial penalty, sentence lengths have to be longer, ceteris paribus, for those found guilty under jury trials than by bench trials and plea bargains. In Table 2, a list of dummies is provided to gauge the comparative trial framework. This list is exhaustive with n-1 categories included in each specification. Therefore, the coefficients compare the listed category to the omitted category. Columns (1) and (2) compare defendants convicted by trial to those convicted by plea; the coefficient in column (1) is 8.04. This coefficient indicates that defendants found guilty by trial face on average 8.04 more years than defendants who plea bargain. Similarly, the coefficient of −65.02 on the variable notguilty shows that defendants who plead guilty face a 65-year sentence premium over acquittal. The trial option is split into two subcategories and examined more closely in columns (3) and (4) of Table 2.

Table 2 Sentencing under three alternative trial settings (Tobit)

Columns (3) and (4) show that when the guilty trial variable is split into (i) guiltyjury and (ii) guiltyjudge, the effects of the trial penalty become clear. Conviction by a jury results in a sentence premium of 11.45 years over guilty pleas, but convictions by judges is associated with no premium (or discount) over guilty pleas. One reason there may not be any premium or discount in sentencing between bench trials and plea bargains is because judges have the authority to adjust the severity of the sentence based on his or her expectation that the defendant is guilty.

Table 2 provides many results that support the existence of a trial penalty. For instance, defendants with prior convictions, facing Class 1 charges and possessing a history of mental disorders, face longer sentences. Prior convictions can serve as a signal of future innocence or guilt. This signal can affect the probability a jury finds a defendant guilty, and it can affect the expected punishment by judges. Class 1 charges are for the most heinous charges. Thus, sentences are much longer. On the other hand, women face shorter sentences on average than their male counterparts. This difference could reflect either that women typically do not commit as many heinous crimes, or that women are treated differently in courts. As columns (2) and (4) illustrate, the inclusion or exclusion of a defendant’s criminal history, severity of crime, and other expected crime predictors does not affect the results in the analysis.

One problem with a Tobit regression is that it does not control for individual level idiosyncrasies in the data (i.e., differences between courts). These differences are typically unobservable and can reflect that courts are located in different cities with different popular views and jury compositions. In an attempt to control for these differences, a Tobit model with random effects is specified in Table 3. While a fixed effects estimation of Tobit models is not possible because of the incidental parameters issue (Neyman and Scott 1948), a random effects model is appropriate. Moreover, a Hausman test that compares random and fixed effects models in an Ordinary Least Squares (OLS) indicates that the individual specific effect is indeed uncorrelated with other explanatory variables.Footnote 4 Therefore, if random effects modeling is appropriate for an OLS, then this provides evidence that random effects modeling is also appropriate for a Tobit. In particular, a random effects Tobit captures some of the variation within each of the 33 large urban counties.

Table 3 Sentencing under three alternative trial settings (Tobit with random effects)

Overall, the results are robust to the inclusion of random effects in Table 3 both in statistical significance and economic magnitude. While there is a substantial trial penalty for trial by jury, bench trials do not have the same penalty. Consistent with the previous results, defendants with prior convictions, a history of mental disorders, and Class 1 charges are all positively related with the sentence’s length. Women also face shorter sentences on average. Overall, the statistical significance is robust to the inclusion of random effects.

6 Who finds whom guilty?

One additional question is whether defendants are treated differently under alternative trials. One possibility is that judges and jurors decide to treat a defendant differently. For instance, defendants with prior convictions or with mental disorders might be discriminated against by jurors, and judges might be more willing to look only at the facts in the case.

In order to address this comparative question, we use a probit model:

$$ \mathrm{P}\left({G}_{\mathrm{i}}=1\right)=\uptheta \left(\upalpha +\mathrm{X}\hbox{'}\upbeta \right) $$
(3)

where G is a vector of criminal trial convictions due to (i) plea bargains and (ii) guilty by trial; β is a vector of the coefficients; X is a vector of the controls that represent exogenous decision factors purported to influence an individual’s choice between pleading guilty or going to trial.

This model analyzes the probability a defendant will plea bargain based on observable characteristics. The dependent variable, G, is a binary variable, with a value of one if the defendant accepts a guilty plea to a lesser charge, and zero otherwise. It also analyzes the probability that a defendant will be found guilty in a jury or bench trial.

The goal of selecting variables in the model is to choose a set of appropriate covariates that might accurately reflect defendants. In particular, it is interesting to analyze variables that contain negative connotations such that a court may perceive it as prejudiced. This way, a defendant might indeed be innocent but might face hardship due to the fact that the court is judging him or her before trial begins. A similar motivation is for differential treatment between juries and judges in columns 3 and 4 of Table 3. Common examples are priors, mental disorders, and possibly race or sex. By including these variables, the model should capture this effect.

Table 4 analyzes the probit estimates under alternative legal options. Column (1) shows that defendants who have mental disorders are more likely to take plea bargains, and this result is statistically significant. There is a clear explanation for this result: defendants with mental disorders are more likely to get better deals from the prosecutor if they agree to receive help for their mental disorders.

Table 4 Probit estimates of guilty convictions under three alternative trial settings

The results also suggest that defendants facing the most heinous crimes, that is, Class 1, are less likely to be involved in plea bargains. This result could reflect that defendants facing Class 1 charges are less likely to accept plea bargains, or prosecutors are not as likely to give plea bargains to these defendants. The likely explanation is some combination of the two possibilities. Thus, the defendants who face Class 1 charges are more likely to be found guilty by trial (column 2), and the results in columns 3 and 4 show that they are more likely to be found guilty by a jury rather than by a judge.

A city’s per capita income is positively correlated with plea bargains, and it is statistically significant. One explanation is that wealthier defendants can hire better defense attorneys to face the prosecutor. Knowing this information, the district attorney might offer a bigger plea bargain, and the defendants might have a harder time refusing a better looking plea bargain.

Analyzing trial convictions in column 2 indicates that defendants with prior convictions are more likely to be found guilty by trial. Looking deeper into the type of trial, that is, jury or bench, the results show that these defendants are more likely to be found guilty by juries, and there is no relation between defendants with prior convictions and bench trials. This finding might indicate that defendants with priors face differential treatment from juries and judges.

A defendant’s gender can also matter in the courtroom. The results in columns 3 and 4 show that women receive differential treatment by juries and judges; women are more likely to be found guilty by a judge and less likely to be found guilty by a jury. The results also show that gang affiliations also receive differential treatment between juries and judges. While defendants with gang affiliations are not related to convictions by juries, they are less likely to be found guilty by judges.

Table 5 summarizes the results of the outcomes under the three alternative trial settings by various demographics and predictors. A positive sign indicates a given variable is more likely to be guilty and a negative sign indicates the variable is less likely to be guilty. A zero indicates the variable has no statistical relation. If a defense attorney is able to summarize this information, it could be helpful for clients. For instance, the results indicate that women are less likely to be found guilty by a jury and more likely to be found guilty by a judge. They also face shorter sentences when found guilty. Therefore, women might want to seek a traditional jury system. However, clients with prior convictions and gang affiliations are viewed more harshly by a jury system and might want to consider a bench trial. Race does not seem to matter as much as other defendant characteristics, although Hispanics are less likely to be found guilty by juries. However, as mentioned previously, caution should be used when choosing a bench trial, since a prosecutor must only convince one judge rather than 12 jurors in a traditional jury system.

Table 5 A defense attorney’s guide: Summary of effects

7 Concluding remarks

Many scholars believe that there is a trial penalty from invoking the option of the sixth amendment right to a fair trial. This trial penalty is present if sentence lengths are systemically longer on average than similar sentences by plea bargains or bench trials. Therefore, this study makes two key contributions to the literature on empirical legal analysis.

First, the evidence shows that the trial penalty exists for both plea bargains and bench trials using Tobit models. Moreover, a comparative analysis of three alternative trial settings estimates the trial penalty between jury trials and guilty pleas to be about 11 years and no trial penalty is found between bench trials and plea bargains.

Second, when analyzing the question of who faces charges and from whom, the empirical results show that defendants face discrimination in court, or they are at least treated differently under alternative trial settings. In particular, females are more likely to be found guilty by bench trials and less likely to be found guilty by juries. Defendants with priors are also more likely to be found guilty by juries, and gang affiliation is less likely to matter for bench trials. Defendants facing Class 1 charges are less likely to accept plea bargains and more likely to be found guilty through trial.

These findings have considerable legal and policy implications. The existence of a trial penalty for jury trials and the absence of a penalty for bench trials indicates that the bench trial setting should be revisited as an alternative to jury trials, particularly when a defendant might face discrimination in a traditional jury system. However, judges are unlikely to be truly impartial due to the economic postulate, incentives matter. Therefore, future research would benefit by comparing bench and jury trials and their incentives in more detail. Further, the findings in this study relate only to murders. Because murder sentencing can involve many years in prison or even the death penalty, the findings in this study could be different from a similar analysis of other crimes.