1 Introduction

It has been well documented that a tax audit triggers taxpayer responses in future periods and that there are both direct effects on the audited taxpayer and indirect effects on those not audited.Footnote 1 The empirical evidence on the direction of the response is somewhat mixed, and a possible cause for this is that taxpayers face noisy information settings. For instance, does the taxpayer know true tax liability and enforcement efforts prior to filing? In practice, tax systems are complex, and tax filers may keep imperfect records of their income and charitable contributions, making true tax liability uncertain. It seems reasonable to expect a link exists between responses to prior audits and the taxpayer’s perceptions regarding enforcement effort and her tax liabilities. For instance, with liability uncertainty, an audit penalty can arise due to an error stemming from tax code complexity or imperfect record keeping rather than a deliberate attempt to evade. However, the general nature of the linkage is not well understood and the present paper seeks to address this gap. In this study we conduct a large-scale laboratory experiment to investigate the behavioral dynamics pertaining to information acquisition and tax evasion in a setting where tax liability is uncertain and the tax agency makes available an information service that, when acquired, reduces liability uncertainty.

Existing tax systems, especially in the USA, are widely perceived to be complex and to require considerable effort on the part of taxpayers (Slemrod 2007), leading some to suggest that non-compliance is higher due to complexity (Forest and Sheffrin 2002; Krause 2000). Second, individuals make tax reporting decisions repeatedly and in this dynamic setting it is likely that the current period decision is influenced by prior outcomes of the individual and her cohort.Footnote 2 Third, the tax reporting decision is not made independently of the fiscal regime which can include such factors as trust in the government, social norms, and the responsiveness of the tax agency to the needs of tax filers.

In response to these factors, many tax agencies such as the US Internal Revenue Service (IRS) are exploring the use of complementary tax reporting instruments including the provision of information and filing assistance to taxpayers.Footnote 3 Such services can, for example, take the form of walk-in sites, advice over the telephone and online support documentation (e.g., FAQs and targeted information articles). This augmented paradigm recognizes that tax administrators have a role as facilitators in accurate tax reporting and it opens the possibility that the enforcement and service approaches to enhancing tax reporting can be synergistic. A recent survey suggests that it is important for the IRS to provide assistance services and that taxpayers at least say they use existing services when they need to resolve a tax issue (IRS Oversight Board 2014). The service paradigm for tax administration fits squarely with the perspective that emphasizes the role social norms play in tax compliance (Feld and Frey 2002), and links directly to the behavioral issues that arise in understanding the dynamic interaction between taxpayers and the tax authority. While these service programs may improve the image of the tax authority, the actual effect on tax reporting accuracy is an open question.

Our study bridges two experimental literatures on tax compliance, in particular one concerned with behavioral responses to audits and a second that examines the effects of information services in a setting with uncertain tax liability.Footnote 4 In doing so, we make three contributions to the literature. First, we provide an examination of the effects of information services on tax reporting that vary in terms of both service quality and cost. Second, we examine incentives to acquire information and estimate the willingness-to-pay (WTP) for information services. Third, we test the effects of past audits on tax reporting behavior and information service acquisition.

Recent experimental evidence suggests that taxpayers respond to tax agency-provided information services covering tax liability questions by increasing compliance (Alm et al. 2010; Beck et al. 1996; Vossler and McKee 2017). Our experimental design is similar to Vossler and McKee (2017), who examine a fully revealing service and an imprecise service that reduces by 50% the range of possible true liability amounts. Along with a fully revealing service, we instead examine two services that can reveal two possible liability amounts, each with equal chance of being truthful. This, for instance, characterizes a situation where the taxpayer receives conflicting signals from tax professionals or online information sources. This is implemented as either a “simultaneous” (both amounts revealed upon request) or a “sequential” (up to two information requests and each will provide a different opinion) information service. We show, theoretically, that incentives for acquiring information differ between the two settings, leading to different tax reporting. Moreover, in the context of a loosely related experimental setting, Boyce et al. (2016) study sequential information acquisition and find there is a behavioral tendency to stop searching when the information already obtained is favorable even when this information may be incorrect.

By varying the cost of information services, which has not been done in prior work, we are able to demonstrate not only that these services have value to taxpayers but moreover the distribution of WTP for services. This investigation is motivated by a theory model, which implies that expected reporting costs decrease when services are acquired, and moreover that cost differences vary based on the quality (precision) of the service. Although the cost in the experiment is monetary, this is intended to capture possible non-monetary costs such as time and effort. For instance, taxpayers may have to wait extended periods of time to reach tax agency representatives by telephone or invest time and effort in filling out tax calculation worksheets.

Prior experimental work that explores behavioral dynamics has often but not universally found that the effect of an audit is to decrease subsequent compliance (Alm et al. 2009; Gemmell and Ratto 2012; Kastlunger et al. 2009; Kirchler 2007; Maciejovsky et al. 2007; Mittone 2006). Aside from the three information service treatments mentioned above, we include in the design comparative uncertainty (no information service) and certainty (no liability uncertainty) baseline settings. As prior work exploring behavioral dynamics has induced certain liability, our design provides insight into whether prior findings extend to this setting, and further what role information services potentially play. Given the oft-noted complexity of the present tax system, tax underreporting can arise either from intent to evade or from errors due to tax liability uncertainty.Footnote 5 It follows that taxpayer information services, which have the potential of promoting a more efficient tax system through these dynamic effects, will be accessed in different ways depending on the taxpayer’s motives. Further, our econometric model and experimental design allow us to distinguish between the effects of penalizing versus non-penalizing audits, using as counterfactuals participants who engaged in similar (compliant or noncompliant) behavior. With the exception of Gemmell and Ratto (2012), prior analyses have instead focused on the aggregate effect of being audited in the past.

Our research utilizes controlled experiments with human decision makers and salient financial incentives in order to test the effects of audits on taxpayer reporting and information acquisition. Within the laboratory, we induce the true tax liability (which is not known with certainty to participants) and then identify the effects of information services (to resolve all or some of the uncertainty) by exogenously varying the setting across groups of players. Experimental data are especially useful here since the laboratory setting allows for control of institutional features (such as the enforcement process) and addresses the problem of being unable to observe the actions of each individual taxpayer. Tastes for evasion are imperfectly observable, and in the field it is difficult to identify counterfactuals, e.g., tax evaders not selected for audit. Further, audits are imperfect in the field and may not correctly reveal the compliance status of those audited. True tax liability is explicitly induced in the laboratory setting, and therefore, the exact amount of evasion is known even for those not audited. Thus, we can compare subsequent reporting behavior for those audited to those not audited for “like” individuals—e.g., those that engaged in evasion. Further, whereas service programs have been introduced in the field, there is not a full spectrum of such programs in existence; such field data as may exist are incomplete.

Our results suggest that, in the presence of uncertain tax liability, audit outcomes impact both the tax reporting and information acquisition decisions. Similar to prior tax compliance studies exploring dynamics, in our certain tax liability treatment the behavioral response to an audit is to increase evasion. This effect is found among compliers as well as non-compliers. We duplicate these results in the uncertain liability setting. However, the behavioral response in the case of a non-penalizing audit is roughly twice as large relative to the certainty setting. Although this interpretation is speculative, this differential response may result from the audit outcome being the consequence of unintended reporting errors rather than deliberate compliance (evasion). Under an information service paradigm, these reactions to an audit are completely mitigated. This finding holds regardless of actual information acquisition, as it is robust across the subsets of respondents who always, sometimes or never request the available service. Immediately following a penalizing audit, information acquisition rates do surprisingly decrease. Given that the overall “intent-to-treat” effect of information services is to decrease evasion, one implication of this finding is that tax authorities should target information services to those previously selected for audit.

2 Experimental design

2.1 Decision setting

Our experimental setting implements fundamental elements of a voluntary tax reporting system. Participants earn income by performing a task and self-report their tax liability to an authority. Final tax liability is a function of earned income, the tax rate, and tax credits claimed. There is a random audit process that performs without error; if the individual has evaded taxes, both the unpaid taxes and a fine are collected.Footnote 6 The audit rate and penalty are public knowledge.

A participant’s earnings for a decision period are her income, minus the taxes she reported, and, if applicable, any penalties. Income is denominated in “lab dollars,” and the overall earnings for the experiment are the sum of the lab dollars earned over all decision periods multiplied by a common (and known) lab dollar-to-US dollar exchange rate. In each period, participants earn income based upon their performance in a simple computerized task, in which they are required to sort numbers into the correct order. Those who finish the task the fastest earn the highest income of 1500 lab dollars for the period, those who finish in the middle of pack earn 1250 lab dollars, and the slowest earn 1000 lab dollars.Footnote 7

After earning income, participants are presented with a screen that informs them of their earnings, the distribution of earnings for others in the experiment, and the tax policy parameters (tax rate, audit probability, and penalty rate).Footnote 8 In each period, the participants decide whether to request a liability information service (if one is available) and how much to claim in tax credit. Although other institutional details are embedded in the design (e.g., tax rate, taxable income, etc.), and in particular the tax form, the participant can only manipulate her tax liability through her credit reporting choice. As there are penalties for tax underreporting, if audited, and foregone earnings associated with over-paying taxes, there is value to resolving any uncertainty regarding the tax credit. The expected tax credit is calculated according to the formula 1000–0.5 \(\times \) (earned income), such that the expected credits equal 500, 375, and 250 for the three income categories {1000, 1250, 1500}. The credit is large relative to the initial tax liability so that the credit decision is financially salient. One important feature of our design is that, aside from penalties, the expected earnings (income minus tax payment) is 1000 lab dollars across all income levels.

The “true” credit amount is a random draw from a uniform distribution, defined as plus or minus 100% of the expected credit. The true credit is independent across decision periods and individuals. In treatments with liability uncertainty, the participant’s true credit remains unknown prior to filing unless she acquires information. Given this design, uncertainty—and, hence, the value of resolving it—increases with the expected credit (or, analogously, decreases in income). In the uncertainty liability treatments, prior to making a credit choice or acquiring information (if possible), each participant sees the supports of the uniform distribution that coincides with her income. If an information service is available, participants can acquire the information, possibly at a cost, with the click of a button.Footnote 9 While the information acquisition costs (when present) in the experimental setting are monetary, in actuality the cost of obtaining information is more likely to be in the form of time and effort.

In all sessions, the tax rate is fixed at 50% of earned income, the audit probability is fixed at 30%, and the penalty rate is fixed at 400% and applied to any over-reported credit revealed by an audit. These values are known by the participants.Footnote 10 The penalty rate is consistent with penalties imposed by the IRS for intentional tax evasion. Enforcement effort is held constant since the effects of enforcement efforts have been widely investigated and we only need this effort to be salient in the current setting to give value to the information that resolves tax liability uncertainty. Table 1 summarizes the key parameters of the experiment.

Table 1 Experiment parameters

Participants are able to revise their credit decision prior to filing their return; the tax form updates their tax liability as the claimed credit is revised. Thus, they can observe the potential changes in their reported tax liability for each reporting strategy they investigate. A timer at the bottom of the tax form counts down the remaining time. The participants are allowed 90 s to file, and the counter begins to flash when there is 15 s remaining. Failing to file on time results in a penalty—the level is such that allowing time to run out is a strictly dominated strategy. If a liability information service is available, it can be requested at any time and does not change the total amount of time for a period.

Audits are completely random and independent of whether other persons are audited or the individual’s reported tax liability. The audit process is static in that only the current period tax return is scrutinized and there is no possibility of penalties for (yet undiscovered) past non-compliance nor does a violation lead to a higher future audit probability. The random audit selection process is illustrated by the use of a virtual bingo cage that appears on the computer screen. A box with blue and white bingo balls appears on the screen following the tax filing, and the ratio of blue to white balls equates with the audit probability. The balls begin to bounce around in the box, and after a brief interval a door opens at the top of the box. If a blue ball exits, the participant is audited; a white ball signifies no audit.

When an audit occurs, the true value of the credit is used to determine taxes owed. The individual’s declarations are examined. If the individual has underreported her tax liability, she must make up for the difference as well as pay a fine. If an individual has over-reported her tax liability no over payments are returned to the individual.Footnote 11 Tax revenues and any penalties paid are not redistributed to the participants in order to ensure that the participants focus on the individual income disclosure decision and not on any public good provision decision. After the tax return is filed and an audit (if any) is determined, the participant is shown one final screen that summarizes everything that happened during the period. After two practice periods to allow subjects to gain familiarity with the interface, the process just described is repeated for a total of 20 paid periods. To minimize potential end-of-game effects, subjects are simply told that the experiment lasts “several” rounds, but the number of periods is not disclosed in advance.

2.2 Experiment treatments

With the exception of the variation in earned income across subjects, we employ a between-subjects design. The main treatment variables are the presence/absence of a liability information service, the quality of the service if provided, and the cost of obtaining the information. These factors are held constant throughout a session. There are five basic treatments, as shown in Table 2. The first, T1, imposes certain tax liability, which we use as a baseline for comparison against the uncertain liability treatments. In T1 participants are automatically given information on their true credit and there is no notion of uncertain liability or an information service. In the second treatment (T2), the individual’s tax credit is uncertain and there is no information service available. This establishes a second baseline for comparison. In the remaining three treatments, there is an information service available. The status quo in the information service treatments, i.e., if the information service is not utilized, is identical to the uncertainty baseline.

Table 2 Experiment treatments (number of participants)

The “perfect” information service reveals the true credit with certainty (T3). Under the other two information service types, the service is imperfect in the sense that up to two possible credit amounts can be provided and each amount has a 50% chance of being correct. Under the “simultaneous” information service treatment (T4) the agency simultaneously provides two credit amounts (two opinions), one of which is the true value. With the “sequential” information service (T5), the participant can make up to two information requests and each will provide a different opinion. If two requests are made, then the simultaneous and sequential services reveal the same information. However, the sequential information treatment leaves the possibility that only one credit amount is delivered, in which case it still has the same 50% chance of being true.

To assess the value of information services, we vary the cost to acquire information in the information service treatments (see Table  2). The three cost levels are 0, 50, and 100 lab dollars for the perfect and the simultaneous information settings. For the sequential setting, these costs are halved and assessed separately for the two sources.

2.3 Participants and procedures

The experiments were conducted at dedicated experimental economics laboratories at two public universities. The University of Tennessee (“Lab 1”) is a large, research university (Carnegie Classification R1) located in a moderate-sized city. Appalachian State University (“Lab 2”) is a regional public university (Classification M1) in a rural setting. At both laboratories the same software and experimental protocols were used and the laboratories are nearly identical in terms of layout. The participant pools included students and non-students (university staff, mostly). Students and non-students participated in different sessions. To reflect different opportunity costs, the lone difference across sessions was the exchange rate (750-1 for students; 375-1 for non-students). Recruiting was conducted using the Online Recruiting System for Experimental Economics (ORSEE) developed by Greiner (2004). Databases of potential participants were built using announcements sent via email to university students and staff. Registered individuals were contacted, via email, and were permitted to participate in only one tax experiment. Only recruited participants were allowed to participate, and no participant had prior experience in this experimental setting. Methods adhere to all guidelines concerning the ethical treatment of human participants. Earnings averaged $25 for student participants and $45 for non-students. Sessions lasted between 60 and 90 minutes. Overall, there were 38 sessions and 730 participants (463 students and 267 non-students).

The experiment session proceeded in the following fashion. Each participant sits at a computer located in a cubicle and is not allowed to communicate with other participants. The instructions are conveyed by a series of computer screens that the participants read at their own pace, with a printed summary sheet provided and read aloud by the experimenter. (Appendix A provides representative screenshots from the experiment, and Appendix B provides instructions from one of the treatments.) Clarification questions are addressed after the participants have completed the instructions and two practice periods. The participants are informed that all decisions will be private; the experimenter is unable to observe the decisions, and the experimenter does not move about the room once the session starts to emphasize the fact that the experimenter is not observing the participants’ compliance decisions. This reduces, to the extent possible, peer and experimenter effects that could affect the decisions of the participants. All actions that participants take are made on their computer. After the 20 paid rounds, participants fill out a brief questionnaire, which collects basic demographics including information on tax reporting experience. Payments are made privately, in cash, at the end of the session.

3 Theoretical framework

3.1 Basic theory model

To derive predictions to inform our experimental design and data analysis, we adapt the model of Vossler and McKee (2017). This approach derives from the classic “economics of crime” model pioneered by Becker (1968) and applied to tax evasion by Allingham and Sandmo (1972) and Yitzhaki (1974). In our experiment setting, there are two reporting amounts on the tax form: income, I, and a tax credit, C. Reported tax liability is defined by a tax rate, t, multiplied by income, less the credit claimed: \(tI-C\). For simplicity, it is assumed that income is known with certainty by both the tax agency and the taxpayer (i.e., this is “matched” income). For the credit, it is assumed that the tax agency does not have matching documentation such that the amount of the credit is not known prior to an audit. The true tax liability is further uncertain to the taxpayer, which could be due to factors such as complexity over how to determine the true amount and inaccurate recordkeeping.Footnote 12 From the taxpayer’s perspective, the actual credit is a random variable x with density \(f\left( x \right) \) over the interval [a, b]. To motivate compliance, the tax authority undertakes random audits with probability p. If an audit occurs, the taxpayer faces a penalty rate \(\beta \) applied to any over-reported credit revealed by the audit. This includes payment of additional taxes owed plus some additional fine such that \(\beta >1\).

The optimization problem faced by a risk-neutral taxpayer is to choose a reporting amount C that minimizes tax reporting costs:

$$\begin{aligned} {\min }_C\; tI-C+p\beta \left\{ {\mathop \smallint \nolimits _a^C \left( {C-x} \right) \cdot f\left( x \right) \mathrm{{d}}x} \right\} \end{aligned}$$
(1)

Assuming that the distribution of the tax credit is uniformly distributed over the interval [ab], and an interior solution, the optimal credit report is:

$$\begin{aligned} C^{*}=a+\frac{b-a}{p\beta }. \end{aligned}$$
(2)

An interior solution requires that the marginal expected cost of the audit is greater than the marginal cost of reporting, i.e., \(p\beta >1\). With our experiment parameters, \(p=0.3\), \(\beta =4\), and \(a=0\), Eq. (2) characterizes optimal reporting with \(C^{*}=\frac{5}{6}b\). The expected true tax credit is \(\frac{1}{2}b\) such that the optimal report is higher than expected value; i.e., it is optimal to underreport liability on average.

3.1.1 The effects of better liability information on tax reporting

The tax liability information services we explore meet the conditions of what Vossler and McKee (2017) define as a helpful tax liability information service. Following their Proposition 1, when such a service is acquired this leads to tax reporting that is closer to the truth. In our setting, as over-reporting the credit is optimal under uncertainty, this means that the information service incentivizes the taxpayer to report a lower credit.

With the perfect information service, when acquired, it is optimal to report the truth. This is because the expected cost of over-claiming an additional dollar of credit is \(p\beta =1.2\), which is more costly than the one dollar of tax payment avoided. This is of course also the prediction for our certainty baseline treatment.

In the simultaneous information setting, as well as the sequential setting where both signals are acquired, the service reveals two possible credit amounts, each with an equal chance of being true. It is optimal to report one of the two amounts as long as \(p\beta >1\). Consider the case where the taxpayer believes her credit is either \(c_1 \) or \(c_2 \), with \(c_2 >c_1 \). If she reports the smaller amount, she simply receives the benefits of the credit \(c_1 \) regardless of audit, as the audit would never reveal underreported taxes. The cost savings of reporting \(c_2 \) instead of \(c_1 \) is \(c_2 -\frac{1}{4}p\beta \left( {c_2 -c_1 } \right) \). This amount is higher than \(c_1 \) when \(p\beta <4\), which holds in our experiment, and thus \(C^{*}=c_2 \) is the optimal reporting decision. To see that it is not optimal to report an amount between \(c^{1}\) and \(c^{2}\), consider a marginal increase from \(c^{1}\). The marginal cost savings from doing so is \(1-\frac{1}{2}p\beta \), which is greater than zero given our parameters. The higher of two random draws from a uniform distribution, on average, will be higher than the expected value of the distribution. It follows that when \(p\beta <4\), the optimal decision is to over-claim the credit, on average. In particular, \(\hbox {E}\left[ {C^{*}} \right] =\frac{a+2b}{3}\). As this is lower than the optimal report in the absence of information, i.e., \(\frac{a+2b}{3}<\frac{\frac{1}{3}a+\frac{5}{3}b}{2}\), the information service still incentivizes reporting a credit that is closer to the true amount.

A slight variation of the above is the sequential information setting when only one signal is chosen. Let this amount be denoted by \(c_0 \). Then, there is now a 1/2 probability that this is the true amount and 1/2 probability that the true amount is anything within the original possible credit interval. Let \(I[C>c_0 ]\) denote an indicator that equals 1 when the reported credit exceeds the information signal and equals 0 otherwise. Then, the cost minimization problem becomes:

$$\begin{aligned} {\min }_C \;tI-C+\frac{1}{2}p\beta \left\{ {\mathop \smallint \nolimits _a^C \left( {C-x} \right) \cdot f\left( x \right) \mathrm{{d}}x} \right\} +\frac{1}{2}p\beta \left( {C-c_0 } \right) \cdot I\left[ {C\ge c_0 } \right] \end{aligned}$$
(3)

The first-order necessary condition, assuming again a uniform distribution for x, can be written as:

$$\begin{aligned} C^{*}=a+\frac{2\left( {b-a} \right) }{p\beta }-\left( {b-a} \right) I\left[ {C^{*}\ge c_0 } \right] \end{aligned}$$
(4)

Note that it is never optimal for \(C^{*}<c_0 \). Intuitively, the information signal results in a kink in the marginal cost of enforcement function and reporting less than the signal means there is a range of higher reporting amounts for which, with probability 1/2, an audit would reveal no penalty. Consider a case where the taxpayer reports less than \(c_0 \). Then, the marginal benefit of reporting more is one dollar, the tax savings. The marginal cost of enforcement is \(\frac{1}{2}p\beta \frac{C-a}{b-a}\). As \(C>b\) and \(1>\frac{1}{2}p\beta \) given our experiment parameters it follows that \(1>\frac{1}{2}p\beta \frac{C-a}{b-a}\). Whether the solution is then characterized by Eq. (4) or \(C^{*}=c_0 \) depends on the information signal. Intuitively, as the marginal enforcement cost increases with the level of reporting, the additional penalty incurred by reporting more than \(c_{0}\) eventually dominates the tax savings of doing so. With our experiment parameters, when \(c_{0}\ge \frac{2}{3}b\), which occurs with probability 1/3, it is optimal to report \(c_{0}\). In this case, the expected credit reported is \(\frac{5}{6}b\), which is identical to the solution of Eq.  (2) based on no information signal. Otherwise, when \(c_{0}<\frac{2}{3}b\), which occurs with probability 2/3, \(C^{*}=a+\frac{2\left( {b-a} \right) }{p\beta }-\left( {b-a} \right) \). With our parameters, \(C^{*}=\frac{2}{3}b\). Then, it follows that \(\hbox {E}\left[ {C^{*}} \right] =\frac{1}{3}\cdot \left( {\frac{5}{6}b} \right) +\frac{2}{3} \cdot \left( {\frac{2}{3}b} \right) =\frac{13}{18}b\). Therefore, on average, providing one possible signal of the true credit increases tax reporting. We can now rank-order the uncertain liability settings in terms of the credit reported (\(\hbox {E}\left[ {C^{*}} \right] )\): no liability information service \(\left( {\frac{5}{6}b}\right) \); information service reveals a possible true amount, correct 50% of the time (\(\frac{13}{18}b)\); service reveals two possible true credit amounts, each with 50% chance of being correct (\(\frac{2}{3}b)\); service reveals actual amount with certainty (\(\frac{1}{2}b)\).

3.1.2 Willingness-to-pay for better liability information

That the information service changes optimal reporting implies that the service has value. If acquiring the service is costless, it should be obtained as long as the expected reporting cost with the service is less than the cost without. When a fee is charged to acquire the service, as in some cases of our experiment, the difference in expected cost (with and without the service) needs to be equal to or greater than the cost of the service. The theory predictions above can be used to determine expected cost differences and hence reveal how high the service cost can be such that the taxpayer is indifferent between acquiring the service or not.

Note that with the low-, middle-, and high-income groups, we have that \(b=\left\{ {1000,750,500} \right\} \). Without information the expected cost is \(\frac{1}{2}I-\frac{5}{12}b\). With perfect information this amount becomes \(\frac{1}{2}I-\frac{1}{2}b\), such that the maximum WTP for the service is \(\frac{1}{12}b\). Across the three groups this implies a maximum WTP of 83.33, 62.50, and 41.67, respectively. With the case of two information signals obtained, expected cost is \(\frac{1}{2}I-\frac{7}{15}b\). The maximum WTP is \(\frac{1}{20}b\), which for the income groups implies amounts of 50, 37.50, and 25.

In the sequential information setting, an interesting question arises as to whether, conditional on obtaining the first signal, it makes sense to obtain the second. On average, the total value of both signals exceeds that for just one, and so if the second is costless it should be obtained. Nevertheless, cases can arise where the second signal has sufficiently low value, and it will thus be optimal to not purchase it except if the cost is near zero. To see this, recall that with two signals it is optimal to report a credit equal to the highest of the two signals. In the limit as the first signal approaches the upper bound of the credit distribution, the probability that it is the highest of two values approaches probability 1, and the value of the second draw approaches zero.

In the experimental design, the service cost is 0, 50, or 100. With a cost of zero it is optimal to always acquire the information regardless of the information setting or income group. With a cost of 50, both the low- and middle-income groups should acquire perfect information. The low-income group should also acquire the two information signals in the simultaneous information setting. In the sequential setting, it will also be beneficial to plan on acquiring both signals. As discussed above, however, if the first draw reveals a sufficiently high credit, cases will arise when the value of the second signal is insufficient to justify additional acquisition costs. With a cost of 100, it is never optimal to acquire information, regardless of the type of service or income group.

Overall, those services that provide more precise information are both more likely to be acquired and further have the largest effect on reporting. Thus, the theory predicts that the unconditional rank-ordering of treatments in terms of expected reporting is similar to the conditional one. The lone exception is that, since theory predicts not everyone will purchase the perfect service if available, we expect credit reporting to be higher in T3 than in our baseline certainty treatment (T1).

3.1.3 Standard economic hypotheses

Based on the theory analysis, the main testable hypotheses from the experiment are presented below. As we are interested in possible behavioral dynamics, we further include as testable hypotheses how the audit process interacts with the reporting and information acquisition decisions. As audits are completely random and audit outcomes are independent of a taxpayer’s audit history, based on the standard theory model past audits should have no effect.

Hypothesis 1 :

The reported tax credit decreases when an information service is available.

Hypothesis 2 :

The reported tax credit decreases as the quality of an available information service increases.

Hypothesis 3 :

An audit has no effect on future tax reporting decisions.

Hypothesis 4 :

The propensity to acquire an information service increases with service quality.

Hypothesis 5 :

The propensity to acquire an information service decreases with the cost of the service.

Hypothesis 6 :

An audit has no effect on the propensity to acquire an information service in the future.

3.2 Insights from behavioral economics

Previous experimental research has identified possible behavioral responses to simple random audit enforcement mechanisms. Several researchers have found that compliance falls following an audit, and Mittone (2006) labels this behavior as the “bomb crater effect.” One explanation for this is that subjects behave as if the audit probability immediately following a period in which they were audited is significantly lower—i.e., they succumb to the “gambler’s fallacy”—and thus perceive the cost of evasion to be low. Another behavioral response to the audit process is known as “loss repair” (Andreoni et al. 1998; Maciejovsky et al. 2007). Loss repair is the notion that the penalties incurred from an audit might induce subjects to “want to evade more in the future in an attempt to ‘get back’ at the tax agency” (Andreoni et al. 1998, pp. 844). Therefore, subjects experiencing penalizing audits may try to recover their losses by engaging in tax evasion in future filings (Alm and McKee 2006).

Uncertain tax liability and information acquisition are unique to our experimental design, but to the extent the above behavioral drivers exist, one would expect differential effects with regard to how information services alter reporting, as well as their perceived value to taxpayers. To see this, with uncertain or certain liability we have a corner solution of maximal evasion if \(p\beta <1\). Thus, in our experiment, anything more than a 5% reduction in the perceived audit probability motivated by the gambler’s fallacy is enough to incentivize full evasion. The value of information services in this case is zero, and the expected change in reporting following information acquisition is likewise zero. Therefore, a finding that the frequency of information acquisition is lower in the period immediately following an audit is a result consistent with the gambler’s fallacy, as is a finding that the effects of available information services are also lower.

If loss repair is an important behavioral driver, we expect a behavioral response in the case of a penalizing audit only. This behavioral response would be qualitatively similar to the case of the gambler’s fallacy. Tying this to the theoretical model, we can think of adding a benefit (i.e., a negative cost) term to the objective function that increases with the amount of reported tax credit (evasion). This perceived benefit thus weakens the effects of enforcement and, in turn, decreases the effect of an acquired information service on tax reporting as well as the perceived value of acquiring one. Motivated by the discussion on behavioral dynamics, we put forth two hypotheses as alternatives to Hypothesis 3 and Hypothesis 6:

Hypothesis 3 (alternative):

The effect of an audit is to increase tax credit reporting.

Hypothesis 6 (alternative):

The effect of an audit is to decrease the propensity to acquire a liability information service.

4 Results

Table 3 describes the experiment data. Across all treatments and decision rounds, the mean tax credit reported is 518 lab dollars. To put this in perspective, the mean actual credit is 363, suggesting that evasion is non-trivial, amounting to over 40%. In the information treatments, the uptake of information is 58%. There is considerable heterogeneity in uptake, with 17% of participants never requesting the information, and 27% acquiring information in every period. In terms of the participant sample, it slightly favors females (57%) and persons who are not full-time students (67%). The mean age is 30 years, 27% have a college degree, 36% reported to be fully employed, and slightly more than half (54%) filed their own tax return the prior year, without the aid of tax preparation software.

Table 3 Data description
Table 4 Tax credit reporting models

4.1 Tax credit reporting

Table 4 presents linear regression models using the panel data generated from all treatments, where the reported tax credit is the dependent variable. Model 1 is a parsimonious specification that simply estimates the mean reported tax credit separately for the five treatments. For ease of interpretation, a full set of treatment indicators is included, dropping the model intercept.Footnote 13 To account for possible differences in participant characteristics across treatments, Model 2 adds the demographic variables listed in Table 3. So that coefficients can be directly compared across models, we subtract the sample means from the demographic variables, which does not alter their coefficients.

Model 3 allows for behavioral dynamics, by allowing the reported tax credit to vary across the four possible audit outcomes in the prior period, separately for each treatment. The indicator Not Compliant controls for whether the participant reported a credit that exceeded the actual credit in the prior period, whereas the indicator Compliant equals 1 if the participant instead reported a credit equal to or less than the actual credit in the prior period. The two interactions, Not Compliant \(\times \) Audited and Compliant \(\times \) Audited measure differential effects due to the audit process. As audit selection is random, this allows for identification of the causal effect of being audited. Also controlled for in the specification is the expected income from the earning task (demeaned). Model 4 is the same as Model 3, but with demographic variables included. To control for possible heteroscedasticity and autocorrelation of unknown form in the regressions, throughout our analysis we use robust standard errors with clustering at the participant level. Further, heteroscedasticity and autocorrelation robust t and F statistics are used when evaluating hypotheses.

As reported in Model 1, the mean reported tax credit exceeds 500 lab dollars in all treatments. Relative to the expected actual credit of 363, there is statistically significant evasion in all treatments.Footnote 14 According to the theory model, individuals in the certainty baseline should report truthfully, but this is not observed in the data. Further, the reported credit is statistically higher in the perfect information treatment, and statistically lower in the imprecise information treatments relative to theory. Deviations from theory point predictions are not unexpected. Indeed, as found by Kirchler and Wahl (2010), decisions in an experimental tax compliance game are likely to be significantly correlated with behavioral considerations not captured by a basic theory model such as ours, for instance, the perceived fairness of the tax system, moral obligations, and the ability to rationalize evasion.

Theory predicts lower credit reporting in the information service treatments relative to the uncertainty baseline, and this is true statistically in all three cases. This lends support to Hypothesis 1. As information uptake is less than 60%, this represents rather substantial “intent-to-treat” effects. In fact, mean evasion is over 60% in the uncertainty baseline and less than 20% in the information treatments. However, in contrast to Hypothesis 2, credit reporting for the three information service treatments is statistically equal. Although theory predicts the taxpayer will report the highest of the two possible actual amounts in the imprecise service treatments, this strategy does not appear to have been intuitive. Instead, participants tended to report an amount between the two signals, thus leading to lower credit reporting than expected and possibly explaining the observed invariance to service quality.

It is important to note that there are no differences in the reported credit between the certainty baseline and the information service treatments. Theory clearly predicts lower reporting in the former case. One possible explanation is a reciprocity effect—if a taxpayer requests a service she may feel compelled to use the information to more accurately report her taxes. There is no notion of an information service in the certainty baseline, and so such an interaction with the tax agency is absent.

Turning to Model 3, there is substantial evidence of behavioral dynamics. For the certainty treatment, within the subgroup of non-compliers, being audited in the prior round triggers a behavioral response in the form of a higher reported credit. Thus, with our design we have replicated the common bomb crater effect from the literature—the effect of a prior audit is to increase evasion. This effect is not completely explained by loss repair, however, as this finding further holds among compliers. The magnitudes of the effects are meaningful for both subgroups. Interestingly, we also find statistically significant increases in tax credit reporting following an audit for both complier and non-complier groups. In the case of compliers, the audit effect is about twice as large relative to the certainty case and is statistically different.

In contrast, there are no significant behavioral responses to prior audits in any of the information service treatments. Out of curiosity, we estimated the model on subgroups representing those who never, sometimes or always requested the information. The null effects persist for all three subgroups, suggesting that the mere presence of an available information service mitigates the behavioral response to a prior audit. It is not clear why this result arises, but does suggest that the bomb crater effect is not entirely general. With uncertain liability, there are two probabilities that taxpayers cannot precisely control—the audit probability and the probability of being penalized if audited. It is possible that with an available service, regardless of whether it is acquired, the second probability becomes focal as this is something that the taxpayer can now (partially) control. If this is true, this may dampen any behavioral effect that stems from misperceptions of the audit probability.

In all treatments, persons who were non-compliant in the prior round report a much higher credit than those who were compliant. This reflects persistence in decision strategies. Also, as expected, those with higher incomes report a lower credit (consistent with their actual credit being lower).

Comparing models with and without including demographic variables does not alter any conclusions. Being female [Model 2: −73.96 (16.61); Model 4: −55.38 (12.33)] and an increase in age [−1.76 (1.05); −2.34 (0.77)] have at least a marginally significant and negative effect on tax credit reporting, whereas participating at Lab 2 [60.97 (17.99); 50.76 (13.34)] increases reporting. The location effect is not unexpected, given that the differences in the student bodies and surrounding populations are unlikely to be fully captured by the demographic variables we control for. We summarize our main results below as they relate to Hypotheses 1–3:

Result 1 :

The availability of tax liability services decreases the reported tax credit.

Result 2 :

Tax credit reporting does not vary with service quality.

Result 3 :

In the certain and uncertain liability baseline treatments, an audit in the prior period increases the reported tax credit. This effect is absent in the information service treatments.

4.2 Information service acquisition and willingness-to-pay

Table 5 reports four linear probability models of the information acquisition decision that contain the same regressors as in the parallel reported tax credit models. We add to this, in Models 3 and 4, the (demeaned) cost of acquiring the available information service. For the sequential information treatment (T5), we code outcomes the same regardless of whether one or both information signals are obtained. Only one of the two available sources was requested 11% of the time.

Table 5 Information service acquisition models

To complement the information acquisition models, we estimate parallel models to explain the variation in maximum WTP for information services. Using established methods from the non-market valuation literature for binary choice WTP data (e.g., Cameron and James 1987; Wooldridge 2010, pp. 780–781), a WTP function is identified based on random assignment of different service costs across subjects. In brief, let \({ WTP}_{ it} \) denote subject i’s willingness-to-pay for the available information service in period t. \({ WTP}_{ it} \) is not directly observed, but instead can be treated as a censored dependent variable. When information is acquired, this implies that \({ WTP}_{ it} \ge {{ Service}}\, {{ Cost}}_{ it} \); i.e., the cost is the lower bound of WTP. Otherwise, when the participant foregoes the information, this provides the signal \({ WTP}_{ it} <{ Service}\, { Cost}_{ it} \); i.e., the cost identifies the upper bound of WTP.Footnote 15 We assume that \({ WTP}_{ it} \) is a linear function of covariates and a mean-zero error term which is assumed to be distributed normal with standard deviation \(\sigma \). This gives rise to what is commonly referred to as an interval regression model. With a linear conditional mean function, assuming the error term has a normal distribution is analogous to assuming a normal distribution for \({ WTP}_{ it} \). Interpretation of coefficients from the interval regression model is analogous to that from an ordinary linear regression model. Table 6 reports the estimated WTP regressions.

Table 6 Willingness-to-pay for information services

Model 1 in Table 5 relays that there is some variation in mean information acquisition rates across the types of services, which range from 53% in the perfect information treatment to 62% for the sequential information service. The acquisition rates are statistically equal, except when comparing the perfect and sequential services. This difference also holds when comparing the subgroup for which the service cost is zero (89% for T5 vs. 81% for T1). Thus, this result does not appear to stem from the fact that, for the sequential service treatment, purchasing just the one signal costs half as much. As highlighted in Model 1 in Table 6, the mean WTP for the three services is similar, averaging around 55 lab dollars. These estimates are statistically equal. The observed difference in the acquisition decision between perfect and sequential service treatments is basically washed out when comparing WTP, as those who only acquired one information source in the sequential treatment have revealed a lower WTP for the service.

The comparison of acquisition rates (or WTP) is surprising, as the perfect information service has the highest value theoretically, in the sense that it leads to the relatively highest reduction in expected reporting costs (assuming optimal decision making). A similar, unanticipated information quality effect is also found in Vossler and McKee (2017), although in that study the less precise information service provided a narrower range that contained the true liability, rather than two possible credit amounts.

Turning to Model 3 in Table 5, there are some behavioral effects of a past audit. In particular, within non-compliers, being audited decreases information service acquisition in the next period for both the simultaneous and sequential information services. This may be construed as evidence supporting the bomb crater effect, perhaps due to loss repair, albeit in the context of information acquisition rather than tax reporting. Based on the WTP regressions, non-compliers who were audited in the prior period have 13 and 8 lab dollar reductions in WTP, respectively, for the simultaneous and sequential services. Of course, the information provided by these services is imprecise, and so this behavioral reaction may stem from participants who intended to report truthfully (at least in expectation) only to find themselves out of compliance. With the perfect information service, the participant knows for sure whether her reporting is truthful or not, and hence there is no blame to place on the service if an audit reveals evasion.

Although we report a negative behavioral reaction to a prior (penalizing) audit, the acquisition rates for the imprecise service treatments are stable across the 20 decision periods, with a rate of 60% in round 1 and 57% in round 20. What explains this discrepancy is that, as the experiment progresses, more participants are in the complier group. As indicated in the models, the information acquisition rate is much higher for compliers than for non-compliers. (In contrast, for the certain and uncertain liability baselines, the fraction of compliers holds steady over the course of the experiment.) There is an increasing trend in perfect information uptake, however, with a rate of 45% in round 1 and 56% in round 20.

For the perfect and simultaneous information service treatments, there are positive and significant effects of earned income on information acquisition. Recall that expected earnings are roughly equal across income levels. Thus, we did not expect to observe any income effect. In fact, since liability uncertainty in our design is decreasing with income, theory suggests that the value of better liability information should be lowest for those in the highest income group. One possible explanation is that participants are motivated by relative (pre-tax) earnings. When a participant earns a low income, she may be compelled to “keep up with the Joneses” by not paying for information. Finally, as expected, information uptake is decreasing in service cost. Based on the estimates, the difference in information acquisition is about 26 percentage points lower when the service cost increases by 50 lab dollars. This price-responsiveness is less pronounced than what is predicted by theory. Indeed, although theory predicts that no individual should purchase a service when it costs 100 lab dollars, in actuality the uptake rate is 35%.

Comparing models with and without including demographic variables does not alter any conclusions. We do find that being female [Model 2: −0.06 (0.04); Model 4: −0.07 (0.03)] or participating at Lab 2 [−0.09(0.04); −0.11 (0.03)] has at least a marginally significant and negative effect on service requests, whereas having a college degree [0.14 (0.04); 0.09 (0.03)] or being a full-time student [0.11 (0.04); 0.06 (0.03)] increases acquisition. Interestingly, females evaded to a lesser extent but were less likely to request information. This may be driven by an underlying taste for being compliant, and this for the most part does not necessitate resolving liability uncertainty (i.e., participants can simply choose to underreport the credit in expectation). In terms of the location effect, those at Lab 2 reported a higher credit but requested information less often. Thus, there could be an underlying taste for evasion that is relatively higher for this group. We summarize our main results below as they relate to Hypotheses 4–6:

Result 4 :

The propensity to acquire a liability information service does not increase with service quality.

Result 5 :

The propensity to acquire a liability information service decreases with its cost.

Result 6 :

For the sequential and simultaneous (i.e., imprecise) information services, the effect of a penalizing audit is to decrease the propensity to acquire the service.

4.2.1 Supplemental analysis

We estimated some additional regressions to gain insight and report these models in Appendix C. First, as an alternative dependent variable in the reporting models we use instead the level of tax non-compliance (i.e., Reported tax credit minus Actual tax credit). This leads to the same statistical equalities and differences across treatments, and similar stories in terms of behavioral dynamics observed. Second, as an alternative to the linear probability models for the information acquisition decision, we estimate probit models. The marginal effects (evaluated at the mean of the data) are very similar to those from the linear probability models. Third, we re-estimate the credit reporting and information acquisition models using instead either the first ten decision periods or the last ten decision periods. In the credit reporting models, the differences between the information service and the non-information service treatments persist, but weaken slightly in magnitude. The observed behavioral responses to an audit in the certainty and uncertainty baselines become more pronounced with repetition. In the information acquisition models, the behavioral responses to an audit tend to get stronger over time.

The mean credit reporting levels for the two non-information service treatments are reasonably stable with repetition, which seems at odds with the behavioral responses to audits we uncovered.Footnote 16 That is, if the response to a past audit in these two treatments is to increase the reported credit, should we not observe that the reported tax credit increases over time? As it is not possible to add further lags of the four status dummies we include for each treatment in Models 3 and 4 due to perfect collinearity, we instead estimate some exploratory dynamic models that allow the reported credit to depend on whether an audit occurred last period, two periods ago, and in both of the prior periods. In brief, the models continue to suggest no behavioral response to audits in the information service treatments. For the non-information service treatments, rebound effects are observed. There continues to be higher reporting in the period following an audit, but this effect is offset by a decrease in reporting two periods following an audit. Similar rebound effects have been identified in related research (Kirchler 2007; Maciejovsky et al. 2007; Mittone 2006).

5 Discussion

This study uses an economics laboratory experiment to study behavioral dynamics in settings with and without available taxpayer information services, in particular services that decrease uncertainty over tax liability. Through a random audit selection mechanism that is held constant over the course of the repeated experiment, the experimental design allows us to identify the effects of past penalizing and non-penalizing audits. Consistent with prior experiments, we do find evidence of a “bomb crater effect”—that a prior audit motivates increased evasion—in our treatment with certain tax liability. As this occurs following both a penalizing or non-penalizing audit, one possible explanation for the behavioral response is misperceptions regarding random audit probabilities (i.e., a gambler’s fallacy). This effect is robust to the case of uncertain liability we examine, thus extending the bomb crater effect to settings where compliance status is partially determined by decision errors (and not simply intended evasion). Importantly, we find that making available an information service mitigates this behavioral response; indeed, even those that never obtain the service display this behavior. However, consistent with the bomb crater effect, those who experienced penalizing audits are less likely to acquire the information service, at least in the case for services that only partially resolve uncertainty. This may be attributable to loss repair behavior, in the sense that taxpayers who wish to recover losses from a penalizing audit should have little desire to be informed of their true tax liabilities.

With a complex tax system, taxpayers are predicted to respond positively to the provision of information services that reduce the cost of computing true tax liabilities. The results reported here demonstrate that making available such services serves to decrease tax evasion. Further, we find that decreasing the net value of information—by increasing the cost of obtaining it—decreases taxpayer service requests. We examined two types of liability information services: one that resolves uncertainty completely and imprecise services that can provide two different signals of the true tax liability. Although theory would predict that the perfect information service would both be more valuable and lead to lower evasion, on average the perfect and imprecise services performed about the same in these dimensions. Nevertheless, as mentioned above, we did find differences with respect to the effect of past penalizing audits on information service acquisition.

As a potential policy option, our findings suggest that an information program that targets those audited in the prior year, regardless of the audit outcome, could be an effective method to increase tax compliance. These individuals are less likely to acquire information on their own, but when given such information it decreases their evasion considerably. Although we only consider a random audit setting, a dynamic audit policy that targets past offenders may provide an alternative albeit less customer-friendly approach. Our experiment does not incorporate the cost to the tax agency of providing information services; however, the improved tax reporting behavior suggests there is potential for a positive return to providing this service. The response of participants to the cost of acquiring information was predictable. While the information acquisition costs in the experimental setting were monetary, we would expect a similar response to higher costs even if they were in the form of higher transaction costs, such as waiting time to receive assistance.