1 Introduction

The provision of ecosystem services from agricultural lands often provides many benefits to society, but individual landowners typically are not compensated for providing such goods in their day-to-day operations. Under a payment for ecosystem services (PES) program, landowners receive financial compensation from the government or another entity for generating conservation improvements on their land (Wunder 2005). PES schemes are market-based mechanisms that rely on incentives to induce behavioral change (Jack et al. 2008). They are potentially more efficient than regulatory, command-and-control approaches because they help to focus conservation efforts in areas where the cost of achieving conservation is relatively low. However, since they are voluntary, their success depends on correctly designing the institution and incentives to induce buyer and seller participation.

We focus in this paper on habitat exchanges, a PES mechanism in which buyers and sellers trade quantifiable, third-party verified units of conservation, called credits. Landowners generate credits by implementing practices that produce measurable conservation outcomes to maintain or enhance habitat. Entities that impact the landscape through development can purchase these credits to meet compensatory mitigation requirements placed on them by regulators (Federal Register 2016). Buyers (for example energy companies in need of conservation credits) and sellers (generally private landowners engaged in extensive agricultural activities such as ranching) come to agreement over price and quantity for conservation credits in a two-sided market setting. This is in contrast to a publicly funded conservation program, in which a government agent seeks to maximize the amount of conservation achieved subject to a budget constraint.Footnote 1

Habitat exchanges are similar to transferable pollution rights markets: a regulatory agency requires an energy developer to purchase off-site mitigation credits in exchange for the right to disturb the landscape. The theoretical gains available from implementing a market for tradable discharge permits (conservation credits in our case) over a regulatory, command-and-control approach are well-established (Montgomery 1972; Krupnick et al. 1983; McGartland and Oates 1985). However, market design (delivery mechanisms and contract features) affects the gains realized in practice (Lyon 1982; Hahn 1989). Implementation details must be appropriate to the local context and conducive to landowner participation for habitat exchanges to be successful (Christensen et al. 2011; Hanley et al. 2012; Torres et al. 2013; Hansen et al. 2018). The present study consequently incorporates institutional detail specific to habitat exchanges and the nature of conservation in the western U.S. For example, the trading institution most likely to prevail in habitat exchanges is private negotiation (in which buyer and seller negotiate individually over the price of conservation units) rather than auction, as stakeholders (including landowners) have indicated a preference for private negotiation.Footnote 2 This limits the generalizability of the results, at least on first examination. However, as noted by Cason et al. (2003) in a similar context, the specific details make the results more relevant and useful to the application at hand.

Habitat exchanges are mostly at an early developmental stage. Whether they succeed in producing habitat quality enhancements that improve welfare depends largely on whether the institutional design induces buyer and seller participation. Market participants likely face three important risks that affect habitat exchange outcomes. First is matching risk: the risk of failing to find a willing trading partner in the private negotiation setting. Second is inventory loss risk: the risk to sellers of failing to sell—or being forced to accept discounted prices on—units already produced in an advance production setting. Both have been studied extensively in agricultural markets (Menkhaus et al. 2003a, 2007; Nagler et al. 2015). The second has also been studied in the context of water quality trading (Jones and Vossler 2014). One additional type of risk that exists in habitat exchanges is the post-production risk of credit failure—the risk that conservation credits fail to maintain habitat quality over their contract life due to events outside the control of landowners. If habitat exchanges fail to address these risks, particularly failure risk, they will not attract sufficient buyers and sellers to constitute a viable option for compensatory mitigation, resulting in a failed market.

We implement laboratory market experiments to assess the impact of three features on habitat exchange outcomes: delivery method (which affects the presence of inventory loss risk), failure risk, and potential reimbursement to landowners for failed credits. Transactions are privately negotiated (buyers and sellers negotiate over price in pairs) rather than via auction, reflecting the trading institution likely to prevail in habitat exchanges. The next section presents relevant background on habitat exchange institutions and some theoretical considerations. We follow this with the experimental design and description of the analysis. We then discuss results and conclude with implications for habitat exchange implementation. Our main findings support the necessity of addressing credit failure risk, given the habitat exchange institutions likely to prevail.

2 Policy Background

In 1995, the U.S. Army Corps of Engineers authorized the establishment of wetland mitigation banks, through which developers could offset unavoidable damage to wetlands and other aquatic resources by enhancing and protecting other land in perpetuity (Federal Register 1995). More recently, the U.S. Fish and Wildlife Service (USFWS) authorized conservation banks to offset impacts to species listed as threatened or endangered under the Endangered Species Act (USFWS 2003). In both cases, a developer in need of compensatory mitigation pays a mitigation or conservation bank for credits at a mutually agreed-upon price.

Environmental non-governmental organizations (including Environmental Defense Fund, Environmental Incentives, and Willamette Partnership), landowners, and industry partners developed the habitat exchange concept in response to perceived shortcomings of the mitigation/conservation banking model (EDF 2017). They sought to develop a model of habitat improvement that would increase the scientific rigor associated with quantifying ecological benefit, streamline the regulatory approval process, and facilitate landowner participation in environmental markets both by allowing for term leases rather than just perpetual contracts and by obviating the need for significant upfront capital investment.

The Bureau of Land Management and USFWS have both recognized habitat exchanges as a valid mechanism through which energy companies can meet their compensatory mitigation requirements (BLM 2016; Federal Register 2016). Currently, the western U.S. states of Colorado, Montana, Nevada, and Wyoming are developing habitat exchanges for greater sage-grouse (Centrocercus urophasianus), though few exchanges have executed trades so far.Footnote 3 To date, the major emphasis in habitat exchange development has been specifying the landowner eligibility criteria and ecological quantification metrics needed to assess conservation and designing protocols to ensure conservation credits developed through exchanges meet USFWS requirements on durability, transparency, and accountability (Federal Register 2016; WCE 2017). Developing the market structure through which credit trading will ultimately occur has received less attention. Yet market design is crucial to program success, as the quantity of habitat created and traded depends on interactions between market design and the risks faced by conservation providers in several key ways.

First, habitat exchange transactions are likely to be bilaterally negotiated between interested buyers and sellers (through private negotiation) rather than auction-based (a more commonly modeled conservation market structure), especially if regulatory agencies require that compensatory mitigation be proximate to the disturbance. This market institution creates the potential for a phenomenon termed matching risk. Although matching risk has not been addressed in the literature on conservation program design, it has been studied in the context of conventional agricultural markets. Private negotiation creates the risk that participants cannot find a willing trading partner at the time they desire to trade, since negotiating privately makes it difficult for buyers and sellers to find one another (Menkhaus et al. 2007), especially in geographically constrained or thin markets. In habitat exchanges this risk may limit the quantity of conservation produced and traded.

Second, federal and state agencies with authority over compensatory mitigation generally require that only verified conservation can be traded through habitat exchanges (BLM 2016; Federal Register 2016; SGI 2017; USFWS 2003). As a consequence, sellers must generate conservation before they have found a buyer or negotiated price. This feature is in contrast to traditional Natural Resources Conservation Service conservation programs through which landowners are paid for practices (for example cheatgrass removal) regardless of whether such practices yield measureable habitat improvements (Hansen et al. 2017). This risk is comparable to the risks agricultural producers face in advance production markets, where sellers incur production costs prior to sale and lose costs on unsold or discounted production. This risk is called inventory loss risk (Menkhaus et al. 2003a).

Inventory loss risk hinders seller bargaining power. Sellers also cut back on production when they risk unsold inventory. The risk of not matching with a willing buyer to trade all units produced (present in the private negotiation trading institution) compounds potential seller losses from unsold inventory. Buyers are consequently able to purchase units for reduced prices (Menkhaus et al. 2003b). As a result, privately negotiated prices and quantity traded under an advance production delivery method are lower than under a production-to-demand delivery method, in which sellers only initiate production after a sales contract is in place (Menkhaus et al. 2003a, b). Jones and Vossler (2014) similarly find that the upfront investment required to produce water quality abatement credits when regulators require sellers to commit to production levels before finding a buyer and negotiating price reduces production.

In traditional agricultural markets, this inventory loss risk stems from the possibility that existing inventory might not be sold. In habitat markets, inventory loss risk is exacerbated by the fact that conservation practices do not always result in verifiable conservation. This is particularly relevant for greater sage-grouse habitat located in semi-arid locations in the western U.S., as new vegetation is typically difficult to establish and grow in this region. Invasive species such as cheatgrass may flourish, reducing establishment and growth of desired forage species. Sellers may also reduce production and trade at reduced prices (to recover at least some production costs) for this reason as well.Footnote 4

Third, federal and state agencies generally require that conservation traded through habitat exchanges be monitored and maintained for the life of the credit (BLM 2016; Federal Register 2016; SGI 2017; USFWS 2003). This durability requirement also places credit failure risk (in addition to matching and inventory loss risks) on sellers. Failure risk is the post-production risk that verified conservation credits fail to maintain habitat quality over their contract life due to events outside the control of landowners. These conservation credits are anticipated to be traded in contracts between 20 and 50 years in length, with periodic monitoring protocols in place to ensure satisfactory maintenance of the credits for the duration of the contract. When credits fail before the end of their contract (perhaps due to climate conditions or wildfire), sellers bear the cost of failed credits. This failure risk is likely to reduce significantly the quantity of conservation credits supplied to the market, and credit price is likely to be higher. A relatively simple risk mitigation strategy—requiring conservation credit buyers (those seeking to offset development impacts) to reimburse sellers the costs they incur to produce failed credits—may mitigate the influence of failure risk, thereby positively affecting market outcomes associated with conservation credits supplied and traded.

Little data exists on market outcomes (quantity traded, price, overall earnings, distribution of earnings between buyers and sellers) for conservation markets. Mitigation and conservation bank pricing information is proprietary. Credit trading through banks also tends to be thin, as regulators often require compensatory mitigation offsets to be located close to the disturbance (Hansen et al. 2017). Once habitat exchanges are operational, they are likely to be just as thin as banks, with similar limitations on access to trading data. This lack of data makes traditional econometric analyses impossible. Thus, to achieve our research objective, we design and conduct laboratory market experiments to test three propositions:

  • \(P_{1}\): Delivery method (whether advance production or production-to-demand in which sellers only initiate production after a sales contract is in place) affects market outcomes.

  • \(P_{2}\): The post-production risk of credit failure affects market outcomes.

  • \(P_{3}\): A private party risk mitigation strategy of seller cost reimbursement by buyers affects market outcomes.

3 Experimental Design and Laboratory Procedures

Trades in the laboratory market were negotiated between buyer–seller pairs submitting bids and offers over a computer network. Each experimental market session consisted of four buyers and four sellers. Buyers and sellers were randomly matched into four buyer–seller pairs to negotiate trades. They traded a generic commodity, or “unit,” in a currency called “tokens,” based on the seller production costs and buyer redemption values for each successive unit provided to them (Table 1). To motivate preference revelation, participant earnings were converted to dollars and paid out in cash at the end of each session (Friedman and Sunder 1994)Footnote 5 at an exchange rate of 100 tokens for $1.00.

Table 1 Per-unit buyer redemption values and seller production costs (tokens)

Experimental sessions followed a standard procedure. All participants were paid a show-up fee of $15.Footnote 6 Participants were randomly assigned to be either buyers or sellers at the start of each session and retained their roles throughout the session. Participants knew only their own role as a buyer or seller and not the role or identity of other participants. Participants were spaced and positioned at computer stations to ensure that participants’ roles, randomly matched trading partners, and decisions remained confidential.

Instructions (see “Appendix A”) outlining the basic market design, trading mechanics, and how earnings were calculated and paid out were presented to participants. Figure 1 illustrates the timeline of a trading period. Prior to trading, buyers were shown redemption values for the eight units they could purchase and sellers were shown production costs for the eight units they could produce and sell. Participants saw only their own value or cost schedule. In the advance production treatments, sellers made a decision about how many units to produce while buyers waited. In the production-to-demand treatments, the credits produced equaled the credits traded.

Fig. 1
figure 1

trading period timeline with market delivery and risk treatments

One or more practice periods were conducted, depending on participants’ comfort with the computer program and willingness to transition to the actual experiment. The actual experiment began when participants reported being comfortable with the mechanics of trading. The actual experiment used different production cost and redemption value schedules than the practice periods. Each market session continued for a minimum of 20 trading periods with a random stop invoked after period 20.Footnote 7

Each trading period was divided into three 1-min bargaining rounds. Buyers and sellers were randomly matched by the computer program into four trading pairs. Units were traded one at a time, starting with the first units on participants’ redemption value and production cost schedules. Each pair bilaterally bargained over price via the computer. At the end of each 1-min round, the computer re-matched participants into bargaining pairs through stranger matching (Menkhaus et al. 2007) and trading resumed.Footnote 8 A trading screen also showed values and costs for each unit as well as current bids and offers. As trading proceeded, buyers and sellers knew only their own bids or offers and those of their current trading partner. Participants were matched three times to negotiate trades on one set of up to eight units during each trading period. Paired trading over three rounds mimics matching risk in thin or geographically constrained private negotiation markets (Menkhaus et al. 2007). If a participant was matched in a later round with a participant who had already traded their units, they did not have the opportunity to trade units in that round. At the end of each trading period, a re-cap screen privately displayed each participant’s earnings from the current period and cumulative earnings so far for the session. It was at this point that participants learned whether some of the units they had traded in the current period had failed.

We construct six treatments in this experimental market setting to test the stated propositions using a between-subjects design (Charness et al. 2012):

  1. 1.

    Production-to-demand market without failure risk (PTDBase). Only matching risk is present.

  2. 2.

    Advance production market without failure risk (APBase). Adds inventory loss risk to PTDBase.

  3. 3.

    Production-to-demand market with failure risk (PTDRisk). Adds failure risk to PTDBase.

  4. 4.

    Advance production market with failure risk (APRisk). Adds failure risk to APBase.

  5. 5.

    Production-to-demand market with failure risk and seller cost reimbursement (PTDRiskSCReimb). Adds seller cost reimbursement—where buyers reimburse sellers for the costs of producing failed credits—to PTDRisk.

  6. 6.

    Advance production market with failure risk and seller cost reimbursement (APRiskSCReimb). Adds seller cost reimbursement to APRisk

Earnings in PTDBase and APBase are calculated as follows. Profit for the buyers is given by the following:

$$\uppi = {\text{R}}\left( {{\text{q}}_{\text{t}} } \right){-}{\text{pq}}_{\text{t}} ,$$
(1)

where R is the value or revenue generated from purchasing units of conservation and qt is units traded. The individual buyer’s demand curve for conservation is MR = R′(qt).

Profit for sellers is given by the following:

$$\uppi = {\text{pq}}_{\text{t}} {-}{\text{C}}\left( {{\text{q}}_{\text{p}} } \right),$$
(2)

where p is the agreed upon transaction price for conservation units, qt is units traded, and C(qp) is total cost of producing qp conservation units. In production-to-demand markets (denoted with superscript pd), q pd t  = q pd p and so the supply curve is MCpd = C′(q pd t ).

In advance production markets (denoted with superscript ap), q ap t  ≤ q ap p , since the quantity of conservation produced is decided prior to trading and sellers may not sell all units they produce. The individual seller’s supply curve for conservation in advance production markets is thus MCap = C′(q ap p ). This inventory loss risk has been found to significantly decrease the quantity that sellers are willing to supply (Menkhaus et al. 2003a, b).

The first order market-clearing conditions require the following:

$${\text{MR}} = {\text{p}} = {\text{MC}}.$$
(3)

Given that q ap t  ≤ q ap p , C′(q ap t ) ≤ C′(q ap p ), resulting in agents behaving as if MCpd ≤ MCap. Thus we expect that q ap t  ≤ q pd t at equilibrium.

In the absence of market risks, the buyer and seller schedules result in a competitive market equilibrium price of 80 tokens and an equilibrium quantity of 20 units (five trades per buyer/seller). This suggests buyer and seller earnings at equilibrium of 150 tokens and total earnings (sum of buyer and seller earnings) of 1200 tokens per trading period (Table 2). Buyer redemption values, seller production costs, and the resulting equilibrium values were selected to be consistent with past market experiment research rather than to replicate realistic values for habitat exchanges, which are generally not publicly available. Results reported below thus indicate generalized market outcomes rather than outcomes with values specific to habitat exchanges.

Table 2 competitive equilibria across treatments

In the failure risk treatments PTDRisk and APRisk, the seller incurs the cost to produce a unit (before negotiating price in the advance production market or after negotiating the price in the production-to-demand market), but faces the risk that the unit will not maintain habitat quality for its contract life. We represent this risk as ηt. A seller’s expected value of their supply schedule is now:

$${\text{MC}}^{\prime }_{\text{t}} = {\text{MC}}_{\text{t}} +\upeta_{\text{t}} .$$
(4)

When a unit fails, the seller cannot sell the unit and receives no revenue yet still incurs its production cost. We employ an average 25% unit failure rate (FR); 50% of sellers in every period lose 50% of credits produced. Davies et al. (2011) conclude that restoration efforts in sagebrush communities with severe (rather than moderate) disturbances are more likely to fail. Thus, we model those units requiring more effort (and so higher production costs) as more likely to fail. In recognition of the fact that more aggressive, higher-cost habitat restoration activities are more likely to fail than less expensive habitat restoration activities (that is, land is already closer to meeting standards before restoration investment), the highest-cost units produced are deemed to be the failed units. The number of units that fail is rounded up. (A seller who trades three units and is randomly selected for half of those units to fail loses two units.)

Failure risk shifts the expected value of supply schedule inward. (The demand schedule remains unchanged.) The decrease in supply is not a linear transformation of the original supply. The premium required to induce supply in the presence of failure risk is equal to ηt, which is decreasing in the number of units traded. For example, if a seller trades only one unit without failure risk, MC1 = 30. With failure risk, the seller must invest more than MC1 into production of the first unit to be assured of producing a non-failing unit:

$$MC_{1}^{\prime } = MC_{1} \left( {\frac{1}{1 - FR}} \right).$$
(5)

If MC1 is 30 and the failure rate is 50% (which it is for the first unit because of the rounding up condition), the effective marginal cost of producing a non-failing unit is 30/(1–0.50) = 60. Thus \({\text{MC}}^{\prime }_{1}\) = 60, and η1 = 30. On the other hand, if a seller trades three units, each of the last two units has a 50% chance of failing but the first unit will not fail in any event. In this situation, the total cost of producing three working units is 30 + (1/1 − 0.5)*(40 + 50), for a total cost of 210 tokens.Footnote 9 The same logic yields a total cost of 290 for a seller who trades four units, so the effective marginal cost for producing four working units is 290 − 210 = 80. Thus η4 =  \({\text{MC}}^{\prime }_{4} - {\text{MC}}_{4}\)  = 20. Incorporating ηt into the augmented expected value of the supply schedule results in a predicted equilibrium quantity traded of 16 and predicted competitive equilibrium price of 100, generating a total surplus of 700 tokens (55 for each buyer and 120 for each seller) (Table 2).

In the reimbursement treatments PTDRiskSCReimb and APRiskSCReimb, the buyer receives no value from a unit that fails to maintain habitat quality for its contract life and also reimburses the seller for the costs incurred to produce the unit. Because a buyer may have purchased the failed unit anywhere in their demand schedule, the risk to the buyer, γt, is not equal to ηt. A buyer’s expected value of the demand schedule is now:

$${\text{MR}}^{\prime }_{\text{t}} = {\text{MR}}_{\text{t}} {-}\upgamma_{\text{t}}$$
(6)

Although the seller’s highest cost units (those associated with the lowest profits) are the ones at risk to fail, buyers are at risk of having to reimburse sellers for the cost of a unit (and losing the revenue generated by that lost unit) anywhere on the buyer’s redemption schedule. Since the risk of failure emanates from the supply side, any unit that a buyer receives from a seller has a failure risk. For example, assume that a buyer trades only one unit. Without failure risk, a buyer’s value for the first unit purchased is 130: MR1 = 130. There is a 25% chance this first unit will fail with failure risk, leaving the buyer with no revenue from the unit’s sale and with the obligation to reimburse the seller for the cost of production: \({\text{MR}}_{1}^{\prime }\)  = 130 − 0.25(130 + 30) = 90. Thus γ1 = 40. Yet when a seller trades four units, γt increases. Here the buyer faces the risk that any of the four units sold by the seller will fail: thus the \({\text{MR}}_{4}^{\prime }\)  = 100 − 0.25 * (100 + 45) = 63.75, where 45 is the average production cost of the seller’s first four units. Thus γ4 = 36.25. Regardless of how many units are traded, ηt is lower than γt. The market equilibrium price with seller cost reimbursement is 70 tokens and the equilibrium quantity traded is 16 units. Given the differences in these risks, seller cost reimbursement treatments generate a surplus of 660 tokens (80 for each buyer and 85 for each seller), which is less than the surplus of 700 from the failure risk treatments (Table 2).

4 Data Analysis

We compare treatments graphically and empirically using the asymptotic convergence model first employed in economics experiments by Noussair et al. (1995). Convergence analysis weights later periods more heavily than earlier periods to account for participant learning. Within the convergence specification,

$$Z_{it} = B_{0} \left[ {\frac{t - 1}{t}} \right] + B_{1} \left( {\frac{1}{t}} \right) + \mathop \sum \limits_{j = 1}^{{\left( {i - 1} \right)}} \alpha_{j} D_{j} \left[ {\frac{t - 1}{t}} \right] + \mathop \sum \limits_{{\left( {j = 1} \right)}}^{{\left( {i - 1} \right)}} \varGamma_{j} D_{j} \left( {\frac{1}{t}} \right) + u_{it} .$$
(7)

\(Z_{it}\) is the dependent variable of interest (quantity of conservation credits or units traded, units produced, average price, total earnings, buyer earnings, or seller earnings) for treatment cross-section i and trading period t. \(B_{0}\) is the asymptotic convergence value of the dependent variable for the baseline treatment, production-to-demand with no failure risk (PTDBase). \(B_{0}\) is weighted more heavily in later periods. \(B_{1}\) is the starting value of the dependent variable for the baseline treatment. The coefficients \(\alpha_{j}\) and \(\varGamma_{j}\) are, respectively, adjustments to the asymptote and starting level for each treatment’s relation to the baseline treatment. In the analysis these coefficients indicate mean difference for the parameter of interest in the baseline treatment. The dummy variable \(D_{j}\) captures potential differences across treatments (equal to zero for the baseline treatment and one for the jth compared treatment), and \(u_{it}\) is an error term.

Panel datasets collected over multiple trading periods may be serially correlated and heteroskedastic. Data may also be contemporaneously correlated between treatment cross sections because the same unit values or costs are used across subjects. Panel-corrected standard error estimates and the Prais-Winston transformation (assuming a common AR(1) coefficient across groups) were used to adjust for these statistical issues (STATA n.d.).

Adjusted treatment coefficients for market outcomes estimated in the convergence model are tested for statistical differences between treatments using a t test. Formal hypothesis testing on convergence analysis parameters is only valid if the residuals for each variable of interest are normally distributed. Using a Shapiro–Wilk test at a 95% confidence level, we fail to reject the null hypothesis that residuals are normally distributed for each variable of interest.

5 Results

A total of 248 subjects recruited at the University of Wyoming participated in 31 market sessions (Table 3).Footnote 10 Each market session generated 20 periods of data, for a total of 620 trading period averages across all market outcomes.Footnote 11 Participants earned an average of $40.54 for a 2-h time commitment.

Table 3 Summary of experimental treatments

To describe converged market outcomes over time, estimated converged asymptotes (\(B_{0}\)) are reported for the baseline treatment with additional adjustment coefficients (\(\alpha_{j}\)) for each test treatment (j) in Table 4.Footnote 12 (Treatment averages and their standard deviations are reported in “Appendix B”).

Table 4 Estimated baseline (PTDBase) asymptotes (\(B_{0}\)), and treatment adjustment coefficients (\(\alpha_{j}\)) (standard errors), [convergence levels (\(B_{0}\)  +  \(\alpha_{j}\))], and market efficiency (converged total earnings/predicted total earnings) for market outcomes

5.1 Impacts of Failure Risk and Seller Cost Reimbursement on Quantity of Conservation Traded

The risk of matching buyers and sellers at different points in trading schedules has been found to reduce quantity traded in privately negotiated markets relative to the predicted competitive equilibrium (Menkhaus et al. 2003a). Our results are consistent with this finding; the converged value for quantity traded in PTDBase, the treatment without inventory loss risk or failure risk, is at 17.14 units (Table 4, column 1), below the competitive equilibrium of 20. Quantity traded in PTDBase is, however, significantly higher than in any other treatment. With the addition of inventory loss risk, quantity traded converges to a significantly lower number of units traded in APBase (13.96 units) than it does in PTDBase. This result of fewer units traded in advance production markets is also consistent with previous research (Menkhaus et al. 2003a).

The four failure risk treatments are most relevant to habitat exchange market design. Fewer units are traded relative to PTDBase and APBase when sellers bear the risk of unit failure (APRisk converges at 13.33 units and PTDRisk converges at 11.17). However, when the risk of unit failure is transferred to the buyers, quantity traded increases substantially (APRiskSCReimb converges at 16.20 units and PTDRiskSCReimb converges at 15.28 units), even to the point of surpassing quantity traded in APBase (13.96). This is remarkable, as the predicted equilibrium quantity is the same (16) for both the failure and seller cost reimbursement treatments. The seller cost reimbursement mechanism is designed to protect sellers from credit failure risk, but it also acts to reduce sellers’ perceived risk of inventory loss, encouraging them to increase quantity. By reducing the risk borne by sellers, the seller cost reimbursement treatments encourage more conservation trades overall.

Delivery method creates some interesting differences between treatments in quantity traded. Failure risk reduces quantity traded substantially in the production-to-demand treatments (Fig. 2, left side); in the absence of inventory loss risk, failure risk is significant. (Quantity traded in PTDRisk is about six units below PTDBase; PTDRiskSCReimb increases quantity traded to about two units below PTDBase.) The addition of failure risk has a less dramatic effect on quantity traded in the advance production treatments, however (Fig. 2, right side). (Quantity traded in APRisk is less than one unit below APBase; quantity traded in APRiskSCReimb is higher than in APBase, by about two units.) Sellers in APBase already face inventory loss risk; the impact of adding the additional failure risk in APRisk is thus lessened. The most likely reason for this modest decrease is that the risk associated with inventory loss is a sufficiently strong motivator for sellers to decrease quantity that the addition of failure risk has a relatively small effect. Sellers who bear inventory loss risk under advance production may choose to sell units at a loss rather than not sell the units. Reimbursement thus helps mitigate inventory loss risk as well as failure risk.

Fig. 2
figure 2

Quantity of conservation credits traded

Statistical tests show significant differences between all treatments for quantities traded. These results suggest delivery method, failure risk, and seller cost reimbursement significantly affect quantities traded and provide support for our propositions.

5.2 Impacts on Conservation Credit Prices

Prices are generally higher for the failure risk treatments relative to the comparable base treatments. This change is due to the constrained supply of conservation credits available in the market (as predicted) and reflects the need for sellers to receive higher prices to cover the risks they bear. Prices are lower than the predicted price of 100 in the failure treatments (Fig. 3). Price is highest in PTDRisk (92.67) and much lower in APRisk (81.22), reflecting the reduced bargaining power that sellers have when they bear both inventory loss risk and failure risk (Table 4, column 2).Footnote 13 Inventory loss risk associated with advance production allows buyers to exert downward pressure on prices, which are lower in APBase (74.80) than in PTDBase (80.94) and lower in APRisk than in PTDRisk (as noted above).

Fig. 3
figure 3

Conservation credit price

Prices in the seller cost reimbursement treatments generally come very close to the predicted equilibrium of 70 tokens (Fig. 3); converged prices are 68.13 for PTDRiskSCReimb and 69.31 for APRiskSCReimb. As expected, when buyers reimburse sellers for failed units, demand shifts downward relative to the comparable base treatments. But sellers also produce and trade more units. The net effect is lower prices in the presence of seller cost reimbursement relative to the failure risk treatments. These results support our proposition; delivery method and failure risk do affect exchange outcomes.

5.3 Earnings and Efficiency Changes from Failure Risk and Risk-Sharing

Total earnings, or total surplus, are a measure of market efficiency. Total earnings (sum of all buyer and seller earnings within a market session) are noticeably lower in the four treatments with failure risk than they are in PTDBase (1074.06 tokens) and APBase (1001.20 tokens) (Table 4, column 3). The difference is due to the loss in production costs (borne by sellers in the failure treatments and by buyers in the reimbursement treatments) on failed units. Total earnings for the risk treatments PTDRiskSCReimb (545.76 tokens), PTDRisk (510.16 tokens) and APRisk (488.06 tokens) are not statistically significantly different from one another. APRiskSCReimb (647.60 tokens) does however result in higher total earnings than the other three treatments with failure risk. We return to this result below.

Total earnings in PTDBase and APBase are 90% and 83% of the 1200-token surplus predicted for the comparable base treatments. In PTDBase, the lower total earnings is due to matching risk. In APBase, lower earnings is due to matching risk and inventory loss risk (Menkhaus et al. 2003a).

Failure risk reduces predicted quantity traded well below the equilibrium predicted for the comparable base treatments, from 20 to 16 units; as such the predicted total surplus is greatly reduced, from 1200 to 700 tokens. It is appropriate to compare experimental results for the failure risk treatments to the lower prediction, as this represents the maximum surplus available given the risky environment. Total earnings in PTDRisk and APRisk are only 73% and 70% of the maximum surplus available in the failure risk treatments (Fig. 4). These efficiency percentages are lower than those for the comparable base treatments; buyers and sellers have more difficulty realizing surplus in the presence of failure risk, even controlling for the lower maximum surplus available. Overall market efficiency is greatly reduced in the presence of failure risk.

Fig. 4
figure 4

Market efficiency as a percentage of predicted equilibria

Seller cost reimbursement was predicted to decrease total earnings relative to the treatments in which sellers bore the risk of credit failure. However, our experimental results do not support these theoretical predictions. Introducing seller cost reimbursement increases quantity traded, expands total earnings, and improves market efficiency overall compared to the presence of credit failure risk alone. The increases are significant: total earnings rise to 545.76, reaching 83% of the predicted equilibrium of 660 tokens in PTDRiskSCReimb and 647.60 (or 98%) in APRiskSCReimb (Fig. 4). In short, seller cost reimbursement improves efficiency by 10–28% points relative to PTDRisk and APRisk, treatments. Seller cost reimbursement appears to mitigate both inventory loss and failure risk effects in APRiskSCReimb significantly, as total earnings are 15% points closer to the predicted equilibrium than they are in PTDRiskSCReimb. This result is surprising, given that PTDRiskSCReimb appears at first blush to possess less risk overall compared to APRiskSCReimb.

To explain APRiskSCReimb results, we turn to our experimental design—recall credits lost to failure risk are those with the highest production costs. A seller in APRiskSCReimb is cognizant of being reimbursed for highest production costs incurred from failure risk, while simultaneously maximizing sold credit profit to make up for any inventory loss risk. Thus, the participant achieves earnings equal to or, as in our results, higher than, PTDRiskSCReimb by trading with buyers at volumes and prices slightly higher than in PTDRiskSCReimb. These results point to the overall benefits of seller cost reimbursement and support our final proposition.

5.4 Impacts of Failure Risk and Risk-Sharing on Agent Incentives to Participate

As expected, seller earnings are highest in PTDBase (138.96 tokens) and APBase (107.26 tokens) (Fig. 5; Table 4, column 4). Once failure risk is introduced, sellers supply far fewer units to the market. In PTDRisk, absent inventory loss risk, sellers are able to receive significantly higher prices than in APRisk, mitigating some of the loss in earnings associated with fewer trades. Sellers on average receive 69.52 tokens in PTDRisk, about half of what they receive in PTDBase. Seller earnings are reduced even more dramatically in APRisk, where they receive approximately only one-third of their earnings in APBase (37.65 tokens). This dramatic reduction is brought on again by a reduction in supply, but is further compounded by lower prices due to reduced bargaining power of sellers in the advance production market. These dramatic changes in earnings for sellers point to the potential impact failure risk may have on the incentive for landowners to participate in these types of markets. Seller cost reimbursement, however, greatly improves seller outcomes in the advance production market when failure risk is present—earnings improve from 37.65 tokens in APRisk to 84.16 tokens in APRiskSCReimb.

Fig. 5
figure 5

Average buyer and seller earnings (tokens)

Buyer earnings are highest in APBase and only somewhat lower in PTDBase (converging at 143.40 and 129.24 tokens, respectively) (Fig. 5; Table 4, column 5). This mirrors earlier findings that advance production (relative to production-to-demand) improves buyer earnings (Menkhaus et al. 2003a). Our result that buyer earnings are higher in APRisk than in PTDRisk (84.31 and 57.45, respectively) is also consistent with these earlier findings. But with the addition of seller cost reimbursement, advance production no longer significantly improves buyer earnings over the comparable production-to-demand treatment, PTDRiskSCReimb. Further, buyer earnings are actually not harmed when buyers reimburse sellers for failed units. (Buyer earnings are not significantly different in APRiskSCReimb and APRisk, and not significantly different in PTDRiskSCReimb and PTDRisk.) Generally the act of mitigating failure risk for sellers through seller cost reimbursement helps sellers yet does not hurt buyers. Seller cost reimbursement has the added benefit—valued by regulators—of increasing the quantity of conservation provided and traded.

6 Conclusion

Seller earnings must be sufficiently high to attract landowner participation if habitat exchanges are to be successful. The expected value of conservation credits must exceed habitat restoration costs and the opportunity cost of putting the associated land to agricultural purpose. However, generating, verifying, and maintaining high-quality habitat can be a risky proposition—and therefore expensive—in the semi-arid conditions that persist in much of the western U.S. Finding ways to reduce seller exposure to some of these risks could help increase conservation credit transactions in habitat exchanges. This study uses laboratory experiments to examine three principal risks present in habitat exchanges: (1) matching risk inherent in the private negotiation trading institution likely to prevail in habitat exchanges; (2) inventory loss risk stemming from the requirement that sellers produce and verify credits before finding a buyer and negotiating price; and (3) the post-production risk that credits fail to maintain habitat quality during their contract life. We find several key results relevant to the establishment of successful habitat exchanges.

First, requiring buyers to reimburse sellers for credits that fail post-production improves market outcomes, even for buyers, because sellers produce and trade more units when this post-production risk is mitigated. Seller cost reimbursement has some potential to improve welfare and overall market efficiency, which in turn increases buyer earnings through the higher quantities traded. This is particularly true if landowners also face the inventory loss risk associated with advance production, though the same result holds even if the inventory loss risk associated with advance production is absent.

Second, experiment outcomes in the advance production treatment with seller cost reimbursement were closer to predicted outcomes than were the other risk treatments. Seller cost reimbursement seems to mitigate some inventory loss risk associated with advance production in addition to the post-production failure risk for which it was designed. The seller cost reimbursement mechanism is designed to protect sellers from failure risk, but it also acts to reduce sellers’ perceived risk of inventory loss, encouraging them to increase quantity. This result is unexpected but points to the necessity of performing such experiments rather than relying solely on theoretical calculations to inform market design. It is also an important finding, because regulators prefer that conservation credits traded through habitat exchanges be verified, making inventory loss risk an unavoidable feature of habitat exchanges. This finding should also be of interest to regulators and environmental NGOs interested in increasing habitat. Inventory loss creates a cost to sellers, but unsold conservation credits still have ecological value on the landscape.

Although the results clearly indicate the benefits of having buyers reimburse sellers for failure risk, none of the habitat exchanges currently in development contemplate seller cost reimbursement. Broadly speaking, energy companies prefer to fulfill their compensatory mitigation requirements upfront rather than having a long-term relationship with a conservation credit seller (Hansen et al. 2015). Thus, it is unlikely companies would reimburse landowners without regulatory incentive to do so. Yet our results suggest in certain instances buyers could be better off if they are required to reimburse landowners for failed credits.

It is also interesting to note that most western U.S. states developing sage-grouse habitat exchanges have established seed funds that will be used to reimburse sellers when management practices undertaken do not result in habitat improvements. This study demonstrates that expanded use of these seed funds to address post-production credit failure risk in addition to credit production risk could be beneficial for increasing market volume and establishing a successful institution. If states or federal entities are unable to sustain some type of fund for reimbursing landowners (sellers), our results suggest an alternate strategy that might have merit would be to require energy companies (buyers) to reimburse landowners. This type of risk sharing is not uncommon in other arrangements familiar to landowners. For example, agricultural land lease arrangements often allow for both landlords and tenants to share in risks (Bastian and Olson 1991).

Even without seller cost reimbursement or some other mechanism in place to assist sellers with failure risk, results show the significant impact that failure risk can have on these markets. Our failure rate was chosen to elicit a response rather than to replicate current habitat restoration rates. Future treatments could explore the extent to which different failure rates affect the market linearly, consistent with risk-neutral agents under expected utility theory, versus risk-averse agents under expected utility or prospect theory.

Although Nagler et al. (2013) found no treatment effect differences between sessions conducted with students and those conducted with agricultural professionals, their study did not consider failure risk. Future treatments could include agricultural producers and other business people as participants, to identify whether the anomalies identified in this study also occur with agricultural professionals.

Demand for compensatory mitigation is driven by regulatory requirements and may consequently be relatively inelastic. Regulators, however, generally allow developers to choose from among several mitigation programs, including conservation banks, in-lieu fee programs, and proponent-funded reclamation. The availability of close substitutes for compensatory mitigation programs (or where alternatives do not currently exist, the threat of close substitutes) serves to keep habitat exchange demand more elastic than it would otherwise be. These conditions suggest the potential for different supply and demand conditions than exist in our laboratory setting. Future research could alter the schedules used here to test the impact of differing supply and demand conditions on market outcomes given failure risk. While the magnitudes of outcomes might be altered compared to our results, we expect our findings regarding the importance of potential reimbursement for landowners would be relatively robust across a potential range of supply and demand conditions. Overall, we believe that without some mechanism to mitigate failure risk, habitat exchanges, at the very least, could result in much less conservation than hoped for, and, in the extreme, habitat exchanges could fail to attract sufficient landowners to supply credits, ultimately dooming them to fail.