Introduction

Today, there exists an extensive and increasing amount of empirical evidence that ecosystems can undergo sudden changes and flip from one stable state to another. (Steffen et al. 2004). These characteristics are due to the existence of positive feedbacks in the systems, and they also imply that a new state can become highly robust; sometimes the change may even be irreversible. (Carpenter 2003; Scheffer et al. 2001).Footnote 1

The importance of this feature for optimal management of ecosystems is increasingly being recognized (e.g., Dasgupta and Mäler 2003; Brock and Starrett 2003; Mäler et al. 2003; Crépin and Lindahl 2009). However, as far as we understand, little or no effort has been made to analyze the implications of this feature for the provision of public goods, such as environmental improvement.

Consider for example the following scenario. Suppose there has been a shift in an ecosystem to a degraded state and that the new state is very robust. In fact, the degradation is reversible only with a certain probability. Simultaneously, there is a discussion whether or not efforts should be made to try to restore this ecosystem. To elicit people’s preferences toward such a potential improvement one may rely on a stated preference method (Freeman 2003) such as the contingent valuation method (CVM), which is widely used for eliciting willingness to pay for public goods. Within such a setting, the validity of the responses will very much depend on how well people understand the nature of this type of uncertainty, i.e. potential irreversibility.

Based on the increasing amount of empirical evidence that we can expect such problems to appear more frequently (see footnote 1), we believe it is important to analyze how people respond to this type of uncertainty, not the least for future design and interpretation of CVM studies involving uncertain provision of public goods.

In light of this, the overall aim of this paper is to analyze people’s preferences for environmental protection with respect to program success. More specifically, how do people respond to questions about environmental protection when prospects for success are uncertain?

It is important to note though that the decision problem described can be more general and can also arise for other reasons. The success of an environmental protection program can for example hinge on the efficiency of policy implementation in term of responses from landowners and industry or on the understanding of the causes of the environmental problem.

Our case study is the highly vulnerable and disturbed ecosystem of the Baltic Sea. Recent research shows that it is uncertain whether a healthy state of the Baltic Sea can be recovered; there is a risk that the ecosystem cannot be restored, regardless of measures taken (Swedish Environmental Advisory Council 2005).

We follow the stated preference approach and design a questionnaire where randomly selected individuals are asked whether they would be willing to pay something for a program with the purpose of improving environmental quality where the program is characterized by success uncertainty, meaning that the program will have the intended effect only with a certain probability.

We use two procedures for introducing this type of uncertainty; a between-sample and a within-sample design. In the former case, each respondent is asked about whether they would contribute something for a given probability of success. Different respondents are randomly assigned different probabilities, and we then compare responses across five levels of uncertainty. In the latter case, each respondent is asked about whether they would contribute something for five different probabilities of success.

Our design resembles tests of scope sensitivity of the CVM. These tests try to find out whether CVM results show sensitivity to variations in quantity and quality of the good being valued. Scope sensitivity has been the focus of much debate.Footnote 2 Ever since the distinguished NOAA panel (Arrow et al. 1993) recommended, inter alia, scope tests, numerous such tests have been conducted, using both between-sample and within-sample approaches with mixed results.Footnote 3 However, to test sensitivity to scope in relation to the probability of program, success is a relatively unexplored field.

However, although the approach is similar, this is not a CVM study. We will not be concerned with estimating total willingness to pay for a potential increase in water quality. Instead, to fit the purpose of the paper, the main focus will be given to analyzing the effect of uncertainty on people’s responses.

We proceed as follows. The next section describes the ecological background and our data. In “Empirical strategy,” we present the empirical model used. The results are presented in “Results,” and a discussion and some concluding remarks are given in “Discussion and concluding remarks”.

Empirical background

The problem

There is empirical evidence that ecosystems can be degraded to the point where it is uncertain whether a healthy state can be recovered at all. One such ecosystem is the Baltic Sea. With the moderate age of 10–15,000 years, the Baltic Sea is the youngest sea on the planet. Its brackish water creates unsuitable or at least stressful conditions for most marine species. The resulting relatively low biodiversity makes the Baltic Sea ecosystems extra vulnerable. At the same time, the Baltic Sea has a catchment area of 85 million people and due to pollutants and nutrients from land-based activities, such as sewage treatment, industrial and municipal waste, there is a lot of stress on this ecosystem. (Jansson and Velner 2005).

According to recent findings, the Baltic Sea has at least two stable ecological states. One state (the former state) is associated with clear water, submerged vegetation and preferred fish species. Due to overloads of nutrients, the amount of dissolved oxygen in the water has decreased, and because the water turnover is in the order of 20 years (due to the low inflow of water from the North Sea), the high level of phosphorus and nitrogen stays within the system. As a result, there has been a shift in the ecosystem to another steady state, a eutrophic state associated with toxic algae blooms, turbid water, oxygen deficiency and less preferred fish species.Footnote 4

Today, the Baltic Sea is one of the most threatened marine ecosystems on the planet; no less than 88% of the biotopes found in the Baltic Sea are listed as endangered. (Swedish Environmental Protection Agency 2009).

The ministers of environment within the Helsinki Commission (HELCOM) agreed in 1988 on an action program to reduce the loads of nutrients by half by the year of 1995 (Swedish Cabinet Bill 1990/91:90). This goal has not been achieved, and additional efforts have been suggested and to some extent carried out. (Swedish Environmental Protection Agency 2009).

However, recent research shows that it is uncertain whether the change in the ecosystem is reversible; there is a risk that the Baltic Sea cannot be restored regardless of measures taken. (Swedish Environmental Advisory Council 2005).

Data

A mail survey was designed for collecting data about people’s responses when asked whether they would be willing to pay something (i.e. an amount >0) for a hypothetical abatement program with the purpose to improve the marine water quality of the Stockholm Archipelago, a part of the Baltic Sea. The respondents were informed that the program would only be successful with a certain probability. The questionnaire was received by in total of 4,500 randomly selected adult inhabitants in the county where the archipelago is situated (Stockholm County) and in one adjacent county (Uppsala County). The overall response rate was about 57%.

The data were collected at the end of the summers 1998 and 1999. In 1998, when data for the probabilities 0.5, 0.75 and 0.9 were collected, the temperature was below season average, rainfall above, and there were low to moderate levels of algal blooms. In 1999, when data for the probabilities 0.1, 0.25 and 0.5 were collected, little rainfall, high temperatures and high levels of algal blooms characterized the summer.Footnote 5

We are especially interested in analyzing responses with respect to the probability of a successful program, and the sampled individuals were randomly grouped into six sub-samples. Five of these were used for a between-sample design, where each individual faced one of the following five success probabilities; {0.1, 0.25, 0.5, 0.75, 0.9}. The remaining sub-sample was used for a within-sample design, where each individual faced all five success probabilities and was asked to give an answer to each of them.

Both designs included a description of the abatement program. The program involved measures in the agricultural and municipal sectors that with some X-percent probability would result in a water quality improvement by a 1 m increase in the average water transparency. If launched, the program would entail price increases for products produced by these sectors, including increases in municipal water tariffs. The ongoing deterioration of the water quality would continue if the program turns out not to be successful. The scenarios for the between-sample and within-sample designs are found in the Appendix.

The main question to be analyzed was formulated as follows. Would you accept or not accept to pay something in terms of increased expenses in order to make it possible to carry out this abatement program? Three mutually exclusive response alternatives followed: I would definitely accept, I would probably accept and I would not accept. These alternatives are abbreviated by definitely, probably and no below.

In the between-sample design, the respondent was also asked to specify the maximum amount in SEK he or she would be willing to pay per month, and we will also make use of this information in our analysis. Some descriptive statistics are given in Table 1.Footnote 6

Table 1 Descriptive statistics

The respondents are between 16 and 78 years with an average of about 44 years. About 55% are women, 7% are residents in the archipelago, 19% report to own a cottage in the archipelago or that someone in their family own a cottage, and 13% live in Uppsala County. The average personal monthly income (including unemployment benefits, child support, student loans etc. and after tax) is about SEK 12,200. About 46% answered definitely, 41% answered probably and about 12% answered no to the question.

Empirical strategy

For the main question, each respondent is facing three alternatives. The discrete choice of each individual partly depends on unobservable factors specific to the individual. Motivated by three response alternatives, where no, probably and definitely are coded 0, 1 and 2, respectively, we use an ordered discrete choice model (Zaviona and McKelvey 1975) to analyze the data. The model is built around a latent regression, where the underlying response model is given by Eq. 1.

$$ y_{i}^{*} = \beta^{\prime} x_{i} + \varepsilon_{i} $$
(1)

Note that \( y_{i}^{*} \) is not observable, but what we do observe from respondents’ answers is.

$$ \begin{gathered} y_{i} = 0\quad {\text{if}}\;{\text{y}}_{\text{i}}^{ *} \le 0; \hfill \\ y_{i} = 1\quad {\text{if}}\; 0< {\text{y}}_{\text{i}}^{ *} \le \mu ; \hfill \\ y_{i} = 2\quad {\text{if}}\;\mu < y_{i}^{*} . \hfill \\ \end{gathered} $$
(2)

The parameter μ is an unknown threshold parameter to be estimated along with a coefficient vector, β. In this model, a positive (negative) coefficient means that the probability of acceptance increases (decreases). The error terms ɛ i are assumed to be normally distributed with mean 0 and variance 1. Thus, we have that

$$ \begin{gathered} \Pr \left( {y_{i} = 0} \right) = \Pr \left( {\beta^{\prime} x_{i} + \varepsilon_{i} \le 0} \right) = 1 - \Upphi \left( {\beta^{\prime} x_{i} } \right); \hfill \\ \Pr \left( {y_{i} = 1} \right) = \Pr \left( {0 < \beta^{\prime} x_{i} + \varepsilon_{i} \le \mu } \right) = \Upphi \left( {\mu - \beta^{\prime} x_{i} } \right) - \Upphi \left( { - \beta^{\prime} x_{i} } \right); \hfill \\ \Pr \left( {y_{i} = 2} \right) = \Pr \left( {\mu < \beta^{\prime} x_{i} + \varepsilon_{i} } \right) = 1 - \Upphi \left( {\mu - \beta^{\prime} x_{i} } \right). \hfill \\ \end{gathered} $$
(3)

where Φ is the standard normal cumulative distribution function. The probability that a person falls into any one of these categories depends on a vector of variables x i . Estimation is done by maximum likelihood.

Due to the panel nature of the within-sample (each respondent answers five questions), this design is analyzed by a random effects ordered probit model which is built around the latent regression

$$ y_{ip}^{*} = \beta^{\prime} x_{ip} + \varepsilon_{ip} + u_{i} $$
(4)

where p denotes success probability. The random disturbance characterizing the i:th individual u i is constant across probabilities and is assumed to be normally distributed with mean 0 and variance σ 2. The unique error term ɛ ip is normally distributed with mean 0 and variance 1.

To test the influence of the uncertainty on willingness to pay (WTP), we use a simple linear regression model.

The variables included in x i (and x ip ) are, besides dummies for the probabilities of reversibility (denoted as d10, d25, d50, d75 and d90), income, age, a variable measuring the respondents’ assessment of the importance of clean and clear water in the archipelago in a scale from 0 (no importance) to 100 (crucial importance) (WQA), and dummy variables for female, residency in Uppsala County (U. County) and visit to the archipelago during the summer. Note that by using a dummy variable for visit we also include most residents and those who own a cottage in the archipelago.

Conventional economic theory suggests that more of a desired market good lead to more consumer utility. As a result, it is logically assumed that consumers should be sensitive to changes in size and scope of environmental goods and services. Based on this, we test the following hypothesis, which is also the main hypothesis.

Hypothesis 1

For both designs, we expect a positive and significant relationship between the probability of program success and the probability to answer definitely.

If increased water quality is a normal good (has positive income elasticity of demand), we also expect the following to be true.

Hypothesis 2

For both designs, we expect a positive and significant relationship between income and the probability to answer definitely.

For a consistency check, we test the following.

Hypothesis 3

For both designs, we expect a positive and significant relationship between the importance of water quality and the probability to answer definitely.

Finally, we expect value to diminish with distance.

Hypothesis 4

For both designs, we expect a positive and significant relationship between visit and the probability to answer definitely, and a negative and significant relationship between U. County and the probability to answer definitely.

Results

Between-sample

Before analyzing the data more thoroughly for the between-sample design, there are some structural differences associated with this design have to be addressed. Table 2 reveals that there are such differences in the data, especially with respect to income, visit and actual responses.

Table 2 Testing for structural differences, within-sample design excluded

In 1998 (lower temperatures, more rainfall, low/moderate levels of algae blooms and higher probabilities of reversibility), there are on average more people who answer no and less people who answer definitely (this is very surprising considering that the probabilities of reversibility are higher, we get back to this later), average income is lower and there are fewer visits.

Fortunately, the data set is rich enough, enabling us to account for these structural differences by testing the 2 years apart. Although we cannot compare all sub-samples, we can compare the responses for those who faced a success probability of 0.5 with those who faced a probability of 0.1. Similarly, we can compare the responses for those who faced a success probability of 0.5 with those who faced a probability of 0.9. From now on, we refer to these grouped samples as 1050 and 5090, respectively. Table 3 reports the results.

Table 3 Ordered probit estimates

Analyzing the results for sample 5090, we find that the dummy d75 is not significant. The dummy d90 is significant at the 10% level. However, the coefficient does not have the expected sign. The variables visit, income and WQA are all statistically significant and have expected positive signs, meaning that the probability to answer definitely increases with income and is higher for a person who made a visit to the archipelago during the summer and who has a high assessment of water quality. The variable U. County is also significant but has a positive sign, which is not what we expected. Age has a negative sign and is the most influential variable, looking at the marginal effects.Footnote 7

For sample 1050, the reversibility dummies are not significant, in this case d10 and d25. In fact for this sample, only income is significant. If a restriction of zero slopes is valid it would not lead to such a large reduction in the log-likelihood function (Log L). This in turn would produce a small likelihood ratio index (LRI = 1−lnLwith predictors/lnLintercept only (McFadden 1974). The small value of the LRI for sample 1050 therefore casts some doubts. Indeed, the chi-squared test of the null hypothesis that all slopes are equal to zero cannot be rejected for this sample.

To summarize, most significant variables (all but U. County and d90 for 5090) show expected signs. However, based on the results obtained so far we have to reject the main hypothesis as our results indicate that the degree of reversibility makes no difference for response behavior. Is this really the case or are there any other circumstances causing this result?

It has been argued and demonstrated that failures to show sensitivity so scope can occur for psychological reasons, but still be compatible with economic fundaments (Heberlein et al. 2005). We will analyze two potential reasons.

It is important to realize that the respondents express behavioral intentions and that these could be biased in several ways. First, since respondents’ answers are expressed intentions rather than actual behavior, there could be a hypothetical bias. In CVM studies, a common concern is that the budget restriction is not taken into enough account by respondents, which could imply that the stated willingness to pay is in fact much higher than what consumers would actually pay. In this setting, it could have the consequence that people are more prone to answer definitely when they in fact are not so certain and could thereby disregard relevant information such as probability of reversibility.

One approach aiming at reducing hypothetical bias in CVM studies is to collect information about how certain respondents were about their answers to a willingness to pay question (see Champ et al. 1997; Champ and Bishop 2001; Blumenschein et al. 2008). Although we cannot be certain that such an exercise would reduce a potential hypothetical bias in our study, the fact that our respondents could answer definitely, probably or no enables us to test the consequences of pooling probably and no answers. This is based on the argument that only those who answer definitely would accept the scenario in a real-world situation. This means that the proportion of answers interpreted as no (no/probably) increases from 17 to 62% for sample 5090 and from 12 to 53% for sample 1050.

However, the estimation results in Table 4 show that reversibility remains insignificant. Visit, income and WQA are still significant and positive for sub-sample 5090. U. County is now not significant however. For sample 1050, there are no significant variables.

Table 4 Binary probit estimates

Answers can also be affected by knowledge and experience with the good. Heberlein et al. (2005) analyzed four CVM studies with respect to scope using both traditional methods as well as methods from psychological theory. They found that responses are more likely to be valid when respondents have knowledge about and experience with the good.

There is experimental evidence that people respond differently to experienced-based and description-based risk (Weber et al. 2004). In particular, rare events get higher weights under description-based uncertainty than under experienced-based uncertainty. If an event has recently been experienced, that event tends to get higher weight than under description-based risk (Weber 2006). This might explain some of the differences between the two sub-samples.

In this study, the good (increased water quality) is complex and perhaps only people familiar with the good give valid answers in the sense that they take the degree of reversibility into account. We do not have a “knowledge parameter” but there are data on people who made at least one visit to the archipelago during the summer. Visitors are likely to have more experience of the problems of eutrophication than non-visitors and might thereby be able to make a more informed judgment. Tables 5 and 6 show the estimations results for these two groups.Footnote 8

Table 5 Ordered probit estimates, sample 5090
Table 6 Ordered probit estimates, sample 1050

This exercise demonstrates that people with some experience of the good respond differently to those with less experience. The most striking difference concerns water quality assessment. Water quality assessment matters more for people with experience of the good in the year with high levels of algae bloom. The reversibility dummies remain insignificant for both groups though.

Although this is not a CVM study, using willingness to pay as a dependent variable could provide richer results.

Table 7 shows that for sample 5090, the reversibility dummies are insignificant. Moreover, separating the respondents into groups or trying to reduce a potential hypothetical bias make no difference for response behavior with respect to reversibility.

Table 7 OLS regression estimates, sample 5090

As is demonstrated in Table 8, although the dummy d25 has the expected sign, it is barely significant.Footnote 9 Moreover, we would expect an even stronger significance on d10. We also tried to correct for a hypothetical bias but that did not improve results. The F test and the adjusted R square are both extremely low. Overall, we can conclude that there does not seem to be much systematic response in the data.

Table 8 OLS regression estimates, sample 1050

Instead of a probability effect, we consistently find a strong year effect for the between-sample design. For the year when sample 1050 was collected, although the probabilities are in the lower range, the share of the respondents who answer definitely is significantly higher, and the share who answers no is significantly lower (see Table 2) than for sample 5090. To refresh your memory, sample 1050 was collected when there were higher temperatures, less rainfall and higher levels of algae blooms. For this sample, we also found few significant explanatory variables. For sample 5090 on the other hand, people are more cautious and take more factors into consideration when making their responses. To see whether the year effect can be explained solely by the percentages or/and whether the explanation is to be found in other variables such as current conditions (weather and levels algae blooms), we run an ordered probit for the sub-sample where the success probability is 50% and then include a dummy for sample 1050 (d1050), see Table 9 below.

Table 9 Ordered probit estimates, sub-sample 50

Table 9 reveals that the year dummy is not significant (or even close to being significant). Thus, current conditions do not seem to be the main cause of the observed differences.

To further evaluate this reverse probability effect, we analyze each percentage sample separately which means that we control for the year effect. Table 10 shows that there is indeed something resembling an internal reverse effect. For sample 5090, the share of people who answer definitely decreases as the probability of program success increases (and vice versa for those who answered no). However, for sample 1050, this effect is not equally distinct.

Table 10 Overall responses for each sub-sample (percentage)

Can prospect theory explain these results? According to prospect theory (Kahneman and Tversky 1979), people are risk averse in the “good domain” i.e. they prefer a good outcome for sure over lotteries but risk seeking in the “bad domain” where they prefer lotteries over a bad outcome for sure (loss averse). However, even if prospect theory would be a better description of how our subjects respond to uncertainty than expected utility theory we would still expect people to be sensitive to probabilities.

The reverse probability effect results are puzzling but are similar to other results found which has been explained by regret theory (Loomes and Sugden 1982). According to regret theory, people rejoice if positively surprised and experience regret when negatively surprised and anticipate these feelings when making decisions, where feelings of regret are given a higher weight. Perhaps the regret of a negative surprise (program failure) under a 90% probability success is given such a high weight that people are more prone to answer no in this case.

Before we discuss our results further it is worthwhile to also analyze the within-sample.

Within-sample

From the results obtained so far one could be tempted to conclude that the degree of reversibility does not have an expected effect on people’s responses. However, a completely different picture appears when the data for the within-sample design are analyzed; see Table 11 for estimation results.Footnote 10

Table 11 Random effects ordered probit estimates

Reversibility is now strongly significant, both statistically and economically. The dummy variables for the different probabilities of success also have the “right” signs, meaning that the probability of to answer definitely increases for percentage rates above 50% and decreases for percentage rates below 50%.

However, the marginal effect shows that the probabilities are not weighted equally which we would expect from expected utility theory. According to prospect theory (Kahneman and Tversky 1979), lower probabilities tend to be over-estimated and higher probabilities tend to be under-estimated, where the latter effect is more pronounced. This is also the pattern we observe for the within-sample design. People give a higher weight to the probability of 0.1 compared to 0.25 than they do to 0.9 compared to 0.75.

We can also note that of the other variables, only water quality is significant; the reversibility dummies clearly dominate all other variables.

For the within-sample design, we found that the probability coefficients are more consistent with prospect theory than expected utility theory (see discussion above). List (2004) found that prospect theory suitably organizes behavior among inexperienced consumers whereas expected utility theory is a more adequate description for experienced consumers. Moreover, we already established that experienced respondents (visitors) tend to answer differently than inexperienced respondents (non-visitors) for the between-sample design. We therefore separate our respondents into these two groups also for the within-sample. Although the effect is not so strong, we also find something similar to the result by List (2004). The over-weighting of 0.1 is more pronounced for visitors than non-visitors. See Table 12.

We also try to correct for a potential hypothetical bias by pooling no and probably answers to see if this alters the results. Now also income is significant. See Table 13.

Not surprisingly, analyzing the responses for each probability separately shows that the share of respondents who answer definitely consistently increases when the probability of success increases (and vice versa for the share who answer no). See Table 14.

Table 12 Random effects ordered probit estimates for visitors and non-visitors
Table 13 Random effects binary probit estimates
Table 14 Overall responses for each probability, sub-sample 1090

Discussion and concluding remarks

The aim of this paper was to analyze how people respond to uncertainty with respect to program success of environmental protection in a typical CVM setting. We want again to emphasize though that this is not a CVM study, and we have not been interested in estimating total willingness to pay. Instead, our study was motivated by the empirical observation that such decision problems may arise more frequently in the future. For example, many ecosystems have been degraded to the point where it is uncertain whether a healthy state can be recovered regardless of the amount of resources devoted to the purpose.

Our results are mixed. In the ordered probit estimates, people do not respond to the probabilities of success in the between-sample design; we only found one significant dummy but with opposite sign than expected. Our attempt to reduce the hypothetical bias did not help us understand this behavior. However, when each respondent faced all probabilities we found that the probabilities dominated all other decision variables and that they had the expected sign. It is perhaps not so surprising that people respond differently depending on design, but we find the magnitude of this difference quite striking.

It has been showed earlier that people have a poor understanding of numerical differences in magnitude and that there are circumstances where people have problems of interpreting information, here about uncertainty, if no reference point is given (Kahneman et al. 1999). This also seems to be the case here. People do not effectively process probabilistic information but can, when asked directly and holding other conditions constant, make consistent comparisons across cases. These kinds of results have been found earlier in CVM studies regarding reductions in health risks; stated willingness to pay is inadequately sensitive to both levels and changes in probabilities (see for example Hammitt and Graham 1999 and references therein). But these types of results are typical for complex probabilities, meaning that the base level of risk is very small as are the changes in risk. Neither of our probability levels nor the changes in them is very complex.

However, the good, increased water quality, as well as the problem description may still be complex enough to cause insensitivity to the magnitude of uncertainty in a between-sample design. This is also supported by the fact that visitors, with more recent experience with the good respond differently to the question than non-visitors; we find this effect for both designs. Perhaps most respondents view the water quality issue as so important that it consistently tends to overrule the uncertainty factor unless it is not explicitly made clear to them (as in the within-sample design) that several different probabilities might be possible. In fact, among the very few respondents who commented on the low probabilities in sample 1050 there were opinions such as: “one should always give it (the abatement program) a try” (respondent #333) and “The odds are bad… but something has to be done, hasn’t it?” (respondent #1005).

For the between-sample design, we were puzzled by the strong year effect observed. More people are willing to accept to pay something although the probabilities of success were lower for this year. Do people respond more to current conditions than to probabilities? Although it is not completely picked up in the ordered probit estimates (there is one exception, d90 for sample 5090), we found some evidence of a reverse probability effect.

Can warm glow motives explain the observed behavior? Insensitivity to scope is sometimes attributed to warm glow motives (Andreoni 1988, 1990). Such motives exist when a person contributes to a public good because the act of contributing in itself provides some benefit to the individual. This cannot be the story in our case because the within-sample design shows that this is not the case. Moreover, if that were the case, we would not find the strong year effect for the between-sample.

So, where do we go from here? Individuals are asked to take on more and more decision responsibility. Simultaneously, as a consequence of technical, institutional and fast environmental changes, unpredictability and uncertainty of outcomes have increased. How people respond to uncertainty therefore remains an important issue, also for environmental protection. People do not always respond exclusively according to a specific theory, whether it is expected utility theory, prospect theory or something else. Problem descriptions, experience and current conditions may also affect responses. We show that this is true also for a “simple” probability description in a typical CVM setting. The variety of these biases suggests that it may be hard to overcome them by methodological adjustments only. For future survey work approaching similar issues, we therefore recommend the use of detailed follow-up questions to give respondents opportunities to explain their responses. Without such additional information, valuation responses might be so difficult to interpret that conclusions giving policy recommendations are impossible to arrive at. This suggests that the design of such follow-up questions is a crucial area for future research. Such research should also consider that the optimal framing of follow-up questions might vary among survey methods. The fact that budget limitations in practice often preclude the use of face-to-face interviews suggests that there is a great need for suitable follow-up questions also in mail and web questionnaire settings.