In response to consumers’ growing consciousness of environmental and societal issues, many companies have made corporate social responsibility (CSR) a strategic imperative. Consequently, companies actively seek to link their products to ethical attributes that reflect a person’s conscience (Irwin and Naylor 2009) through tactics such as eco-labels, advocacy advertising, or cause-related marketing (CM). Managerial motivations include improving the competitive position, building brand equity, or directly driving sales. Such objectives appear reasonable given that 73% of global consumers report they would switch to brands with ethical strengths at comparable price and quality levels (Edelman 2012).

In line with such reports, prior research suggests ethical strengths might compensate for lower brand familiarity (Arora and Henderson 2007) or increase quality perceptions (Chernev and Blair 2015). However, scientific evidence on the impact of ethical attributes is based on measures of purchase intent or choice decisions from small assortments with few alternatives. This does not necessarily match actual market settings. Rather, consumers are often confronted with many more alternatives, which can require screening processes prior to actual decision making. To investigate this, we assess the role of ethical attributes across consumers’ decision-making journey with a particular emphasis on the hitherto neglected consideration stage. Marketing managers perceive a position within consideration sets a critical marketing objective (McKinsey 2017), as it increases their odds of being chosen for purchases, e.g., from 1-in-40 to 1-in-4 (Hauser 2014). We build on well-established two-stage consideration-then-choice decision models (Bettman and Park 1980; Hauser and Wernerfelt 1990), which emphasize that decision rules can differ across stages. Consequently, ethical attributes may exhibit differential weighting across stages. As a result, findings of prior research may not generalize to market settings where consideration set formation plays a role.

When faced with a multitude of competing products, each characterized by many attributes, consumers must screen this vast amount of information. In the course of screening, they often focus on the most important, salient, and easy-to-evaluate attributes (Payne et al. 1993). Evaluations of moral values have been shown to be especially complex (e.g., Baron and Spranca 1997; Irwin and Spira 1997), making ethical strengths less suitable for screening than attributes related to self-centered motives with which consumers have more experience (e.g., price or brand reputation). Yet ethical strengths might tip the scales in the choice phase once consumers have reduced the number of options to a smaller subset and have the capacity to perform more comprehensive, compensatory trade-offs. We test these conjectures with four empirical studies.

Our research intends to make several contributions. Theoretically, we show that ethical strengths drive decision making in the choice stage of the decision-making journey but do not serve as screening criteria in the consideration phase. This builds on and extends prior research on the intricacy of normative moral judgments (e.g., Kahneman and Knetsch 1992) and differences between intentions and behavior (Batson et al. 2002). We find this effect is moderated by brand familiarity, such that well-known brands can more readily drive market share by leveraging their ethical strengths in large assortments than unknown brands can. This puts unknown brands at a disadvantage when intending to capitalize on ethical strengths. However, our findings also show that increasing the emotional intensity of ethical benefits can raise their importance during screening, suggesting that marketers can find ways to compensate for the lack of effects. We also find asymmetric effects of ethical strengths versus weaknesses on consumer screening of larger assortments. Brands gain little from ethical strengths in the consideration phase, but risk being screened out when associated with ethical weaknesses.

Theory

Prior research

Related research has compared (1) ethical attributes to egoistic attributes and (2) different evaluation modes in ethical judgments for different response formats (e.g., including vs. excluding decision heuristics). Both streams offer implications for the differential impacts of ethical attributes for choosing versus screening phases in the decision-making journey.

Impure altruism and the value of ethical attributes

Ethical product attributes resonate with a person’s conscience and moral beliefs (Irwin and Naylor 2009) and can have positive (e.g., fair trade) or negative (e.g., child labor) valence (Luchs et al. 2010). Information about them is often available at the point of sale and can take on many forms, such as eco-labels, CSR awards, or CM campaigns (Creyer and Ross 1996; Menon and Kahn 2003) and vary in many aspects, e.g., information may pertain directly to product functionality or be linked to peripheral attributes (Gershoff and Frels 2015). To isolate the effects of conscience and moral beliefs from effects of egoistic attributes, we focus on ethical attributes that derive their attractiveness from altruistic benefits and exclude those carrying egoistic benefits, e.g., associated with natural ingredients (see also Ehrich and Irwin 2005).

Trade-offs of ethical and egoistic product attributes in choice decisions

Existing empirical evidence has examined the effect of ethical product attributes on consumer purchases via two main approaches: testing a broad range of moderators of the main effect or directly examining trade-offs between ethical and egoistic attributes (see Table 1 for an overview). Evidence on the relevance of ethical attributes for consumer decisions is mixed, likely due to several factors.

First, many investigations do not force consumers into trade-offs between ethical and egoistic attributes. Studies of choice tasks without such trade-offs (e.g., Andrews et al. 2014) or with other types of dependent variables such as purchase intentions (e.g., Chernev and Blair 2015) indicate that ethical attributes drive consumer preferences. While these studies are instructive regarding the moderators of the impact of ethical attributes on preferences, assessments of the absolute effect might be complicated by social desirability biases.

Second, studies including trade-offs provide mixed conclusions about the relevance of ethical attributes. For example, Arora and Henderson (2007) argue ethical strengths can offset low brand familiarity, while Auger et al. (2010) find that only brands with a strong reputation (typically correlated with higher brand familiarity) appear trustworthy enough to conduct effective ethical campaigns. Barone et al. (2000) show that ethical strengths might not compensate for lower quality or higher prices, yet Chernev and Blair (2015) suggest that moral attributes enhance quality perceptions. These diverging effects might stem from different manipulations of the focal trade-offs: While Barone et al. (2000) directly manipulate product quality ratings, Arora and Henderson (2007) ask subjects to infer quality from brand slogans and product pictures, so they might underestimate the actual costs of the ethical brand.

Third, studies including trade-offs only address decision making from small assortments of up to four alternatives (e.g., Henderson and Arora 2010). Some studies test more conceptual alternatives in a conjoint design (e.g., Pracejus and Olsen 2004), yet the actual number of alternatives per choice task remains the same. As a consequence, prior research has been limited to decisions where consideration set formation is unlikely to play a role since the effort to thoroughly inspect all relevant pieces of information is relatively low. This makes these results not informative about the role of ethical attributes in earlier phases of the decision-making journey, where consumers often screen many more alternatives.

Table 1 Prior research on the effects of ethical product attributes in the consumer decision-making journey

Response mode sensitivity of ethical attributes

By definition, any decision involves conflicts and trade-offs among attributes (Hogarth 1987). Consumers thus use various coping strategies and decision rules, according to accuracy versus effort considerations. Yet ethical values, which tend to be emotion-laden, further increase decision difficulty (Irwin 1994; Irwin and Baron 2001), and consumers often struggle to assess the value of ethical public goods (Kahneman and Knetsch 1992). Accordingly, ethical judgments can depend on the response format, with objectively identical questions in different response formats resulting in different judgments. Most relevant to this investigation, Irwin and Naylor (2009) find that an excluding versus including response mode induces consumers to place more weight on ethical attributes, suggesting compatibility between exclusion and ethics. Similarly, Ehrich and Irwin (2005) find that people do not seek and ask for ethical information when it is not explicitly presented during their evaluation of individual options. Taken together, these findings suggest that ethical attributes can play a differential role in the compensatory choice phase compared with the heuristic screening phase of the decision-making journey.

Hypotheses

When making choices from small assortments, consumers can easily process all products and their attributes, and ethical product strengths can compensate for other inferior aspects (e.g., lower quality, higher price). Yet this compensatory decision making can become too complex and effortful in large assortments (Payne et al. 1993) that confront consumers with a larger amount of alternatives (Hauser et al. 2010). To form consideration sets, consumers have been shown to apply simplifying screening strategies (Hauser and Wernerfelt 1990; Shugan 1980). Specifically, two-stage decision models of consideration-then-choice (Hauser et al. 2010; Roberts and Lattin 1997) suggest that consumers construct smaller and more manageable consideration sets first, then assess them in detail to make final decisions (Häubl and Trifts 2000). Products not included in consideration sets are by definition excluded from the choice stage and cannot compete with the focal ones.

Several decision heuristics for consideration set formation have been studied (Gigerenzer and Goldstein 1996), many of which are non-compensatory in nature (Payne et al. 1993). According to prior research, consumers evaluate up to four product attributes on average during screening (Ding et al. 2011; Hauser et al. 2010). Due to limited processing capacity, instructive, easily comparable, and quickly assessable attributes are particularly suitable criteria (Bettman et al. 1998). In line with this, less comparable attributes with partial information or different dimensions have been shown to have a limited effect on decision making (Gourville and Soman 2005; Kivetz and Simonson 2000). Similarly, ethical strengths, which are not manifest in all products and typically pertain to different dimensions, may play a limited role when screening. Forming rules with simple cut-off levels or exclusion restrictions is also more difficult for ethical strengths (e.g., exclude all without CM campaign) than for traditional, self-related attributes (e.g., exclude all above a certain price level).

Even if information about ethical strengths were available for all alternatives, evaluating them is not intuitive. Predicting self-related consumption utility can already be challenging for consumers (Kahneman and Snell 1992); predicting other-related, environmental, or societal benefits is likely even more difficult. As ethical attributes cannot be assigned to a continuous scale, their comparisons may create pervasive tensions and decision conflict or can even result in moral dilemmas (Kahneman and Knetsch 1992). Further, feelings of warm glow resulting from ethical purchases (Andreoni 1990) are difficult to assess in terms of their relative subjective value (Green and Peloza 2011). Finally, the heterogeneity across ethical attributes might cause skepticism about their value, e.g., consumers may doubt firms’ motives for CM campaigns (Nyilasy et al. 2014). Taken together, ethical strengths are less likely to come to mind during screening, and instead consumers are likely to rely on more familiar screening criteria reflecting self-related consumption motives that support quick and intuitive comparisons (Markman and Medin 1995).

Hence, we expect differential impacts of ethical strengths for consideration set formation versus choice. While ethical strengths may affect choices from among small assortments, they are less practical for minimizing cognitive effort or making quick decisions based on non-compensatory screening heuristics in large assortments. Note, these conjectures are particularly relevant for those products that would not enter consideration sets anyways for other reasons. Conversely, products which are more likely to enter consideration sets irrespective of ethical strengths are also more likely to benefit from ethical strengths due to their impact on the choice stage of the decision-making journey.

Brand familiarity can play an important role in this context. Specifically, strong brands with high levels of familiarity enter consumer consideration sets more readily than less-known ones (Erdem and Swait 2004). Amongst other things, habitual purchases of the same brand offer a convenient decision heuristic (Hoyer 1984). Irrespective of ethical strengths, familiar alternatives are consequently more likely than less-known ones to enter consideration sets derived from large assortments. Brand familiarity also could mitigate some of the complexity surrounding evaluations of ethical strengths. Amongst other things, successful brands achieve positive spillover effects across their brand portfolios (Balachander and Ghose 2003) and extend brand image to new product categories (Völckner and Sattler 2006). Similarly, brand reputation can lend credibility to partnering non-profit organizations in CM or serve as a normative reinforcement of the importance of ethical strengths, making them simpler to evaluate (Basil and Herr 2006).

For these reasons, we expect that larger assortments with a higher relevance of screening are likely to lead to a lower impact of ethical strengths than smaller assortments. However, this effect is likely to be smaller for well-known brands which are more likely to be part of consumers’ consideration sets due to their higher levels of familiarity.

  • H1a: Ethical attributes have less relative influence on evaluations of unfamiliar brands in larger assortments than in smaller assortments.

  • H1b: The difference in the role of ethical attributes across assortment sizes becomes smaller for higher levels of brand familiarity.

The negativity bias (Rozin and Royzman 2001) posits that negative information is more powerful than positive information (negative potency) and dominates overall evaluations (negative dominance). For example, Mizerski (1982) shows a disproportionately high weighting of negative product information, prompting much stronger attributions about product performance than positive information. Similarly, we expect consumers to be particularly sensitive to ethical information in screening when negative evidence is available. Such information about ethical misconduct is often readily available to consumers via mainstream media, social media channels, or independent third-party services with CSR-related information and online corporate CSR profiles.

While positive ethical attributes are less likely to be useful when screening, negative ethical conduct can be employed to rule out normatively unacceptable options (negative potency). For consumers, it is easier to identify what is normatively wrong than what is sufficiently right to qualify for inclusion in consideration sets. While the positive value of ethical attributes is hard to evaluate due to a lack of anchors and absolute points of comparison (Kahneman and Knetsch 1992), any negative ethical performance that does not require evaluations on a continuous scale can be interpreted as a binary “no-go” screening criterion. Further, emotions are crucial for the choice of decision heuristics (De Martino et al. 2006). If the decision-making trade-offs are not too difficult, readily available affective impressions can be efficient screening criteria, especially for complex decision tasks (Finucane et al. 2000). Since affective responses to ethical misconduct are thought to be particularly intense (Creyer and Ross 1996), we expect negative ethical information to be more relevant for screening than positive information. While in certain settings consumers strive to circumvent negative emotions and willfully ignore negative ethical information (Ehrich and Irwin 2005), consumers are in need of convenient and easily justifiable exclusion criteria in non-compensatory screening. Negative ethical attributes are likely to be particularly suitable for this purpose. Note, consumers may screen based on multiple attributes. However, since screening criteria often do not compensate, we expect a stronger impact of negative than positive ethical information on average.

  • H2: Ethical attributes have a stronger impact on consideration set formation when the ethical attributes contain negative compared to positive or neutral information.

According to H1a, low familiarity brands achieve lower demand benefits from ethical strengths when competing in large assortments. The lower their capability of entering consideration sets based on egoistic attributes, the lower the benefits will be. When other ways of entering consideration sets are not available, increasing the relevance of ethical attributes during screening is of particular interest. To accomplish this, marketing managers might leverage the emotional undertones of consumers’ conscience (Irwin and Naylor 2009) and moral values (Peloza et al. 2013). As we argued earlier, emotions influence the choice of decision heuristics and strong affective responses can serve as efficient screening criteria, especially for complex decision tasks. While ethical values can be hard to evaluate cognitively, ethical purchases are especially prone to induce affective responses like warm glow (Arora and Henderson 2007), empathy (Lee et al. 2014), moral satisfaction (Kahneman and Knetsch 1992), or reduced consumption guilt (Strahilevitz and Myers 1998).

Research on donation behavior suggests that emphasizing individual donation recipients increases the willingness for monetary contributions. Amongst other things, emotional contagion theory (Hatfield et al. 1992) suggests that pictures make the recipient more vivid and drive emotional involvement of consumers (De Houwer and Hermans 1994; Kogut and Ritov 2005a) and thus raise donations (Kogut and Ritov 2005b). Similarly, emphasizing the personal helping role of consumers (Robinson et al. 2012) and highlighting how their actions redress injustice (White et al. 2012) increase affective responses to ethically positive claims and products. This emphasis on tangible consequences is likely to drive feelings of guilt associated with not considering support for the cause at hand. Such emotional reactions are more spontaneous and quicker than higher order cognitive deliberations (Shiv and Fedorikhin 1999). This makes ethical attributes, which elicit strong affective responses, particularly relevant during screening processes where deliberation time is scarce. Taken together, these considerations suggest that ethical attributes with increased emotional intensity are more salient, easier to process, and elicit higher interest. We thus expect:

  • H3: Ethical strengths have a stronger impact on consideration set formation when the ethical information is presented with high emotional intensity.

Study 1: The importance of ethical strengths in small and large assortments

Study 1 tests the differential role of ethical strengths along the purchase decision funnel by comparing choice shares for small (i.e., choice) and large (i.e., consideration-then-choice) assortments, as well as the moderating role of brand familiarity. As preliminary evidence of consideration set formation, we further assess self-reports of experimental subjects about which options they considered prior to their choice.

Method

We recruited 1,471 participants (51% female, mean age = 47 years) from a representative consumer panel in Germany and randomly assigned them to one of six experimental conditions in a 2 (assortment size: small [three products] vs. large [twelve products]) × 3 (brand: known vs. unknown featuring a CM campaign vs. no-CM control) between-subjects design. The CM campaign served as the ethical attribute and promoted the donation of 40 cents per purchase for food support to socially disadvantaged people.

To ensure sufficient CM effect sizes, we chose a hedonic product, frozen pizza. This is likely to be particularly susceptible to CM since consumers have been shown to seek to compensate feelings of guilt from indulgent, but unhealthy choices (Strahilevitz 1999). We manipulate brand familiarity by comparing a well-known brand in the local market (Wagner) with an unknown foreign brand not available in the market of interest (Finizza). We selected these brands as a conservative test of H1b since potential country-of-origin effects suggest Finizza as an Italian brand might attract more attention than other unknown brands. Product descriptions, brand slogans, and pictures all were taken from actual advertisements and products. Depending on the experimental condition, a CM promotion was shown for Finizza, Wagner, or neither, similar to actual market settings with a single or very few products featuring CM campaigns. For the CM conditions, we replaced one product claim with a CM claim for the focal product, keeping the total amount of information per alternative constant. All self-centered attributes are non-comparable (e.g., light, fresh, crisp), and all attributes indicate positive valences only, so no brand seemed clearly superior or inferior, and the diagnosticity of the CM and self-serving attributes was similar (see Web Appendix A). We randomly rotated the order of product displays across participants to avoid primacy or recency effects.

We assigned more subjects to the twelve-product than the three-product experimental group (1,135 vs. 336 subjects) to ensure sufficient demand for the alternatives of interest. After gathering demographics and product category knowledge (those who had not bought pizza in the past 6 months were excluded, to ensure minimum product category knowledge), we asked participants for their preferences for different frozen pizza products, which they could indicate by choosing from three/twelve alternatives, as in an actual purchase decision. Brand choice serves as the primary dependent variable. Since we do not expect two-stage consideration-then-choice decision making for the small assortment, we asked only participants in the large assortment condition to indicate all brands they took into consideration.Footnote 1

Results

As we show in Fig. 1, in the no-CM control groups, we observe a larger choice share for the well-known Wagner brand than the unknown Finizza brand, in both the small (43.2% vs. 6.3%, χ2(1) = 17.75, p < .01) and large (26.3% vs. 2.1%, χ2(1) = 46.61, p < .01) assortment, suggesting our experimental manipulation of brand familiarity was effective. In line with H1a, we find a strong increase in choice shares when the unknown brand features CM, relative to the no-CM control condition, in the small assortment (6.3% to 17.3%, χ2(1) = 6.40, p < .05; see Web Appendix G for Cohen’s d effect sizes), but the impact is much smaller for the large assortment (2.1% to 2.6%, χ2(1) = .17, p = .68). In contrast, this difference between the impact of CM across the two assortment sizes is much smaller for the well-known brand. Specifically, the relative market share increases for the well-known Wagner brand are similar for both the small (43.2% to 54.8%, χ2(1) = 3.01, p < .10) and the large (26.3% to 34.0%, χ2(1) = 5.25, p < .05) assortment. Hence, CM seems effective for well-known and unknown brands in small assortments, when decision making is limited to a single choice stage. However, and in line with H1b, for large assortments with a higher relevance of consideration set formation, CM has a stronger impact for well-known than for unknown brands. In conjunction with the findings on the smaller assortment choices, this suggests unknown brands have stronger difficulties driving consideration likelihood than subsequent choice probability.

Fig. 1
figure 1

Choice shares as a function of brand familiarity, CM, and assortment size (Study 1)

To test the apparent interaction empirically, we ran a logistic regression with choice of the CM promoted brand as the dependent variable and brand familiarity, assortment size, and their interaction as independent variables (focusing on the conditions with CM). Both brand familiarity (bbrand familiarity = 1.76, z = 5.60, p < .01) and assortment size (bassortment size = −2.06, z = −5.50, p < .01) have effects in the expected direction. Most importantly, we find a positive interaction between brand familiarity and assortment size (bbrand familiarity × assortment size = 1.20, z = 2.60, p < .01) suggesting the negative impact of assortment size on CM choice is smaller when brand familiarity is high instead of low.

We compare the number of participants who take the respective brands into consideration for both Finizza and Wagner with and without CM. We find no significant differences in shares for the unknown Finizza brand (control 4% vs. CM 5%, χ2(1) = .59, p = .44) or for the well-known Wagner brand (control 71% vs. CM 72%, χ2(1) = .10, p = .75). Thus, it appears that ethical strengths such as CM do not serve as a relevant screening criterion and cannot compensate for other self-centered consumption motives during the consideration phase.

Discussion

Study 1 supports H1a and H1b, such that the impact of ethical strengths is contingent on brand familiarity and assortment size. In relation to the baseline choice share in the no-CM control group, the unknown brand gains more than 1.5 times its initial share (relative increase of 174%); the well-known brand gains only by about one quarter (27%). These findings are in line with prior research that suggests unknown brands gain relatively more from ethical strengths in small assortments (Arora and Henderson 2007). Yet many markets feature assortments with more than twelve alternatives, so consumers are likely to apply a two-stage consideration-then-choice decision process. Our findings on larger assortments suggest the need to account for the entire decision funnel to understand the impact of ethical strengths, which do not seem to help entering consideration sets. If brands cannot enter consideration sets by other means, they appear in need to leverage more traditional marketing instruments and build brand familiarity to better benefit from ethics.

According to Barone et al. (2000), CM effects are most pronounced in situations of interbrand homogeneity but less effective when consumers face trade-offs. With regard to larger assortments, Study 1 suggests differences in brand familiarity are relevant to achieve the full demand benefits of ethical strengths. Specifically, when few brands have higher levels of familiarity, these are more likely to be part of consideration sets. Since being part of the consideration set appears to be a prerequisite for CM to impact demand, these can benefit from CM. Consequently, well-known brands profit from higher awareness both directly and indirectly, as brand familiarity makes ethical strengths more effective. Less-known brands, on the other hand, face a dual disadvantage in large assortments if they position via CM: Their lower visibility makes them less attractive partners for non-profit organizations, and they need to invest in brand building to reap the demand-side rewards of their efforts.

Our findings offer direct evidence of the impact of ethical strengths on consumer choice, but they are limited in several ways. The implications for consideration set formation remain indirect and assume different decision heuristics and two-stage consideration-then-choice processes for large assortments. Further, despite the random ordering of products, comparative relationships between the options in the larger assortment may have had an impact. The relatively small effect of CM on consideration set formation also might have been a function of the peculiarities of the product category. Also, brand familiarity could be confounded by country-of-origin effects in ways other than anticipated. For example, Wagner may have appealed more to the German participants than Finizza. We tested a CM donation as an ethical strength, but this tactic might not be sufficiently linked to firm processes and core value propositions to make it relevant for screening. Finally, we expected participants to assume that each brand conducting CM would feature according information and each alternative which does not would lack such indications. While this mimics actual market settings, we did not make this explicit and participants may have been uncertain about actual levels of CM whenever information was absent. This could have made the ethical attribute less diagnostic, which could have impacted our observations. To address these concerns, we performed additional studies.

Study 2: The importance of ethical strengths in the consideration stage

Study 2 assesses the impact of donation- and production-related ethical strengths relative to self-centered attributes on decision rules of consideration set formation. We directly test the application of simplifying decision heuristics (Hauser and Wernerfelt 1990) by empirically inferring the nature of consumers’ decision rules (Hauser 2014) based on a set of consideration tasks.

Method

We recruited 62 students (53% female) from a German university. Experimental subjects performed a series of consideration tasks in four product categories: running shoes, mineral water, skin care, and chocolate. All of these products were taken from large assortments in the actual market place and differ on dimensions that are relevant to this investigation (i.e., food vs. non-food, hedonic vs. utilitarian value, consumables vs. durables). We assessed decision heuristics by applying the disjunction of conjunctions (DOC) method (Hauser et al. 2010), which has been shown particularly effective for inferring non-compensatory decision rules (Bremer et al. 2017). In the DOC approach, respondents provide independent consideration judgments about a set of hypothetical products by indicating whether they would consider each individual option presented to them. Similar to conjoint analysis, hypothetical products are constructed based on a fractional factorial design of attribute space by optimizing D-efficiency (Street et al. 2005). Using these consideration data, a linear program is employed to identify the heuristics that led to the observed consideration decisions of each respondent. We tested five product attributes with two to six levels each, depending on the product category (see Web Appendices B and C): brand (five–six levels), price (four levels), two consumption-related characteristics (flavor and color, each with three–five levels), and two levels of CSR and CM as ethical strengths. To account for attribute differences, we estimated separate DOC models for each product category. To control for fatigue or memory effects, we randomly rotated the order of presentation of the categories and products.

The diagnosticity of the ethical attributes is equivalent to that of the other attributes, in that we included information about desirable levels of ethical conduct (CSR) or donations linked to each purchase (CM) or else explicitly indicate that no donations were linked to purchases or no information on CSR was available. The participants evaluated 20 products for each category (80 total) and sequentially indicated, for each option, whether they would consider purchasing it, i.e., they effectively specified their consideration set relative to the presented products (Hauser et al. 2010). The average time to complete the task was 15 minutes.

We applied the linear program from Hauser et al. (2010) to these data by programming a pattern recognition search algorithm to infer the consideration heuristics responsible for the observed consideration set. Inferring non-compensatory decision heuristics is a more complex combinatorial task than estimating part-worth utilities as in traditional choice-based conjoint, because we must identify decision heuristics and corresponding parameters. The linear optimization is based on model fit, sample shrinkage, and cognitive simplicity; that is, it relies almost exclusively on individual-level data, and sample-level information is relevant only for ties between attributes. Due to this and unlike hierarchical Bayesian estimations of choice-based conjoint data, DOC does not require large sample sizes to converge (Bremer et al. 2017). We estimated the following DOC decision rules: (subset) conjunctive, disjunctive, and lexicographic (Hogarth and Karelaia 2005; Payne et al. 1988; Tversky 1972; see Hauser et al. 2010 for a detailed description of parameter estimation).

Results

We start by examining the average consideration set sizes. The values between 5 and 7 (running shoes 5.52, mineral water 6.43, skin care 6.57, chocolate 6.93) are in line with prior findings about consideration set formation (e.g., Chakravarti and Janiszewski 2003). Consumers’ limited processing capacity for each consideration task (e.g., Bettman et al. 1998) also suggests it is plausible that running shoes, containing attributes which are more difficult to evaluate, would result in smaller consideration sets. In line with prior research (Gilbride and Allenby 2004), we obtained decision heuristics containing zero to three screening aspects. Across all product categories, brand is the most often included factor (either positive must-have or negative no-go screening criterion), with values ranging from 57% (chocolate) to 75% (running shoes), affirming the findings of Study 1 that brand familiarity is critical for heuristic consideration set formation. The greater screening relevance of brands for running shoes also seems reasonable, considering their product complexity, uncertainty about quality, and importance for social signaling. Price is the second most frequently applied screening criterion, with values from 40% (skin care) to 56% (chocolate). The two consumption-related attributes show variation by product category and type of characteristic, between 20% and 47% (except chocolate, for which 3% of screening was based on consistency [melting vs. crispy]). With a few exceptions, we observe more no-go than must-have screening rules per category and attribute (see Table 2).

Table 2 Share of screening rules that contain each attribute (Study 2)

We do not find any exclusion screening rules based on ethical attributes. Both CM and CSR entailed only beneficial information, i.e., participants did not regard the absence of such benefits an exclusion criterion. The share of inclusion screening rules attributed to CSR, from 2% (mineral water) to 9% (running shoes), and CM, from 3% (running shoes, skin care) to 13% (mineral water), are smaller than for all other attributes (except for chocolate consistency). A series of chi-square tests in each product category, comparing the number of participants who screened based on ethical attributes versus each of the other product attributes, reveals statistically significant differences for all but two pairs (p < .05, except for chocolate consistency, see Web Appendix D).

Discussion

Study 2 provides direct evidence based on screening rules that CM and CSR both are less important than traditional product attributes when it comes to screening in the consideration phase. Of the limited number of attributes consumers can process during screening, and irrespective of the comparative characteristics of individual assortments, ethical attributes will likely be overshadowed in larger assortments. This finding is robust across a diverse set of attributes and two types of ethical strengths, rather than being driven by any peculiarities of individual categories (e.g., varying category fit with ethical attributes). Furthermore, Study 2 relies on a consideration task for individual products, which cannot be a function of the comparative characteristics of individual choice sets. Whereas in Study 1 the limited number of familiar brands may have made brand-based screening more likely, the estimated decision rules in Study 2 represent more generic decision-making approaches of individuals for each product category. Our findings also represent a conservative estimate of the greater importance of self-related attributes relative to ethical attributes because we indicated in each case whether a CM donation or CSR strength was present or not.

However, both Studies 1 and 2 examined ethical strengths (positive valence) exclusively. We limited our manipulations to positive valence because brands typically advertise their ethical strengths, whereas consumers require access to other sources to obtain information on ethical misconduct. However, such information is increasingly available and negative ethical attributes are likely more important for product screening than positive ones, following H2. Furthermore, in Study 2 we estimated only one screening rule per respondent. We then inferred the heuristic rules from repeated consideration decisions for one alternative at a time and across multiple attribute combinations. However, under actual market conditions individual consumers may switch between different screening heuristics. These types of transitions could not be accommodated by the DOC approach. To corroborate and extend our findings, we therefore ran an additional study with another approach to investigate the relevance of ethical attributes during screening.

Study 3: The relative importance of ethical valence in the consideration stage

Study 3 assesses the moderating effect of valence on screening alternatives with ethical attributes. Further, we control for different types of ethical attributes, as prior research suggests varying impacts (Creyer and Ross 1996) and Study 2 suggests that consumers assign more weight to CSR than to CM. Specifically, we compare CM (only positive information) with two types of realistic CSR information: CSR awards (positive information) and CSR ratings (positive and negative information).

Method

We randomly assigned 304 participants (50% female, mean age = 49 years), recruited from a representative consumer panel in Germany, to one of three experimental conditions: CSR traffic light, CSR award, and CM. The CSR award and CM campaign may have only positive or neutral valence (no vs. CSR award; no vs. CM campaign); the CSR traffic light on the other hand can indicate socially irresponsible behavior (irresponsible vs. neutral vs. responsible CSR). We tested sunscreen, chocolate, and water as product categories. As in Study 1, each category included twelve alternatives, showing descriptions of the brand, price, and two brand claims. We again excluded participants who lacked minimum category knowledge.

Based on the findings of Study 1 that brand familiarity moderates the impact of ethical strengths on consideration set formation, we asked for familiarity of all brands in all categories prior to providing information about the ethical attributes (7-point Likert scale; Campbell and Keller 2003). Next, we showed a web page that informed about the CSR traffic light, the CSR award, or the CM campaign, depending on the experimental condition (see Web Appendix E). In the CSR traffic light condition, we informed participants about the criteria used to determine the CSR score (e.g., treatment of employees, environmental friendliness) and highlighted that traffic lights could indicate irresponsible, neutral, or responsible CSR conduct. In the CSR award condition, we provided similar information about the determinants of CSR and noted that only brands with ethical strengths would be eligible for an award, though not receiving an award does not indicate ethical misconduct. In the CM condition, we explained that some brands link each item sold with donations and others do not.

To mimic actual purchase decisions, we asked participants to indicate brands for which they wanted ethical information, similar to clicking on a pop-up box in an actual online decision context. We then asked them to indicate which attributes (CSR/CM, price, brand, product claim, or other) they would use as screening criteria in an actual purchase decision. The fraction of respondents who used ethical attributes as screening criteria serves as our primary dependent variable. To assess potential trade-offs across screening attributes due to time or cognitive resource constraints, we also asked them to distribute 100 points across all attributes, indicating the likelihood of each respective attribute to be part of their screening procedure, serving as a secondary dependent variable. We repeated this procedure three times per participant, for all three product categories, randomly rotating the order of categories and product displays. Finally, we asked participants to indicate their CSR attitudes (7-point Likert scale, items “companies should act in a responsible way regarding the environment”, “make every effort to reduce pollution from their products”, “watch the recyclability of their products”, and “treat their employees in a socially responsible way”).

Results

Figure 2 presents the share of participants reporting that the ethical attribute would serve as a screening criterion for their consideration set formation, across products and experimental conditions. In line with Study 2, we detect no significant differences across product categories. However, we find significant differences for the three types of ethical attributes. In particular, the CSR traffic light emerges as a self-reported screening criterion for 49% of participants in the water and sunscreen categories, statistically higher than the shares in the CSR award (water and sunscreen both with 49% vs. 34%, χ2(1) = 4.72, p < .05) or CM conditions (water: 49% vs. 31%, χ2(1) = 7.03, p < .01; sunscreen: 49% vs. 35%, χ2(1) = 4.25, p < .05). For chocolate, the effect is in the expected direction but not statistically significant at conventional alpha levels, with 42% screening via ethical attributes in the CSR traffic light group, versus 35% in the CM (χ2(1) = 1.15, p = .28) and CSR award groups (χ2(1) = 1.07, p = .30). Finally, the share of people who employ ethical screening is not statistically different between the CSR award and CM groups (sunscreen: 34% vs. 35%, χ2(1) = .01, p = .92; chocolate 35% vs. 35%, χ2 (1) = .00, p = .96; water 34% vs. 31%, χ2 (1) = .25, p = .62), i.e., differences in the relationships or costs of the core value proposition related to CM and CSR do not appear to drive the consideration rules.

Fig. 2
figure 2

The relevance of ethical attributes for screening (Study 3).

Note: Superscripts indicate significant differences between experimental groups (1, 2 at p < .05, 3 at p < .10)

To investigate these results in more detail and take individual-level heterogeneity in decision-making styles and individual preferences for ethical attributes into account, we ran a logistic regression with clustered standard errors and included the observations across all three product categories as repeated measures. We coded CM as the reference category and use two dummy variables for the CSR traffic light and CSR award groups. To control for brand familiarity, we included average familiarity with each of the twelve products per category and its standard deviation; higher variance should favor heuristic consideration rules based on cost–benefit trade-offs (Bettman et al. 1998). Confirming H2, we find a significant effect of the CSR traffic light (bCSR traffic light = 2.58, z = 2.03, p < .05), but not for the CSR awards (bCSR award = .31, z = .27, p = .79). Potential negative ethical information is thus more likely to influence consideration set formation than just positive and neutral ethical information. As the impact of brand familiarity, its standard deviation, and their interactions with our experimental manipulations are not significant, we find no significant indications linking brand familiarity to differences between positive and negative ethical information.

Asking respondents to report their hypothetical consideration rules may have inflated the values for the screening likelihood based on ethical attributes, which are higher than in our prior studies (31% to 49%). Yet we have no theoretical reason to suspect that these inflated screening likelihoods affect the relative screening probability across the three conditions. To investigate this, we replicated the previous analysis for the constant sum task, requiring participants to make trade-offs across product attributes, as in Studies 1 and 2. The average allocation of 15.5 points to the ethical attributes is in line with our previous findings, again indicating the low relative importance of ethical attributes for screening. We also find a similar pattern of relative differences across experimental groups (Fig. 2). For water, participants allocated 13.1 and 14.3 points to CM and CSR awards but 17.7 points to the CSR traffic light (CM: t = 2.20, p < .05; CSR awards: t = 1.75, p < .10). For sunscreen, we also find a significant difference between CM (12.9) and CSR traffic light (18.5, t = 2.54, p < .05), though we detect a marginally higher screening probability for CSR awards (17.1) than CM (t = 1.86, p < .10). This might reflect the higher personal relevance of this product category, in terms of health consequences, e.g., CSR awards could signal higher product quality. Finally, we do not detect any significant differences across the ethical conditions for chocolate (CSR traffic light vs. CM: t = 1.45, p = .15; CSR traffic light vs. CSR award: t = .55, p = .58; CSR award vs. CM: t = .83, p = .41).

Another regression model, with clustered standard errors, analogous to our prior analysis, produces an identical cross-category pattern of results. In particular, only the CSR traffic light effect is significant (bCSR traffic light = 4.51, z = 2.41, p < .05); hence, CSR traffic lights are 4.51% more likely to function as a screening criterion than CM when consumers must make trade-offs across product attributes. We do not find any significant effect of CSR awards relative to the CM reference category (bCSR award = 2.48, z = 1.33, p = .18). Thus, the sunscreen result does not appear to generalize to the other categories. Similarly, we do not find significant effects of brand familiarity measures or their interactions with the experimental conditions (all p > .42).

Discussion

Study 3 replicates our findings that ethical strengths are rarely used as screening criteria and establishes a relevant boundary condition. Across all product categories and two dependent measures, we find that negative ethical information is more important for screening than the presence or absence of positive ethical information. The repeated measures logistic regressions further suggest that CM screening effectiveness does not differ from that of CSR when it is limited to only positive and neutral information, like CSR awards. Valence thus appears to drive the pattern of results, such that ethical information with negative valence increases the relative importance of ethical attributes for screening. It seems that negative CSR information has the potential to diminish consideration likelihood for both well-known and less-known brands.

Study 4: Increasing the importance of ethical strengths for consideration

Study 4 examines a marketing intervention to increase the low importance of ethical strengths in the consideration phase. Specifically, we test whether higher emotional intensity makes ethical strengths both more salient and easier to process and thereby more relevant for screening. To manipulate this, the ethical attribute relates to a specific person using an image to increase consumers’ identification with the recipient and stresses how the consumer can help. In addition, to shed light on the psychological process that inhibits ethical screening, we test the impact of salience and diagnosticity. Study 1 mimicked actual market settings, so CM information was only available for the focal brand, not all alternatives. In Study 3, we observed greater importance of ethical attributes, especially in the sunscreen category, perhaps driven by the higher levels of salience and diagnosticity. In Study 4, we therefore test whether salience and diagnosticity might be responsible for the observed differences or whether the effects actually stem from the incompatibility of ethical benefits with non-compensatory screening. We thus display ethical attributes for all alternatives and repeatedly remind respondents about them. Finally, Study 4 tests another response format, in which consumers are confronted with a large assortment and select all alternatives they would consider.

Method

Recruited from a representative consumer panel in Germany, 2,112 participants (49% female, mean age = 44 years) were randomly assigned to one of seven experimental conditions in a 2 (emotional intensity of ethical strength: low vs. high) × 3 (diagnosticity: competing brands without ethical information vs. average ethical value of competing brands vs. below-average ethical value of competing brands) between-subjects design, together with a control group that did not receive any ethical information about the focal and competing brands. We used the product categories from Study 3 (sunscreen, water, and chocolate) and again surveyed demographics, previous purchase behavior, and brand familiarity. As a cover story, we explained that we were interested in product preferences in three categories. We then showed participants a fictitious supermarket shelf, with twelve products for each category, and asked them to indicate which products they would consider purchasing (see Web Appendix F). As in Studies 1–3, available information included the brand, a product image, a brand slogan, a price, additional product characteristics, and, depending on the experimental condition, CSR. We randomized the order of product categories across participants. To address potential order effects, we created two sets of shelves for each category with randomized product positions and then randomly assigned participants to one of these sets. The results do not differ across sets, so we collapsed the resulting data. For each product category, we selected one unknown brand as the focal brand, and chose brand names not indicative of the country of origin (e.g., Aveeno for sunscreen, Feodora for chocolate).

Our primary dependent variable is whether the focal, unknown brand enters the respondent’s consideration set, depending on the experimental condition. In the control condition, none of the twelve products featured any information on product ethicality. In the low emotional intensity condition, the focal product provided a textual description and example of its ethical engagement (e.g., “above-average CSR through drinking water projects for families in Tanzania”). In addition, the high emotional intensity condition displayed a picture of the potential recipient that also addressed the consumer (e.g., “help families like Tajo’s”, see Web Appendix F).

The diagnosticity conditions differed according to the CSR information provided about the 11 competing brands. In the non-diagnostic condition, no other brand provided any CSR information. In the two diagnostic conditions, all competing brands displayed either average or below-average CSR. Following all consideration tasks, we asked participants to indicate which attributes (CSR, price, brand, product claim, or product characteristics) served as screening criteria and to distribute 100 points across all attributes based on their relevance for screening. Next, they stated their degree of certainty about the country of origin of each focal unknown brand and two randomly chosen other brands per category, then rated the product category as hedonic or utilitarian (7-point Likert scales; Strahilevitz and Myers 1998). Finally, as a control check, we asked participants to identify all products with above-average CSR, before continuing with an unrelated final task.

Results

The manipulation checks for brand familiarity and country of origin for the focal products indicate scores significantly below the scale midpoints of 4. That is, t-tests for both brand familiarity (MSunscreen = 1.52, t(2,111) = −90.18, p < .01; MChocolate = 3.39, t(2,111) = −12.45, p < .01; MWater = 1.54, t(2,111) = −86,83, p < .01) and country of origin (MSunscreen = 2.23, t(2,111) = −51.84, p < .01; MChocolate = 2.93, t(2,111) = −26.28, p < .01; MWater = 2.28, t(2,111) = −50.65, p < .01) confirm our manipulation and suggest country-of-origin inferences are unlikely to have an impact.

Figure 3 presents a descriptive overview of the consideration shares, consolidated across product categories for the focal brands. The non-significant differences between the control and low emotional intensity groups for the non-diagnostic conditions (9.7% vs. 9.8%, χ2(1) = .01, p = .91) confirm our findings from Studies 1–3, such that ethical strengths are not relevant for screening. Likewise, the non-significant differences between non-diagnosticity and average diagnosticity conditions when emotional intensity is low (9.7% vs. 12.1%, χ2(1) = 2.66, p > .10) suggest our results are not driven by the diagnosticity of the ethical attribute across alternatives.

Fig. 3
figure 3

Share of subjects including the focal product in their consideration set in percent (Study 4).

Note: Participants selected the products they would consider for purchase. Results display the share of subjects including the focal product in their consideration set. Non-diagnostic: competing brands without ethical information; diagnostic (below-average): competing brands with below average ethical value; diagnostic (average): competing brands with average ethical value. Superscripts indicate statistically significant differences within each diagnosticity group (1,2 at p < .05; 3,4 at p < .10)

In contrast, the consideration shares of the focal product are significantly higher in the high versus low emotional intensity condition for the non-diagnostic treatment (16.8% vs. 9.7%, χ2(1) = 20.23, p < .01). This confirms H3, i.e., high emotional intensity offers an effective lever for marketers to overcome the lack of relevance of ethical strengths in screening and makes ethical strengths both more relevant and easier to process during screening considerations.

Additional negative ethical information for all competing brands increases consideration likelihood for the focal brand when emotional intensity is low (9.7% vs. 13.3%, χ2(1) = 5.78, p < .05). In line with Study 3, the negative valence of ethical attribute information for the competing brands appears to increase the importance of ethics for screening (H2). High emotional intensity also benefits the focal brand when all competing brands feature below-average ethical scores (16.0% vs. 13.3%, χ2(1) = 2.76, p < .10).

When all brands in the market have average levels of CSR, we find no significant differences between high and low emotional intensity (12.4% vs. 12.1%, χ2(1) = .04, p = .83) or between low emotional intensity and the control (12.1% vs. 9.8%, χ2(1) = 2.28, p = .13). Only the difference between high emotional intensity and the control is significant at p < .10 (12.4% vs. 9.8%, χ2(1) = 2.97). Taken together, this suggests two effects of diagnosticity: Diagnosticity itself does not appear to increase the importance of ethical strengths overall and average levels appear to mitigate the perceived difference between average and above-average CSR so that the focal brand cannot create a competitive advantage via its above-average CSR.

To investigate these relationships in more detail, we ran a random intercept regression model, considering intra-class correlations across all three product types as repeated measures. The choice of the focal brand provides our primary dependent variable, and we limited the analysis to participants in the CSR groups, excluding the control condition (n = 1,810). When estimating a model with main effects only we find a significant effect of high emotional intensity (bemotional intensity = .38, z = 2.55, p < .05), but diagnosticity is non-significant (bnon-diagnostic = .14, z = .73, p = .46; bnegative diagnosticity = .28, z = 1.53, p = .13). Adding the interactions of emotional intensity and diagnosticity reveals a positive moderation when comparing the non-diagnostic to average diagnostic group but no such interaction effects comparing negative diagnosticity with average diagnosticity (bemotional intensity × non-diagnostic = .80, z = 2.17, p < .05; bemotional intensity × negative diagnosticity = .26, z = .71, p = .48; all other main effects p > .28). Thus, high emotional intensity seems most effective under non-diagnostic conditions, reflecting real-world settings in which information about ethical attributes tends to be available for only a few products.

Discussion

In line with our previous studies, Study 4 affirms that ethical strengths do not seem to influence consideration set formation; the consideration shares of the focal brand remain unchanged between the control and low emotional intensity, non-diagnostic groups. As in Study 3, ethical weaknesses serve as a boundary condition. For all three products, the presence of negative CSR information increases the importance of CSR overall, irrespective of emotional intensity, as indicated by the significantly greater number of participants who selected the focal product, compared with the non-diagnostic condition. Negative CSR information therefore emerges as a more important screening criterion than the simple presence or absence of CSR.

Study 4 further suggests that increasing the salience of ethical attributes by providing average ethical information about all alternatives does not suffice to make it relevant for screening. Note, this result contrasts findings of Kivetz and Simonson (2000) and Slovic and MacPhillamy (1974) according to which consumers overweight egoistic attributes with information available for all alternatives due to their higher salience and diagnosticity. For the ethical attributes we study, the consideration of focal brands does not differ even when respondents receive repeated reminders of ethical attributes through ethical information for each brand. Making ethical strengths relevant for screening instead requires raising their subjective importance and ease of processing, by increasing their emotional intensity.

Due to the objectives for this study, only an unknown focal brand achieved a higher value on the ethical attribute. If only a single less-established brand follows a certain course of action, it may have appeared less relevant to participants superficially screening alternatives. We therefore conducted a follow-up study with 616 German panel participants, in which both the unknown brand and another well-known brand (Volvic for water, Nivea for sunscreen, and Ritter Sport for chocolate) exhibit ethical strengths. With this design, we could investigate whether a well-known competitor increases the salience and weight of ethical strengths for the product category overall, such that the unknown brand benefits from a spillover effect. We also controlled for the spatial distance between the known and unknown brands on the shelf. Then, we randomly assigned participants to one of four conditions in a 2 (unknown brand with vs. without CM campaign) × 2 (unknown brand positioned next to or far from the known brand featuring CM) design; the well-known brand ran a CM promotion in all conditions. The rest of the design mimicked that of Study 4, except that in the chocolate consideration task, we replaced the less-known brand Feodora with Bensdorp, because Feodora’s consideration share was high (16%) even without ethical benefits in Study 4, suggesting that it might have appeared more familiar to participants than we anticipated. The random intercept repeated measures logistic regression of the impact of the shelf position of the unknown brand on consideration shares did not yield significant results (p > .10), so we collapsed all position conditions for the analysis.

For the focal unknown brand with and without CM, we do not observe significant differences in consideration shares in the chocolate category (7.4% vs. 5.6%, χ2(1) = .80, p = .37). For water, we find a directional effect but no significant difference according to conventional alpha levels (7.7% vs. 4.6%, χ2(1) = 2.54, p = .11). For sunscreen, the unknown brand benefits from CM (9.3% vs. 4.9%, χ2(1) = 4.41, p < .05). A random effects logistic regression across all three categories produces a marginal CM effect (bCM = .65, z = 1.78, p < .10). In contrast with a lone unknown brand, with or without CM, these indications imply that unknown brands might benefit more from CM if they are following the lead of a well-known brand that also promotes CM. However, these effects primarily reflect the differences observed in a single product category. Also, the weak findings may be due to the increased competition of two instead of a single brand with the same ethical strengths. However, the relatively low effect of a well-known brand with a similar ethical profile is consistent with the conjecture that salience is not the major driver of the relevance of ethical attributes during screening. Emotional intensity appears to play a more important role.

Increasing emotional intensity, using text and images, seems an effective means to increase the relevance of ethical strengths in consideration set formation. We observe the highest effect in the non-diagnostic condition, which mimics actual market settings most closely. In relative terms, moving from low to high emotional intensity grows consideration likelihood by 73%. This appears a practically important effect.

General discussion

This paper makes several contributions. Four studies provide consistent evidence across five product categories and several brands on the differential role of ethical attributes in the consideration versus choice stages of the consumer journey. The empirical evidence indicates that ethical strengths can drive decision making in the choice phase, but consumers do not actively seek ethical strengths in their consideration set formation process when applying decision heuristics, as ethical values are not consistently available and are inherently hard to be compared to other ethical and egoistic attributes. Conversely, negative ethical information is influential already in the consideration stage, as ethical misconducts represent a convenient exclusion criterion. Finally, managers can attenuate the lack of relevance of ethical strengths in screening by increasing the emotional intensity of the ethical claim.

Theoretical implications

This study extends prior theorizing on the value of ethical attributes for consumer purchase decisions (Arora and Henderson 2007; Chernev and Blair 2015) and ethical judgments in different evaluation modes (Baron and Spranca 1997; Ehrich and Irwin 2005; Irwin 1994; Irwin and Baron 2001; Irwin and Naylor 2009). We identify important boundary conditions related to consideration set formation and large assortments, revealing effects that differ from those of singular evaluations or small assortment decisions.

In particular, our results suggest that it is critical to take the full decision-making journey into account. Studying only choices from a small assortment can lead to an overestimation of the impact of ethical attributes, especially for unknown brands. Given that many actual markets may feature even more products than the twelve alternatives we tested, taking consideration set formation into account appears important.

While ethical strengths are often advocated as product differentiation tactics, their competitive advantage appears to diminish in large competitive sets. This might appear to be in contrast to findings on objectively irrelevant attributes, which have been shown to be capable of driving demand when only a single alternative features such an attribute (Carpenter et al. 1994). Yet these attributes typically allow consumers to develop a naïve theory and assume potential self-related benefits. In our studies, the differentiation refers to altruistic motives, which appear less suitable for differentiation in large assortments than are self-related benefits.

We also contribute to research that evaluates how ethical attributes are weighted across different response formats. Different self-serving motivations might reduce the weight of ethical attributes in decision making, if consumers try to avoid the costs of being moral (Batson et al. 1999), willfully ignore (Ehrich and Irwin 2005) or forget negative ethical information (Zane et al. 2016), or ignore questionable CSR practices if the product is very desirable (Paharia et al. 2013). Our results suggest a more conciliatory view: Even when consumers have other-serving motivations, their decision heuristics may be incompatible with evaluations of ethical attributes. Ethical attributes reflect a wide range of criteria, are difficult to compare, and do not feature uniformly available information across products, so consumers are unlikely to rely on them to screen many alternatives. Screening requires quick, intuitive decisions—contradictory to the complex evaluations required by ethical benefits. Actual market settings can thus demand too much effort from consumers to allow them to take ethical information systematically into account.

Our findings also distinguish between positive and negative ethical attributes. The results on positive ethical conduct resonate with research that suggests the value of public goods or ethical attributes is hard to evaluate and translate into monetary values (Kahneman and Knetsch 1992). Yet this difficulty does not hold true for negative ethical attributes. They rather provide easy-to-justify heuristics and function as simple exclusion criteria for screening, especially if they come from a reliable third-party source and offer comparability across product alternatives.

In addition, studies on single vs. joint evaluations of ethical courses of actions suggest that differences in response mode can result in reversed relative preferences between two options (Kahneman and Ritov 1994). We extend these findings by studying joint evaluations of many more alternatives, in which multiple heuristic screening processes might take place. In such settings, ethical strengths lose importance and can have very little influence.

Finally, prior research on self-centered attributes has suggested emotional responses often drive screening when individuals make complex choices under time pressure (Finucane et al. 2000). According to our findings, this extends to emotions triggered by other-related information. Compassion may play a strong role in moral conduct and donation interest. Evoking such considerations in product screening appears to require vivid representations of ethical attributes with high emotional intensity.

Practical implications

Our findings suggest traditional market research may be misleading when studying the impact of ethical attributes. In particular, ethical attributes can have different effects on consideration and choice, such that simple attitudinal judgments, purchase intent measures, or measures of decision making from a limited set of alternatives, may not match consumer behavior in the actual market place whenever screening is relevant. When testing the impact of ethical attributes, methods such as the DOC approach from Study 2 are likely to provide valuable additional information on non-compensatory decision making.

Also, depending on brand familiarity marketers have different prospects when seeking additional market share with ethical benefits. According to our findings, well-known brands are more likely to increase market share through their ethical strengths in small as well as large assortments. Less-known brands are more likely to require other means to gain entry into consideration sets since ethical benefits alone do not appear to drive consideration. Often, less-known brands follow the lead of better known competitors. In large assortments, doing so can result in better prospects than pioneering ethical strengths. Shelf positioning beyond the established competitive space also might drive consideration. For example, organic food or fair-trade corners in supermarkets emphasize ethical considerations. However, the impact we have observed is rather small.

For unknown brands, it might be more promising to nudge consumers to consider ethical attributes during screening by increasing their emotional intensity. As in Study 4, using pictures and a personal description of the CSR beneficiary appears feasible even for smaller, unknown brands with limited resources. This could help less-known brands to overcome their dual disadvantage of being less attractive to non-profit partners and requiring more effort to ensure consideration. Note, this positive effect of emotional intensity is unlikely limited to unknown brands as it is likely to impact both the consideration and the choice phase (see Kogut and Ritov 2005a for related findings on the choice phase).

As self-centered attributes are more common screening criteria, less-known brands also could combine ethical attributes with self-centered benefits—for example, better taste due to the use of organic ingredients or enhanced skin care of creams without micro-plastic (Peloza et al. 2013). Such strategies might increase the visibility and consideration probability of ethical strengths. When less-known brands manage to enter consumers’ consideration sets, the relative market share gain can exceed even that of better known competitors. Both less and well-known brands may be interested in increasing consideration likelihood when they target new customer segments. Well-known brands likely appeal to these new segments with ethical attributes and tactics similar to those used in segments they already are well-established in. However, unless their brands are already part of the consideration set of this new segment, such efforts can produce lower effects than expected.

Our results offer two main insights for policy makers and third-party services providing CSR profiles for consumers. First, a lack of salience and ease of interpretation can result in consumers not relying on ethical attributes for screening. When ethical information is more salient and provides an objective, trustworthy measure of ethical performance, it is less likely to disappear in large assortments. Several countries started to test objective ratings in related areas such as nutrition (WHO 2017), which could be applied to ethical attributes as well. Mobile apps and in-store information provide additional means to deliver such information to consumers. Investments of policy makers in rating instruments is likely to reduce the amount of companies refraining from ethical objectives due to a lack of economic incentives.

Second, we study decision making within specific product categories. Yet consumers make important decisions across domains (e.g., consumption vs. donation, self-centered career choices vs. social engagement, supporting environmental vs. humanitarian causes). For these choices, the alternative courses of actions are even more complex to evaluate than the ones we have studied. Public policy makers seeking to drive socially desirable behavior thus would be well-advised to make ethical benefits easy to understand, e.g., by stressing specific examples.

Limitations and further research

There are of course limitations to this research. A main limitation is the data. We took care to present realistic alternatives, employing actual and fictitious brands and different measures of consideration set formation, yet all of the investigated choices are hypothetical. The minimal impact of ethical strengths on consideration set formation suggests that social desirability did not bias the results. However, actual product choices can evoke different effects of emotion-laden attributes (Shiv and Fedorikhin 1999). For example, in an actual purchase decision for chocolate, ethical strengths may provide a vehicle for rationalizing impulsive, short-term hedonic benefits, which are more salient in actual shopping environments than a hypothetical laboratory setting.

In line with prior research, we assessed ethical strengths tied to altruistic values (Irwin and Naylor 2009). We expect directionally similar but weaker differences between consideration and choice phases if the ethical strengths are also associated with self-serving benefits. For example, in clothing organic cotton may appear having both ecological as well as wearing benefits. Further research could investigate this. Prior research also distinguishes ethical attributes according to their link to integral or peripheral physical product elements (Gershoff and Frels 2015). Further research might extend this reasoning to address central versus peripheral brand image dimensions and build upon our findings on brand familiarity. For example, Kirmani et al. (2017) show that an underdog positioning can attenuate lower competence associations.

We studied assortment size in terms of the number of alternatives, not multiple facings of individual options. Assigned shelf space could influence the visibility of ethical strengths in decision making and it is unclear whether our assortment size findings transfer to such cases. Our study is also limited in terms of the cultures it covered since all of our participants originated from a German population. For Germany, average orientations towards ethical purchasing have been reported (Edelman 2012). Other cultures can assign higher but also lower importance to ethical attributes (Winterich and Barone 2011). While the absolute interest may differ, the relative differences between consideration and choice appear less likely to vary across cultures. However, empirical cross-cultural research on consideration set formation and ethical attributes would be valuable to understand the impact of cultural differences in more detail.

For markets where consideration set formation plays a role, this research has revealed challenges leveraging ethical strengths if brands lack other types of screening benefits (e.g., low prices or high brand familiarity). Our findings indicate that raising the emotional intensity by displaying recipient images can compensate for this lack of impact. Evidently, any victim-related image contains additional information compared to a text-only description. Further research may test other operationalizations of emotional intensity. For example, the World Wildlife Fund highlights irreversible negative consequences to motivate donations for rain forest protection: “we cut off something that doesn’t grow again.” Firms may use similar tactics by highlighting the negative consequences of not supporting the promoted cause or stressing the responsibility of consumers by emphasizing continuous negative ethical consequences when support is lacking. Such operationalizations would compare different text content containing similar information to disentangle the effect of additional information and emotional intensity.

We purposefully focused mainly on low-priced, everyday consumer goods where ethical attributes are prevalent in everyday shopping contexts. Previous research on self-interested versus other-related moral reasoning has also studied more expensive consumer durables (Ehrich and Irwin 2005; Irwin and Naylor 2009). Studying such categories may produce different effects and shed further light on product category effects related to self-related versus other-related reasoning. Another avenue for further research is the moderating role of the hedonic nature of the product category. Prior research with small assortments indicates a stronger influence of ethical attributes on preferences for hedonic relative to utilitarian products, helping to offset anticipated consumption guilt (Strahilevitz and Myers 1998). While we did not intent to replicate these effects, Study 4 does not reveal any differences across utilitarian and hedonic product types. Similar empirical extensions of studies with smaller assortments to larger assortment problems will likely provide many new insights on the effects of ethical attributes.

We hope this article motivates research in these and related directions as the consumer decision-making journey appears an important perspective for understanding the impact of ethical product attributes.