Introduction

As researchers have established the managerial relevance of corporate social responsibility (e.g., Aguinis and Glavas 2012; Porter and Kramer 2011), the study of the factors that drive corporate social performance has become a key concern in business ethics (Brower and Mahajan 2013; Crilly 2011; Ioannou and Serafeim 2012).Footnote 1 Corporate social performance (CSP) can be defined as the measurement of organizational outcomes in the environmental, social, and governance (ESG) domains with respect to multiple stakeholders, such as employees, local communities, or shareholders (Chen and Delmas 2011; Clarkson 1995; Orlitzky et al. 2003). Scholars have theorized multiple drivers of CSP at industry, country, and national business system levels (Aguilera et al. 2007; Campbell 2007; Matten and Moon 2008), and prior empirical research attempted to clarify whether and how these macrolevel drivers may interact with specific firm-level drivers of CSP (Brower and Mahajan 2013; Crilly 2011; Udayasankar 2008).

Missing from this research stream are two important considerations. First, although prior research has established that CSP drivers operate at the firm, industry, country, or national business systems (NBS) levels of analysis, surprisingly little is known about the relative influence of these factors (Aguilera et al. 2007) as well as the influence of time as a potential driver of CSP. These omissions are detrimental to knowledge about how CSP can become more strategic and, thus, more conducive to higher corporate financial performance (Orlitzky et al. 2011; Porter and Kramer 2006) as well as other important organizational outcomes across multiple levels and over time (Aguinis and Glavas 2012). At the same time, these knowledge gaps limit managers’ understanding of CSP priorities (Smith 2003), especially in corporations that operate across multiple industries, countries, or national business systems.

Second, despite the centrality of a stakeholder perspective in the theoretical analysis of CSP (Carroll 1991; Clarkson 1995; Freeman et al. 2007), prior empirical examinations of CSP drivers rarely made distinctions between the different stakeholder groups. Rather, empirical investigations focused on CSP breadth (Brower and Mahajan 2013), examined only aggregate proxies of CSP (e.g., Surroca et al. 2010), or did not separately study the different stakeholder foci in CSP (e.g., Ioannou and Serafeim 2012). As a result, our current knowledge of the relative influence of the factors that drive CSP across different stakeholder dimensions remains rather limited.

This study starts addressing these two important limitations. In regard to the first limitation, our analysis can be considered exploratory as there is at present no theory that would explain or specify the extent to which the different drivers affect CSP (let alone its stakeholder dimensions). This means that, like seminal studies taking a similar analytic approach in strategic management—with a different outcome variable (e.g., Rumelt 1991), our study focuses on the magnitude of effect sizes.Footnote 2 In general, such a descriptive focus on the magnitude of effect sizes has been recommended as methodological best practice (e.g., Cumming 2012; Hunter 1997; Kline 2004; Orlitzky 2012; Schmidt 1992). However, most researchers currently eschew such an emphasis on effect size magnitude in favor of the binary outcomes of null-hypothesis significance tests (Schmidt 1996; Ziliak and McCloskey 2008).

In regard to the second limitation, we rely in this study on disaggregated, stakeholder-focused measures of CSP not only to be consistent with prior theorizing (e.g., Clarkson 1995; Hillman and Keim 2001), but also to make the findings more operationally meaningful for managerial practice, as prior studies have shown that managers and employees perceive CSP mainly through a stakeholder lens (El-Akremi et al. 2015; Lucea 2010; Turker 2009). More specifically, the present study presents analyses for these six stakeholder groups separately: customers, local communities, shareholders, suppliers, the natural environment, and employees.

In shedding light on the relative importance of CSP drivers across multiple levels of analysis and for multiple stakeholder groups through the application of various methods, this study contributes to the literature in three major ways. First, this study advances stakeholder theory by showing that the relative influence of CSP determinants varies according to the stakeholder group considered. Our findings show that the firm level accounts for a lot of variability in CSP focused on local communities, the natural environment, and employees, whereas macrolevel drivers seem more important for shareholder-focused CSP. Second, we address recurrent calls for multilevel analyses (e.g., Aguilera et al. 2007; Aguinis and Glavas 2012) in comparing and contrasting the relative importance of three levels of possible sources of organizations’ variance in CSP: (a) national business systems (country); (b) industry; and (c) firm. We also add to this perspective by considering time (year) as a fourth level, a dimension that has often been neglected in prior CSP analyses (Griffin and Mahon 1997). Overall, our findings point to the primacy of firm-level CSP drivers, but also demonstrate the importance of higher levels of analysis by showing that national and supranational factors may, to some extent, affect specific stakeholder components of CSP. Finally, we compare and contrast the findings of three analytic techniques (analysis of variance, variance components analysis, and hierarchical linear modeling). As far as we know, this is the first multilevel analysis of CSP to compare the effect sizes calculated by each of these techniques across levels of analysis. So, similar to Hough’s (2006) study design for return on assets, this study adopts a multimethod perspective.

In the following section, we describe the theoretical background of this study and present three hypotheses regarding the importance of levels of analysis for CSP. To develop these hypotheses, we draw on insights from economics, management, and comparative sociology as well as the empirical and theoretical CSP literature. We then test our hypotheses using the SiRi dataset, which allows for the disaggregation of CSP by stakeholder domain. The third section introduces our methods as well as our sample, measures, and sources of data. The fourth section of the paper presents our analyses and results. Finally, we discuss the implications of our findings for theory and practice as well as the limitations of the study and potential future research directions.

A Multilevel Perspective on Corporate Social Performance

Macrolevel: Country and National Business Systems Factors

Variations in organizational adoption and implementation of CSP can be explained by a wide range of factors operating at different levels of analysis. First, social macrostructures have often been emphasized as key determinants of CSP (e.g., Aguilera et al. 2007; Matten and Moon 2008). These macrolevel institutions, or national business systems (NBS), capture institutional nation-state differences in firms’ macroenvironments (Morgan 2007), which in turn have been found to affect firm decisions, for example in the automobile industry (Biggart and Guillén 1999). Applied to CSP, distinct national policy frameworks that encourage social and/or environmental initiatives may affect organizations’ decisions (Spence 2007; Tantalo and Willi 2012). Accordingly, the country level and the NBS level have both been theorized as likely to explain part of the CSP variation across firms.

Mesolevel: Industry Factors

Industry forces have also been proposed as constraints on, or enablers of, CSP (Baird et al. 2012; Hull and Rothenberg 2008; Orlitzky and Shen 2013). For example, in industries that experience economic downturns, discretionary CSP expenditures may be one of the first corporate spending cuts (Campbell 2007). Conversely, in highly unionized industry environments, labor unions may put a lot of pressure on companies to increase the level of CSP exhibited toward workers and insist on the enforcement of “fair trade” standards, which may create trade barriers in a quest to protect blue-collar workers’ jobs from possibly less expensive imports (Ederington and Minier 2003; McWilliams et al. 2002). Hence, mesolevel industry factors are also expected to influence CSP variance across firms.

Microlevel: Firm-level Factors

Several authors also theorized that CSP may primarily be determined by firm-level actions and variables (e.g., McWilliams and Siegel 2001; Udayasankar 2008). Often, organization-level factors constrain firms’ spending on CSP. For example, organizational efforts to increase CSP may increase transaction and other costs (King 2007), such as organizational expenditures associated with identifying partners or stakeholders to be targeted by CSP, negotiating with these partners or stakeholders, and monitoring and enforcing compliance (e.g., monitoring of suppliers’ compliance with sustainability programs or workplace safety). So, because of the inherent costliness of genuine CSP (Friedman 1970; Orlitzky 2013; Windsor 2001), an important precondition for high CSP is the availability of slack resources (McGuire et al. 1988; Orlitzky et al. 2003; Waddock and Graves 1997). In addition, CSP may be constrained by customers’ reluctance to pay a premium for a firm’s socially responsible products (Bhattacharya and Sen 2004; Frazier 2007) or investors’ unwillingness to punish irresponsible companies or reward responsible ones (Rivoli 2003).

The Relative Influence of Factors Across Levels

Although prior theory clearly suggests that each of these three levels of analysis (NBS/country, industry, and firm) matters, no empirical evidence has provided numerical estimates of these factors’ relative weights and simultaneous impact. In addition, the time dimension has often been neglected in prior studies of CSP (Griffin and Mahon 1997). Rather than constituting a time-invariant outcome of deterministic firm-, industry-, or country-level influences, CSP may instead represent a highly time-contingent or transient decision process (e.g., Wang and Choi 2013). For example, recessions may severely limit the level of funding available for discretionary social and environmental initiatives as well as determine the strategic benefits of CSP (Campbell 2007; Lee et al. 2013; McWilliams and Siegel 2001). Coding for year captures this time dimension in the same way as coding for firms captures all the different firm-level variables that may affect CSP across all years of our time period (Rumelt 1991, p. 173). Hence, our overarching research question is aimed at addressing these multilevel issues. Our overall analytic approach, which is summarized in Fig. 1, can be stated as follows:

Fig. 1
figure 1

Overview of study

Research Question

To what extent do (1) country-level, (2) industry-level, (3) corporate-level factors, and (4) time account for the variability of different types of corporate social performance?

Hence, the focus of the present study is on the relative empirical importance of factors at different levels of analysis—a question that cannot be answered from the conventionally sizeless and binary hypothesis-testing perspective in the social sciences (Kline 2004; Orlitzky 2012; Schmidt 1996; Ziliak and McCloskey 2008). The research question implies that our perspective is not only descriptive, but also exploratory because so far no theory has emerged that would specify the magnitude of effect sizes with respect to the different sources of variability in CSP. Nonetheless, with theoretical guidance from the extant literature, our study also goes beyond this descriptive perspective by testing three more specific hypotheses as well as a methodological proposition.

Hypothesis Development

Macrolevel forces undoubtedly influence CSP (Aguilera et al. 2007; Ioannou and Serafeim 2012) by shaping the regulations and legal standards that exist in each domain of CSP and hence homogenizing the playing field within which firms compete through CSP. However, these macroforces are not necessarily homogeneous across CSP stakeholder domains because the institutional norms regulating what is regarded as appropriate behavior vis-à-vis each stakeholder group vary nationally and globally (Hall and Soskice 2001; Jackson and Deeg 2008). We argue here that the effects from these higher levels of analysis will pale in comparison to the interfirm variance in CSP that can be attributed to the firm level, except in the domain of corporate governance where national legal factors tend to have a prominent influence (Jensen and Meckling 1976; La Porta et al. 2000). Accordingly, we propose that the firm level is the main determinant of almost all stakeholder dimensions of CSP, except for shareholders (i.e., corporate governance practices). We now theorize further the reasons behind these differences.

The Predominance of Firm-Level Factors

Several arguments suggest that firm-level factors may explain a greater proportion of interfirm variance in CSP relative to other levels. First, CSP is increasingly used strategically (McWilliams et al. 2006; Orlitzky et al. 2011) to differentiate the firm from its competitors. In such a strategic approach, firm-specific cost-benefit analyses can be assumed to take center stage (Mackey et al. 2007; McWilliams and Siegel 2001). Let’s first consider the cost side. Implementing CSP is costly because, from an economic perspective, high CSP reflects a firm voluntarily internalizing its externalities (Lyon and Maxwell 2008). Externalities, defined as the results of market transactions that are not themselves embodied in such transactions (Coase 1960; Crouch 2006), are typically not fully reflected in prices and so lead to a divergence of private and social costs (Dahlman 1979; Pigou 1962). In other words, CSP refers to nonmarket actions by which firms take “ownership of the externalities they generate” (Crouch 2006, p. 1534). In addition to the direct costs associated with CSP (see also Orlitzky 2013), companies that are committed to CSP also incur transaction costs (Macher and Richman 2008), which apply not only to social/environmental partnerships, but also to the adoption of any CSP initiative more broadly (King 2007). These additional costs incurred by organizations high in CSP explain why there is often no short-term strategic incentive for individual companies to increase CSP (Campbell 2006), especially if intense market competition prevents such an organizational focus on social and environmental concerns (Doane 2005; Reich 2008; Shleifer 2004; Vogel 2005).

At the same time, economic rents are often appropriableFootnote 3 from CSP because, in the long run, transaction costs and uncertainty can sometimes be reduced by increasing CSP (Hosmer 1995; Jones 1995). When rents are appropriable by particular organizations with particular attributes, firm-level factors are expected to account for most CSP variability. This study is based on the assumption that profit-seeking economic factors will only make costly decisions if economic rents are anticipated as a result of those expenditures (Alvarez and Barney 2004; Coase 1937; Mackey et al. 2007; Schoemaker 1990).

Second, arguably even more important in explaining variance of CSP at the firm level is the fact that, in order to be effective, CSP must be embedded in particular organizational cultures (Swanson 2014). In creating the proper firm-specific conditions for meaningful, value-attuned CSP, the business executive’s mindset becomes highly important: a mindset of value discovery transcends legal, economic, or social pressures and is able to “engage employees in the quest for social responsibility” (Swanson 2014, p. 123). In turn, an organization’s culture and climate, emphasizing either compliance or values (Collier and Esteban 2007; Duarte 2010), exerts a powerful influence over commitment and engagement of the entire workforce (Slack et al. 2015). Other research suggests that organizational culture is causally ambiguous, socially complex (Dierickx and Cool 1989), path-dependent (Barney 1991), and often difficult for rivals to imitate, particularly if it forms part of an organization’s unique identity orientation in defining its relationships with stakeholders (Brickson 2007). Therefore, only if social and environmental initiatives converge around a set of highly standardized, institutionalized, and therefore relatively homogeneous practices (Orlitzky 2013) can we expect higher-level institutional drivers to outweigh the influence of idiosyncratic firm-level influences of CSP. Based on this theoretical reasoning, our overall expectation is that, for most aspects of CSP, firm-level factors account for most CSP variability because they either constitute firm-level economic constraints, predetermine the strategic/economic opportunities that can be anticipated from CSP, or are deeply, holistically, and intangibly embedded in each individual firm’s unique DNA.

H1

Most of the interfirm variability in CSP overall as well as its disaggregated stakeholder-oriented components is attributable to the micro (firm) level.

The Prevalence of Macrolevel Factors for Shareholder-Oriented CSP

Nonetheless, for the specific case of shareholder-oriented CSP, we expect variations in CSP to be explained more by macrolevel factors than firm-level factors. Three lines of argument back this hypothesis. First, in essence, good corporate governance involves organizations satisfying shareholders’ informational, ownership, and other interests that have not been sanctioned by regulations or the law yet (Macey 2008). For example, business executives and directors may decide to forgo pay in favor of organizational reinvestment of earnings, stock buybacks, or large dividend payments to shareholders. Often, prioritizing shareholders over other stakeholders (i.e., a strong focus on shareholder-oriented CSP, or corporate governance reforms) is triggered not so much by firm-specific events, but by higher-level motivations of stock market revitalization or the withdrawal of the state from economic activity (i.e., by a countrywide move away from corporatism) (Meyer and Höllerer 2010). This is unsurprising as an enhanced focus on shareholders—or firm owners—is not necessarily in the best interest of entrenched managers, who make the operational decisions about shareholder-oriented CSP. In contrast, improvements of many other aspects of CSP often have a direct influence on the attitudes and perceptions of customers or employees (Bhattacharya et al. 2009; El-Akremi et al. 2015), which can be expected to enhance corporate reputations and thus performance (Fombrun 2005). That is, when executives of large corporations can be assumed to be firmly entrenched, only macrolevel forces can be expected to affect differences in shareholder-oriented CSP. National governments and NBS may not only regulate or constrain, but also enable corporate actions that prioritize the firm owners’ property rights (see also Campbell and Lindberg 1990; Davis 2005).

Second, financial theory suggests that differences across countries largely account for differences in corporate actions in relation to shareholder management (Jensen and Meckling 1976; La Porta et al. 2000). This literature stresses the importance for shareholder management of macrolevel factors, such as laws protecting shareholders from expropriation, as well as the effectiveness of the enforcement of these laws across countries (for an overview, see La Porta et al. 2000). For instance, Doidge et al. (2007) found that country characteristics accounted for much more variance in firm-level governance ratings (ranging from 39 % to 73 %) than observable firm characteristics (ranging from 4 % to 22 %). If this logic of macrolevel dominance extends not only to core governance practices, but also more broadly to firms’ management of their relationships with shareholders in the extrafinancial domain, we should expect macrolevel factors, and in particular country-level factors, to explain relatively more variation in shareholder-related CSP than in other CSP dimensions.

Third and finally, this expectation is also in line with managers’ instrumental considerations. Although effective governance may ultimately lead to more satisfied shareholders, it is costly. For example, the organizational experience with the Sarbanes–Oxley Act (2002) has shown that (a) governmental regulations may become necessary because opportunistic managers are exceedingly reluctant to implement voluntary governance reforms that benefit shareholders and (b) the costs of good corporate governance can be very high (Chhaochharia and Grinstein 2007; Zhang 2007). Other evidence indicates that these costs are unlikely to be counterbalanced by improved organizational performance. More specifically, meta-analytic evidence indicates that shareholder-oriented initiatives that are generally considered good corporate governance are unlikely to increase stock prices or internal efficiency (Dalton et al. 1998). In this context, we can assume that if large net economic benefits existed for shareholder-oriented CSP (which is focused on practices of good governance), market signals rather than government regulation would already have led to more substantial governance reforms. However, the lack of financial impact of many well-known attributes of “good” corporate governance (Dalton et al. 1998, 2003), combined with the anticipation of high managerial costs associated with good governance (e.g., transfer of organizational funds and power from managers to owners), leads to the expectation of macrolevel (i.e., NBS or country-level) forces being the primary driver of this type of CSP.

H2

For shareholder initiatives, most of the variance in CSP is attributable to the macrolevel (i.e., country and/or NBS factors).

A Multistakeholder Perspective: Accounting for the Stakeholder-Centric Logic of CSP

A key theoretical and empirical motivator of this study is a comparison of variance decomposition models of CSP centered on a generic and broad responsibility toward society (see, e.g., Höllerer 2013) to other models that reflect a stakeholder-centric logic embedded in theorizing by Freeman (e.g., Freeman 1984; Freeman et al. 2007), Aguinis and Glavas (2013), Barnett (2007), Jones (1995), Mitchell et al. (1997), and many other scholars. Clarkson (1995) may have captured our reasoning regarding CSP best when he suggested:

Performance is what counts. Performance can be measured and evaluated. Whether a corporation and its management are motivated by enlightened self-interest, common sense, or high standards of ethical behavior cannot be determined by the empirical methodologies available today. (Clarkson 1995, p. 105)

Furthermore, evaluations of CSP may be based on proxies of stakeholder satisfaction because direct, valid measures are very difficult and expensive to obtain (Orlitzky and Swanson 2012). Practically, such a focus on primary stakeholders is necessary because each firm faces its own unique set of nonmarket challenges (Clarkson 1995). Empirically, this focus is necessary because overall ratings of CSP often do not seem to pass the most basic measurement hurdle of forming a coherent or robust construct (Orlitzky 2013), not even within the same organization (Strike et al. 2006). Although the construct validity of aggregate measures of CSP seems questionable (e.g., Chatterji and Levine 2006; Entine 2003; Porter and Kramer 2006), researchers have continued to use them (e.g., Ioannou and Serafeim 2012; Surroca et al. 2010). A more charitable interpretation of the lack of a coherent CSP aggregate is the observation that there is no equivalent aggregate of corporate financial performance, either. In fact, the evidence suggests that different dimensions of financial performance are in tension with each other (Meyer and Gupta 1994), which is one of the reasons why, for example, return on assets, return on equity, market share, or Tobin’s q are not aggregated to capture a corporation’s financial performance in one overall number. Based on this reasoning and previous empirical research (Mattingly and Berman 2006), we expect large differences between variance decompositions for the stakeholder dimensions and those for the aggregate scores of CSP.

H3

The proportions of variance accounted for in decomposition models that examine specific stakeholder dimensions of CSP are expected to differ significantly from those shown in models of aggregate CSP.

A Multimethod Perspective: Three Different Approaches to Variance Decomposition

Three different analytic techniques of variance decomposition have vigorously been debated in the field of strategic management (see, e.g., Brush and Bromiley 1997; Crossland and Hambrick 2007; Hough 2006; McGahan and Porter 2005; Misangyi et al. 2006; Ruefli and Wiggins 2003): analysis of variance (ANOVA), variance components analysis (VCA), and hierarchical linear modeling (HLM). To estimate the sources of the interfirm variability in CSP, we will compare the results of all three methods, which are depicted in Fig. 1 as horizontal arrows crossing the different levels of analysis. Because the analytic approaches of ANOVA, VCA, and HLM rely on very different statistical assumptions and estimation techniques, we do not expect the findings, regarding estimates of interfirm CSP variance explained, to converge across the three different methods. As this assumption is based on not so much organization theory or a theory of business ethics but statistical theory instead, we frame it as the following methodological proposition: Estimates of the variability in CSP accounted for by the different levels and the different stakeholders are significantly different across the three different methodological approaches of ANOVA, VCA, and HLM.

Method

Sample

Our sample of 2060 corporations (with an average firm size of 91,716 full-time employees) was drawn from the database compiled by Sustainable Investment Research International, or SiRi (now known as Sustainalytics), which is one of the world’s largest firms specializing in research on CSP.Footnote 4 At the time of the study, the SiRi dataset was compiled by a network of social rating agencies comprising ten independent research institutions such as KLD, coordinated from the SiRi headquarters in the Netherlands and Canada. SiRi aimed to assess the CSP of all the largest stock listed companies worldwide, aggregating information from various sources, such as company documents and interviews, media reports, trade unions, NGOs, and other contacts with stakeholders and managers. Similar samples drawn from the SiRi database have also been used in research published in other prestigious academic journals (e.g., Ioannou and Serafeim 2012; Surroca et al. 2010).

Our study covered the 5-year period of 2003–2007. In other words, we obtained 10,300 year observations in total from large public companies headquartered in 21 different countries: Australia, Austria, Belgium, Canada, Denmark, Finland, France, Germany, Greece, Ireland, Italy, Japan, Netherlands, Norway, Portugal, South Korea, Spain, Sweden, Switzerland, United Kingdom, and USA. Table 5 in the Appendix presents descriptive statistics about the number of firms and their size (in terms of full-time employees) within each country as well as NBS cluster of countries. Our sample is highly representative of the population to which we aim to generalize our findings—to the set of very large, public multinational companies attracting media attention for their financial, social, and environmental performance.

Dependent Variables

We relied on the SiRi database to measure CSP, our dependent variable. SiRi assigns a rating between 0 and 100, which represents the extent of each firm’s overall, aggregate CSP with respect to customers, local communities, shareholders (i.e., extent of responsible corporate governance), suppliers, the natural environment, and employees, respectively. For the empirical comparisons necessitated by our research question, we also used the overall CSP score for each firm (see also Surroca et al. 2010 for more details and the suitability of this dataset more generally). Table 1 presents the means and standard deviations for each stakeholder dimension of CSP. SiRi offers a truly international cross-industry dataset capturing companies’ CSP with satisfactory measurement properties. Table 6 in the Appendix provides the main measurement components for each of the six stakeholder dimensions.

Table 1 Descriptive statistics and correlations

Sources of Variation

National Business Systems (NBS)

To code the NBS where our sample companies are headquartered, we relied on the five institutional country clusters of NBS identified by Amable (2003). Amable’s varieties of capitalism framework is grounded in institutional economics and political science and supported by considerable empirical evidence (Jackson and Deeg 2008; Morgan 2007; Tempel and Walgenbach 2007). Amable’s (2003) extensive and detailed statistical analysis of a large set of macroeconomic indicators supported the following five clusters of NBS: Market-Based Capitalism, Coordinated Market Economies, Social-Democratic Economies, Mixed Market Economies (i.e., Amable’s “Mediterranean Varieties of Capitalism”), and Asian Collectivist Economies. This clustering of countries has been shown to be robust for Organization for Economic Cooperation and Development (OECD) countries (Amable 2003, pp. 171–181). Amable’s varieties of capitalism were particularly suitable for this study for three main reasons. First, all companies in our sample were headquartered in OECD countries and could therefore be categorized according to Amable’s framework without any need for additional assumptions about institutional classification. Second, Amable’s model is the most applicable macroinstitutional typology of NBS because it most closely corresponds to the timeframe of this study: Amable’s (2003) data analysis ended in 2002, and our CSP dataset covers the years 2003–2007. Finally, Amable (2003, pp. 181–213) established the predictive validity of his NBS framework by examining the empirical impact of his NBS clusters on other theoretically related variables, such as partisan politics and specialization of scientific, technological, and industrial activity (Amable 2003, pp. 181–209). After a thorough review of the NBS literature, we concluded that Amable’s varieties of capitalism model represented a rigorous and empirically validated typology of NBS clusters (see also Jackson and Deeg 2008; Morgan 2007).

Industry Sectors

The ten broad sectors of the Global Industry Classification Standard (GICS) were used in our coding of industry: energy, materials, industrials, consumer discretionary, consumer staples, health care, financials, information technology, telecommunication services, and utilities. Statistically significant Chi-square statistics summarizing the cross-tabulation of companies by industry and NBS suggested the importance of including industry sectors in our analytic models.

Analytic Models

In general, our analytic model can be formally expressed as

$$\eta_{\text{ijkt}} = \mu + \alpha_{\text{i}} + \beta_{\text{j}} + \gamma_{\text{k}} + \delta_{\text{t}} + \varepsilon_{\text{ijkt}} ;$$

where η ijkt is CSP in NBS i, in industry j, in firm k, during year t. The dependent variable η ijkt is a linear combination of the grand mean μ, a NBS effect (α), an industry effect (β), a firm effect (γ), a year effect (δ), and an error term (ε ijkt). To estimate the sources of variance in CSP, three different analytic techniques were used: analysis of variance (ANOVA), variance components analysis (VCA), and hierarchical linear modeling (HLM). Each of these data analysis tools comes with its own set of advantages and disadvantages, as briefly discussed below.

ANOVA

The first analytic technique used in our study was simultaneous ANOVA, which relies on an ordinary least squares (OLS) algorithm. Several strategy researchers consider this technique to be superior to sequential ANOVA in the context of components-of-variance analysis (e.g., Crossland and Hambrick 2007; McGahan and Porter 2002). Our OLS calculation of variance components in the unbalancedFootnote 5 dataset (see also Searle et al. 1992) followed Marchenko’s (2006) specific methodological advice for Stata™ software analyses. In our analyses, firms were conceptualized as nested within NBS. Effects were entered in the following sequence: year, industry, NBS, and firm.

VCA

This technique addresses the major weakness of ANOVA; in ANOVA the results are affected by entry order of categories (Bowman and Helfat 2001; Brush et al. 1999)—a weakness that even applies to simultaneous ANOVA (Crossland and Hambrick 2007, p. 780). The statistical assumptions behind random-effects VCA are described in further detail in Searle et al. (1992) and Rabe-Hesketh and Skrondal (2012). In all random-effects VCA, an important precondition for computability is that the underlying probability distribution of the data is assumed to be normal. The two methods of VCA estimation are maximum likelihood estimation (MLE) and restricted maximum likelihood estimation (REMLE). The difference is that MLE provides estimates of fixed effects, whereas REMLE does not. In other words, REMLE overcomes the weakness of ML estimation of disregarding the degrees of freedom used for estimating fixed effects, that is, of neither being minimum variance nor unbiased (in contrast to ANOVA estimates). For unbalanced data (like ours), statistical experts consider both MLE and REMLE VCA to be superior to the ANOVA method (Searle et al. 1992, p. 254). However, because of the tendency of VCA to produce unstable results (Brush and Bromiley 1997), many strategy researchers still prefer ANOVA to VCA (Misangyi et al. 2006, p. 573). In line with our methodological proposition, we decided to report both.

HLM

Most recently, strategy researchers argued that hierarchical linear modeling (HLM), also known as multilevel modeling, was the best method for examining multilevel effects (Hough 2006; Misangyi et al. 2006). Specifically, HLM is generally considered superior to VCA and ANOVA because HLM (a) permits complex error structures and can thus model dependence between levels of analysis, (b) has greater statistical power than the other two methods, and (c) addresses the problem of collinearity between corporations and industries (Hough 2006).Footnote 6 In other words, HLM specifies within-unit factors most accurately in longitudinal datasets (Rabe-Hesketh and Skrondal 2012). HLM, however, is not the only method we reported because, first, ANOVA and VCA have a much longer methodological tradition than HLM in the business literature and, second and most important, our methodological proposition aims to compare the results of our multimethod calculations. This helps us determine whether the findings across the three different techniques are commensurate. In sum, we believe that a descriptive study like ours, focused on estimating the proportion of CSP variance accounted for by the various levels of influence, can benefit from this methodological pluralism.

To test our hypotheses, we adapted Crossland and Hambrick’s (2007) conservative significance testing procedure (see also Bobko 2001). First, the partial R 2s were converted into partial r correlation coefficients (by taking the square root). Then, the rs were compared via Steiger’s (1980) Z, which is a simplified version of Dunn and Clark’s (1969, 1971) test proposed for overlapping samples (Kleinbaum et al. 1988). We concluded that H1 or H2 was supported if the relevant difference between the hypothesized dominant (or highest) and second-highest category in each variance decomposition model was statistically significant at an alpha probability (p) level of 0.01 or lower. H3 and the methodological proposition involved multiple comparisons between parts of our correlation matrices or entire matrices and, thus, were assessed via overall χ 2 tests of difference or fit. For the assessment of H3 and the methodological proposition, we followed the statistical procedures described by Steiger (1980).

Results

Table 1 reports means, standard deviations, and bivariate correlations for the year 2005 (unless stated otherwise), the midpoint of our study timeframe. We estimated the reliability of the dependent variables by calculating the internal consistency (i.e., Cronbach’s alpha) of the SiRi measurement items used for each CSP stakeholder dimension. The items per dimension ranged from 16 (for customer initiatives) to 37 items (for employee CSP). The calculated alpha reliability estimates of 0.77, 0.81, 0.83, 0.85, 0.76, and 0.86 for customer, community, shareholder, employee, supplier, and environmental CSP, respectively, were satisfactory. In addition, the statistically significant positive correlations between the six stakeholder dimensions can, at a minimum, be interpreted as generally satisfactory coefficients of generalizability (Orlitzky and Benjamin 2001; Traub 1994). Consistent with other studies (Chatterji et al. 2009; Sharfman 1996), the significantly positive correlations in the lower right-hand corner of Table 1 can be interpreted as indicative of the concurrent validity of the CSP proxies.

In the reporting of results, we will first discuss the descriptive findings with respect to our overarching research question, which focused on effect size magnitudes (i.e., proportion of variance explained by multiple levels of influence). Then, we will summarize the results of our hypothesis tests.

Analysis of Variance (ANOVA) and Variance Components Analysis (VCA)

Table 2, which presents the ANOVA and VCA results, indicates that the firm level explained between 36 and 75 % of the variance in our SiRi dataset. Particularly for overall CSP, the firm level tended to explain a very large proportion of variance, if not the largest variance, of all dependent variables considered in this study. In general, NBS and industry membership explained a much smaller proportion of variance in the dependent variables—with only one exception to this general rule: NBS were clearly the second most important determinant of CSP protecting shareholder interests (i.e., good corporate governance), explaining between 21 and 27 % of variance, depending on the specific statistical method used. In contrast, for socially responsible supply-chain initiatives, year-to-year changes appeared to be the second most important antecedent, explaining between 22 and 30 % of variance of supplier-focused CSP. Other than that, annual changes (between 2003 and 2007) accounted for only a negligible fraction of CSP variance.

Table 2 ANOVA and variance components analysis

To check the robustness of these conclusions about the predominant attribution of CSP variation to between-firm differences, we repeated our calculations for each firm headquarters’ country location (instead of NBS), leaving all other data points unchanged. Table 3 shows that overall the firm level remained the predominant factor—with one exception: when either MLE VCA or REMLE VCA was applied, country became the most important antecedent of shareholder-focused CSP (i.e., scores reflecting good corporate governance), explaining 41–43 % of its variance. The comparison of Tables 2 and 3 indicates that country generally explained more variance than NBS. Country location of firm headquarters also seems to have been quite important for employee-focused CSP, explaining about 22 % of its variance in the context of MLE or REMLE VCA. Industry membership and temporal change were found to be relatively unimportant—with supplier-oriented CSP the only exception again (for year-to-year changes).

Table 3 ANOVA and variance components analysis (robustness check: countries rather than national business systems)

Hierarchical Linear Modeling (HLM)

In our HLM specification, we explicitly modeled the interaction, or covariance, between firms and industries. That is, this covariance across two different levels of analysis was not assumed to be zero as in VCA. Many researchers regard REMLE as superior to MLE because, for balanced data, REMLE produces solutions identical to ANOVA, which has optimal minimum variance properties (Searle et al. 1992, p. 255). Therefore, Table 4 only presents the REMLE results.Footnote 7 Our REMLE HLM findings suggest that, again, most variance in CSP (across the six stakeholder dimensions) is mostly attributable to between-firm differences rather than NBS or industry variation. In general, the firm effects exceeded NBS effects and industry effects by factors of 2:1–67:1, and 3:1–47:1, respectively. Even larger than these differences in the six stakeholder dimensions were the differences between firm-level effects and NBS effects, and between firm-level effects and industry effects, for overall CSP. Substituting country for NBS effects reduced the proportion of CSP variance explained by firm effects by 13 % on average. Country effects, on the other hand, accounted for 7–39 % of variance in the dependent variables. In fact, the latter percentage exceeded variance attributed to firm effects by 9 percentage points in the case of shareholder-focused CSP (i.e., 39 vs. 30 % for variance in good corporate governance as rated by SiRi). Country effects, which also seemed to be quite important for employee-focused CSP (21 %), were consistently more important than industry effects. Country-level effects also were generally more important than temporal effects (with the exception of supplier-oriented and customer-focused CSP). In the HLM, firm effects generally exceeded country effects by factors of 2:1–7:1—with, as already noted before in the interpretation of the VCA results, shareholder-focused CSP as the only exception.

Table 4 Hierarchical linear model (HLM) using restricted maximum likelihood estimation (REMLE)

Hypothesis Tests

Out of 56 possible tests of H1 (14 tests for ANOVA results plus 28 tests for VCA plus 14 tests for HLM), 53 came out statistically significant at probability level p < 0.01, consistently supporting the hypothesis that most of the variance in CSP overall and its stakeholder-oriented components is attributable to the firm level. That is, whenever the firm level explained most variance, it explained significantly more variance than the second-highest component, which differed from one stakeholder group to another (see Tables 2, 3, 4). Notable exceptions were both VCAs shown in Table 3 and one of the HLMs shown in Table 4, where the country level of influence, in line with our second hypothesis, accounted for the greatest proportion in shareholder-oriented CSP (at p < 0.01).

Overall, H2, which predicted that, for shareholder initiatives, most of the variance in CSP could be attributed to the country level (national business systems), did not receive the same level of empirical support as H1. Out of eight possible tests (two tests for the ANOVA tables plus four tests for VCA plus two tests for HLM), the country level explained (at p < 0.01) the highest percentage of CSP variability in three models—the two country-level specifications of VCA (see Table 3) and the country-level specification of HLM (see Table 4). H2 was not supported, however, when a NBS specification of each variance decomposition model was used. That is, in the other five specifications, the firm level remained the predominant factor explaining specific stakeholder dimensions of CSP variance (see discussion of H1).

H3 proposed that the proportions of variance accounted for in decomposition models based on a stakeholder logic differed significantly from those shown in models of aggregate CSP. The critical χ 2 value (α = 0.01) was 13.28 for the ANOVA and VCA comparisons, and 11.34 for the HLM comparisons. All calculated χ 2 statistics used for the comparisons to test H3 were statistically significant (p < 0.01) for all stakeholder groups, except for two: the model comparison between environmental CSP and aggregate CSP in the MLE VCA (with country as highest level) had a χ 2 of only 12.33; and the model comparison between community-oriented CSP and aggregate CSP in the HLM (with NBS as the highest level) had a χ 2 of only 9.97. These findings imply that the disaggregation of CSP is important for all stakeholder groups except for the natural environment and local communities (at least sometimes). In other words, for most stakeholder groups, capturing CSP only at the aggregate level will not allow for a proper decomposition of its variance.

Finally, the methodological comparisons of matrices implied by the final, methodological proposition all came out statistically significant. The smallest χ 2 was 179.78, for the comparison between the country-based REMLE VCA and the corresponding country-based HLM. However, the critical χ 2 value for this comparison was 34.81 (α = 0.01; df = 18). All other calculated χ 2 for all other comparisons of matrices (after converting r into z scores) exceeded their critical thresholds as well. This means that the method used for decomposing variance of a focal variable does make a substantive difference; the methods of ANOVA versus VCA versus HLM are not interchangeable. This finding is consistent with the conclusions of the methods literature on variance decomposition in strategic management. However, to the best of our knowledge, this methodological difference has never been demonstrated empirically or presented in as much statistical detail as in this study.

Discussion

To the extent that the SiRi measures reflect CSP accurately, this study suggests that firm-level drivers account for the greatest proportion of not only CSP overall, but also most of its stakeholder-specific dimensions. Our findings indicate that firm-level factors are especially important for CSP targeted at local communities, the natural environment, and employees, whereas broader institutional influences (country-level effects or NBS) sometimes seem to be more important for shareholder-oriented CSP. The three multilevel analyses show that, in support of H1, the firm level accounts for the largest proportion of variance in CSP across five of the six stakeholder dimensions. More specifically, in the ANOVA and VCA results, the firm level consistently seems to have the largest impact on social and environmental initiatives, explaining between 36 and 75 % of variance of CSP stakeholder dimensions. These firm effects seem to be particularly large for overall CSP and for CSP targeted at local communities, the natural environment, and employees. The relatively large influence of firm-level factors suggests that, given particular corporate attributes, firms are able to choose particular CSP initiatives proactively and strategically. Instead of higher-level environmental forces structuring corporate decisions, there seems to be considerable leeway for economic agency independent of these higher-level factors (Child 1997; Oliver 1991); otherwise, the firm level would explain only a much smaller proportion of the CSP variance. These results are also consistent with arguments in favor of the alignment of strategy with CSP (Orlitzky et al. 2011; Porter and Kramer 2011).

Interestingly, the one exception to this general finding about the importance of firm-level effects is shareholder-focused CSP, in which country effects predominated in the VCA and HLM analyses. Interestingly, our findings suggest good (or poor) corporate governance is systemic—much more so than the other stakeholder dimensions of CSP. The national-level drivers of shareholder-focused CSP (“good corporate governance”) clearly warrant further theorizing and empirical study. One possible interpretation of these findings is that when a country has strong institutions in place to protect a particular stakeholder group, firm-level agency (for better or worse) will become more limited.Footnote 8 Furthermore, our findings also underscore the appropriateness of the decision made by many researchers to exclude shareholder-oriented CSP from overall CSP deliberately (e.g., Chin et al. 2013; Di Giuli and Kostovetsky 2014; Hillman and Keim 2001; Surroca et al. 2010). From the viewpoint of stakeholder theory, this common practice of omitting analyses of shareholder concerns is puzzling (Freeman 1984, 1994; Mackey 2005); however, based on the empirical findings of this study, it seems empirically justified after all because the antecedents of shareholder-oriented CSP seem to come from very different levels than those of all other CSP dimensions.

Furthermore, the findings allude to the persistence of differences in nation-states and/or business systems—despite the forces of globalization (Gond et al. 2011; Matten and Moon 2008). If there were a global convergence toward an Anglo-American model of shareholder capitalism, for instance, countries’ lack of variability would be reflected in small NBS or country effects. However, because NBS or country effects explained up to 43 % of the variability in CSP (in VCA and HLM), this study provides some evidence supporting the view that NBS still diverge. Only with respect to CSP targeted at suppliers, local communities, and employees did NBS seem to explain only a negligible amount of variance in CSP. As a final observation about the macro (institutional) level, the fact that country effects were generally found to be larger than NBS effects indicates that, contrary to the arguments by Amable (2003), clusters of institutional similarities may be a level of conceptual abstraction that is indeed a bit too high.

Overall, a firm’s main industry sector seems to have only a minor effect on CSP across the six stakeholder dimensions. Industry effects were largest for environmental CSP (varying between 5 and 13 %, depending on the analytic technique used) and CSP shown toward local communities (varying between 5 and 10 %). In contrast, industry differences seem to matter the least for shareholder-focused CSP and customer-focused CSP. This suggests that industry self-regulation is not observed to be a driver of changes in governance- or customer-related aspects of CSP (see also King and Lenox 2000, on the (in)effectiveness of industry self-regulation in the environmental arena). Generally, industry differences do not seem a major explanation of the variability in CSP and most of its stakeholder dimensions.

The fact that, for different aspects of CSP, we find at least some variation in the relative importance of NBS, country, industry, and firm effects indicates the usefulness of disaggregating CSP into its various stakeholder group dimensions. The findings summarized in Tables 2, 3, and 4 consistently show that differences in sources of CSP would be overlooked if CSP were only calculated at an aggregate level, rather than for each stakeholder dimension separately. In support of theorizing by Clarkson (1995) and our introduction of H3, CSP may best be discussed as corporate stakeholder performance. In other words, different corporate stakeholder responsibilities seem to require different explanations at different levels of analysis and should therefore be theorized as distinct outcomes. In this context, it should also be noted that the reliability of overall CSP (0.73) was lower than the reliability coefficients of the stakeholder components of CSP. Lower reliability coefficients are synonymous with larger measurement errors and more noise (Orlitzky 2013; Orlitzky and Swanson 2012). Thus, the breakdown of CSP into its stakeholder components may also be beneficial for the interpretation of empirical results. This result is aligned with psychometric findings about employees’ perceptions of CSR, showing the relevance of a multistakeholder perspective to CSR assessment (El-Akremi et al. 2015; Turker 2009). Our study suggests the construct of corporate stakeholder performance may be a helpful complement to CSP.

Practical Implications for Managers and Policy-Makers

For effective organizational and public policy decisions about CSP, managers need to know the main sources of interfirm differences in CSP. The results of this study imply that by far the most important source of corporate differences in CSP is due to firm-specific assets, resources, and mindsets rather than headquarters location, membership in an industry, or year-to-year adjustments to external pressures. Put differently, firms within a given industry differ from one another a great deal more than industries or countries do in terms of CSP. So, in some ways, our findings challenge the importance of institutional and industry-level drivers of CSP that has been highlighted in previous theory and empirical research (e.g., Campbell 2006, 2007; Ioannou and Serafeim 2012). Viewed from an empirical decomposition-of-variance perspective, firm-level microfactors should be emphasized more in managerial decision-making than these meso- and macrolevel influences.

In a practical context, estimating the relative importance can provide guidance on the levels of analysis that may be most instrumental for making effective (including strategic) decisions about CSP. Because generally industry-level effects are small, managers making decisions about CSP initiatives, such as CSR or sustainability directors, may spend their time more productively on creating highly integrated, firm-specific configurations of CSP activities rather than mimicking broad industry standards or trends in CSP (see also Orlitzky 2013). The same caveat applies to emulating countrywide trends in CSP because our findings suggest (with the caveats mentioned in the next section) that these nation-state forces, in general, do not seem to be the main levers for improving CSP. The one exception that public policy-makers ought to keep in mind is the greater importance of country effects for corporate governance than any other dimension of CSP. Thus, our findings allude to the possibility that government may ultimately be the most appropriate regulator of shareholder-related CSP, prioritizing the firm owners’ interests over managerial self-interest and entrenchment.

Limitations and Future Research

Like all studies, this one has several weaknesses, which could be addressed in future research. First, the findings are limited to our particular dataset (SiRi), whose usefulness and measurement advantages, however, have also been highlighted in a study by Surroca et al. (2010). In future, researchers could analyze other CSP datasets to validate our findings. For example, cross-validation is required with a sample of smaller, privately held firms because, as Table 1 shows, the average organizational size in our SiRi was large—approximately 92,000 employees per firm. More important, though, may be the possibility that the type of CSP measure used in this study—even when decomposed into its stakeholder components—is affected by significant biases, conflicts of interest, or validity concerns (e.g., Carroll 2000; Chelli and Gendron 2013; Graafland et al. 2004; Igalens and Gond 2005; Liston-Heyes and Ceton 2009; Orlitzky 2013), so that alternatives (see, e.g., Chen and Delmas 2011; Orlitzky and Swanson 2012; Turker 2009) should be explored in future.Footnote 9

Furthermore, it is important to remember that, in general, studies like this one are unable to answer questions about the ultimate drivers of CSP. Future studies could, for example, measure specific CEO or top management team characteristics and use several different lagged designs (with greater attention to intertemporal effects, i.e., year-to-year variation δ in the analytic model) to clarify causal effects. Although HLM is able to examine causal relationships (Hough 2006), ANOVA and VCA are purely descriptive (McGahan and Porter 2005; Rumelt 1991). In terms of causality, many higher-level effects are very likely to be driven by managerial actions (McGahan and Porter 2005, pp. 875–876; Ruefli and Wiggins 2003, pp. 864–865). Conversely, firm effects cannot unambiguously be attributed to managerial actions.

Finally, future HLM research could transcend the limitation of our study to categorical effects (dummy variable coding) and investigate the impact of specific continuous variables on CSP. Such a focus on continuous variables could determine, across different levels of analysis, what specific variables cause firms to increase or decrease their CSP. Other researchers (e.g., Ioannou and Serafeim 2012) have already taken an important step in this direction—albeit not from the perspective of corporate stakeholder performance. As indicated by our own supplementary analyses,Footnote 10 the most important antecedents of CSP at the firm level may include companies’ international scope, firm size, and intangible assets.

Conclusion

This study adopted a multilevel, multistakeholder, and multimethod approach to examine and unpack the relative influence of CSP drivers at different levels of analysis. More specifically, it applied to the CSP arena three statistical modeling techniques, which have been widely used in the strategic management literature to analyze the sources of variability in financial performance. Comparing and contrasting the empirical importance of (a) institutional/country-level (macro) effects, (b) industry (meso) effects, (c) firm-level (micro) effects, and (d) time effects by drawing on three distinct statistical techniques, we generally establish the primacy of firm-level factors. Macrolevel factors seem to exert a primary influence on only one of our investigated CSP dimensions, namely shareholder-oriented CSP—and even then only in the context of one specific technique of variance decomposition analysis. In addition, we showed that the relative impact of the determinants of CSP varied greatly depending on the stakeholder initiative considered. Specifically, firm-level drivers seemed to be the most important determinants of CSP for local communities, the natural environment, and employees as well as a firm’s overall CSP. Thus, future cross-cultural research of CSP ought to examine not only aggregate CSP, but also distinct stakeholder groups. Overall, the findings of our study can be interpreted as preliminary evidence that the choice of an organization’s most effective level of CSP is highly firm-specific, stakeholder-specific, and probably closely intertwined with the firm’s strategy, identity, and culture.