1 Introduction

Mirroring the recognition of important issues, such as working conditions, human rights and environment, corporate social responsibility (CSR) has developed considerably in recent years both in academic literature (e.g., Margolis & Walsh, 2003; McWilliams & Siegel, 2001) and in the business world (Ioannou & Serafeim, 2012). Corporate social performance (CSP) refers to a “business organization's configuration of principles of social responsibility, processes of social responsiveness, and policies, programs and observable outcomes as they relate to the company's societal relationships” (Sahut et al. 2016). Accordingly, CSP represents firm’s performance in CSR matters (Hill et al., 2007; McWilliams & Siegel, 2001). In the economic approach, corporate governance (CG) places particular emphasis on maximizing a company’s financial performance (Shleifer & Vishny, 1997). However, in a broader view, CG can be seen as “the design of institutions which induce or force management to internalize the welfare of stakeholders” (Tirole, 2001, p. 4). The literature was mainly interested in the link between CSR and firm’s financial performance (McWilliams & Siegel, 2000; Rowley & Berman, 2000). The main conclusion is that the relationship is at best uncertain (e.g. Peloza, 2009). Understanding and knowing why some firms have a higher CSP remains a topical and important issue (Cruz et al., 2019).

CSR and CG are two intertwined concepts where: (1) CG is a pillar of CSR; (2) CSR is a dimension of CG; and (3) CG and CSR are part of continuum (Jamali et al., 2008). Accordingly, beyond the traditional functions, which are monitoring and provisions of resources functions (cf. Hillman & Dalziel, 2003), a third is assigned to the board of directors (BoD): increase firm’s sustainable behavior (Hill & Jones, 1992). Consequently, the BoD is accountable for a firm’s CSR policy (via its choices and decision-making) and ultimately for any firm performance (FP) outcomes (Rao & Tilt, 2016; Shaukat et al., 2016). Many wonder about ways in which to enhance CSP (Cruz et al., 2019; Sahut et al., 2019). Women on corporate boards (WOCB) are one of the solutions considered in the literature (Byron & Post, 2016). The underlying assumption is that the experience and values of female directors are likely to influence a company’s CSR policy, reputation and, ultimately, FP and CSP (Adams et al., 2015; Cook & Glass, 2018).

Although there is an abundance of literature (both theoretical and empirical) on how WOCB influence numerous areas of an organization, which in turn has an impact on firm performance (FP) (Kirsch, 2018; Terjesen et al., 2009), it must be noted that only a limited number of studies have examined the relationship between WOCB and non-financial performance and, in particular, CSP (Rao & Tilt, 2016). This therefore constitutes an under-researched area (Francoeur et al., 2019).

From an empirical standpoint, the existing empirical literature examining the effect of WOCB on CSP has yielded mixed results. Recent works by Dang et al. (2021), Francoeur et al. (2019) and the meta-analyses conducted by Rao and Tilt (2016) and Byron and Post (2016) document a positive relationship between WOCB and CSP, whereas other studies have found a negative (e.g., Husted & de Sousa-Filho, 2019) or null relationship (e.g., Boulouta, 2013; Manita et al., 2018). These contrasting results create some confusion about the effect of WOCB on CSP.

Some existing studies and meta-analyses fail to consider endogeneity issues in WOCB–CSP relationship. Because board composition is unlikely the result of an exogeneous variation, but more likely from firm- or self-selection according to their operation and contracting environment (Adams & Ferreira, 2007; Coles et al., 2008; Harris & Raviv, 2006), we argue that it is necessary to take into account the endogeneity problems when examining the effects of WOCB on financial or non-financial performance (Adams, 2016). Two main sources may bias the effect of board composition on performance (Boulouta, 2013; Francoeur et al., 2019): omitted or unobserved firm characteristics, and reverse causality.

The study makes several contributions to the CSR and WOCB literatures. First, this study makes a theoretical contribution to the CSR & WOCB literatures. While current contrasting empirical results can be probably explained by differences regarding national institutional systems (Grosvold & Brammer, 2011), time windows (Dang et al., 2021), CSP measures or estimation methods (Byron & Post, 2016), we argue that the existing literature does not consider the fact that the WOCB–CSP relationship may not be linear. Theoretically, token and critical mass theories (Kanter, 1977a, 1977b) posit that women’s ability to influence any firm’s outcome (e.g. job satisfaction, turnover or FP) depends fundamentally on their number, i.e. high (low) female representation is likely to have a positive (effect). Likewise, many authors have shown that the non-linearity of factors influencing performance, whether financial or social in nature (Kalaitzoglou et al., 2020). Many empirical studies have confirmed this theoretical viewpoint. For example, Torchia et al. (2011) find that a critical mass of WOCB significantly enhances a company’s innovation. Joecks et al. (2013) observe a U-shaped relationship between WOCB and FP. Strydom et al. (2017) confirm the U-shaped relationship between WOCB and earnings quality. Accordingly, this research pursues this line of inquiry within the framework of the WOCB–CSP relationship. Consequently, we contribute the CSR and WOCB literatures by re-examining WOCB–CSP relationship through the prism a non-linear relationship not significantly supported by theoretical and empirical literature (Byron & Post, 2016; Rao & Tilt, 2016).

Second, this paper makes an empirical contribution to the CSR and WOCB literatures by using a novel approach compared to existing studies: the semiparametric approach. This latter is perceived as being more flexible (Florackis & Ozkan, 2009; Florackis et al., 2015; Hamadi & Heinen, 2015; Trinh et al., 2018) as it does not impose any a priori pre-specified parametric form regarding the relationship under investigation, enabling non-linearities in the data to be more effectively captured. The semiparametric approach provides a relatively complete overview of the WOCB–CSP relationship. Engle et al. (1986) emphasize how nonparametric specification is perfectly appropriate for cases where the relationship under examination is highly nonlinear, as seems to be the case for the WOCB–CSP relationship. Finally, unlike OLS (ordinary least squares), the semiparametric approach is insensitive to the presence of outliers, thus enabling more robust conclusions regarding the WOCB–CSP relationship. Specifically, following Trinh et al. (2018), this study employs generalized additive models (GAMs), a type of semiparametric regression model (Wood, 2006). GAMs have become very popular in the fields of medicine, biology and ecology (Marra & Radice, 2010) to the extent that this approach extends traditional generalized linear models by allowing the variable of interest to be non-linear (McCullagh & Nelder, 1989). In a nutshell, the semiparametric approach allows us to capture the possible complex non-linear relationship between WOCB and CSP, thereby enabling us to suggest another line of explanation (possibly complementary) to the current mixed empirical findings (Florackis et al., 2015; Hamadi & Heinen, 2015). We thus contribute to the CSR and WOCB empirical literatures, as to the best of our knowledge, no study has done this.

Finally, this study adds to the existing literature by specifically addressing endogeneity issues in WOCB–CSP literature. A common empirical strategy to deal with endogeneity in panel data models is to use the instrumental variable (IV) approach (e.g. Wooldridge, 2010). Indeed, this method can yield consistent parameter estimates and be used in any kind of analysis in which endogeneity is suspected to be present. A large body of empirical research suggests that certain CG structures drive improved performance (Wintoki et al., 2012; Zhou et al., 2014). Endogeneity is likely to be present in the relationship between WOCB and firm’s outcomes. For example, Adams et al. (2009) or Đặng et al. (2020) confirm this fact. To tackle this issue, consistent with Marra and Radice (2011), we use a two-stage generalized additive model (2SGAM), which is a kind of 2SLS generalization. Marra and Radice’s (2011) approach is based on the two stage procedure first suggested by Hausman (1978, 1983) by using reliable smoothing approaches available in the GAM literature. Via a simulation study, Marra and Radice (2011) show that 2SGAM can: (a) provide unbiased parameters; (b) handle endogeneity through instrumental variable (e.g. Angrist et al., 1996) and (c) accurately fit with simple quadratic relationship or more complex ones (e.g., cubic, quartic, etc.). Consequently, we make an econometric contribution to the WOCB literature by specifically taking into account endogeneity issues through the 2SGAM suggested by Marra and Radice (2011). Unlike the existing literature, instrumental variable’s choice is made on theoretical grounds and is adequately tested for validity and relevance. To the best of our knowledge, we have no knowledge of any study having used this modus operandi (Kirsch, 2018; Terjesen et al., 2009), and in particular in WOCB–CSP relationship (Byron & Post, 2016; Rao & Tilt, 2016).

The purpose of this paper is twofold. First, the relationship between WOCB and CSP is re-examined for companies drawn from the Fortune 1000 ranking for companies ranked between 501 and 1000 over the period between 2004 and 2018. This will bring new insight compared to the existing empirical literature. Indeed, this study relies on a sample of 3016 firm-year observations compared to 820 and 1542, for Boulouta (2013) and Francoeur et al. (2019), respectively. Likewise, the study window is deeper than that of studies cited earlier: 2004 to 2018, compared to 1999 to 2003 and 2007 to 2013, respectively. Second, to account for the non-linearities between WOCB and CSP as well as the endogeneity issues associated with this relation, we use a 2SGAM.

The rest of the paper is organized as follows. Section 2 presents the theoretical frameworks and the hypotheses developed. Section 3 outlines the research design. The results and concluding remarks are presented in Sect. 4 and 5, respectively.

2 Theory and hypothese

2.1 Theoretical frameworks

The foundations of WOCB stem primarily from agency theory (AT) and resource dependence theory (RDT) (Kirsch, 2018; Terjesen et al., 2009). According to Hillman and Dalziel (2003), the BoD contributes to two important functions for organizations: monitoring management on behalf of shareholders and providing resources. These two theoretical frameworks underpin the extent to which board composition influences CSR and, in fine, CSP (Byron & Post, 2016).

When ownership and control are separated, the prime responsibility of the BoD is monitor managers on behalf of shareholders (Hillman & Dalziel, 2003) as managers may be tempted to their own agenda to the detriment of shareholders (Fama & Jensen, 1983; Jensen & Meckling, 1976). This creates “agency costs” (Jensen & Meckling, 1976). Overall, monitoring by the BoD can reduce these agency costs, which, in turn, improves FP (e.g., Fama, 1980). Because female directors are more likely to be independent directors than their male counterparts (e.g., Kesner, 1988), they tend to increase board independence. As such, WOCB will raise board’s monitoring effectiveness (Carter et al., 2003, 2010) by being better prepared for meetings (Huse & Solberg, 2006), by raising more questions (Carter et al., 2003; Konrad et al., 2008) or by bringing a fresh perspective to complex issues (Francoeur et al., 2008), allowing to correct informational biases in the corporate strategy (Dewatripont et al., 1999). All this certainly contributes to improve CG and reduce agency costs. Indeed, Adams and Ferreira (2009) find that WOCB significantly increase the rate of directors’ attendance at board meetings and are more likely to fire CEOs responsible for poor financial and economic performance. In summary, from the AT, the higher degree of monitoring ensured by female directors can result in greater board effectiveness, leading to better FP.

Provision of resources to the firm is the second important function of the BoD (Hillman & Dalziel, 2003). The theoretical underpinning of this function is derived from resource dependence theory (Pfeffer & Salancik, 1978), which, in essence, view the BoD as an essential means of linking the firm to its environment and the external resources upon which the organization relies. They propose that board linkages may bring four benefits: (1) advice and counsel, (2) legitimacy, (3) communication channels (between external organizations and the firm), and (4) commitments or support from important elements outside the firm. Within this framework, provision of resources is linked to FP (Hillman & Dalziel, 2003) via the reduction of: organization vulnerability vis-à-vis external contingencies (Pfeffer & Salancik, 1978); firm’s uncertainties (Pfeffer, 1972); or transactions costs (Williamson, 1984). As such, female directors bring resources to the board that male directors are unable to provide (Hillman et al., 2007). They bring fresh creativity and new ideas for innovation, enabling to solve problems (Robinson & Dechant, 1997). They also bring legitimacy within the organization (e.g., employees) and outside, particularly key stakeholders (e.g., customers or investors; Hillman et al., 2007). In summary, from the RDT, female directors provide critical resources via advice and counsel, legitimacy, and linkages to external entities that ultimately increase FP (Brammer et al., 2007; Hillman et al., 2007).

In a nutshell, depending on the theoretical framework taken, WOCB can significantly improve FP by closely monitoring management via the reduction of agency costs and by providing critical resources such as advice and counsel or legitimacy.

2.2 Hypotheses

In essence, WOCB can significantly influence CSR and CSP in three ways (Kirsch, 2018). First, in line with RDT, female directors tend to have different level of education and professional backgrounds than male directors, allowing them to fully consider issues considered by the board (Dang et al., 2014; Hillman et al., 2002; Singh et al., 2008). As such, the presence of WOCB may increase a firm’s sensitivity, in terms of CSP and reporting policies (Bear et al., 2010; Nielsen & Huse, 2010). Second, female directors possess certain psychological traits––e.g., interpersonal sensitivity and concern about others’ welfare (Eagly et al., 2003)––making them better able to heed stakeholders’ claims, such as employees and community members (Adams & Funk, 2012). Since female directors are more socially oriented than men and heed the needs of others, they are more likely to promote CSR (Burgess & Tharenou, 2002; Nielsen & Huse, 2010). Finally, in line with AT, gender differences regarding values, risk behaviour or management style, may enhance the monitoring of gender-diverse boards (Adams & Ferreira, 2009; Adams & Funk, 2012). As such, the appointments of female directors to the BoD or CSR committee (Endrikat et al., 2021).

Taken together or separately, these factors suggest that WOCB are more likely to favour all actions, behaviour or policies promoting firm’s CSR, and thus firm’s CSP. Accordingly, following Cook and Glass (2018), we propose the following:

Hypothesis 1

All else being equal, firms with all-male boards will have weaker CSP compared with firms with WOCB.

Originally formulated by Kanter (1977a), token theory defines “tokens” as being members of a social group who are significantly under-represented in a work environment. Applied to the boardroom, women are seen as “tokens” whey they are insignificantly represented among the directors. In extreme cases, “solos” refer to individuals who are the sole representative of a particular demographic group (e.g., gender and race).Kanter (1977a) highlights three perceptual phenomena. First is visibility, i.e., every deed and action of tokens (solos) are watched. This constant pressure makes female directors less efficient in their duties (Adams & Ferreira, 2009). Second, increased scrutiny and pressure to perform can often lead tokens (solos) to feel reluctant or restrained in their behaviour. Indeed, Kanter (1977a) points out that fear of retaliation or being seen as a “troublemaker” can restrict tokens (solos) from making a difference within the organization. The third is social isolation, where men exclude women by putting up barriers in informal networks and meetings. This leads women to feel polarized (Walls et al., 2012). Consequently, women are stereotyped and assimilated as representatives of their gender (Bratton, 2005). These constraints are likely to impede female directors in fully contributing to their organizations. Indeed, as tokens (solos), they may not be listened to or taken seriously, thereby hindering their contribution to and performance within the BoD (Cook & Glass, 2018). According to Kanter (1977a), “two … is not always a large enough number to overcome the problems of tokenism and develop supportive alliances”. Accordingly, drawing on token theory (Kanter, 1977a), token or solo female directors are unable or have limited ability to influence corporate decisions such CSR and, in fine, CSP. Therefore, we assert the following:

Hypothesis 2

All else being equal, firms with token or solo female director will have a very low or zero CSP.

Critical mass theory (CMT) (Kanter, 1977a) argues that a critical mass of women is necessary to significantly influence a company’s culture and policy, and to make a change, not as tokens/solos, but as an influential body. This raises the question of what is the “right” number of WOCB (Konrad et al., 2008). These authors argue that when there is only one female director, she will experience the biases and limits associated with the token/solo status. Having three female directors will normalize the situation as there is a shift from gender to talent, thereby reducing out-group bias toward women (Konrad et al., 2008). With at least three female directors, they feel more comfortable in expressing their point of view or concern, less eager to prove themselves and more confident in their abilities (Konrad et al., 2008; Torchia et al., 2011). Their numerical presence is able to influence a male-dominated group (Okhuysen & Eisenhardt, 2002). In a more balance configuration of women, numerical minorities are more likely to be seen as individuals and not representative of their gender. They are therefore more able to exert an influence on the company’s outcome (Bear et al., 2010; Konrad et al., 2008). Finally, when a critical mass of women is reached, they can significantly interact and influence the company’s outcome (Konrad et al., 2008; Torchia et al., 2011).

CMT receives some support. For example, Torchia et al. (2011) find that three of more female directors on BoD increase a company’s innovation. Likewise, based on sample of German firms, Joecks et al. (2013) found that the relationship between WOCB and FP is not linear, as before a critical mass of 30% of WOCB, the relationship between WOCB and FP is negative. Beyond this threshold, the effect of WOCB is positive and significant. Moreover, in their analysis of Fortune 500 (from 2001 to 2010), Cook and Glass (2018) show a with a critical mass of two or three female directors significantly influence CSP compared to firms with sole or token female directors. Consequently, we propose the following:

Hypothesis 3

All else being equal, firms with a critical mass of female directors––i.e. at least three female directors––will have a stronger CSP firms.

Based on gender’s differences (hypothesis 1) token theory (hypothesis 2) and CMT (hypothesis 3), and as a follow-up of Cook and Glass’s (2018) study, this research seeks to advance the field by showing that the relationship between WOCB and CSP is non-linear contrary to what the existing empirical literature assumes (Boulouta, 2013; Francoeur et al., 2019). Accordingly, we consider the following:

Hypothesis 4

All else being equal, the relationship between WOCB and CSP is non-linear.

3 Research design

3.1 Sample

The initial sample included all companies listed on the 2018 Fortune 1000 list, over the period between 2004 and 2018. This list—compiled by Fortune magazine—ranks the 1000 largest U.S. public and private corporations by their revenues. This list has already been used in the literature (e.g., Bonet et al., 2020; Marquis & Tilcsik, 2016). Specifically, we focus on companies ranked from 501 to 1000 to provide fresh and complementary data to enrich existing studies (e.g., Boulouta, 2013; Francoeur et al., 2019).

Following existing literature (e.g., Cheng et al., 2021; Dang et al., 2021), we apply two main filters on our data. First, firms operating in financial sectors (SIC codes 6000–6999) and utility sectors (SIC codes 4900–4999) were excluded due to their particularisms (disclosure requirements and accounting considerations). Second, missing observations are removed. Accordingly, the final sample consisted of an unbalanced panel data set of 384 firms and 3016 firm-year observations.Footnote 1

In comparison, the sample in the work of Boulouta (2013) included 126 firms and 594 firm-year observations from S&P 500 (for the period between 1999 and 2003) and Francoeur et al. (2019) assessed 325 firms and 1632 firm-year observations from Fortune 500 (for the period between 2007 and 2013).

Table 1 presents the sample’s distribution by year and by industry, according to Campbell’s (1996) industry classification. From panel A of Table 1, it is apparent that firms in the present sample are not evenly distributed, insofar as the years 2015 to 2018 account for approximately 47% of the sample, while the 2004 to 2007 period represents only about 12% of the sample. Using the Wilcoxon–Mann–Whitney test, a significant difference between these two samples at the 1% level (z = 30.843) is found. Furthermore, it appears that none of the years included 384 firms. This occurs because many firms have gone through an M&A transaction. Consequently, Thomson Reuters Eikon no longer provides information on CSP. Similarly, depending on the IPO date, not all information is available throughout the study period.

Table 1 Sample characteristics

Finally, the sectoral breakdown of companies is not evenly distributed, as services (20.32%), capital goods (19.36%), consumer durables (17.31%) and basic industries together account for approximately 71% of the sample. In contrast, utilities (2.65%), construction (3.02%) and food/tobacco (3.02%) represent barely 9% of the sample.

3.2 Variables

3.2.1 Dependent variable

Firms’ CSR comes from the Refinitiv database, formely known as Asset4. Refinitiv is one of the world's largest providers of financial, governance and CSR data. The latter includes more than 10,000 companies (more than 80% of the global market capitalization) and 450 different CSR measures.Footnote 2 Based on published information available (e.g., corporate filings, news or media), Refinitiv assesses firms’ CSR performance based on three criteria: (a) environment (E) in three categories: resource use, emissions, and innovation; (b) social (S) commitments in four areas: workplace, human rights, community, and product responsibility; (c) and finally governance (G) in three dimensions: management, shareholders, and corporate social responsibility strategy. Each sub-category contains several themes related to ESG. According to Refinitiv, the ESG scores “are based on relative performance of ESG factors within the company’s sector (for E and S) and country of incorporation (for G)” (Refinitiv 2020).Footnote 3 The overall ESG scores range from 0 (minimum score) to 100 (maximum score). Consequently, the Refinitiv ESG scores reflect both a firm's CSR policy and its performance in this regard (Albuquerque et al., 2020). Refinitiv ESG scores––or its previous version Asset4––is common in Finance and CSR literatures (e.g., Bae et al., 2021; Dyck et al., 2019; Ferrell et al., 2016). Refinitiv’s reliability has not been challenged either in academic literature or by business users (Cheng et al., 2014) because Refinitiv ESG scores are less sensitive to selection bias and are relevant regarding variability and distribution. Consequently, Refinitiv ESG scores are as good or better to those of other providers, such as Bloomberg or KLD (Kinder Lydenberg Domini & Co.; see, more in detail, Habermann & Fischer, 2021; Desender & Epure, 2015; Dorfleitner et al., 2015). In a nutshell, based on the aforementioned literature, we are confident regarding the quality and the reliability of our CSP based on Refinitiv ESG scores.

3.2.2 Independent variable

Following Adams and Ferreira (2009) or Đặng et al. (2020), among others, WOCB was measured through the percentage of WOCB calculated as the number of female directors divided by the total number of directors.

3.2.3 Control variables

In their literature review, Margolis and Walsh (2001) note that a firm’s size and risk are the most commonly used control variables in the empirical literature. Accordingly, any study should at the very least contain these control variables (e.g., Graves & Waddock, 1994; Ullmann, 1985; Waddock & Graves, 1997). Specifically, this study measures firm size (FSize) as the natural logarithm of total assets (in millions of US dollars). Following McWilliams and Siegel (2000), we can expect a positive relationship between firm size and CSP. In-keeping with Waddock and Graves (1997), firm risk (Leverage) was measured using the long-term debt to total assets ratio. As existing literature documents a negative link between a firm’s indebtedness and its CSP, we can similarly expect a negative relationship. As R&D expenditures positively and significantly influence FP (Chauvin & Hirschey, 1993), McWilliams and Siegel (2000) and Hull and Rothenberg (2008) emphasize the importance of taking R&D into account in any CSP-FP study to avoid any misspecification. This study measured R&D intensity (R&D) through the ratio of R&D expenses-to-sales (e.g. Biga-Diambeidou et al., 2021). Following McWilliams and Siegel (2000), we expected R&D intensity to be positively related to CSP. In addition to this, to prevent problems related to missing R&D values (Koh & Reeb, 2015), a dummy variable was created for this study equal to 1 if R&D expenditure is unavailable on Thomson Reuters Eikon database (Miss), and 0 otherwise.

Because CSP is generally associated with FP (Hillman & Keim, 2001; Waddock & Graves, 1997), the latter is often considered when investigating the CSP-FP relationship. Operationally, we used return on assets (ROA), expressed as net income divided by total assets (Griffin & Mahon, 1997)Footnote 4 and Tobin’s Q (Q) calculated as the market value of equity plus the book value of debt divided by the book value of total asset (Bebchuk & Cohen, 2005; Gompers et al., 2003).Footnote 5 Following Griffin and Mahon (1997), a positive relationship between CSP and FP was expected. We also controlled for board independence (BIndep)– measured as the proportion of outside––non-executive––directors on the board––as Freeman (1984) argues that this may influence a firm’s awareness on CSR and key stakeholders’ legitimate interests. Accordingly, a positive relationship between board independence and CSP was expected. Finally, firm age (Fage) was included—measured as the natural logarithm of the number of years since the firm’s creation (Anderson & Reeb, 2003)—as Harjoto and Jo (2011) and Withisuphakorn and Jiraporn (2016) found that a firm’s maturity plays a significant role vis-à-vis its commitment to CSR. The direction of the relationship is, however, uncertain.

All variables except the dummy variables are winsorized at the 1% and 99% levels to reduce the potentially spurious effects of outliers. Finally, all the variables came from the Thomson Reuters Eikon database. An overview of the variables used in the analysis and their definitions is provided in Table 2.

Table 2 Definition of variables

3.3 Methodology

3.3.1 Parametric approach

The following equation was first estimated:

$$ \begin{aligned} CSP_{it} & = \alpha_{i} + \delta_{t} + \beta_{1} WOCB_{it} + \beta_{2} FSize_{it} + \beta_{3} ROA_{it} + \beta_{4} Q_{it} + \beta_{5} Leverage_{it} \\ & \quad + \beta_{6} R\& D_{it} + \beta_{7} Miss_{it} + \beta_{8} BIndep_{it} + \beta_{9} FAge_{it} + \mu_{it} \\ \end{aligned} $$
(1)

where i denotes companies in the sample, and t refers to time period. All other variables are described in Table 2.

Firm fixed-effects (FE) are introduced in order to control for unobserved time-invariant firm heterogeneity. By definition, these FEs can be correlated with the variables included in the right-hand side of Eq. (1). The correlated random effect (CRE) approach as defined by Chamberlain (1984) enables us to take this issue into account by replacing FE by a linear combination of time-averaged regressors. Time-invariant unobserved heterogeneity is then controlled as with FE but without encountering the incidental parameter problem that affects classical FE model estimation. Moreover, the CRE approach enables us to measure the effect of time-invariant explanatory variables, which FE modeling does not allow (Wooldridge, 2010).

3.3.2 Endogeneity issues

Establishing a causal relationship between WOCB and CSP may be challenging (Sila et al., 2016; Yang et al., 2019). Indeed, according to Adams et al. (2010), there are no strong theoretical arguments or empirical evidence assuming that the board structure is fundamentally endogenous (see for example Hermalin & Weisbach, 1988, 1998). As a result endogeneity issues must be taken into account when considering the relationship between corporate outcomes and WOCB (Adams, 2016). In essence, endogeneity arises when the variable of interest is correlated with the residuals (e.g., Lu et al., 2018). The two main sources of endogeneity that can bias our estimates concerning how WOCB affect CSP are: omitted/unobserved factors and reverse causality (Adams, 2016; Sila et al., 2016).

Wooldridge (2010) argue that in any economic model, key variables’ omission can cause omitted variable bias. This significantly influences the residuals, thus creating endogeneity issues (Adams, 2016). For instance, corporate culture or director ability are all characteristics that influence CSP, which are omitted by the literature because they are difficult to observe or measure (e.g., Boulouta, 2013; Yang et al., 2019). To handle omitted/unobserved factors, panel data analysis and fixed-effects may mitigate detrimental effect of omitted variables under certain assumptions (Wooldridge, 2010). However, this may not be sufficient because of a second source endogeneity, reverse causality.

Another source of endogeneity in WOCB–CSP relationship is reverse causality: the match between boards and female directors is likely to be a function of firm and individual characteristics (Adams, 2016). Specifically, WOCB may affect CSP, but it is also possible that more socially responsible firms are more likely to appoint female directors (Boulouta, 2013). In both cases, we can observe a positive relationship between WOCB and CSP. Consequently, the direction of causality could go both ways (Adams, 2016).

Consequently, an instrumental variable (IV) approach must be employed in order to obtain consistent parameter estimates in Eq. (1). Assuming that appropriate IVs can be found for WOCB, several methods can be employed to correctly quantify the impact that WOCB has on CSP, 2SLS being the most common. According to Adams (2016), finding suitable exogenous instruments can be challenging, since they must not be correlated to with the endogenous variable (WOCB), but be uncorrelated with the error term in Eq. (1) (e.g. Đặng et al., 2020). Following Campbell and Mínguez-Vera (2008) and Ben-Amar et al. (2017), we hypothesize that the representation of WOCB can be instrumentalized by board size. Indeed, these studies suggest that the more members there are on the board, the higher the likelihood that more women will be appointed. To assess the strength or weakness of our instrument, we compute F test and Cragg and Donald’s (1993) statistic and compare to critical values suggested by Staiger and Stock (1994).

2SLS estimation of Eq. (1) proceeds in two stages. The first stage involves fitting a regression of WOCB on Bsize, control variables involved in Eq. (1), and individual and time fixed effects. Here too, the CRE approach is implemented by estimating an augmented version of the regression model, including time-averaged regressors, by OLS. In the second stage, Eq. (1) where WOCB is replaced by its estimated first-stage value is estimated following the CRE approach. Estimation by 2SLS then produces a consistent estimate of the impact of WOCB on CSP for an appropriate choice of the instrumental variable. Various tests can be used to assess the validity of this choice (see, among others, Andrews et al., 2019).

3.3.3 Semiparametric approach

The presence of non-linearity in the WOCB–CSP relationship can be detected by considering a more general specification than Eq. (1) such as:

$$ \begin{aligned} CSP_{it} & = \alpha_{i} + \delta_{t} + s_{WOCB} WOCB_{it} + \beta_{2} FSize_{it} + \beta_{3} ROA_{it} + \beta_{4} Q_{it} \\ & \quad + \beta_{5} Leverage_{it} + \beta_{6} R\& D_{it} + \beta_{7} Miss_{it} + \beta_{8} BIndep_{it} + \beta_{9} FAge_{it} + \mu_{it} \\ \end{aligned} $$
(2)

where sWOCB(WOCBit) is an unknown smooth function of WOCBit.

Here too, we use the CRE approach to deal with all the issues involved in the presence of firm FEs in Eq. (2). After replacing these effects by a linear combination of time-averaged explanatory variables, Eq. (2) can be viewed as a generalized additive model (GAM) (Wood, 2020) where the link function is identity and the dependent variable is normally distributed. Estimates of unknown parameters and smoothed functions can then be recovered using GAM estimation tools (see “Appendix” for technical details).

By representing the unknown function.\(s_{WOCB} \left( {WOCB_{it} } \right)\) using a reduced rank spline smoother, the problem of estimating the parameters and function involved in the augmented version of Eq. (2) is reduced to the estimation of the same parameters to which are added those of the finite expansion of the unknown function in the chosen spline basis. The introduction of a penalization as for the derivatives of the unknown function (for instance, the second derivative) when estimating it, then makes possible to avoid a very wiggly estimate (Wood, 2020). Moreover, as the unknown function is expressed as a linear combination of known spline basis terms, estimates of derivatives of the unknown function can be recovered by taking the derivatives of the estimated function, i.e. linear combinations of derivatives of the known spline basis terms (see “Appendix” for more details).

To account for the potential endogeneity of WOCB, we use a two-stage generalized additive model (2SGAM) approach, which is a generalization of the two-stage approach introduced by Hausman (1978, 1983) as a means of directly testing the endogeneity hypothesis for a class of linear models (Marra & Radice, 2011). The two stages are the following:

  1. (1)

    Obtain consistent estimates of parameters and unknown smooth functions by fitting the following reduced-form equation through a GAM method:

    $$ \begin{aligned} WOCB_{it} & = \gamma_{i} + \theta_{t} + s_{Bsize} \left( {Bsize_{it} } \right) + \varphi_{2} FSize_{it} + \varphi_{3} ROA_{it} + \varphi_{4} Q_{it} + \varphi_{5} Leverage_{it} \\ & \quad + \varphi_{6} R\& D_{it} + \varphi_{7} Miss_{it} + \varphi_{8} BIndep_{it} + \varphi_{9} FAge_{it} + \xi_{it} \\ \end{aligned} $$

    Then calculate the corresponding estimated errors \(\hat{\xi }_{it}\).

  2. (2)

    Fit a GAM defined by:

    $$ \begin{aligned} CSP_{it} & = \alpha_{i} + \delta_{t} + s_{WOCB} WOCB_{it} + \beta_{2} FSize_{it} + \beta_{3} ROA_{it} + \beta_{4} Q_{it} + \beta_{5} Leverage_{it} \\ & \quad + \beta_{6} R\& D_{it} + \beta_{7} Miss_{it} + \beta_{8} BIndep_{it} + \beta_{9} FAge_{it} + s_{\xi } \left( {\hat{\xi }_{it} } \right) + \mu_{it} \\ \end{aligned} $$

The first stage is a generalization of the 2SLS first stage. The second stage differs from the 2SLS second stage, as WOCB is not replaced by its first stage estimated value. Instead, the first stage estimated residual is added to the model to be estimated, in order to control for all potential sources of endogeneity of WOCB, such as unobserved confounders correlated to both WOCB and CSP. The unknown function \(s_{\xi } \left( {\hat{\xi }_{it} } \right)\) is thus added is adder to recover the residual amount of nonlinearity needed to clear up the endogeneity of WOCB. The latter can be tested by considering the null hypothesis \(s_{\xi } \left( {\hat{\xi }_{it} } \right) = 0\), or the joint nullity of all parameters involved in the finite expansion of \(s_{\xi } \left( {\hat{\xi }_{it} } \right)\) in the chosen spline basis.

3.4 Descriptive statistics and correlation analysis

Table 3 reports the descriptive statistics of all the variables. The firms in the present sample have an average (median) CSP score of 36.648 (33.740) on a scale between 0 and 100. Our findings are relatively low compared to studies using the Refinitiv database. By comparison, Dorfleitner et al. (2021) and Habermann and Fischer (2021) report a score of 53.55 (for mutual funds from 2003 to 2018) and 43.53 (for 1215 U.S. firms from 2010 to 2019), respectively.

Table 3 Descriptive statistics (N = 3016)

Regarding the variable of interest, the mean (median) percentage of WOCB is 15.2% (12.5%), which is relatively similar to the 16% reported by Dang et al. (2021). Figure 1 shows the distribution of WOCB. In essence, almost 16% of sample firms have no female directors, 40% of firms have one female directors (WOCB ≃ 0.111) and 30% of them have two female directors (WOCB ≃ 0.205). These three categories represent 86% of the sample. The histograms also reveal that very few observations are beyond 0.38 of WOCB (i.e. 4 women or more): barely 2.35% of the total sample. Consequently, one should remember these statistics when interpreting the nonparametric results below.

Fig. 1
figure 1

Distribution of WOCB

Compared to the 500 largest companies (e.g. Dang et al. 2020), the present sample firms are smaller in terms of size (8.36) and ROA (6%). On the other hand, they seem relatively similar in terms of R&D expenses, leverage and board independence.

Table 4 reports the correlations among the variables. As a general rule, a correlation of 0.70 or higher in absolute value is indicative of a multicollinearity issue (Kutner et al., 2005). In Table 4, the highest correlation of 0.34 appears between ROA and missing Tobins’ Q (Q), which is below the reference value of 0.70. As such, multicollinearity does not seem to be a significant problem. To confirm this finding, variance inflation factors (VIFs) were calculated for all the variables. The maximum VIF in Table 4 is 1.47, which is significantly below the rule-of-thumb cut-off of 10 recommended by Wooldridge (2010). Consequently, Table 4 suggests that multicollinearity has had little impact on these analyses.

Table 4 Correlation matrix

4 Results

Table 5 reports first results from the estimation of Eq. (1) including time averages of all observed confounders, using the classical 2SLS method. To obtain consistent estimate of the impact of WOCB on CSP, the chosen instrument, board size, needs to be strongly associated with the endogenous variable. Cragg and Donald’s (1993) approach is first used to test the hypothesis that Bsize is a strong instrument. Because there is only a single endogenous regressor, the Cragg-Donald statistic is the F-statistic value in the first-stage regression. Model 1 in Table 5 shows that the value is equal to 57.272, which exceeds the threshold of 10 that Staiger and Stock (1994) suggest to reject the null hypothesis of weakness of the chosen instrument. Endogeneity is also tested using Wu-Hausman test. This test gives a p-value of 0.010, suggesting that the null hypothesis for WOCB’s exogeneity is rejected at usual significance levels. Results of the two previous tests justify the use of 2SLS in order to measure the impact of WOCB on CSP. This result is consistent with studies such as Boulouta (2013) Boulouta (2013), Francoeur et al. (2019) or Dang et al. (2021) arguing that WOCB–CSP relationship is endogenous.

Table 5 Results from 2SLS and 2SGAM models estimation

Model 1 in Table 5 shows that the percentage of WOCB is positively and significantly correlated to CSP at a 1% level of significance. This finding is consistent with Francoeur et al. (2019) or Dang et al. (2021), among others. Everything else being equal, the 2SLS estimate suggests that a one standard deviation increase of WOCB would increase CSP by: 0.794 * 16.854 = 13.379, an amount that is significant.

Model 1 (2SLS) in Table 5 imposes assumptions on the functional form of continuous covariates. To overcome this problem, we estimate the WOCB–CSP relationship within the GAM framework. Measuring this impact now requires the estimation of the unknown smooth function s(WOCB) and its first derivative.

Results from estimation of Eq. (2) including time-averages of all observed confounders using 2SGAM are reported in Model 2 in Table 5. Here too, getting a consistent estimate of the WOCB–CSP relationship requires addressing the potential endogeneity of WOCB and the relevance of the chosen instrumental variable, i.e. board size. As in Model 1 of Table 5, we use Cragg and Donald’s (1993) statistical method to test the null hypothesis that board size is a weak instrument. The statistic value is 13.285, which is larger than the value of 10. Staiger and Stock (1994) suggest to reject the null hypothesis. Board size is thus a strong instrument for WOCB.

Moreover, first-stage estimation results show that: first, the computed estimated degrees of freedom (edf) associated with the estimated function ŝBsize(Bsizeit), or 7.971, exceeds 1, suggesting a highly nonlinear relationship between WOCB and Bsize; second, the p-value associated to the F-statistic allowing to test for the joint nullity of all the parameters involved in the spline basis expansion of the unknown smooth function sBsize(Bsize) is beyond the classical 1% significance level, clearly rejecting the null that sBsize(Bsize) = 0.

Finally, following, we test for the endogeneity of WOCB by testing the null hypothesis that \(s_{\xi } \left( {\hat{\xi }_{it} } \right) = 0\) using estimate from the second stage estimation of 2SGAM. This test yields a p-value of 0.002, leading to a clear rejection of the null hypothesis of no endogeneity of WOCB at the usual significance levels.

Regarding control variables, firm size (Fsize) is positively and significantly correlated to CSP (at the 1% level), which is consistent with McWilliams and Siegel (2000). This suggests that even among Fortune 1000 companies 501–1000, firm size drives CSR investments (McWilliams & Siegel, 2000). We also find that board independence (Bindep) is positively and significantly correlated to CSP (at the 1% level). Jo and Harjoto (2011, 2012) find that effective CG––including independent boards––reduce the conflicts between the firm and its various stakeholders by promoting CSR engagement, which positively influence FP. Finally, firm age (Fage) is positively related to CSP (at the 1% level), which is consistent with Jo and Harjoto (2011). They argue that older firms are more likely to bear expenditures related to CSR. Finally, we can observe that missing R&D values are negatively and significantly correlated to CSP at a 1% level of significance, suggesting that those missing values may have influenced the results herein. Contrary to McWilliams and Siegel’s (2000) claim R&D expenditures (R&D) and CSR (via CSP) are not significantly correlated at the 10% level. This is probably explained by the missing R&D expenditure (Miss; reminder: in 58% of cases, we do not have these data; cf. Table 3) which are negatively and significantly correlated at CSP (at the 1% level), consistent with Koh and Reeb (2015) and Duru et al. (2016) studies. However, ROA Tobin’s Q and leverage are not significantly correlated to CSP at a 10% level of significance. R-squared is equal to 0.30.

Before starting the interpretation of nonparametric results, it is worth recalling, as mentioned previously and as shown in Fig. 2, that there are very few observations in Region D of Fig. 2. Specifically, few firms have more than three female directors on their boards: 3.88% of our sample firms. Consequently, as pointed out by Florackis et al. (2009) and Hamadi and Heinen (2015), care should be taken regarding the interpretation of the curve in Region D of Fig. 2.

Fig. 2
figure 2

Estimated function

Hypothesis 4 suggests that WOCB–CSP relationship is non-linear. The 2SGAM estimates––Model 2 in Table 5––seem to confirm this for the following reasons. First, the edf associated to the estimated ŝWOCB(WOCBit) exceeds 1, suggesting nonlinearity (cf.). Second, the F-test clearly reject the null hypothesis of joint nullity of all the parameters involved in the spline basis expansion of the unknown smooth function sWOCB(WOCBit). Third, visually, this nonlinearity appears clearly in Figs. 2 and 3 with the estimates of the function sWOCB(WOCBit) and its first derivative are reported, as well as the corresponding 95% confidence intervals. Hypothesis 4 is supported.

Fig. 3
figure 3

Estimated derivative of nonlinear function s(WOCB)

2SGAM (Model 2 in Table 5), which offer a more nuanced picture of WOCB–CSP relationship with signification departures from parametric model, 2SLS (Model 1 in Table 5) for several reasons. First, below a threshold of 14.67% of WOCB, the value of the function sWOCB(WOCBit) is negative and significantly different from zero. In the present case, this represents 45.1% of companies in the sample. Put differently, below this threshold the lower WOCB is, the lower CSP is, ceteris paribus. But, this negative effect decreases in absolute value, and becomes positive and still significantly different from zero after the threshold. Second, Fig. 3 exhibits a growing relationship between WOCB and CSP but with a decreasing growth of rate. Third, this growing relationship is also estimated fairly accurately up to a threshold of 35% for WOCB. After this threshold the 95% confidence interval increases exponentially. It becomes then difficult to distinguish what is the shape of the WOCB–CSP relationship is: while significantly different from zero, it may be constant, increasing or even decreasing.

Hypothesis 1 suggests that firms with all-male boards will have significantly lower CSP compared to firms with WOCB. Figure 2 shows that the value of the function sWOCB(WOCBit) is negative (≃ 5.88) and significantly different from zero, thus supporting Hypothesis 1. Note that this concerns 15.2% of our firms sample.

Hypothesis 2 suggests that firms with token or solo female director will achieve a zero, or close to zero, CSP. Figure 2 shows that region A–– i.e. firms with one female director on corporate boards, with varying percentages of WOCB from 0 to 11.1%––exhibits a value of the function sWOCB(WOCBit) that is negative, approximately − 3.75, and significantly different from zero. Approximately 40% of our firms sample are located in Region A. Therefore, Hypothesis 2 is supported.

Hypothesis 3 suggests that a critical mass of female directors (i.e. at least 3 female directors) will exhibit a stronger CSP. For WOCB levels between 11.0% and 20.49% (i.e. two female directors), the turning point (WOCB = 14.7%) is located in Region B. We can observe that the curve (WOCB) increases up to Region C (WOCB < 29.6% or 3 female directors on corporate boards). The values of the function sWOCB(WOCBit) vary between − 3.75 and 4.75, which is significantly different from zero. Approximately 70% of our firms sample are located in Regions B and C. In other words, this growing relationship is also estimated fairly accurately in view of the confidence interval. The rug for observed values of WOCB, as reported on the x-axis of the figure, clearly show that a large amount of observations for this variable, around 96.1%, are smaller than the considered threshold. From Region D, the estimated relationship (WOCB) increases with values of the function sWOCB(WOCBit) increasing from 4.75 to 6.75. However, this trend should be taken with caution. Indeed, as mentioned previously, there are very few observations in that Region (only 3.89%). Furthermore, Fig. 3 shows the increasing size of the confidence interval. As such, the relationship is estimated in an increasingly imprecise way. This increasing imprecision stems from the growing scarcity of observations for WOCB as shown by the rug of these observations (see Fig. 1). The estimated derivative of s(WOCBit) obtained by deriving the estimate of the function, provides supplementary information on the WOCB–CSP relationship. As shown in Fig. 3, the estimated value of the derivative is positive up to a threshold of 40% for WOCB. The derivative first appears constant up to the 8.00% threshold for WOCB, with a value around 0.44, and then decreases. Beyond the 40% threshold for WOCB, the estimated values for the derivative are close to 0.11. This pattern of the derivative is consistent with the inverted-U shaped pattern of the function itself as depicted in Fig. 2. Consequently, based on these different results, Hypothesis 3 is not supported.

We observe that with the exception of firm age (FAge) all control variables are found to be statistically significantly in Model 2 of Table 5. The R-square (equal to 0.374) is higher than in Model 1.

5 Concluding remarks

The purpose of this study was to re-examine the WOCB–CSP relationship. Despite strong theoretical arguments supporting that WOCB influence a firm’s CSP (Byron & Post, 2016), the existing empirical literature yields mixed results. We use a semiparametric approach, and specifically, 2SGAM (Wood, 2006) to examine the non-linear relationship between WOCB and CSP. In essence, this approach enables an examination of the non-linearity of variables, thus providing a complete picture of the non-linear relationship between two variables (Florackis et al., 2009, 2015; Hamadi & Heinen, 2015; Trinh et al., 2018). Our analysis is based on a large sample of firms from the Fortune 1000 ranking for companies ranked from 501 to 1000.

Several interesting findings emerge from this study. First, thanks to semiparametric approach and GAM, we provide evidence of the specific non-linear relationship between WOCB and CSP. In furtherance of Florackis’s et al. (2009) work, our results suggest that parametric approaches are likely not suitable to investigate the exact nature of the WOCB–CSP relationship. The semiparametric approach can skirt around issues around parametric issues. To our knowledge, and based on the meta-analyses of Rao and Tilt (2016) and Byron and Post (2016), our study represents the first study to show the non-linear relationship between WOCB and CSP.

Second, as a follow-up to Florackis’s et al. (2015) study, we emphasize that the WOCB–CSP relationship is significantly more complex than previously considered by the existing theoretical and empirical literatures. Indeed, below a threshold of WOCB of around 14.7% (i.e. more or less one female director), the effect of WOCB on CSP is zero, or even negative, as shown in Figs. 2 and 3. We can notice that this is the case for almost 45% of sample firms. This finding is consistent with the token theory (Kanter, 1977a), which argues that solo or token women cannot significantly influence decisions made by the BoD and, ultimately, CSP. Furthermore, we find that from this threshold, the WOCB–CSP relationship is positive and significant, suggesting that WOCB can contribute to CSP (and CSR) via the various contributions suggested by the Agency and Resource dependence theories (Kirsch, 2018; Terjesen et al., 2009). Finally, beyond the threshold of 30% of WOCB, the positive effect of WOCB on CSP should be considered cautiously––Figs. 2 and 3––because only 3.89% of sample firms have more than three female directors on their boards. Consistent with Florackis et al. (2009) and Hamadi (2010), the few observations at the end of the distribution (see Fig. 1) prompts a certain prudence in interpreting the effect of WOCB on CSP in this area of the curve (see Region D in Fig. 2). In view of this, our study cannot neither confirm nor deny CMT (Kanter, 1977a; Konrad et al., 2008) that three female directors are likely to significantly influence CSR (e.g., Ben-Amar et al., 2017; Cook & Glass, 2018) and that three is a “magic number” (e.g., Jia & Zhang, 2013; Joecks et al., 2013). The semiparametric approach shows that a critical mass of WOCB on CSP is indistinguishable. In practice, given how few companies reach or exceed the 30% threshold, it is difficult to empirically validate CMT claims. We can suggest that the empirical results in this regard be treated with caution (e.g., Liu, 2018; Liu et al., 2014). Consequently, the semiparametric results offer a nuanced picture of WOCB–CSP relationship with significant departures from linearity, different faces of WOCB’s effect on CSP and some questions on critical.

The present study is important from theoretical, empirical and managerial perspectives. First, drawing on token (Kanter, 1977a) and critical mass theories (Childs & Krook, 2008; Kanter, 1977a), our study contribute to the existing theoretical literature (Byron & Post, 2016; Rao & Tilt, 2016) by providing further evidence of the WOCB–CSP relationship and offering a more complete picture, namely the non-linearity of the relationship. In essence, the effect of WOCB on CSP is a function of female representation on corporate boards. The link is not straightforward and perphaps double-edged (Triana et al., 2013). In so doing, this research responds to Rao and Tilt’s (2016) call, among others, who suggest that more studies are need to examine the extent to which WOCB actually influence CSP.

Furthermore, empirically, an innovative econometric technique is used: semiparametric GAM specification, similar to Trinh et al. (2018). As underlined by Hamadi and Heinen (2015), the semiparametric approach is not a tool commonly used in financial econometrics or empirical finance. We argue that the semiparametric approach is particularly suitable and relevant, to the extent that the WOCB–CSP relationship is questionable. To our knowledge, this nonparametric approach has never been proposed in the literature. Furthermore, by using a sample of 384 firms and 3016 firm-year observations from a sample of companies belonging to Fortune 1000 ranked between 501 and 1000, this sample is significantly smaller than those of previous studies (e.g., Boulouta, 2013; Francoeur et al., 2019). Consequently, this sample adds fresh and relevant empirical evidence regarding WOCB–CSP relationship.

Finally, from a managerial perspective, the present results are useful for investors, as surveys indicate that more and more investors are integrating CSR criterion into their investment policy (Eccles et al., 2011). The findings indicate how WOCB can improve CSP and how female directors may be a useful investment criterion.

This study is not without limitations. Perhaps most importantly, the evidence is based on US evidence (English-origin countries; see La Porta et al., 1998). As such, the findings may not be transposable to other geographical areas, since the US are behind regarding female representation on corporate boards (Cook & Glass, 2018), especially in comparison with Scandinavian countries. Moreover, this study focuses on large US listed companies. Further studies are, for instance, needed in small- and medium-sized enterprises (SMEs), as they are the backbone of the US economy – see the Office of the United States Trade Representative.