Introduction

In this paper, we examine whether US public companies with gender-diverse boards report better long-term non-financial and financial performance. We are motivated to examine this relationship because in recent years the issue of gender diversity in corporate boardrooms has gained significant momentum in many countries (Broome and Krawiec 2008, p. 431) suggesting that female participation in corporate boards improves boardroom governance leading to better firm performance (Erhardt et al. 2003). A 2018 census of Fortune 500 company boards finds that since 2010 the number of women directors has increased from 856 to 1278. (Alliance for Board Diversity 2018, p. 17).

Examining the relationship between boardroom gender diversity and firm performance is also important from a policy perspective because many countries and regions are issuing regulatory mandates to increase the number of female directors in corporate boardrooms. For example, in the United States, California became the first state to require that its public companies must have at least one female director on their boards by the end of 2019 and a minimum of two to three women directors by 2021 (NYT 2018). Additionally, the Securities and Exchange Commission (SEC) rulesFootnote 1 now require US public companies to disclose how diversity is considered in their director nomination process. In the lower house of the US Congress (2017), a bill is currently pending that proposes to amend the Securities Exchange Act of 1934 to require (1) each issuer to disclose in its proxy filings the gender composition of its board of directors and nominees for board leadership, and (2) that the SEC establish a Gender Diversity Advisory Group with the charge of studying and making recommendations to the Commission on how to increase gender diversity on US company boards.Footnote 2 Similarly, in Canada, the Securities Regulators from eleven jurisdictions now require on a “comply or explain” basis the non-venture issuers to disclose on an annual basis information on female directors such as their number and percentage, company policy on inducting female directors, and targets for female directors (Canadian Securities Regulators 2015). The European Union now seeks to promote gender-diverse boards by setting a quota of at least 40% representation for each gender among the non-executive directors by 2020 within its member countries (European Commission 2014a, b). Frustrated by the slow progress in achieving gender parity in the senior ranks of publicly listed companies in Europe, the European Commission has adopted a policy of compelling such companies to “positively discriminate” when recruiting new members for their boards to achieve the goal of 40% women in the boardrooms (Boffey 2017).

Few studies that we are aware of have so far examined the relationship between boardroom diversity and a firm’s long-term non-financial and financial performance. In this regard, Rhode and Packel (2014) specifically note that “research is lacking on the relationship between board diversity and long-term price performance, which is the “gold standard” measure of shareholder value” (p. 391). Moreover, the academic literature seeking to examine what, if any, relationship exists between board gender diversity and firm performance is largely inconclusive in its findings. For example, drawing on the results of 140 research studies that investigate relationship between female directors and firm performance, Post and Byron (2015, p. 1546) conclude that “despite a relatively large body of literature examining [this relationship], the empirical evidence is decidedly mixed.” Similarly, Rhode and Packel (2014) review more than 20 research studies on boardroom diversity and firm performance during 1980 to 2010 period. They conclude, “The relationship between diversity and financial performance has not been convincingly established” (p. 377). Additionally, no study that we are aware of has so far examined the relationship between boardroom diversity and a firm’s long-term non-financial and financial performance.

In addition, we use both the ordinary least squares (OLS) and two stage least squares (2SLS) to explore the link between board diversity and long-term firm performance. Most studies in the literature fail to address the endogeneity issues among gender and ethnic diversity and firm characteristics. According to Kmenta (1986, 652–653), ignoring endogeneity and the related factors, such as measurement error, missing variable and simultaneity in the estimation, results into biased estimates simply because it violates the exogeneity assumption of the Gauss–Markov Theorem. The problem of endogeneity is, however, unfortunately overlooked by most researchers in the literature and thus precludes them making proper policy recommendation (Antonakis et al. 2010). In particular, endogeneity problem is critical in the context of time series analysis of causal processes, which is, in our case, the long-term impact of board diversity on firm performance. Therefore, after controlling for firm-specific characteristics, such as firm profitability, firm size, beta, sales growth, board size, CEO duality, and board independence, and the potential effect that these firm characteristics and board diversity measures could be endogenous, we could have higher confidence in estimating and concluding on the relationship between the board diversity and firm’s long-term performance.

In our tests, we measure a firm’s non-financial performance with reference to its corporate social responsibility (CSR) scores in five specific areas (KLD Stats 2010). These are Environment (ENV-), Employee Relations (EMP-), Corporate Governance (CGOV-), Community (COM-), and Diversity (DIV-). Similarly, we use three prevalent measures to study boardroom gender diversity’s impact on a firm’s financial performance. Consistent with prior research, we use Tobin’s Q, our first financial metrics, to measure firm performance (Brown and Caylor 2006). Within the governance research literature, Tobin’s Q is frequently used because it helps capture the effect of firm-specific intangible assets such as good managers or effective boards (Morck et al. 1988, p. 296) and financial/market valuation variables, both of which are documented to have been influenced by firm level governance (Brown and Caylor 2004; Anderson and Gupta 2009). Additionally, Tobin’s Q also provides a forward-looking perspective as it is affected by investor perceptions and behaviors with regard to how to view corporate business strategies and ensuing market events affecting the firm (Gupta et al. 2009). The second measure proxies firm performance using accounting metrics such as return on equity (ROE) (Zahra and Stanton 1988; Shrader et al. 1997; Adams and Ferreira 2002; Erhardt et al. 2003; Smith et al. 2006; Catalyst 2007; Rose 2007; Campbell and Minguez-Vera 2008; Miller and del Carmen Triana 2009; Carter et al. 2010). The third measure focuses on the “gold standard” by capturing firm performance through stock-return-based measures such as the cumulative annual stock returns (AASR) and cumulative market-adjusted annual stock returns (CMAASR). We use a 3-year and a 5-year lag for board gender diversity and board characteristics to measure the effect of board gender diversity on a firm’s long-term performance (see subsequent discussion under “Lagged Board Variables”).

Using observations from 2003 to 2012, we find that firms with gender-diverse boards tend to perform better on non-financial measures over longer-term even after controlling for the simultaneous effects of board characteristics. However, using the same model for financial performance, our findings are mixed—no impact on “gold standard” market measures, a positive impact on ROE but a mixed impact on Tobin’s Q.

The remainder of the paper is organized into four sections. Section 2 reviews the relevant literature and develops hypotheses. Section 3 discusses the sample selection and research design. Section 4 discusses the empirical results of the main model and the sensitivity tests. Section 5 concludes the paper.

Background and hypotheses

In addition to fairness in the work place arguments, the regulatory imperatives cited above note that gender-diverse boards may lead to better firm performance because such boards “engage in diverse critical thinking around business decisions, creating a more proactive business model” (European Commission 2013). The EU Fact Sheet “Women on Boards: The Economic Arguments” (European Commission, undated) cites improved company performance, better quality decision-making, improved corporate governance and ethics, and better use of the talent pool as plausible economic arguments to promote gender diversity on public company boards.

Although there is increased rhetoric about the possible positive effects of gender diversity in corporate boards on a firm’s performance, there is no theory that directly predicts how and why the gender diverse boards may impact a firm’s performance. However, Srinidhi et al. (2011) conceptualize that diversity in the boardroom improves corporate governance by bringing in the boardroom a broader perspective that enriches the discussion that in turn could improve board’s collective decision-making.

Along the same line of arguments, we can trace the possible impact of boardroom diversity through board of directors’ fiduciary duties. In common law countries, corporate directors have two primary fiduciary duties: duty of care and duty of loyalty (Larcker and Tayan 2016). The duty of care involves setting corporate strategy, monitoring and evaluating senior managers, providing them with advice and direction on matters brought forth to the board, and ensuring that the company is in compliance with all laws and regulations (Mallin 2004; Monks and Minow 2004). Unlike the “duty of loyalty”, in cases involving breach of “duty of care”, courts have rarely held directors liable individually. Instead, the courts have held the entire board liable for failing to discharge its “duty of care” (Ibrahim 2008). Thus, the composition of the board, its make-up and any other factors that improve its collective decision-making become important determinants of a board’s functioning and performance that ultimately—through board decisions—may impact a firm’s performance.

Within the corporate governance literature, the most commonly invoked agency theory has successfully modeled the conflict between a company’s management (or executive directors) and independent directors. Under the agency perspective (Jensen and Meckling 1976), corporate directors, acting as monitors, are classified as either insiders or outsiders. The inside (or executive) directors, constrained by the non-diversifiable human capital investment in the firm, have incentives to engage in opportunistic behavior and inflate the value of the firm to investors. On the other hand, the “outside” directors being independent have a need to protect their professional reputations leading them to become effective monitors of their firms. In other words, board independence is an important determinant of board monitoring effectiveness (Jensen and Meckling 1976). Therefore, whether board effectiveness can be improved by changing the composition of the board is an important question to study (Hermalin and Weisback 1991). Thus, within the context of boardroom diversity, we question whether gender diverse boards exhibit greater independence and therefore are effective monitors of management.

According to Carter et al. (2003), diversity may increase board independence because “people with a different gender, ethnicity, or cultural background might ask questions that would not come from directors with more traditional backgrounds. In other words, a more diverse board might be a more activist board because outside directors with nontraditional characteristics could be considered the ultimate outsider” (p. 37). However, board members with different perspectives can be marginalized, especially because boards of directors, to a significant extent, are an “endogenously determined institution” and “the CEO has incentives to “capture” the board” (Hermalin and Weisback 2003, p. 7). Thus, boards may appear independent on the surface but independence “in-fact” may be low thereby negatively impacting firm performance. Prior research finds that gender diverse boards act as “tougher monitors” and are better at mitigating agency conflicts (Adams and Ferreira 2009; Byoun et al. 2016).

In reviewing current research on board diversity and its influence on corporate performance, Broome and Krawiec (2008) note that the rationales for enhancing board diversity include addressing fairness concerns, accessing an untapped talent pool, reducing agency costs, possessing more and better information, engaging in constructive dissent, and signaling to observers of corporate behavior. Some have even suggested that “the excessive risk-taking and mistreatment of customers in the pre-2007 boom were caused by the overwhelming masculinity of the industry; some have asked whether the crisis might have been avoided if Lehman Brothers had been Lehman Sisters” (Studer and Dalsley 2014, p. 1).

According to Carter et al. (2010, p. 398) the corporate directors also act as “insiders, business experts, support specialists, and community influential.” Thus, diverse board members, by virtue of their invariant traits, may provide their company managements unique information about their company’s external environment and its constituencies to help them make better decisions. Thus, it makes sense that a gender diverse board will provide services that are more valuable to the firm that in turn may result in better overall firm performance.

Building a case for gender diverse boards, Westphal and Milton (2000) argue that corporate directors with majority status tend to exert more influence in board decision-making because directors drawn from homogeneous groups tend to possess similar views about business problems and solutions that may render them less effective in responding to changes in a firm’s business environment. Thus, making corporate boards more gender diverse will lead to “improved performance by facilitating idea generation and bringing in multiple perspectives for problem solving and strategy formulation” (Orlando et al. 2007, p. 1215). On the other hand, “heterogeneity of belief structures, priorities, information, and ideas that result from diversity lead to conflicts, lower cohesion, slower decision-making and overall lower [firm] performance” (Orlando et al. 2007, p. 1215).

As noted earlier, the academic literature on board gender diversity and firm performance has documented both a positive and a negative impact of boardroom gender diversity on firm performance. For example, Burke (2000) finds significant correlation coefficients between the number of female directors and revenue, assets, number of employees and profit margins for Canadian firms. However, the study cautions about reverse causality suggesting that profitable firms may be more amenable to appointing female directors on their boards. Carter et al. (2003) examine the relationship between board diversity and firm value for Fortune 1000 firms. Board diversity is defined as gender and ethnic diversity among the corporate directors on a board. After controlling for size, industry, and other corporate governance measures, they find significant positive relationship between the fraction of women on the board and firm value. Greene et al. (2020) find the same positive relationship between women on board and firm value, this time using the sample of public corporations listed on major US stock exchanges with their headquarters in California and thus subject to California’s SB 826 law.Footnote 3 Erhardt et al. (2003) examine the relationship between gender and racial diversity on boards of directors with return on assets and investment as measures of firm financial performance. Using observations on 127 large US companies from two separate years, 1993 and 1998, they find that board diversity is positively associated with the financial indicators of firm performance.

Campbell and Mınguez-Vera (2008), using a Spanish corporate panel data analysis, investigate the link between the gender diversity of the board and firm financial performance. They find that gender diversity—as measured by the percentage of women on the board and by the Blau and Shannon indices—has a positive effect on firm value and that the opposite causal relationship is not significant. Their results suggest that greater gender diversity may generate better economic gains.

There are also studies in the same literature stream that report negative results. For example, contrary to their expectations, Shrader et al. (1997) find some evidence of a significant negative relation between the percentage of female board members and accounting measures of performance for a sample of Fortune 500 companies in 1993. Shrader et al. (1997) argue that it may be necessary for a firm to achieve a critical number of female board members before they can exert any positive influence because in their sample few firms had more than one female board member. Farrell and Hersch (2005) use one-year lag variable design and find insignificant abnormal returns on the announcement of a woman added to the board, although higher women ratio on the board is associated with better financial performance. Adams and Ferreira (2009) find that, in a sample of US firms, female directors have better attendance records and fewer attendance problems than their male counterparts do. Moreover, they also find that the more gender-diverse the board, the more likely women are to join monitoring committees, suggesting that women in gender-diverse boards allocate more effort to monitoring. However, the average effect of gender diversity on firm performance is negative, which is driven by companies with fewer takeover defenses. Their results suggest that mandating gender quotas for directors may reduce firm value for well-governed firms.

Rose (2007) uses a sample of listed Danish firms during the period of 1998–2001 to study diversity and performance. Despite the fact that Denmark is leading in the liberalization of women, Danish boardrooms are still to a large extent dominated by men. The findings show no significant link between firm performance as measured by Tobin’s Q and female board representation or proportion of foreigners on the board. This may suggest that board members with an unconventional background are socialized unconsciously thereby adopting the ideas of the majority of conventional board members, which makes potential performance effects, if any, insignificant. For a sample of US firms, Carter et al. (2010) use a lagged variable design (both one-year lag and two-year lag) to examine the relationship between the number of women directors on a board and important board committees and financial performance measures such as return on assets and Tobin’s Q. They do not find any significant relationship between the diversified (gender) boards, or board committees, and firm financial performance.

Based on the above review and prior mixed empirical results, we propose the following non-directional hypotheses, in their alternate forms, for board-gender diversity and their impact on a firm’s long-term non-financial and financial performance:

H1a

Board gender-diversity affects the long-term non-financial performance of a firm.

H1b

Board gender-diversity affects the long-term financial performance of a firm.

Research design

Sample selection

Our sample includes US publicly listed companies for the 10-year period 2003–2012. We collected the female directorship data from the Corporate Library database. The corporate social responsibility (CSR) data was gathered from the KLD STATS database. We collected the financial data from the Compustat database and the stock return and related beta data from CRSP.

The KLD STATS database, maintained by the RiskMetrics GroupFootnote 4, provides annual snap-shots of a firm’s environmental, social and governance (ESG)Footnote 5 performance. It provides a binary summary of strengths (positive) and concerns (negative) ESG ratings for each company on all Qualitative Issue Areas. The Environmental (ENV-) qualitative issue area measures a firm’s strengths and concerns relating to 15 different items such as pollution prevention, emission, recycling, use of clean energy, environmental impact of its product and services, environmental management and regulatory compliance systems. The Employee Relations (EMP-) area measures a firm’s strengths and concerns relating to 12 different items such as relations with its unions, health and safety issues, and extent of employee involvement in managerial decision-making and in profit sharing. The Community (COM-) area measures a firm’s strengths and weaknesses relating to 13 different items such as charitable giving, participation in volunteer programs, tax disputes and engagement with community. The Diversity (DIV-) area measures a firm’s strengths and weaknesses in 11 different items such as policies supporting women, minorities, gays, lesbians and other underrepresented groups, gender and ethnically diverse boards, and the share of contracts awarded to women and minorities. The Corporate Governance (CGOV-) area measures 11 different items such as transparency in political involvement, appropriateness of directors’ and officers’ compensation, and reporting on social and environmental measures.

Table 1 summarizes the sample selection. We begin with the Corporate Library Database because it contains the key directorship data needed for our study. The database contains 1,757 of the largest firms listed in the United States as of 2003. We obtain 14,982 firm-year observations after excluding firms without matching CSR information, or firms from the financial, insurance, and regulated utility industries (standard industrial classification code between 6000–6999 and 4900–4999 respectively), or firms, without, at least, 5 years of directorship data. The financial and utility firms are excluded because of the data compatibility issues. The final data set for the primary analysis consists of an unbalanced panel of 14,903 firm-years. We analyze the impact on a firm’s long-term financial and non-financial performance using five and three-year lagged data of board-gender and other board characteristics. Thus, in our model, current year firm performance is a function of lagged board data. Our sample sizes for the analysis of three year and five year lagged data are 11,654 firm-year observations and 8,470 firm-year observations, respectively.

Table 1 Sample selection

Empirical model

Since we examine how gender diverse boards impact a firm’s long-term non-financial and financial performance, we utilize the following primary empirical model:

$$ \begin{aligned} & {\text{PERFMEAS}} = a + b_{1} {\text{PWDIR}} + b_{2} {\text{LTA}} + b_{3} {\text{BETA}} + b_{4} {\text{GRW}} + b_{5} {\text{BSIZE}} \\ & \quad + \, b_{6} {\text{PINDDIR}} + b_{7} {\text{DUAL}} + b_{8} {\text{ROA}} + b_{j} {\text{FIXED}}\;{\text{EFFECTS}} + e \\ \end{aligned} $$

where

PERFMEAS = Firm performance measures include financial and non-financial measures. Non-financial measures are a firm’s Corporate Social Responsibility (CSR) scores in following areas: Environment (ENV-), Employee Relations (EMP-), Corporate Governance (CGOV-), Community (COM-), Diversity (DIV). We also use the Overall Corporate Social Responsibility Score for a firm as reported by the KLD STATS database (CSR). Financial measures include accounting measure which is return on equity (ROE) and market measures such as Tobin’s Q, cumulative annual stock returns (AASR) and cumulative market-adjusted annual stock returns (CMAASR).

  • PWDIR = Percentage of women directors on the board;

  • LTA = Log of total assets;

  • BETA = A firm’s systematic risk;

  • GRW = Sales growth;

  • BSIZE = Number of total board members;

  • PINDDIR = Percentage of directors on board classified as independent;

  • DUAL = Dummy variable coded as 1 if CEO is also the chair of the board and 0 otherwise;

  • FIXED EFFECTS = Industry fixed effects and year fixed effects. Industry fixed effects is based on the first-digit of the SIC codes. The year fixed effects are from 2003 to 2012.

Lagged board variables

We use lagged variables on firm’s board gender diversity and board characteristics because the effect of board gender diversity on financial and non-financial performance may occur over time. Prior studies (Carter et al. 2010) on board diversity have used lagged variables in their empirical tests although there is no underlying theory that predicts the length of the time for such an impact to occur. For example, Farrell and Hersch (2005) use one-year lag and Carter et al. (2010) use both one-year and two-year lags. In this paper, we use three-year lag and five-year lag variables on firm board gender diversity and board characteristics as alternative measures, because we attempt to examine the long-term effect of board gender diversity on financial and non-financial performance.

Empirical results and discussion

Table 2 reports the descriptive statistics of the variables used in this study. For our sample, the average percentage of female directors (PWDIR) is around 9.60%, suggesting that about one in ten corporate directors in our sample is a female. A recently released analysis of women in S&P 500 companies by Catalyst (2016a, b) reports that the female directors hold 19.9% of the board seats. If the high-profile S&P 500 US companies report such a low percentage of board seats occupied by women, it is reasonable for our sample, which includes companies beyond S&P 500, to report even lower average number of directorships held by females. These descriptive statistics indicate that gender diversity among the corporate boards in US companies remain low for our sample firms for the 10-year period of 2003–2012.

Table 2 Sample descriptive statistics

For the accounting performance measure for our sample, the average ROE is 18.01% during the ten-year sample period ending in 2012. For the stock-return-based performance measures, the average cumulative market-adjusted annual stock return is 8.79%, while the average of cumulative annual return is 17.58%. In addition, the average Tobin’s Q is about 2.051. For control variables, the average ROA is 7.57%, the log of total assets (LTA) has an average of 3.046, the average beta (BETA) is about 1.20, and the average sales growth (GRW) is 12.26%.

The overall CSR rating scores of the firms in our sample vary with negative values (concerns) to positive values (strengths). While the mean ratings of environment CSR component (0.0401), community CSR component (0.0712) and diversity CSR component (0.0172) are positive, the mean ratings of the employee relation CSR component (− 0.1656) and the corporate governance CSR component (− 0.2829) are negative. This suggests that, on average, firms in our sample lag behind in performance on the corporate governance CSR and employee relation CSR. Net rating of a firm can be negative only if the KLD STATS analysts find more concerns than strengths.

Finally, on the governance dimension our sample firms have an average board size of 8.6 members, ranging from a low of four members on a board to a high of 16 board members. While the average percentage of independent directors is about 70%, only in about 41% of our sample firms there is duality meaning the CEO and the board chair are the same person.

Table 3 presents the Pearson’s correlation for all variables.Footnote 6 Overall, the presence of female directors on a company’s board has significant correlation with all variables in our study except for the Tobin’s Q (r =  − 0.02) which is insignificant. Looking at the financial measures, the percentage of female directors on a company’s board is significantly and positively correlated with accounting measure: ROE (r = 0.08). However, the percentage of female directors on a company’s board is significantly and negatively correlated with the two market-based measures: CMAASR (r =  − 0.05) and AASR (r =  − 0.04).

Table 3 Pearson/Spearman correlation matrix

Another interesting observation is that firm size as measured by log of total assets (LTA) is highly correlated with the percentage of female directors (r = 0.29) but significantly and negatively correlated with both BETA (r =  − 0.07) and sales growth (GRW) (r =  − 0.08). It appears that boards with female directors (PWDIR) are more likely to be larger in size (BSIZE, r = 0.31), and more independent (PINDDIR, r = 0.16).

The CSR scores (overall as well as individual) are generally significantly positively correlated (r ranges from 0.05 to 0.50) with PWDIR except for a negative correlation (r =  − 0.08) between PWDIR and governance (CGOV). Additionally, the CSR scores (overall as well as individual) are, mostly, significantly and positively correlated with one another, suggesting that firms doing well in one CSR dimension usually have better scores on other CSR dimensions. Moreover, firms with better CSR scores (we take CSR as example) are generally larger based on total assets (LTA, r = 0.23), have a larger board (BSIZE, r = 0.22), and have more independent directors (PINDDIR, r = 0.05).

In summary, firms with higher CSR ratings tend to have more assets, and have larger, more gender diverse and more independent boards. Overall, our findings reinforce findings in Carter et al. (2003) that the proportion of women directors on boards increases with firm size and outside directors. Our results also partially validate the findings in Farrell and Hersch (2005) that female directors are likely to be found more in high performing firms, as the correlations between ROE and PWDIR is positive and significant.Footnote 7

Table 4 reports the OLS regression results on the effects of laggedFootnote 8 board gender diversity on non-financial performance variables (i.e., CSR, ENV-, EMP-, CGOV-, COM-, and DIV-). We find that board gender diversity does significantly affect a firm’s non-financial performance as measured in all CSR ratings, as indicted by the positive and significant coefficients of LPWDIR in all the models, except the model using CGOV- as dependent variable.

Table 4 Regression analysis of board diversity on long-term non-financial measures

For example, using the 5-year lagged values for board gender diversity and board characteristics, the results show that firms with more gender diverse boards have better environment scores (LPWDIR, t-statistics = 8.36), better employee relations scores (LPWDIR, t-statistics = 4.04), better community relations scores (LPWDIR, t-statistics = 7.18), and better diversity scores (LPWDIR, t-statistics = 35.97). However, more gender diverse boards do not lead to better corporate governance scores because for our sample firms the t-statistics is not significant (LPWDIR, t-statistics =  − 0.25).

To draw broader conclusions, we also examine the impact of female directors on the “overall” measure of CSR performance (CSR), the coefficient of which is positive and significant. In these OLS regressions, we find unequivocally that female directors on a board do positively impact their firm’s long-term non-financial performance (e.g., for 5-year lag CSR, LPWDIR has t-statistics = 21.87). The 3-year lag findings are completely consistent with the 5-year lag findings for various categories of CSR (e.g., for CSR, LPWDIR has t-statistics = 28.28).

We also find that larger firms (LTA) are more likely to score better on all CSR dimensions except the governance category (CGOV-), possibly because they have more financial resources to address any issues in these areas compared to their counterparts. It also makes sense that larger firms across both lag periods do not score positively on the governance category because larger firms are more in the cross hairs of corporate governance watchdogs than their smaller counterparts are. For our sample firms, we find that firms with higher systematic risk are accompanied with negative performance on all CSR categories in the long-term except that beta has no effect on a firm’s performance on the governance category. This is consistent with the prior studies that greater board diversity is associated with lower realized firm risk (Bernile et al. 2018) and more risk disclosures (Bravo 2018). Overall, larger firms positively impact their CSR scores and firms with higher risk negatively impact their CSR scores, irrespective of whether long-term performance is measured with a 5-year lag or a 3-year lag.

Overall, our OLS findings support the view that gender diversity increases board independence because female board members might ask questions that would not come from male directors. The positive evidence on the link between board gender diversity and firm nonfinancial performance is also consistent with the notion that gender diversity produces better performance by encouraging innovative ideas and distilling multiple perspectives before making decisions.

Table 5 reports the OLS regression results on the effect of board diversity on accounting performance measure, ROE. We report the results using 5-year lags and 3-year lags to proxy long-term. Regardless of the length of lag time, women directors (LPWDIR) have positive and significant effect on the ROE of companies with gender diverse boards. The t-statistics for the 5-year lag variable (LPWDIR) is 3.34. For the 3-year lag, the t statistics is 1.75. These results signify that a more gender diverse board contributes to better ROE.

Table 5 Regression analysis of board diversity on long-term financial measures

Larger (LTA) firms tend to obtain a higher ROE (t-statistics of 14.47 for the 5-year lag and 17.78 for the 3-year lag) while firms with high systematic risk (BETA) are associated with lower ROE (t-statistics − 4.40 for the 5-year lag and − 4.06 for the 3-year lag). Firms that have the same person as the CEO and board chair (LDUAL) have a significant impact on a firm’s long-term financial performance (5-year lag t-statistics of 2.77 and 3-year lag t-statistics of 2.37).

Table 5 also reports the OLS regression results of board diversity on Tobin’s Q. The OLS regression results show that female (LPWDIR) directors have positive and significant effects on a firm’s Tobin’s Q regardless of whether we use 5-year lags or 3-year lags. For the LPWDIR the t statistics for 5-year and 3-year lags is 2.00 and 4.34, respectively. This finding suggests that companies with a more gender diverse board do perform better over the long-term as measured by Tobin’s Q.

Our findings extend Carter et al. (2003), who provided the first empirical evidence suggesting that female directors do exert a positive impact on a firm’s value as measured by Tobin’s Q. As in our study, they also find that the fraction of female directors on the board increases with firm size. However, their study, using a 2SLS, focused only on firm value and used only 1-year data on 697 Fortune 1000 firms.

Although not directly related, our results on female directors also complement the results reported by Srinidhi et al. (2011) which suggest that firms with female directors exhibit higher earnings quality. Our findings support the literature that documents that a higher degree of board gender diversity may influence firm performance positively—suggesting that commitment to board gender diversity helps a firm attract qualified applicants outside the normal echelons to increase the competitive advantage (e.g., Campbell and Minguez-Vera 2008; Carter et al. 2010; Rhode and Packel 2014).

In addition, Table 5 reports the OLS regression results of board diversity on stock returns, where AASR is the cumulative annual stock return for the year, and CMAASR is the cumulative market-adjusted annual stock return. The OLS regression results show that board gender diversity may have a short-lived negative effect on a firm’s stock returns—LPWDIR with 5-years lags has no effect on a firm’s stock returns while LPWDIR with 3-year lags has a negative and significant effect on a firm’s stock returns (AASR t-statistics =  − 2.13 and CMAASR t-statistics =  − 2.22). Our findings compliment the results obtained by Farrell and Hersch (2005). Although their primary goal was to “assess the extent to which gender impacts the selection of a director to serve on the board” (p. 86), they also investigated whether adding female directors to a board impacts firm value in any way. They find no significant “abnormal returns on the announcement date of a woman added to the board” (p. 104).

Sensitivity analyses

We conducted several sensitivity analyses to test the robustness of our primary findings. First, to address the potential endogeneity issue, we use the lagged values for gender and ethnic diversity variables in the tests and results are described above. To further strengthen this analysis, we use the two-stage least square (2SLS) method. In the first-stage, we use the lagged explanatory variables (ROA, LTA, BETA, GRW, BSIZE, DUAL, and PINDDIR), and instrumental variable for L3PWDIR or L5PWDIR assuming these variables are all endogenous. The first instrumental variable is an indicator variable of male dominated industries (MD_IND), which is coded as one if a firm is from industries such as agriculture, fishing, logging, mining, construction and utilities and zero otherwise.Footnote 9 Catalyst (2015) study notes that these industries are mostly male-dominated and female representation in these industries is relatively low. The second instrumental variable is BCON, a board connection proxy that measures the number of directorships held by directors. A similar version of the BCON variable has been used in Adams and Ferreira (2009).Footnote 10

In Tables 6 and 7, Panels A and B, we report the first-stage and second-stage regression results with all 3-year lag values for board characteristics and board diversity, respectively. In Panel A, MD_IND has a significant and negative coefficient in the first stage model with L3PWDIR as dependent variable (t-statistics =  − 15.434), suggesting that female directors are less likely to be found in male-dominated industries.

Table 6 Panel A: two-stage least squares regression analysis of board diversity on long-term non-financial and financial measures: first-stage regression analysis with 3-year lag variables
Table 7 Panel B: two-stage least squares regression analysis of board diversity on long-term non-financial and financial measures: second-stage regression analysis 3-year lag variables

In Panel B, the second-stage model, the L3PWDIR has a positive and significant coefficient, indicting a significant effect of women directors on a firm’s non-financial performance (t-statistics = 4.26 for the overall CSR score) for the 3-year lag. Likewise, all components of the CSR are significant and positive suggesting that companies with female representation on their boards do perform better on non-financial measures, which is in line with Liu (2018) which finds that gender diversity help reduce firm’s environmental violations..

However, our findings are less encouraging when it comes to assessing the impact of female participation on company boards and their impact on financial performance of the firm. We only find a positive and a significant effect on a firm’s ROE (t-statistics = 1.78). With regard to Tobin’s Q, as reported earlier, our OLS results were significant and positive regardless of the time lag. However, in 2SLS analyses (second-stage), the impact of the female directors (L3PWDIR) turns negative on their firm’s Tobin’s Q. This is a surprising finding and can only be attributed to the fact that on an average in our sample there are only less than 10% female director representation on the boards. Likewise, female directors, have no impact on their company’s cumulative annual stock returns (AASR) and cumulative market-adjusted annual stock returns (CMAASR). These findings are consistent with the earlier discussed OLS model’s results in the paper. In other words, gender diverse boards fail to pass the “gold standard” test of market measures of a firm’s performance. This could be interpreted to mean that the market does not differentiate among companies with or without diverse boards.

In Tables 8 and 9, Panels A and B, we report the first stage and second stage results of 2SLS regressions with five-year lags for board gender diversity and board characteristics, respectively. The results are largely similar to what we find for the three-year lag with one exception. When we compare the 3-year lag results with the 5-year lag results for non-financial performance measures, we find that the significant result on the Employee CSR (EMP) dimension becomes insignificant (t-statistics = 1.53) for the 5-year lag. This finding may suggest that the impact of board gender diversity on the Employee Relation component of the CSR may change over time.

Table 8 Panel A: two-stage least squares regression analysis of board diversity on long-term non-financial and financial measures: first-stage regression analysis with 5-year lag variables
Table 9 Panel B: two-stage least squares regression analysis of board diversity on long-term non-financial and financial measures: second-stage regression analysis 5-year lag variables

However, when it comes to assessing the impact of board gender diversity on financial performance over a 3-year lag vs. a 5-year lag, there is much less variability in our findings. The ROE, accounting measure of a firm’s financial performance, remains positive and significant (t-statistics = 3.61) even with a 5-year lag. While the Tobin’s Q remains significant and negative (t-statistics =  − 6.80), once again, there is no impact, whatsoever, of a gender diverse board on a firm’s “gold-standard” of market measures: cumulative annual stock returns (AASR) and cumulative market-adjusted annual stock returns (CMAASR).

Second, instead of using the scores assigned by KLD STATS (KLD scores), we use the adjusted scores by normalizing the KLD scores with the maximum score that a firm can attain. We compute such adjusted scores for all categories of CSR measures and for the overall CSR scores. The results do not differ much from what is reported earlier in this paper.

Third, we have presented only the net scores (strengths minus concerns) as the key non-financial performance indicators. However, it would be interesting to learn whether the effect of the gender diversity lies in fostering strengths or in minimizing the concerns. Thus, we study the effect of gender diversity on strengths and concerns separately. Our dependent variables include, respectively, the score of the strengths and concerns in the five areas of interest (community activities, diversity, corporate governance, employee relations, and environmental record). We find that our diversity measure has significant effects on non-financial performance measures regardless of whether our dependent variable is a strength or concern in most of our measures. As shown in Table 10, the diversity measures have positive and significant effects on strengths and negative and significant effects on concerns, indicating that diversity has effects on both the strengths and concerns.Footnote 11

Table 10 Summary of coefficients and T-values of strengths and concerns of board diversity on long-term non-financial and financial measures partitioned by 3-year and 5-year lag variables

Conclusion

In this paper, we are motivated to examine whether board gender diversity has any long-term impact on a firm’s non-financial and financial performance given that during the last decade increasing number of board seats are being allocated to female directors. We measure non-financial performance with reference to a firm’s corporate social responsibility score (CSR) and five of its components: Environment (ENV-), Employee Relations (EMP-), Corporate Governance (CGOV-), Community (COM-), and Diversity (DIV-). Likewise, we measure financial performance of a firm using accounting-based measure of ROE, and market-based measures of Tobin’s Q, Cumulative Annual Stock Return (AASR) and Cumulative Market-adjusted Annual Stock Return (CMAASR). We use 3-year lags and 5-year lags to proxy for long-term financial performance. Using observations for the 10-year period from 2003 to 2012, we find that a more gender diverse boards do impact a firm’s non-financial performance more than its financial performance over the long-term. We summarize the two-stage least square regression analysis results in Table 10 for comparison and contrast purposes.

Overall, based on Table 11 summary, we conclude that board gender diversity does significantly impact a firm’s long-term non-financial performance, with one minor exception, more than it does a firm’s financial performance as measured by ROE and market measures—Tobin’s Q, CMAASR and AASR.

Table 11 Summary of two-stage least squares regression analysis of board diversity on long-term non-financial and financial measures: second-stage regression analysis 3-year and 5-year lag variables

With regard to impact on non-financial performance, we find that all the five components of the CSR including the overall CSR are significant and positive for both 3-year and 5-year lags with one minor exception—Although the Employee Relations (EMP-) component is significant and positive for the 3-year lag, it becomes insignificant for the 5-year lag. Overall, these findings suggest that inclusion of women on corporate boards improves a company’s corporate social responsibility performance in the longer term.

In addition, the impact of women directors on the accounting measure of financial performance—ROE—is significant and positive for both 3-year and 5-year lags. However, the three market measures tell a little different story. A company’s cumulative annual stock returns (AASR) and cumulative market-adjusted annual stock returns (CMAASR) are not significant for both 3-year and 5-year lags. These findings add to the literature where studies such as Farrell and Hersch (2005) find insignificant abnormal returns on the announcement when a female is inducted into a board, and Lee and James (2007) which find that investor reactions to the announcements of female CEOs are significantly more negative than those of their male counterparts in top-executive positions. Possible reasons for the negative or insignificant market perception of the impact of board gender diversity could be driven by the possibility that the investors might perceive corporate directors with non-traditional backgrounds as being simply marginalized in corporate board rooms as well as activation of bias on the part of the institutional investors that now control a large portion of the shares of many public companies (Dobbins and Jung 2011). However, we find that Tobin’s Q is significant but negative for 3-year as well as 5-year lags suggesting that board gender diversity, in fact, exerts a drag on a firm’s market performance.

Juxtaposing the two sets of findings (non-financial and financial), we infer that over a 3-year and a 5-year lag, the improvement in the non-financial measures of a firm’s performance comes at the cost of market measures of performance even though there is a benefit to accounting measure. Our findings also suggest that gender diverse boards take a more “stakeholder” view of the corporation than the traditional “stockholder” view. These findings add to the prior literature that documents that board gender diversity has positive influence on a firm’s short term performance as measured by return on equity (ROE), return on sales, return on invested capital (ROIC) (Catalyst 2007); return on assets (ROA) and return on investment (ROI) (Erhardt et al. 2003); return on equity, and Tobin’s Q (Adams and Ferreira 2002, 2009; Campbell and Minguez-Vera 2008; Carter et al. 2010).

Our study has its limitations. We focus only on the gender diverse boards. Some of the related research studies also examine knowledge, education, function, and experience diversity of organizations, management teams and boards and suggest that diversity leads to a greater knowledge base, creativity and innovation, and therefore leads to a competitive advantage (Watson et al. 1993). For instance, Bantel (1993) investigates the relationship between the demographic nature of top management groups and strategic clarity in retail banks and demonstrates that greater education and functional background diversity in top management teams led to better strategic decision-making. Simons et al. (1999) report similar results in their study on executive diversity and suggest that both educational and cognitive diversity have positive effects on organizational performance. However, they also find that diversity in experience has a negative impact on return on investment and overall organizational performance due to informal communication among top teams. Hambrick et al. (1996) conducted a longitudinal study on the effects of diversity on top management team performance in 32 major US airlines, using diversity measures of functional, educational and tenure heterogeneity and find that homogeneous top-management teams outperformed heterogeneous ones. Similar results are found by Li and Wahid (2018) and Clements et al. (2018) on board tenure diversity and monitoring effectiveness, and Harjoto et al. (2019) on the nationality and educational diversity and CSR performance. Future studies could expand this line of research to investigate whether knowledge, education, function, and experience diversity of boards lead to a greater knowledge base, creativity and innovation, and therefore leads to better performance of firms from both financial and non-financial perspectives.