Introduction

In the popular press, the representation of women on boards is heavily discussed (Holst and Schimeta 2011; Konrad and Kramer 2006). Not only in Germany male managers regularly hold the vast majority of board positions, and compared to the increase of the overall percentage of women in the workforce during the last decades, the representation of female directors in the boardroom falls far behind (Farrell and Hersch 2005, p. 86).

Not surprisingly then, in many countries, there has been a pressure for governance reforms that may foster gender diversity in the boardroom. Norway was one of the first countries to impose a law in 2003 requiring public-limited companies to fill at least 40 % of board positions with women by 2008 (Ahern and Dittmar 2012; Holst and Schimeta 2011, p. 7). Spain followed Norway’s example and enacted a law prescribing a 40 % quota of female board members by 2015 (Adams and Ferreira 2009, p. 292). While other European countries, like, e.g., the Netherlands or France, also imposed women quotas (Holst and Schimeta 2011, p. 11; Böhren and Ström 2010, p. 1282), Germany, focuses on voluntary commitments. The so-called German Corporate Governance Code (2010) which asks firms to “comply or explain” with its recommendations states in article 5.4.1:

The Supervisory Board shall specify concrete objectives regarding its composition which … take into account the international activities of the enterprise … and diversity. These concrete objectives shall, in particular, stipulate an appropriate degree of female representation.

But, with an average less than 10 % women on German supervisory boards in the 30 largest and most actively traded companies listed on the Frankfurt stock exchange (DAX 30) (e.g., Holst and Schimeta 2011), female representation in the boardroom is still rather low.

While fostering female representation in the boardroom for ethical and social reasons is beyond dispute, the performance effects of an increased female representation on the board are rather ambiguous: While some studies hint at a positive link between female representation on the boardroom and firm performance, others find no or even a negative link. In our paper, we add to the literature by postulating—based on critical mass theory—that the relation between gender diversity and firm performance is U-shaped and by providing a first empirical test on this supposition based on a hand-collected panel data set of 151 listed German firms for the time period 2000–2005.

The remainder of the paper is structured as follows. We first present a review of the recent literature on the performance effects of gender diversity followed by a review and critique of critical mass theory as our basic theoretic point of reference. In the following section, we describe our data, variables, and methods. Our findings and analyses are subsequently reported in the following section. In the final section, we conclude with a discussion of our results and our paper’s contribution.

Literature and Theoretic Starting Point

The Empirical Link Between Gender Diversity and Performance: A Literature Review

The empirical evidence on the link between female representation on the board and firm performance is controversial (for an overview of the literature see Table 1): While some studies find the relation between women on boards and firm performance to be positive, others provide evidence of a negative link, and still others do not find a link at all.

Table 1 Overview of the literature (chronological)

While some of the differences may be due to the data stemming from different countries (with differing board systems) and different time periods (Campbell and Mínguez-Vera 2010) or from the use of different performance measures and estimation methods (Campbell and Mínguez-Vera 2008, p. 441; Rhode and Packel 2010, p. 8), results may further be affected by studies being confronted with differing ratios of women on boards, i.e., there may be studies with overall rather low female representation and others with rather high female representation. If the link between gender diversity and performance was non-linear and, e.g., U-shaped, the first group of studies would most likely find the relation between gender diversity and performance to be negative, the latter group would find it to be positive. To the contrary, a study that covers boards with very low and very high female representations and that searches for a linear relation between gender diversity and performance would most likely find no link between the two.

Critical Mass Theory: A Review and Critique

In our study, we build on Kanter’s (1977a, b) seminal work concerning gender diversity in groups: critical mass theory. In her analysis of group interaction processes, Kanter constructs four different categories of groups according to their composition: uniform groups, skewed groups, tilted groups, and balanced groups:

  • Uniform groups are groups in which all members share the same (visible) characteristic. That is, with respect to gender, all members of the group are either male or female. Of course, also uniform groups develop their own differentiations, but with reference to salient external master statuses like gender, its members are similar (Kanter 1977a, p. 208).

  • Skewed groups are groups in which one dominant type (e.g., the males) controls the few (e.g., the females), and therefore also controls the group and its culture. The few are called “tokens.” Tokens are not treated as individuals, but as representatives for their category (Kanter 1977a, p. 208). Kanter suggests that a male dominated skewed group consists of up to 20 % women.

  • Tilted groups are groups with a less extreme distribution. Unlike in skewed groups, minority members can ally and influence the culture of the group. They do not stand for all of their kind, instead they represent a subgroup whose members are to be differentiated from each other in their skills and abilities (Kanter 1977a, p. 209). According to Kanter, a male-dominated tilted group consists of 20–40 % women.

  • In a so-called balanced group, majority and minority turn into potential subgroups where gender-based differenced become less and less important. The focus turns to the different abilities and skills of men and women (Kanter 1977a, p. 209). A balanced group with respect to gender representation has 40–60 % women.

Concerning group interaction processes, Kanter regards skewed groups to be especially problematic: Either the tokens are in the focus or they are overlooked, and they may be subject to stereotyping (Kanter 1977a, p. 210). For women, there are different strategies to cope with a token status (Kanter 1977b, p. 968). Either they pretend that differences between women and men do not exist, or they hide their individual characteristics behind stereotypes (Kanter 1977a, p. 239). The incumbent men, too, will also behave differently in skewed as opposed to uniform groups leading skewed groups to be outperformed by uniform ones.

With an increase in their relative numbers from a skewed to a tilted or even a balanced group, women are more likely to be individually differentiated from each other. As a consequence they might then also bring in their different knowledge bases and perspectives. As is well documented in the literature, men and women differ in a whole range of respects: Women are more risk averse than men (e.g., Croson and Gneezy 2009; Niederle and Vesterlund 2007; Jianakoplos and Bernasek 1998), they are less aggressive in their choice of strategy, and more likely to invest in a sustainable way (Apesteguia et al. 2012; Charness and Gneezy 2012). Women may hence add value to a male-dominated boardroom by providing new perspectives and by asking different questions (Farrell and Hersch 2005, p. 87; Burgess and Tharenou 2002, p. 40; Burke 1997, p. 912). While in a skewed group, these new perspectives may either not be adequately expressed by the female tokens or not spotted by the dominant males, in tilted or balanced groups, the combination of female and male attributes will more likely allow for productive discussions and will hence positively affect group performance (Apesteguia et al. 2012; Konrad and Kramer 2006).

In sum, critical mass theory postulates that until a certain threshold or “critical mass” of women in a group is reached, the focus of the group members is not on the different abilities and skills that women bring into the group. As a consequence, skewed groups will have a lower performance than uniform or tilted and balanced groups. Tilted groups—i.e., groups where a critical mass of 20–40 % women has been reached—will outperform uniform and skewed groups.

Despite its popularity, critical mass theory has rarely been put to an empirical test. While studies on gender diversity often explicitly refer to Kanter (e.g., Tsui et al. 1992), they rarely directly test Kanter’s predictions on the performance of different group types. Among the few exceptions are Spangler et al. (1978) and Fenwick and Neal (2001). While the latter provide empirical support for Kanter’s theory and find tilted groups in a student simulation study to outperform skewed and uniform ones, Spangler et al. (1978) find achievements of women law students to be diminished in skewed as opposed to tilted student work groups. Both, Spangler et al. (1978) and Fenwick and Neal (2001), are confined to simple mean comparisons and do not substantiate their results with the help of a multivariate analysis.

We do not only add to the existing literature by testing Kanter’s predictions in a business context and by combining our univariate findings with a multivariate regression analysis but also explicitly address the fact that the “critical mass” in Kanter’s theory is exogenously—and rather arbitrarily—defined to lie in a range of 20–40 % women (for a corresponding criticism see Childs and Krook 2006, 2008, 2009; Celis et al. 2008; Grey 2006). Unlike the preceding literature, we attempt to endogenously determine the critical mass of women in the boardroom by regressing firm performance on gender diversity and including a quadratic term. Allowing for non-linearities, we expect to find a U-shaped link between gender diversity and performance. Finding such a U-shaped link would support Kanter’s theory of a critical mass, but at the same time highlight the need to endogenously determine the critical mass of women in the boardroom.

Methods

Sample

Our initial sample consists of all 160 companies listed in one of the German stock exchange indices DAX, MDAX, SDAX, and TecDAX on December, 31st 2005.Footnote 1 We exclude nine firms that were not of German legal form to make sure that all companies in the sample were subject to the same regulatory environment. Our sample hence consists of 151 companies whom we observe over a 5-year period (2000–2005).

The board system in Germany is a two-tier system with the supervisory board appointing and supervising management (Dittmann et al. 2010, p. 41). Unlike in a one-tier board system, the main responsibility of the German supervisory board is to monitor, supervise, and appoint the management board which in turn is responsible for firm operations. German supervisory boards comprise directors elected by shareholders and, depending on their size, also by employee representatives.

Variables and Data Sources

Concerning the dependent variable, similar to other studies that analyze the relation between women on boards and firm performance (e.g., Lindstaedt et al. 2011; Haslam et al. 2010; Shrader et al. 1997), we measure firm performance in terms of return on equity (ROE). The data on ROE are taken from Thomson Financial Datastream.

With respect to our central explanatory variable, gender diversity, we hand-collected data on board members’ gender from firms’ annual reports on the basis of board members’ first given names. We found none of the boards to be female dominated, i.e., there were no boards with more than 50 % women.

With respect to Kanter (1977a, b), we first created four dummy variables reflecting the different group types: uniform board (assuming the value “1” if a board has no woman; “0” otherwise), skewed board (assuming the value “1” if a board has at least one woman but less than 20 % women; “0” otherwise), tilted board (assuming the value “1” if the ratio of women in the boardroom is at least 20 %, but less than 40 %; “0” otherwise), and balanced board (assuming the value “1” if the ratio of women is at least 40 %).

In search for an endogenous determination of the critical mass of women in the boardroom, we further calculated a measure of gender diversity. As one of the most wide spread diversity measures for categorical variables (e.g., Bear et al. 2010; Webber and Donahue 2001; Hambrick et al. 1996; Magjuka and Baldwin 1991), we used the so-called Blau index of diversity. Following Blau (1977), diversity of a group is given by

$$ H = 1 - \sum\limits_{c = 1}^{k} {s_{\text{c}}^{2} } , $$

where k stands for the number of categories (i.e., k = 2 in the case of gender) and s c is the fraction of supervisory board members with characteristic c (i.e., the fraction of female/male supervisory board members). Following Alexander et al. (1995), we standardize the index such that H = 0 signifies complete homogeneity (i.e., all board members are male) and H = 1 indicates complete heterogeneity (i.e., one half of all board members is female and the other is male). In order to account for potential non-linearities, the Blau index of gender heterogeneity does not only enter our regression in its linear but also in its quadratic form.

As controls, besides year and industry dummies and in accordance with the literature (e.g., Lindstaedt et al. 2011; Bermig and Frick 2010), we include a firm’s market value as well as a dummy variable for the use of the German accounting standard HGBFootnote 2 as both are obviously apt to influence our dependent variable ROE. Further, and again in accordance with the literature, we control for a set of board-related variables: board size (Lückerath-Rovers 2011; Adams and Ferreira 2009, or Farrell and Hersch 2005), codetermination (Lindstaedt et al. 2011; Oehmichen et al. 2010; Fauver and Fuerst 2006), and multiple directorships (e.g., Lindstaedt et al. 2011). Board size is measured by the number of members on the board and potentially related to gender diversity in the boardroom. Codetermination is measured by a dummy variable that takes the value “1” if the board is codetermined (i.e., besides shareholders’ representatives there are also employee representatives on the board) and “0” otherwise. Codetermination might be related to our dependent variable ROE (e.g., Bermig and Frick 2011b) and—as Arnegger et al. (2010) have shown—potentially also to gender diversity. Finally, the variable “multiple directorships” is calculated as the average number of board memberships a board member holds besides the one in the board under consideration. Again, this variable might well affect ROE (positively due to further board member’ experience; Sarkar and Sarkar 2009, or negatively because of time constraints; Fich and Shivdasani 2006) and it might also relate to gender diversity (Farrell and Hersch 2005, p. 87). Information on the different controls is taken from diverse sources, e.g., Thomson Financial Datastream, Deutsche Börse (2010), and firms’ annual reports.

Analysis

The central challenge for our empirical analysis is reversed causality as we cannot exclude that well-performing firms are more likely to appoint women to their boards (Smith et al. 2006, p. 579) or that women self-select into the boards of well performing firms. Further, unobserved factors may influence both the percentage of women on boards and firm performance. To address potential problems of endogeneity and in accordance to a similar approach by Dittmann et al. (2010) and Farrell and Hersch (2005), we use panel estimations and lag our central explanatory variable gender diversity by one year. Further, we also lag the board controls board size, co-determination, and multiple directorships as they are potentially related to gender diversity.

In a first step, we compare firm performance for different board types according to the classification by Kanter and then analyze the link between board type and firm performance in a multivariate regression analysis. In a next step, we regress firm performance on our measure of gender diversity in its linear and also in its quadratic term to account for potential non-linearities and to endogenously determine the “critical mass” of women on the supervisory board. In an attempt to further substantiate our results on the critical mass of women in the boardroom, we close with a regression on the apparent “magic number” of women in the boardroom. In all models, we use ordinary least squares (OLS) estimators with robust standard errors and firm clusters. As the Breusch–Pagan Lagrange multiplier (LM) shows the random effects (RE) estimator to be more appropriate in all models, we include the lead of the central explanatory variable in the regression in order to test for strict exogeneity, and find gender diversity to be exogenous in all specifications. We decide against the use of fixed effects (FE) estimators because for more than a third of our firm population, our main explanatory variable, gender diversity, does not change over time. According to a Hausman test, we further find the RE estimator to be more efficient than the FE estimator.

Results

Descriptives

Table 2 contains the means, standard deviations, and correlations for all the variables included in our analysis. After the elimination of outliers,Footnote 3 mean ROE in our sample is 9.42 with a standard deviation of 20.87. The average Blau index of gender diversity is 0.26 corresponding to a ratio of female board members of about 8 % (only slightly increasing in time from about 7 % in 2000 to about 9 % in 2005). The Blau index of gender diversity in our sample ranges from zero (no women on the supervisory board) to one (half of the members of the supervisory board are women). There are no boards in our sample where the ratio of women is larger than 50 %. 20 % of firms in our sample report according to the German standard HGB. Market value is on average 5,544.81 million Euros, about three quarters of the firms in our sample are codetermined, each board member holds on average about three other directorships and average board size is 11.4 ranging from 2Footnote 4 to 21.

Table 2 Means, standard deviations, and correlations

As to the industry distribution, the largest percentage of firms in our sample belongs to Industrials (28.5 %) followed by Financials (18.5 %) and Consumer Goods (12.6 %). Female representation on the board is higher in Financials, Telecommunication, Pharma & Healthcare and in Consumer Goods, and less prevalent in Industrials and Basic Materials. These results are consistent with the literature according to which female directors are more often to be found in Consumer Goods or Financials than Industrials (Adams and Ferreira 2009, p. 295; Brammer et al. 2009; Grosvold et al. 2007, p. 353).

Concerning correlations with our dependent variable ROE, we find it to be slightly positively related to market value (r = 0.05*) and to co-determination (r = 0.08**), and slightly negatively related to multiple directorships (r = −0.13***). Consistent with our theoretic prediction, we do not find an indication for a linear relationship between ROE and gender diversity.

As to potential interrelations with our main explanatory variable gender diversity, we find it to be positively related to market value (r = 0.14***), co-determination (0.33***), multiple directorships (r = 0.28***), and board size (r = 0.27***). That is, firms with a larger market value are characterized by a (slightly) higher degree of gender diversity in the boardroom. The same is true for codetermined firms as opposed to non-codetermined firms. Further, gender diversity in the boardroom is positively related to multiple directorships as well as to board size. That is, larger and more experienced boards have, on average, more women.

Concerning interrelations between the different controls, the most striking correlations concern board size: It is strongly positively related with multiple directorships (r = 0.67***) and with codetermination (r = 0.48***). In order to test for potential multicollinearity, we examined the variance inflation factors (VIF). As all VIF values were below 2.58, there is no multicollinearity problem.

ROE and Female Board Representation: Following Kanter (1977a, b)

Before starting with the regression analysis, in Table 3, we first take a look at the average ROE for the different degrees of female participation in supervisory boards according to the definition by Kanter (1977a, b). As expected (see Holst and Schimeta 2011), the most common groups in our sample are uniform groups with n = 394 and skewed groups with n = 360. Firms with a uniform supervisory board (i.e., no female representatives on the board) on average have an ROE of 9.6. Firms with a skewed supervisory board (<20 % females) on average have a significantly lower (p < 0.05) ROE of 7.7, while firms with a tilted supervisory board (20–40 % females) and those with a balanced supervisory board (>40 % females) again have a higher average ROE (12.3 and 12.4, respectively) with the difference between ROE in skewed as opposed to tilted groups being statistically significant in a Mann–Whitney test (p < 0.05). That is, there is evidence, that skewed boards perform worse than uniform boards, and that tilted boards outperform skewed boards. Hence, if there is a “critical mass” of women on supervisory boards that is needed in order for female representation to positively affect firm performance, this apparently is reached within tilted boards—just as proposed by Kanter.

Table 3 Average ROE for different board types according to Kanter

Our results from the Mann–Whitney test are mirrored by subsequently performed OLS and RE-regression analyses (Table 4) with ROE as the dependent variable and with dummy variables for the different types of boards as defined by Kanter (with skewed boards representing the reference category) and a set of further controls. Owing to the missing values, our sample size is reduced to 140 firms. Concerning controls, we find ROE to be positively related to market value and negatively related to board size, while the other controls are unrelated to ROE. With respect to the groups as defined by Kanter, we find that firms with a tilted board have a higher ROE than firms with a skewed board. The coefficients for the two other group dummies (uniform board and balanced board) are not statistically significantly different from zero, i.e., having a completely male (uniform) or a balanced board (40–50 % women) does not contribute to a higher ROE as compared to having a skewed board (<20 % women).

Table 4 OLS and RE regression with dummy variables for the different board types according to Kanter

Concluding, the results hint at a critical mass of women being reached in tilted as opposed to skewed groups. Other than pre-defining a critical ratio of female representation, in what follows, we attempt to endogenously determine the degree of female representation on supervisory boards at which a potentially negative effect will turn into a positive one by including a linear and a quadratic term of gender diversity into the regressions.

ROE and Female Board Representation: In Search of the Critical Mass

Table 5 shows the results of our OLS and RE estimation with ROE as the dependent variable and gender diversity in its linear term (in the a-variants) and also its quadratic term (in the b-variants).

Table 5 OLS- and RE-regression results with gender diversity in its linear and quadratic form

Starting with the controls, our results are quite similar to the regression with the different board types according to Kanter. Market value has a positive impact on performance; whereas, depending on the model, multiple directorships and board size have a negative effect.

Concerning the relation between gender diversity and ROE, our RE-regression in fact confirms it to be non-linear and concave. Figure 1 plots the link between gender diversity and ROE according to the RE-estimation including the quadratic term (model 2b in Table 5) and shows it to be U-shaped. The graph displays a global minimum at a normalized Blau Index of about 0.4 (corresponding to a share of women on the board of about 10 %) and shows increasing performance levels starting from there. Only at a Blau index of about 0.85 (corresponding to a ratio of about 30 % women on the board) ROE reaches the level of uniform boards with only male representatives. That is, we find evidence of the “critical mass” of female representatives on the board to be reached at a share of about 30 %. Over and above this threshold, the performance of a more diverse board exceeds the one of a completely male board.

Fig. 1
figure 1

ROE and gender diversity

As our finding of a U-shaped relation between gender diversity and firm performance does not prove to be robust with respect to other performance measures and/or a different set of controls, our evidence on a “critical mass” of 30 % female representatives is to be regarded rather tentative. However, as we will show below, our results are not only supported by the fact that a 30 % female representation lies within the spectrum of Kanter’s tilted groups but also by the recent literature on a supposedly “magic” number of three women on the board (Konrad et al. 2008).

A Magic Number?

With board size in our sample averaging 11.45, the critical percentage of about 30 % women on the board translates into an absolute critical mass of on average three women. Strikingly, this is exactly what Torchia et al. (2011) find in their recent analysis on female board representation and firm innovativeness: When there are three or more women on the board, firm innovativeness is higher than when there are less than three women on the board. Similarly, based on an interview study with 50 women directors and building on Kanter’s theory, Konrad et al. (2008) as well as Konrad and Kramer (2006) recently suggested the critical mass of women in the boardroom to be equal to three.

In what follows, we further substantiate our results, linking our analysis to the above cited studies. In our analysis, we distinguish firms with (a) no woman on their supervisory board from firms with (b) one woman on the board, (c) two women on the board, and (d) three or more women on the board. One woman on the board (b) corresponds to our global minimum of about 10 % female board representation, and three or more women on the board (d) correspond to our critical mass of female board representation of about 30 %. Again, we run OLS and RE-regressions (Table 6); the reference category is boards with only one woman (b).

Table 6 OLS- and RE-regressions with dummy variables for different numbers of women on the board

We find that having three or more women on the board significantly increases ROE as compared to having only one woman on the board. Unlike the preceding analysis, we find this result to be robust to the use of different performance measures (e.g., Tobin’s Q or PTBV) and/or control variables. Hence, our study is well in line with the recent literature on a critical mass of “three” as the “magic” number of women on the board, thus substantiating our preceding analysis.

Discussion and Conclusions

In our study, we explored the relation between gender diversity in the boardroom and firm performance based on critical mass theory. While the existing literature that builds on critical mass theory exogenously (and rather arbitrarily) defines the percentage of women on boards which is judged to be “critical” as being reached in tilted groups with 20–40 % women, we attempt to determine the critical mass of women on boards endogenously by adding a quadratic term into the regression analysis. Further, we add to the existing empirical literature on board composition and firm performance by explicitly accounting for potential problems of endogeneity with the help of a panel dataset. Last but not least, our analysis is based on the supervisory boards in a dualistic corporate governance system which up to now—for the case of Germany—mostly concentrated on the role of employee or bank representatives (e.g., Bermig and Frick 2011a; Fauver and Fuerst 2006 for the former and Dittmann et al. 2010 for the latter) and where only very recently gender issues have been tackled (Lindstaedt et al. 2011; Oehmichen et al. 2010; Bermig and Frick 2010).

In accordance with critical mass theory, we find skewed supervisory boards to be outperformed by tilted supervisory boards, i.e., we find evidence for the critical mass of women in boards to be reached in tilted groups with a percentage share of women between 20 and 40 %. Aiming at an endogenous determination of what represents the critical mass of women in the boardroom, we subsequently analyze the relation between gender diversity in supervisory boards and firm performance explicitly allowing for non-linearities. In fact, we find evidence for a U-shaped link between gender diversity on the board and firm performance: Apparently, it needs a critical mass of women on the board to realize the advantages a more diverse board may offer. We find this critical mass to be in the range of about 30 % female representation on the board—i.e., a clear case against tokenism on boards. Further, we find evidence of this critical mass to translate into a “magic” number of three women in the boardroom and hence lend support to the recent studies by Torchia et al. (2011), Konrad et al. (2008), and Konrad and Kramer (2006).

As for the managerial implications of our study, our results suggest that a more gender diverse board composition will only enhance performance if diversity is sufficiently large (10+ % female representation) and that only for boards with a critical level of 30+ % females (3+ women on the board), performance will be over and above the one of male boards. At very low levels of gender diversity (below 10 % female representation), an increase in diversity might even be associated with reduced firm performance.

Concerning political implications, our study suggests that—unless there are no restrictions on the supply side—female representation in the boardroom should be in the range of 30+ %. The question whether a women quota should be legally enforced or not, however, goes beyond the scope of our article. Drawing our data from a legal context without a women quota, we are not in a position to judge whether the established link between board diversity and performance would also exist in a system where women were appointed only because of the quota and not because of the knowledge and expertise they bring into the board. For example, the study by Ahern and Dittmar (2012) suggests that women who are appointed to a board due to a quota are, on average, younger and have less CEO experience than their male counterparts—which might in fact hint at restrictions on the supply side of eligible women that are ready and qualified to serve on supervisory boards.

As usual, our study also has several limitations. First, with a period of five years, our analysis is based on a quite short time period. Further studies may want to concentrate on longitudinal panel data covering a longer time span. Second, we study the link between board diversity and performance within one special national context (the German system with a two-tier board structure and codetermined supervisory boards). As Grosvold et al. (2007) stress, the institutional and cultural context might be of importance when analyzing board diversity and its effects. Hence, further studies should incorporate cross-country analyses.