As the distinctive economic and social roles of various kinds of nonprofits become conspicuous in many countries, countywide, statewide, and nationwide data concerning the nonprofit sector have been gathered and examined to grasp a numerical figure of the nonprofit sector. What these data and the corresponding empirical studies have revealed is noteworthy, that is, the size of the nonprofit sector varies dramatically according to its locality.

Weisbrod (1975, 1986, 1988) presented a rational explanation for this unique feature of the nonprofit sector by introducing the government failure theory. According to this theory, government provisions of quasi-public goodsFootnote 1 become relatively homogeneous because governments supply them in order to mainly satisfy the preferences of median voters. In this case, all but median voters are not satisfied by the government supply of quasi-public goods, and this dissatisfaction creates the demand for quasi-public goods supplied by nonprofits. Thus, the government failure theory implies that nonprofits are most active where the preferences of populations are most diverse or where populations are heterogeneous (Matsunaga and Yamauchi 2002). In short, the core source of government failure is demand heterogeneity, and it is the most influential factor determining the size of the nonprofit sector.

Many researchers have examined the concept of demand heterogeneity in order to determine whether it really has explanatory power. Their estimation results reveal that its explanatory power seems to be non-robust as summarized in Table 1. Some researchers have found that demand heterogeneity has a positive effect on the size of nonprofit sector (e.g., Abzug and Turnheim 1998; Ben-Ner and Van Hoomissen 1992; James 1987; Corbin 1999). However, some researchers have either found that it has no explanatory power, or that it did have explanatory power but that it was negatively correlated with nonprofit sector size (e.g., James 1993; Marcuello 1998; Salamon et al. 2000; Gronbjerg and Paarlberg 2001). Some results from these empirical studies are intuitive to what the government failure theory suggests since a proxy for demand heterogeneity has an explanatory power and its coefficient is positive. However, some results either invalidate or are counter intuitive to what the government failure theory suggests. Hence it is not surprising that empirical nonprofit sector size models specified in accordance with the government failure theory, with the inclusion of a demand heterogeneity variable, have reinforced suspicions over the robustness of the government failure theory. It is for this reason that further empirical studies have blossomed in this area of nonprofit research.

Table 1 Empirical results from previous studies

Among others who raise questions about the robustness of the government failure theory, Salamon et al. (2000) and Salamon and Anheier (1998) examined a nationwide data set collected by the Johns Hopkins Comparative Nonprofit Sector Project (CNP) undertaken by The Johns Hopkins Center for Civil Society Studies in 1995. The CNP was a project to collect descriptive data for countries with the goal of comparing nonprofit sector scale, structure and revenue sources across the selected countries. As indicated in Global Civil Society: Dimensions of the Nonprofit Sector (Salamon et al. 1999), some 150 researchers and 300 advisors around the world were mobilized with the core objective of developing “a common base of data about a similar set of ‘nonprofit’ or ‘voluntary’ institutions in a disparate set of countries”.

Using the CNP data set, Salamon et al. (2000) find that, contrary to what the government failure theory implies, the nonprofit sector did not grow in proportion to the degree of demand heterogeneity. Alternatively, they find that the nonprofit sector grew in proportion to the actual level of government support of nonprofit activities. Consequently, they conclude that the cross-country variation in nonprofit sector sizes is to be largely attributed to the level of government support of nonprofit activities in different countries. They claim that the observed financial relationship between governments and the nonprofit sector supports the robustness of the interdependence theory. The interdependence theory states that the government sector is a partner of the nonprofit sector in the production of quasi-public goods.

Though the sample size was undeniably insufficient to conclude without doubt the validity of either the government failure theory or the interdependence theory, their research was a milestone in the search for vindication of the underlying causal mechanisms that explain observable regional size variations among nonprofit sectors. However, we argue that the government failure theory should not be rejected as a result of the panel analysis approach to the CNP data set. What this study uniquely adds to the previous study is a better way to test the government failure theory using cross-country panel data, the advantages of fixed effects modeling, and the importance of studying subsectors of the nonprofit sector (health, education) rather than the sector as a whole.

This article is arranged as follows: we briefly review the government failure theory and previous empirical studies in these theories’ vein. We also summarize the data collection methodology of the Johns Hopkins Comparative Nonprofit Sector Project and highlight some issues in Salamon et al. (2000). We next introduce the criteria for how the government failure theory can be empirically examined according to Corbin (1999)’s argument. We then present the estimation results and include a discussion of unobservable heterogeneity, an advantageous feature of the panel regression estimation approach. Finally, the conclusion will summarize our findings and caveats in this study.

Government Failure Theory and Its Empirical Examination

Within a given society the kind of quasi-public goods and the level to which these quasi-public goods are produced by a government depends on the preferences of the median voter. The preferences of the non-median voter or non-median voter groups will therefore be purposefully neglected in order to satisfy the group containing the median voter. This scenario, given that non-median voter groups with homogeneous preferences are able to organize and mobilize resources, will result in unsatisfied demand for quasi-public goods, driving the establishment of nonprofit organizations to provide such goods. Nonprofit organizations, therefore, act as a substitute to the governmental provision of quasi-public goods, as the nonprofits have the power to crowd-out the governmental provision of quasi-public goods by satisfying a wide variety of unmet demands. In order to emphatically validate the government failure theory empirically, Corbin (1999) claims that the satisfaction of the following two hypotheses is necessary:

Hypothesis 1:

An increase in demand heterogeneity will have a positive effect on the size of the nonprofit sector.

Hypothesis 2:

An increase in government expenditure on quasi-public goods will have a negative effect on the size of the nonprofit sector.

Hence, a test of the government failure theory is a joint test of both demand heterogeneity and the size of governmental direct expenditures on quasi-public goods, although many previous studies have ignored this fact and concentrated on singularly determining the significance of demand heterogeneity irrespective of the size of governmental expenditure.

If nonprofits can be better suppliers of heterogeneous quasi-public goods than governments, then rational governments may choose to cut the budget for the direct supply of quasi-public goods and instead increase financial support for nonprofits so that the nonprofit provision of heterogeneous quasi-public goods will flourish. Many researchers (e.g., James 1987, 1993; Salamon 1987; Smith and Lipsky 1993; Frank and Salkever 1994; Kapur and Weisbrod 2000; Gronbjerg and Paarlberg 2001) have observed this relationship between the two. We may call this financial relationship between government and the nonprofit sector ‘the complementary financing hypothesis.’ Unlike the median voter, non-median voters may have little effect on authoritarian states. However, a significantly excessive unmet demand could lead to changes of government. Governments which are afraid to take risks support nonprofits financially because a nonprofit provision of quasi-public goods could meet the demands of non-median voters.

Since governmental financial support would be a major source of income for the nonprofit sector, when the government sector delegates the provision of heterogeneous quasi-public goods to the nonprofit sector, the accompanying increase in income stimulates the non-profit sector’s growth. Under the given condition that both hypotheses 1 and 2 cannot be rejected or the government failure theory is validated, this insight can be examined by testing the following hypothesis:

Hypothesis 3:

An increase in the share of governmental financial support in nonprofit revenue will have a positive effect on the size of the nonprofit sector.Footnote 2

Previous Empirical Examinations of the Nonprofit Sector Size Model

Numerous researchers have conducted empirical analyses on the nonprofit sector size with the objective of determining which variables explain the variability of the nonprofit sector by locality. Table 1 summarizes the estimation results from several previous studies.

As becomes clear from Table 1, there are several explanatory variables that have been repeatedly selected by researchers. These selected explanatory variables correspond to the relevant theoretical models as factors that influence the size of the nonprofit sector, and thus define the empirical model specification. The proxy variables representing demand heterogeneity and the scale of governments, usually data on governmental expenditures form the basis of almost all of these previous studies. Although these previous empirical studies can, in essence, be reduced to tests of the government failure theory, the variety of approaches and data sets must be noted. Gronbjerg and Paarlberg (2001), Ben-Ner and Van Hoomissen (1992) and Marcuello (1998) approach the problem using data at the country-wide level, Corbin (1999) at a metropolitan-wide level, James (1987) at a statewide level, and James (1993) and Salamon et al. (2000) at a worldwide level. Many of these researchers, however, did not set out to explicitly test the government failure theory but rather to engage in a general search for factors affecting the size of the nonprofit sector.

As is observable in the studies in Table 1, the choice of proxy variable constructed and included in the model dramatically affects the sign and statistical significance of the underlying demand heterogeneity regressor. For example, Corbin (1999) and James (1993) both chose religious diversity as a proxy for demand heterogeneity and found that demand heterogeneity had a positive effect on the size of the nonprofit sector. Salamon et al. (2000) also chose religious diversity as a proxyFootnote 3; they, however, found that underlying demand heterogeneity had no explanatory power. Abzug and Turnheim (1998) chose racial diversity as a proxy, but concluded that racial diversity had no explanatory power. Therefore, Table 1 shows that there appears to be no consensus as to which proxy measure of demand heterogeneity best captures all the pertinent dimensions of demand heterogeneity. This, however, is not unexpected and is quite reasonable, considering that the concept of demand heterogeneity is multidimensional. In addition, this vulnerable explanatory power of demand heterogeneity may be due to the implicit measurement error of the proxy variable approach, as Ben-Ner and Van Hoomissen (1991) and Steinberg (1997) suggest, due to the “lumpy” distribution implied by the heterogeneous demand argument is indeterminate or, as James (1993) suggests, due to data problems.

When modeling the relationship between governments and nonprofits, we again encounter some difficulties. For example, Abzug and Turnheim (1998) considered Moody’s municipal bond ratings as a measure of the financial competence of local governments. They reasoned that if a government were appropriately supplying quasi-public goods, the government’s Moody’s municipal bond rating would be favorable. However, upon estimation they found that the coefficient on the Moody’s municipal bond ratings was not statistically significant. Salamon et al. (2000), by contrast, claim that government social spending and governmental financial support as a share of nonprofit sector revenues can be both used as a proxy for the size of governments and to find positive and statistically significant relationships between nonprofit sector size and the size of governments. James (1993), on the other hand use government expenditure on education to proxy for the size of governments and find the negative relationships between nonprofit sector size and the size of governments.

From the above previous studies, there appears to be little consistency among the signs or statistical significance of either demand heterogeneity or the size of governments. Leaving the fundamental question unanswered is the nonprofit sector a substitute provider of quasi-public goods as in the government failure theory? Or does the government act to compliment the nonprofit provision of quasi-public goods? It is with this motivation that we intend to reexamine the government failure theory.

Research Design

In this section we primarily seek to address econometric issues and empirically reexamine the government failure theory using the same data set collected by CNP. It is highly likely that Salamon et al. (2000)’s research does not support the robustness of the government failure theory, because their empirical model faces the following two major problems:

Issues of Specification Error

The first problem concerns their model specification. They conduct two single variate regression analyses in order to test the government failure theory. The inclusion of one variable at a time in an empirical econometric model, within the context of testing the significance of a variable, results in an omitted variable problem if both covariates are required in the model. In the case of a theoretically relevant but omitted variable, misleading results indicating the significance of the included variable may provide evidence for the support of a theory, which if the omitted variable had been included may not have been supportable. Hence from a theoretical perspective, in the present case a test of the government failure theory necessarily requires that variables relating to this theory be included in the estimated model. This article improves on the Salamon et al. (2000) model by making use of a multivariate specification, as in James (1993), thereby effectively correcting the model’s misspecification by explicitly including the variables omitted in Salamon et al. (2000). We thus hope to avoid generating any possible misleading results.

Issues of Small Sample Size

The second problem concerns their sample size. Salamon et al. (2000) estimate several single linear regression models using the CNP data set of 22 observations.Footnote 4 Thus, few degrees of freedom are allowable, which is almost certainly open to criticism of small sample problem. However, as in this article, by decomposing the CNP data along the health and education subsectors operating within a nonprofit sector and then generating a fixed effects model object, we have effectively doubled the sample size to 44 and consequently increased the degrees of freedom. In addition to the advantage of increased sample size and degrees of freedom, there is a distinct advantage to the estimation of a fixed effects model, since an additional source of variation is incorporated into the model, thereby alleviating any potential bias due to the aggregation of 12 sectors (See Table 3) in the international classifications of nonprofit organizations (ICNPO).

The fixed effects model approach is relevant in this instance, as it can be argued that the proportionality of the nonprofit subsectors will vary, hence any decomposition by industry will indeed introduce further variability into the model. It seems a natural hypothesis that differing theories may pertain to different nonprofit subsectors, but the claims from James (1993) and James and Rose-Ackerman (1986) indicate that the health and education subsectors have something in common with each other, although debating whether this commonality is due to group identity or relevant experience in the subsector is merely to engage in circular argumentation. Consequently, we consider a model specification that explicitly models the health and education sectors.

Strategies for Modeling the Nonprofit Sector Size

In order to compare nonprofit sector size variation across countries, a definition of nonprofit sector size is required. There are two different approaches by which the size of the nonprofit sector can be gauged. The first is on the basis of the number of nonprofits in the nonprofit sector. The drawback of using this definition to measure the size of the nonprofit sector is that nonprofit organizations with a budget of 100,000 dollars and those with a budget of 2,000,000 are regarded as similar organizations. The second proxy for the size of the nonprofit sector is the number of people employed in the nonprofit sector. Ben-Ner and Van Hoomissen (1992) measured the size of the nonprofit sector, the government sector, and the for-profit sector by the level of employment across the sectors. Salamon et al. (2000) measured the size of the nonprofit sector by paid full-time equivalent employment (with or without volunteers) in the nonprofit sector as a share of nonagricultural employment. However, their measurement method is also not perfect because this captures only one aspect of the size of the nonprofit sector, that of the nonprofit labor market.

This article uses a measurement approach based on employment in the nonprofit sector because the second definition is used to measure the size of the nonprofit sector in the Handbook on Nonprofit Institutions in the System of National Accounts (2001) issued by the United Nation.Our definition of the size of the nonprofit sector is given by

$$ {\text{Nonprofit Sector}}\,{\text{Size = }}{\frac{{{\text{FTE}}\,\,{\text{Employment }}}}{\text{Nonagricultural Employment}}}. $$
(1)

Full-time equivalent (FTE) employment is the number of paid full-time equivalent jobs with or without volunteers defined as total hours worked divided by the average annual hours worked in full-time jobs. Following Salamon et al. (2000) we eliminate the scale effect of different countries by dividing FTE employment by nonagricultural employment so that we can compare the size of the nonprofit sector across the nations.

An improvement to the size of the nonprofit subsector definition would be to replace the denominator of nonagricultural employment, a general measure, with total employment in the education and health sectors. The proportion of nonprofit subsector FTE employment to the entire (nonagricultural) economy would then change to the proportion of FTE employment in nonprofit subsector j to total employment in the subsector j. The authors attempted to resolve this issue by means of the International Standard Industrial Classification (ISIC) sectoral employment data; however, with classification inconsistencies and missing data this task proved intractable. This improvement to the model would effectively eliminate measurement error implicit in nonprofit sector definition, although econometrically measurement error in the dependent variable is not detrimental to the estimation results. Measurement error is incorporated into the error term, assuming that the measurement error is uncorrelated with the independent variables, and thus coefficients estimates will remain unbiased and consistent, although less efficient.

In Salamon et al. (2000) the dependent variable represents the size of the nonprofit sector for each country as an aggregated whole; the individual subsectors that compose the nonprofit sector are not considered at all. When seeking evidence of a causal relationship, as in the case of testing the validity of a theory, it is necessary to control for potential alternatives. The Salamon et al. (2000) article’s construction of the dependent variable suffers from two fundamental problems. Salamon et al. (2000) rejects the notion that different theories may have a differential impact within different industries of a nonprofit sector and also do not take into account the fact that the composition of nonprofit activities by industry may exist in differing proportions across countries. The consequences of the Salamon et al. (2000) approach are that potential evidence of causal relations determining nonprofit sector size in one industry may be being drowned out in the aggregate by evidence to the contrary in another subsector. For example, demand heterogeneity may pertain strongly to a nonprofit education industry as non-median voter families with strong ideological preferences stimulate a nonprofit response, whereas in sporting and recreation subsectors, the government may be encouraged to support nonprofit activities, yet without a clear mandate from the median voter for the direct provision of such quasi-public goods. In this instance, an interdependence explanation may dominate. And yet, evidence of a large reaction in the education industry may, in aggregate, be drowning out evidence supporting an interdependence theory, as in the case of the sporting and recreation industries or vice versa.

The estimated model using this pooled data is specified as:

$$ {\text{Nonprofit Sector}}\,{\text{Size}}_{ij} = \alpha_{j} + \sum\limits_{k = 1}^{6} {\beta_{k} x_{kij} + \varepsilon_{ij} } $$
(2)

where i = 22, as the sample spans 22 countries, and j = 2 since we have two subsectors. Thus, we have effectively increased the sample size to 44 as well as introducing an additional source of variation into the model, as the relative sizes of the education and health nonprofit subsectors will vary across countries, thus aiding efficiency of estimation.

In the case of Salamon et al. (2000), three forms of composition effects to data aggregation exist: (1) composition effects across countries inherent within the composition of subsectors, (2) composition effects across countries inherent in the compositional structure of subsectors within nonprofit sectors, and (3) compositional effects due to the aggregation of nonprofit subsectors, preventing analysis in an industry-by-industry manner, thus not allowing for the possibility that the different theories of nonprofit sector size may be more applicable to some subsectors than to others. Since in this article, our focus is on only two industries, education and health, any aggregation bias due to the proportional composition of nonprofit sectors across countries will be alleviated when estimating a pooled data structure relating specifically to these two industries.

Explanatory variables \( x_{1ij},\) \( x_{2ij}, \) \( x_{3ij},\) \( x_{4ij},\) \( x_{5ij},\) \( x_{6ij}\) indicate the religious fractionalization (a proxy of demand heterogeneity), government expenditure on quasi-public goods, governmental financial support as a share of nonprofit revenues (government subsidies to the nonprofit subsectors), cross terms of government expenditure and governmental financial support, per capita income, and political framework as the respective regressors. The term \( \varepsilon_{ij} \) represents the typical disturbance term, which is identically independently distributed with mean zero and constant variance \( \sigma_{\varepsilon }^{2} \).

Since nonprofit organizations are not only providers of goods and services but important factors of social and political coordination (Seibel 1990), nonprofit organizations have deep historical roots in a political framework (e.g., Liberal, Corporatist, Statist, Social-democratic, and so on). How effectively a median voter’s voice is influencing a government’ decision on supplying quasi-public goods depends highly upon the political framework of a country. Therefore, it is necessary to control the political framework of a country in the model so that the model captures a pure effect of demand heterogeneity. In order to measure the political framework, we use the index from the Freedom in the World.

Freedom House, the international watchdog organization, issues a yearly report, “Freedom in the World,” which aims to measure the degree of democracy and political freedom in every country, district, and territory around the world by producing scores representing the levels of political rights and civil liberties in each country and territory. The scores are generated on a scale from 1 (most free) to 7 (least free), and we use the average score of these two indices (PRCL). Therefore, the higher average score represents the lower degree of democracy and political freedom. The status of democracy and political freedom is also grouped as “Free”, “Partly Free”, or “Not Free”, according to the ratings of scores. In the case of the countries used in our analysis, 17 countries are classified as “Free”, 5 countries as “Partly Free”, and none as “Not Free”.Footnote 5

We define the size of the nonprofit sector as shown in Eq. 1, and this measurement confounds the size of the nonprofit sector with changes in employment due to the business cycle or the distribution of employment by sector. Economic downturns can affect manufacturing and retail more than they do the service sector where the nonprofit organizations are predominantly located. Therefore, the nonprofit sector may appear larger during downturns and in economies with a large agricultural sector, and the effect of economic downturn should be captured and controlled in our empirical analysis. This also suggests that a measure of nonprofit sector size based on funding or capital may give a different result from a labor-based measure. Thus, the inclusion of per capita income (PCI) as a proxy-independent variable for stage of economic development is expected to control for such issues.

The validity of the government failure theory for the emergence and expansion of the nonprofit sector in society is acknowledged by explaining that the government can only meet the demands of median voters and fails to satisfy other voters’ heterogeneous preferences which the nonprofit sector is able to cater. This theoretical hypothesis is only robust under the precondition that voters are able to directly influence government tax and spending policies. Therefore, the scope condition of the government failure theory is the type and level of democracy with regard to reaching political consensus on allocating necessary government expenditures. However, the state of democracy as measured by the influence of median voters on decisions about government expenditure is not controlled in Salamon et al. (2000). For this reason, it is crucial to test this hypothesized relationship between the government and the nonprofit sector after controlling the effect of the state of democracy on the size of the nonprofit sector so that we can reach our research goals using the CNP data set.

On the other hand, it is conceivable that the size and scope of a given nonprofit sector may be attributed to an independent impact relating to other economic factors, which for a test of the government failure and interdependence theories need to be controlled for. More specifically, the economic stage of development for a given country may independently affect nonprofit sector size. The stage of economic development captures such effects as the capital to labor ratio of the for-profit sector, where a capital-intensive for-profit sector with low employment may result in, ceteris paribus, a larger nonprofit sector. The Salamon et al. (2000) also neglects to control for foreign donations. Much of the money earmarked for nonprofits in developing countries originates outside of that country, from foreign donors. If this omitted variable is correlated with the included explanatory variable misleading results may be produced. Although this article eliminates this form of composition effect, as we focus on only two subsectors within the nonprofit sector, it is felt that due to the nature of the size of sector definition, as using labor hours, it is important to control for the level of economic development.

Measuring Demand Heterogeneity

A one-way fixed effects model includes two types of heterogeneity. One is observable demand heterogeneity (ODH). The other is unobservable demand heterogeneity (UDH). It is very difficult to observe demand heterogeneity. It is even difficult to measure demand heterogeneity in an accurate fashion. Salamon et al. (2000) make use of a religious fractionalization index as a proxy measure of ODH. Religious fractionalization indices, however, are highly sensitive to the construction of the heterogeneity index. Depending on how fine the categorization of the various religious sects, differences in the degree of nonprofit activeness among the various religions may not be well captured. A good example of an index created with a specific purpose in mind can be found in James (1993). James (1993) argues that other than the size of the minority or majority group with homogeneous preferences, some religious groups have greater tendency to engage in proselytizing activities, which is likely to be a contributing factor to the establishment of nonprofit organizations. James (1993) also postulates that, in the case of groups with homogeneous preferences, a large minority or small majority are cases where competition over resources is greatest and thus a large nonprofit response is expected. This kind of relationship is not monotonic and only becomes evident when the index takes its highest value (most heterogeneous) within a population with a wide but even distribution of religious groups and take its lowest value (most homogeneous) when the distribution is uneven. James (1993) overcomes some of these problems by weighting her index in a way that increases when certain religious groups are present in the population. The principle to be drawn from James (1993) is that the construction of measures to gauge heterogeneity are highly sensitive to the purposes for which they are constructed, and hence an index created for one purpose may not necessarily be relevant for an alternative purpose. Since for the purposes of this article the Salamon et al. (2000) religious fractionalization index is used, to make a judgment on the influence of religious sects in a cross-country context is outside the scope of this article, although the weakness of using a fractionalization index which fails to incorporate important information is acknowledged. It is felt that this is another weakness in the Salamon article, which this article inherits.

One of the unique features of the econometric model specified by Eq. 2 is that it can control unobservable demand heterogeneity (UDH). The advantage of the one-way fixed effectss modelFootnote 6 is that UDH of education and health is explicitly modeled by the decomposition of the intercept into two intercepts, which relates specifically to the subsectors. As Moulton (1986, 1987) claims, an analysis of panel data (cross section by cross section in this article’s case) not controlling this heterogeneity runs the risk of obtaining biased results. As noted by Baltagi (1995), the fixed effects model approach to estimation can alleviate biases caused by the excessive aggregation of data. The constant term (α j ) in Eq. 2 captures an effect of UDH on the size of the nonprofit sector. Since the health and education sectors differ in terms of their history, financial institutions, political regimes, and so on, not accounting for this sector-specific UDH causes serious misspecification. If H 0:α 1 = α 2 is rejected, we say unobservable demand heterogeneity has no impact on the size of the nonprofit sector. After all, if the government failure theory can be supported empirically either ODH or UDH or both should have explanatory power. In summary, applying Corbin (1999)’s argument for a panel data setup to test the government failure theory is to examine whether ODH or UDH or both have explanatory power and the government expenditure on quasi-public goods has an explanatory power and its coefficient is negative. In addition to the above condition is met, the governmental financial as a share of nonprofit organization’s revenue should have an explanatory power and its coefficient is positive if the complementary financing hypothesis can be empirically supported. Therefore, using a panel data set, testing the government failure theory based on Corbin (1999)’s argument (Augmented Corbin’s test) and testing the complementary financing hypothesis can be summarized in Table 2.

Table 2 Empirical examination of theory and hypothesis

Data Sets and Descriptive Statistics

The data set used in this article is made up of data collected on 22 countries from Western Europe (Austria, Belgium, Finland, France, Germany, Ireland, Netherlands, Spain, United Kingdom), Central and Eastern Europe (Czech Republic, Hungary, Romania, Slovakia), Latin America (Argentina, Brazil, Colombia, Mexico, Peru), and a final category made up of developed countries (Australia, Israel, Japan, United States). In order to generate an estimate of the size of the nonprofit sector, entities that met criteria of (a) as being an organization; (b) self-governing; (c) not profit-distributing; (d) private; and (e) voluntary were considered to be nonprofit. Upon determining the nonprofit nature of an entity, organizations were then categorized by principal activity.

For the purposes of this article, we consider nonprofits with principal activities in the fields of education and health. Following Salamon et al. (2000), religious organizations were included in the relevant field of activity together with their nonreligious counterparts. Religious worship organizations, on the other hand, were recorded in the nonprofit activity field of religious congregations. Therefore, schools and hospitals that are run by religious orders are included in our data set. After the categorization process was complete, data on the key characteristics of the nonprofits, namely paid full-time equivalent (FTE) employment, volunteer employment (as converted into FTE), operating expenditures, and revenue sources (governmental financial support, private fees and charges, and private philanthropy), were enumerated. Following Salamon et al. (2000) the countrywide aggregated dimension used in this article is FTE, which is used in the construction of the dependent variable. For the education and health subsectors, governmental financial support received from the state as a share of nonprofit revenues is used in the construction of the independent variable.

Tables 3 and 4 indicate the data source and the descriptive statistics of our data, respectively.

Table 3 Descriptive statistics
Table 4 Data sources

Estimation Results and Testing the Government Failure Theory

Following Salamon et al. (2000), the religious fractionalization index proxies for ODH. The first and second columns of Table 5 are the estimation results of Eq. 2 when the dependent variable is nonprofit sector paid FTE employment divided by total nonagricultural employment whereas the third and fourth columns of Table 5 are the estimation results of Eq. 2 when the dependent variable is nonprofit paid and unpaid (volunteer) FTE employment divided by total nonagricultural employment.

Table 5 Results from estimation

The first and third columns of Table 5 are the estimation results of a pooling model whereas the second and fourth columns are the estimation results of a one-way fixed effects model.

According to Salamon et al. (2000)’s study, ODH has no explanatory power but governmental support of nonprofit activities proxied by government expenditure on quasi-public goods does have explanatory power. However, its coefficient is positive. Consequently, Salamon et al. (2000) rejects the government failure theory.

However, Salamon et al. (2000)’s may be due to both the small sample problem and specification error. Because 22 observations are not sufficient even for a simple regression, we alleviate the small sample problem by generating a pooled data set. The pooling model is an ordinary least squares regression of the dependent variable on a single constant and the repressors. The output consists of the standard results for the least squares regression. In our case, 44 observations are simply treated as if it were 44 cross-sectional observations.

The estimation results in the first and the second columns show that if the dependent variable is the size of the nonprofit sector without volunteers, observable demand heterogeneity (ODH) has no explanatory power as Salamon et al. (2000). However, unlike Salamon et al. (2000) the government expenditure on education and health (GEXP) has explanatory power and its coefficient is negative in both the pooling and one-way fixed effects models.

According to both the likelihood ratio test (χ2 = 3.78) and the F-test (F-statistic = 3.23), it can be concluded that the null hypothesis of no UDH between the two sectors (\( H_{0} :\alpha_{1} = \alpha_{2} \)) can be rejected in both at a 10% level of significance, and therefore, a one-way fixed effects model rather than a pooling model is well suited when the dependent variable excludes volunteers. In short, there exists UDH between the two sectors. A cross term of GEXP and GFS (GEXPFS) is included in the regressors to capture the interaction effect of government direct expenditure on quasi-public goods and governmental financial support on the nonprofit provision of quasi-public goods since both GEXP and GFS show decision making by a government.

In order to see the marginal effect of GEXP, we take a partial derivative of Eq. 2 with respect to\( x_{2ij} \). From this we will obtain \( {{\partial \, \left( {\alpha_{j} + \sum\nolimits_{k = 1}^{6} {\beta_{k} x_{kij} + \varepsilon_{ij} } } \right)} \mathord{\left/ {\vphantom {{\partial \, \left( {\alpha_{j} + \sum\nolimits_{k = 1}^{6} {\beta_{k} x_{kij} + \varepsilon_{ij} } } \right)} {\partial {\text{x}}_{ 2} = \beta_{2} + \beta_{4} \bar{x}_{3} }}} \right. \kern-\nulldelimiterspace} {\partial {{x}}_{ 2} = \beta_{2} + \beta_{4} \bar{x}_{3} }} \), where \( \bar{x}_{3} \) is an average of \( x_{3ij} \). The estimation results of the one-way fixed effects model (the second column) shows that the marginal effect is −1.65. That is, an increase by 1% of GEXP, ceteris paribus, decreases the size of the nonprofit sector by about 1.65 [= −1.78 − 0.14 × (−0.96)] percent. Corbin’s test in a framework of panel analysis, as in the augmented Corbin’s test shown in Table 2, suggests that the government failure theory can be empirically supported because while no ODH exists, UDH does exist and there is a negative coefficient of GEXP. In short, our estimation results back up the argument that the government failure theory can still be considered a robust theory in explaining why the size of the nonprofit sector varies from country to country.

However, governmental financial support as a share of revenues in the education and health sectors (GFS) has no explanatory power, and therefore, hypothesis 3 is rejected. Consequently, the complementary financing hypothesis cannot be supported empirically.

Per capita income (PCI) is used as a proxy independent variable to control the level of economic development and the effects of economic depression. The estimation results indicate that a 1% increase of PCI, ceteris paribus, increases the size of the nonprofit sector by about 2.25%. In short, the size of the nonprofit sector is larger in countries with higher per capita income, thereby removing the countries’ economic scales and states.

The average score of Political Rights and Civil Liberties (PRCL) has no explanatory power in both the pooling and one-way fixed effects models. This implies that differences in political framework across the countries have no impact on the size of the nonprofit sector when the dependent variable excludes the number of volunteers.

In contrast, when unpaid (volunteer) FTE employment is included in the dependent variable, a pooling model is now better suited than a one-way fixed effects model for our panel data set since the likelihood ratio test (χ2 = 2.45) and the F-test (F-statistic = 2.06) suggest that the null hypothesis of no UHD between the two sectors (\( H_{0} :\alpha_{1} = \alpha_{2} \)) cannot be rejected at 10% levels of significance in both tests. Therefore, no UDH exists. In addition, the coefficient of ODH is statistically insignificant as shown in the third column. Also, GEXP does not have explanatory power in the pooling model, thereby firmly rejecting hypothesis 1. Hence, the augmented Corbin’s test shown in Table 2 suggests that the government failure theory cannot be supported empirically when we take into account volunteer FTE employment. Also, governmental financial support (GFS) does not have explanatory power in the pooling model and therefore the augmented Jame’s test shown in Table 2 suggests that the complementary financing hypothesis cannot therefore be supported since there exists no ODH and no effect of GFS. Unlike the case of paid FTE employment excluding volunteers, PRCL has explanatory power. We posited that nonprofit activities are vigorous in countries with a high degree of democracy and political freedom and therefore expected its coefficient to be negative. However, the estimation result shows the opposite. The positive sign of the coefficient of PRCL indicates that an increase by 1 point in PRCL causes, ceteris paribus, an approximately 87% increase in the size of the nonprofit sector, which suggests that the size of the nonprofit sector is larger in countries with lower degrees of democracy and political freedom.

It should be noted as a caveat to this article, however, that testing the complementary financing hypothesis corresponds more precisely to the testing of conventional simultaneity among the sizes of the nonprofit sector and governments and public financial support of nonprofit activities. In other words, to examination of correlations between\( x_{2it} \), \( x_{3it} \), and the error term \( \varepsilon_{it} \). If a government is rational enough to recognize the nonprofit sector’s comparative advantage in supplying heterogeneous quasi-public goods to heterogeneous groups of the non-median voter variety, it is likely that the government will cut direct expenditures on quasi-public goods and entrust nonprofit organizations to provide them.Footnote 7 In order to examine this scenario in a more formal way than that we have executed in this article, the method of two-stage least squares \( (2SLS) \) to reestimate Eq. 2 should be carried out with both GEXP (\( x_{2it} \)) and GFS (\( x_{3it} \)) now being treated as endogenous as Matsunaga and Yamauchi (2002) claims. However, we are unable to perform this method because the statistical justification of the 2SLS is of the large-sample type. The 44 samples are not sufficient for the consistency and large-sample normality of the 2SLS coefficient estimators.

In summary, we conclude that the government failure theory still has a rational explanation for a unique feature of the nonprofit sector: nonprofit sector size variance from one country to another. This conclusion varies when the nonprofit sector size is defined in such a way that it includes not only paid FTE employment but also FTE volunteers. In that case, the government failure theory can no longer be empirically supported.

Summary and Conclusion

In this article we revisited the research done by Salamon et al. (2000) and empirically examined whether the government failure theory denoted a rational explanation for the size of the nonprofit sector varying from one place to another. Applying Corbin (1999)’s test to a panel analysis, we specified how to perform an empirical examination of the robustness of government failure, namely an augmented Corbin’s test of the government failure theory. We have also demonstrated how we can empirically examine the James’ complementary financing hypothesis using panel data.

In order to alleviate a small sample problem, we created pooled data using CNP data and reexamined different specifications from Salamon et al. (2000). This process expanded the sample size from 22 to 44 and we carried out a panel analysis. As a result, the estimation of a one-way fixed effects model in this article revealed that the government failure theory has a good chance of giving a rational explanation for a unique feature of the nonprofit sector: variance in the size of the sector from one place to another. Our estimation results using a pooling data set imply that Salamon et al. (2000) may have suffered from the specification error and/or small sample problems.

The conclusions of this analysis should be taken carefully since the government failure theory fails to explain variance in the size of the nonprofit sector when we include unpaid volunteer FTE employment in the dependent variable. This provides an opportunity for discussing measurement of the nonprofit sector’s size and the scope of unpaid volunteer employment. Also, our empirical result indicated that democracy and political freedom have a negative effect on the size of the nonprofit sector, contrary to our expectations. The index score employed shows a small variance in the degree of democracy and political freedom among the countries used in the case study, and does not completely quantify the differences or variety of political regime and type of democracy. Room remains for an examination of the validity of a proxy variable for political framework and condition.

In addition, it should be noted that the selection of countries in our data is contingent upon data availability. In particular, data for FTE volunteers depend highly upon local researchers’ accessibility. Since this could cause a serious measurement error in the dependent variable including FTE volunteers, our estimation results in columns 4 and 5 that do not support the government failure theory could be misleading. Also, mapping the nonprofit sector in Africa, the Middle East, and China is not an easy task, we currently have no choice but to depend on the Johns Hopkins Comparative Nonprofit Sector Project if we want to know about the international tidal stream of the nonprofit sector.

Although this article shows that the government failure theory should not be so easily rejected, it is clear that, due to several caveats in this article, further empirical studies on the robustness of the government failure theory are desirable.