1 Introduction

The re-emergence of interest in the determinants of growth has focused attention on the puzzle of the apparent non-convergence of per-capita income between low and high-income countries. One response to this dilemma has been what has come to be known as ‘endogenous growth’ theory, which in various forms drops the assumptions of the exogeneity of technological change, and the homogeneity of investment opportunities across countries (see Lucas, 1988; Romer, 1986, 1990 as seminal examples). The convergence prediction in Solow-Swan type growth models is of course conditional on homogeneity of the savings rate, the labour force growth rate, and the technology of production. Once the possibility of heterogeneity of countries in the relatively limited dimensions provided by Solow-Swan type growth models is recognised, the possibility that heterogeneity in other dimensions may be of significance to growth follows readily. Thus, to name a few, differences in the level of human capital (Barro, 1991; Mankiw, Romer, & Weil, 1992), the depth of financial development (King & Levine, 1993), the nature and quality of government intervention in economic processes (Barro, 1990; Fischer, 1991; King & Levine, 1993), have all been controlled for in growth equations. Yet even after correcting for a wide variety of additional explanatory variables, many growth equations struggle to account for cross-country variation in growth, particularly in Africa and Latin America (see for example Barro, 1991; Easterly & Levine, 1997).

Easterly and Levine (1997: 1205), in an influential article, drew attention to the potentially important role of ethnic diversity in influencing economic growth: “cross-country differences in ethnic diversity explain a substantial part of the cross-country differences in public policies, political instability, and other factors associated with long-run growth.” They find that ethnic diversity is an important predictor of economic performance, and of the African growth experience. Central to Easterly and Levine’s perspective is the question of whether the level of ethnic fractionalization bears on the potential for distributional conflict. Specifically, they hypothesize that, in the light of a significant body of recent literature, polarized societies will be prone to competitive rent-seeking that impedes agreement about the provision of public goods and that creates positive incentives for growth-reducing policies. This is attributed either to the pursuit of redistributive policies defined along ethnic lines, or to the fact that ethnically diverse societies are more prone to war and unrest which in turn is bad for economic growth.

A limitation of these studies on fractionalization is that whilst they emphasize cross-country evidence, they do not provide a basis for exploring the dynamics of the process. That is, they do not provide a basis on which to establish how growth interacts with ethnolinguistic diversity. There are at least two dimensions to this question. The first is that in evidence such as that presented by Easterly and Levine (1997), what is not explicit is that African economic growth fell dramatically from the 1960’s to 1970’s, and has remained dismal subsequently—at least for most of the continent. This leaves one with the puzzle of how it is that a variable, which remains a constant for all countries (the ethnolinguistic measure employed in many studies is for the single year 1960), is to account for a series of growth rates that follow the shape of a step function—with the step occurring some considerable time after the point of measurement of the ethnolinguistic measure generally employed in these studies.

The second dimension to the question addresses whether or not it is legitimate to view ethnolinguistic fractionalization as a constant over time. Is it instead not at least feasible that ethnolinguistic diversity changes with growth? If so, in what direction? And why might such a change occur? To answer these kinds of questions, a time series based study is needed.

South Africa constitutes an interesting case in which to explore the purchase of the claim concerning the (negative) impact of ethnolinguistic fractionalization on growth. In terms of the Ethnolinguistic Fractionalization Index (ETHNIC) cited by Easterly and Levine (1997), South Africa in 1960 ranked 6th out of 66 countries covered by this index with a value on the measure of 88 (that is a probability of 88% that two randomly chosen individuals are of different ethnolinguistic origin). The most highly fractionalized country was Tanzania (score of 93) and the least fractionalized were Haiti, Japan and Portugal (score of 1). South African census data collected since 1910 provides historical evidence of this fractionalization. Important, too, in the South African case is that from 1948 through to the demise of Apartheid, a major attempt was made to socially and politically “engineer” South Africa along the lines of supposedly “essentialist” ethnic identities. At least with regard to the “African” community these ethnic divisions were meant to coincide with linguistic identities. That South Africa continues to be linguistically highly fractionalized is registered, at least in part, by the fact that the present (post 1994) Constitution acknowledges linguistic diversity through the legal recognition of eleven official languages. If fractionalization is an important explanatory variable in the long run growth process of a country, South Africa is ideal to test this hypothesis. That data also proves to be available for this case study renders it doubly interesting, since it becomes susceptible to rigorous analysis. This is reinforced by the fact that a number of studies have already demonstrated that the social, political and institutional variables (SPIVs) available for South Africa prove to be important in driving the long run economic performance of the economy.Footnote 1

This paper introduces, describes and analyses several new sets of fractionalizationFootnote 2 indicators for South Africa: ethnolinguistic, religious and cultural fractionalization, and a polarization measure—most of which go back to either 1910 or 1936. The societal indicators are, respectively, linguistic, religious and racial fractionalization. The choice of these indicators was occasioned by the desire to establish whether, and if so in what degree, social diversity might help explain the long-run growth performance of the South African economy.Footnote 3 What is unique to the study is the use of time series data over an extended time frame in a wide range of institutional dimensions.

Section 2 presents a review of the existing literature on fractionalization. Section 3 of the paper notes a range of measurement issues relevant to the institutional variables employed in the study. Section 4 examines the significance of dynamics in measures of fractionalization. Section 5 explores the question of whether the level of aggregation in the fractionalization carries analytical significance. Section 6 concludes.

2 Review of previous literature

The central question confronted by the literature has concerned why fractionalization might matter for long-term development prospects. A number of proposed mechanisms have emerged.

The first suggests that diversity carries positive consequences for economic growth. The reason for this is that a large diversity of abilities, experiences and cultures generates a productive, innovative and creative environment.Footnote 4 In addition, diversity increases with the size of the underlying population, and greater population size is associated with standard benefits associated with economies of scale, internalization of any (particularly policy) externalities, market size, easier defence and protection from external aggression, and readier insurance from stochastic shocks.Footnote 5

The more conventional mechanism suggests the presence of a negative association between diversity and growth prospects. This argument posits two alternative mechanisms. In one, the link is generated by the fact that diversity increases the probability of conflict, and instability in turn is inimical to economic growth prospects.Footnote 6 The second link is through the quality of public goods provision. Taking for granted the existence of a positive impact of public goods provision on economic growth, the presence of diversity in ethnic identity has been argued to disrupt public goods provision,Footnote 7 to produce multiple policy distortions,Footnote 8 or to lead to public goods provision which is inappropriate.Footnote 9 All of these approaches share the common insight that a conflict of preferences generates policies that are counterproductive, and in the extreme case leads to an oppression of minorities which may generate political unrest or civil war.

That link between diversity and development is not uncontested, however. Thus Acemoglu, Johnson, and Robinson (2004) argue that growth in the final instance depends on institutions, not on the underlying diversity of the population. In their argument, the contrast between the North America and Australasian experience on the one hand, and the African on the other, lies in the alternative colonial histories they faced. In the former, colonialism brought with it the transfer of forward-looking, sound institutional mechanisms that proved to be growth-compatible or enhancing. By way of contrast, the African experience was one of a scramble for resources by the colonial powers, with little attention being paid to the construction of institutional mechanisms beyond those focussed on the maximal extraction of resource-bases.

An alternative to the suggestion that growth depends either on institutions or on fractionalization, is that fractionalization may interact with institutional structure. A number of contributions have merged in this vein. Thus democracies have been argued to be better able to deal with diversity,Footnote 10 institutions are modelled as endogenous to the nature of intergroup conflict,Footnote 11 fractionalization in Africa has been suggested to be endogenous to poor colonial institutions,Footnote 12 or the impact of diversity has been suggested to depend crucially on the quality of institutional dispensation.Footnote 13

A range of additional questions has been raised with respect to the micro-structure of the link between diversity and growth. First, the very concept of ethnic identity on which measures of diversity are generally based, has been argued to be subject to conceptual ambiguity. In particular, ethnic identities are viewed as malleable, subject to change over time.Footnote 14 Since individuals possess multiple ethnic identities at any given point in time, which identity proves salient depends on the circumstance in which agents find themselves. The consequence is that there exist multiple dimensions of ethnic cleavage in any country at any given time,Footnote 15 and instead of the fixed and unchanging entities assumed by some of the early contributions to the debate,Footnote 16 ethnic groups are instead the products of political and historical processes.Footnote 17

Measurement issues have loomed large in the debate, with a wide range of issues being raised. Given the presence of multiple forms of social cleavage pointed out in the preceding discussion, it follows immediately that the first hurdle of measurement is that the wrong cleavage be identified by any formulaic or essentialist definition of ethnic or indeed any other (such as religious) form of identity.Footnote 18 Second, change in the number and size of groups along any dimension of cleavage may come to change endogenously over time, necessitating not only the use of relevant measures of cleavage, but emphasising the use of time series data also.Footnote 19 Third, the relevant index of measurement has itself been the subject of considerable contestation. The standard measure in the literature has proved to be the ELF measure of fractionalization, which identifies the probability that two randomly chosen individuals come from different groups.Footnote 20 An alternative index of fragmentation is provided by the RQ measure of polarization, which accounts for the relative size distribution of groups within the population by applying a larger weighting to relatively large groups which may be more likely to engage in conflict with one another.Footnote 21 Both of the preceding indexes suffer from the limitation that they are insensitive to the cultural distance between ethnic groups—prompting the proposal of the CF index of cultural distance or diversity, incorporating cultural distance.Footnote 22 Finally, the literature has pointed to problems in the quality of the underlying ethnographic data,Footnote 23 and that any single measure of diversity may obscure relevant features of the ethnic landscape, either because the index is insensitive to relevant variation,Footnote 24 the index contains no information about spatial distribution,Footnote 25 or about multiple dimensions of division.Footnote 26

In the context of estimation, a number of additional concerns have emerged. Most important of these has been the concern that measures of diversity may be endogenous to processes of economic development. Growth itself has been argued to cause increased diversity,Footnote 27 or indeed as an incentive to assimilate into dominant linguistic or ethnic groups.Footnote 28 Alternatively, fractionalization may be endogenous to the institutional environment.Footnote 29 The retort in the literature has been that any feedback from growth to diversity is simply not proximate enough to be of any empirical consequence, and also that if it is true that growth raises diversity, and diversity lowers growth, then finding a negative association between diversity and growth despite the countervailing forces strengthens the finding of any negative association between fractionalization and economic performance.

A final class of concerns in the empirical literature has addressed the possibility that any partial correlation between diversity measures and economic performance may be spurious. Thus the suggestion is that both diversity and growth may be correlated with the level of development,Footnote 30 the age of the state,Footnote 31 geography,Footnote 32 civil wars,Footnote 33 and institutions and government policy.Footnote 34

Thus as Alesina and La Ferrara (2005) point out, the precise mechanisms governing the endogenous formation of identity, why precisely some dimensions of diversity prove to be salient to development, while others do not, the governing mechanisms of social mobility and their interaction with measures of diversity, and issues of measurement remain incompletely resolved in the debate.

3 Fractionalization: reflecting on the significance of measurement problems

Measures of ethnolingusitic fractionalization have come to be widely used in growth studies. In addition, the literature has seen the introduction of a variety of measures of such fractionalization.Footnote 35 Since the present study introduces a range of new measures of fractionalization, in a number of dimensions, we reflect briefly on some issues surrounding both the measurement and interpretation of linguistic and ethnic fractionalization.

The fractionalization indexes employed for the present study were constructed by consulting official South African government statistical sources which are based on census data. Between 1910 and 1960 our principal source has been the Union Statistics for 50 Years and various other official statistical yearbooks. Subsequently the Central Statistical Services Reports were our primary sources as well as various statistical reports of the apartheid-created ‘independent homelands’.

We begin our discussion of fractionalization with a general methodological warning that applies repeatedly in what follows. Obtaining consistent and reliable data series on linguistic usage and other forms of fractionalization in South Africa was non-trivial. The scope and accuracy of coverage of different population groups has not been consistent over time. Particularly with respect to Africans, early years showed incomplete data collection, rendering data unreliable. Moreover, under Apartheid and the repeated creation, redefinition, and disappearance of supposedly ethnically defined “Homelands,”Footnote 36 significant portions of the African population disappeared, moved classification, and reappeared in statistical sources. Finding, reclassifying, reinserting such populations was not a trivial task. The implication is that the fractionalization data suffers substantially from data collection problems. After 1970 the active creation of Homelands for Africans under Apartheid policies shifted responsibility for census data collection away from Pretoria. For example, data collection on first language orientation by Homelands was negligible. The consequence is an error of measurement in any use of official data series, with a downward bias in the resulting linguistic fractionalization. Reported fractionalization indices in the present studies are corrected for any Homeland exclusion.

The general point here is that given the demands of this undertaking even for a country such as South Africa, which relative to many other developing countries has fairly extensive and sound data collection procedures, this raises significant questions concerning the quality of data for developing countries in general—where data collection frequently is even more problematic than in South Africa, in the presence of high levels of fractionalization which require high levels of accuracy.Footnote 37

For example, the highest levels of linguistic fractionalization reported in Alesina et al. (2002) are for the Cameroon, Chad, the CAR, the DRC, Ethiopia, the Gambia, Guinea-Bisau, Kenya, Liberia, Mali, Mozambique, Nigeria, Philippines, South Africa, Tanzania, Uganda, and Zambia—all of which report a probability value of above 0.8. The developing countries with the highest levels of fractionalization are quite likely to have the most contentious data being employed in cross-sectional growth studies. Furthermore, the data employed by Alesina et al. for a wide range of countries rely on the Encyclopedia Britannica, the CIA World Factbook, Scarrit and Mozaffar (1999), Levinson (1998) and the World Directory of Minorities of the Minority Rights Group International (1997). The South African data comes out of the Levinson work and differs from that calculated in this paper from primary data. A cursory glance through these sources reveal some of the problems. For example, the CIA Factbook lists South African ethnic groups on the basis of the four racial population groups. While racial fractionalization in South Africa is undoubtedly important (and we deal with it in a separate section below), ethnic divisions cleave racial groupings also. This is well illustrated by the high degree of fractionalization within the African population group in South Africa, where linguistic grouping does correlate to some extent with ethnic identity. The CIA Factbook also lists the religious groups as follows: Christian, Muslim, Hindu, and Indigenous with no further differentiation, whereas for South Africa more detailed data is available. Note further that for other countries in the Alesina et al. data set these categories are further disaggregated. An immediate question must therefore be whether like is being compared with like in international cross-country estimations.

In similar vein, Posner (2004) criticizes the ETHNIC used in growth studies based upon the Atlas Naradov Mira measures. He highlights the ‘grouping problem’ in which umbrella categories sometimes subsume groups that are clearly distinct and often highly antagonistic. He cites the Tanzanian example in which the Nyamwezi and Sukumo are collapsed into a single category despite the fact that these groups are very distinct (culturally, linguistically and geographically) and are keen political competitors. In fact both the Rwanda and the Burundi measures collapse the Hutus and Tutsis into a monolithic category which ignores a fundamental cleavage which played a role in the devastation faced by both countries towards the end of the 20th century. On other occasions he points to the ‘problem of inclusion’ where distinct linguistic groups are produced in the Atlas study but which are irrelevant as political actors. Posner does not deny the importance of ethnic groups in the Atlas but rather claims that they are often unimportant for the political measure that ethnolinguistic fractionalization is trying to test, namely the competition for power and resources.

Posner (2004) furthermore takes issue with the use of Herfindahl indexes more generally and questions whether they are the best means of capturing a country’s diversity because it often is insensitive to a great deal of variation in the ethnic landscapes being compared. A further problem he highlights is that the index contains no information about the depth of the divisions between different groups. It also provides no information about the relative political weights to be assigned to various ethnic groups.Footnote 38 So for example, the relatively small white proportion in South Africa does not capture the impact of this group on the country’s political landscape. The final problem presented by Posner is that these single measures contain multiple ethnic and political cleavages which change over time. He cites the Indian example of a population which is divided along ethnic, religious, linguistic and caste lines and questions which group division is more relevant.

While issues of measurement are undoubtedly crucial, these extend beyond simple questions of accuracy of measurement. In the sections that follow in this paper we consider a range of additional questions that arise in the context of appropriate construction, and interpretation of measures of fractionalization. The intention of the discussion is to move beyond the consideration of static measures with wide geographic sweep. Instead, the object of the discussion is to consider questions that arise from the introduction of time, and questions of aggregation that arise from in-depth evidence to emerge from a specific case study.

4 Dynamics: do changes over time exist in measures of fractionalization, and do they carry analytical significance?

Data questions surrounding fractionalization extend beyond problems of measurement. An equally important set of considerations, to be explored over the course of the following sections, concern appropriate interpretation of the evidence that emerges from the data.

The first question arises due to the fact that studies employing measures of fractionalization, frequently do so on the basis of data that have an observation at a single time point (Atlas Naradov Mira, Alesina, 2002). These observations bear a considerable analytical burden—as explanatory variables of dynamic processes in economic development, generally per capita GDP. Data sets have observations for distinct years across countries (the measure is not for the same year across countries). Implicit in the use of either set of fractionalization measures, is the presumption that measures of fractionalization are essentially static, and exogenous to processes of economic development. Only on this presumption would the time point of measurement and/or the time-invariant nature of the fractionalization measure be irrelevant to estimation of the fractionalization impact. The immediate question is then whether the presumption of an unchanging magnitude of the fractionalization measures employed in studies is justified.

Evidence from South Africa on first blush suggests that it may be. In Table 1 we report fractionalization measures on linguistic and religious fractionalization for the aggregate South African population, for available census years in South Africa.Footnote 39 The presumption that measures of fractionalization appear to be relatively invariant over time, appears to be vindicated, with both linguistic and religious fractionalization changing by relatively little in aggregate for South Africa over a period of more than half a century.

Table 1 Fractionalization measures for available South African census years

However, the same cannot be said of the measure of racial fractionalization also reported in Table 1. While the measure remained fairly constant over the 1911–1960 period (at a probability of approximately 0.49), from 1960 to 2001 fractionalization declined substantially in relative terms to end with a probability of 0.36 in 2001.

The distinction between the three measures of fractionalization is particularly important when placed into the context of the level of distributional conflict in South Africa. A number of studies have examined the impact of distributional conflict on growth in South Africa, including its impact on investment in physical capital stock. Results are unanimous in confirming a negative impact of a measure of political instability on either output directly, or on investment rates.Footnote 40 Yet, the measure is not readily tied to our measures of fractionalization in any manner that readily conforms to the priors that arise from the literature on fractionalization. In Fig. 1 we depict the measure of political instability (as proxy for distributional conflict),Footnote 41 as well as the measures of fractionalization.

Fig. 1
figure 1

Fractionalization measures and distributional conflict

The measure of distributional conflict in South Africa identifies a number of periods of intense conflict—during the 1960s, the 1970s, and particularly the 1980s. This much is conventional wisdom about South Africa. But note that the aggregate linguistic fractionalization measure for South Africa remains virtually unchanged, both over periods of stability (1946–1964; the 1990’s) as well as periods of relative instability. Similarly, the measure of religious fractionalization shows its period of increase before the periods of intense conflict, and remains virtually static over the period of intense conflict.Footnote 42

Instead, the most likely source of distributional conflict, if such a source is to be found in a fractionalization measure, comes from the measure of racial fractionalization. This is the only fractionalization measure that demonstrates significant change from the period preceding significant distributional conflict, to the period in which distributional conflict rose dramatically. That distributional conflict in South Africa might most plausibly be linked to racial divisions in the society does not seem a dramatic finding to have reached. After all, what could possibly constitute a more clichéd view of the South African political landscape?

But there are at least four considerations in relation to the empirical evidence that nuance the findings from cliché to insight. First, note that in contrast to the measure employed by Alesina for fractionalization in South Africa, the measure of fractionalization most plausibly related to distributional conflict is not static, but strongly changes over the sample time frame. Indeed, it is this very change in the fractionalization measure that usefully might be said to distinguish between periods of relative stability and instability in South Africa.

Second, note that the direction of change in the racial fractionalization variable is the opposite to that predicted by the literature. It is not rising, but falling, racial fractionalization that appears to be associated with rising political instability, though by the 1990s political instability quietens again despite a continued decline in racial fractionalization which continues to this day. One response to this finding is to conjecture that the association between fractionalization and distributional conflict is more non-linear than generally hypothesized in the literature. Very high fractionalization measures may reflect iso-sized groups, in stable equilibrium rather than in distributional struggle. The relative strength of each grouping may be such as to render the cost of conflict prohibitive to each. Instead, it is precisely the disturbance of such an equilibrium through increased pressure toward homogenization, brought about by a relative gain in one or another of societies’ constituent groups, that may trigger distributional conflict—either through aggression (such as increased repression of a growing disadvantaged group), or as a defensive measure (such as rising political resistance in the face of legislative measures designed to disempower). Eventually, the relative preponderance of one group may become sufficiently large to once again render the cost of conflict prohibitive for other groups, such that sufficiently high homogeneity in turn leads once again to political accommodation and relative stability.Footnote 43

A third consideration might concern the direction of causality between instability and the fractionalization measure. Racial fractionalization might well have fallen at least in part due to white emigration, in the face of rising opposition to the maintenance of white privilege. Under this interpretation of the evidence causality would run from instability to racial fractionalization, rather than vice versa. A further consideration here might be to question again the interpretation of the interaction between religious fractionalization and political instability—of whether rising religious fractionalization might have led to distributional conflict—though with a lag.

To investigate these questions we examine the interaction between the three measures of fractionalization and political instability, employing the bounds analysis of Pesaran, Shin,and Smith (2001). Footnote 44 We report the resultant F-tests in Table 2.

Table 2. PSS F-tests for direction of association

The evidence is instructive. The F-tests favor the direction of association to be from racial fractionalization to political instability, with no evidence in favor of the reverse direction of association.Footnote 45 Second, there is no evidence of religious fractionalization impacting on political instability—nor of religious fractionalization being influenced either by any other measure of social cleavage or by the political instability measure. Third, it is noteworthy that the different measures of fractionalization appear to hang together—with particularly linguistic fractionalization appearing as outcome variable flowing from other forms of social cleavage.

The fourth and final implication that follows from the evidence is that the trigger for distributional conflict in a society is at least not universally associated with a single measure of social cleavage. In South Africa linguistic fractionalization, at least at the aggregate level is difficult to associate with variations in political instability—while racial fractionalization is more readily so associated.

The general point thus is that while there may well be links between distributional conflict and measures of fractionalization, such links may prove to be contingent on history and social context. Appropriate measures of fractionalization to be associated with distributional conflict may thus be heterogeneous across societies.Footnote 46

5 Is distributional conflict a problem emergent at the aggregate level?

Thus far we have considered data only at the aggregate level for South Africa. Both the linguistic and the religious fractionalization measures ignored any information to emerge at more disaggregated levels—for racial groups, for instance.

Yet given the evidence to have just emerged on the impact of racial fractionalization on political instability, it is not clear that aggregate measures of fractionalization are appropriate for capturing distributional conflict, at least in South Africa. Instead, distributional conflict may be the expression of developments that are located at greater levels of disaggregation than can be captured by the simple summary measures reported thus far.

In order to examine the plausibility of this hypothesis we measure linguistic fractionalization, disaggregated so as to capture not only the overall level of fractionalization, but also the levels of fractionalization within the racial groups as they were officially designated in the period of segregation prior to 1948, and which became statutorily encoded after 1948 under Apartheid.Footnote 47 In this way we are better able to capture dynamic processes and potentially interesting patterns of change that might not be observed if one were to focus only on the aggregate data. (Table 3)

Table 3 Fractionalization measures for available South African census years—the racial breakdown

We have already noted that the aggregate measures of both linguistic and religious fractionalization in South Africa remain constant over protracted periods of time. By contrast, the disaggregated fractionalization indexes in both dimensions, when decomposed by race, show substantial variation over time. In effect, the aggregate evidence on linguistic and religious fractionalization in South Africa is misleading in the sense that it fails to reflect the substantial extent of social change that occurs within the society.

A number of features stand out from the evidence:

  • Only two out of eight racially decomposed measures of linguistic and religious fractionalization show stability of the same order as the aggregate measures: African linguistic fractionalization and Coloured religious fractionalization. All other racially disaggregated measures in either dimension report relatively strong change over time.

  • Of the four racial groupings in South Africa, the two smallest groupings, that of Coloureds and Asians, show the most dramatic change over time. Most spectacular of all, Asian linguistic fractionalization falls from a probability measure of 0.78 in 1970, to 0.10 in 1991, reflecting a switch from a range of Indian first languages to English over the 20 year period. While linguistic fractionalization amongst Coloureds has risen rather than fallen strongly from 0.21 to 0.29 over the same period, it also has done so by virtue of a linguistic switch to English first language, from Afrikaans in this instance. Over the period in which Asian linguistic fractionalization is falling dramatically, Asian religious fractionalization is increasing strongly, from 0.47 to 0.61.

  • One of the two “dominant” racial groups of South Africa also shows instances in which a stable pattern of fractionalization is disturbed. Here however, the disturbance is relatively temporary, with a reversion to the former stable fractionalization pattern reemerging over time. Thus white linguistic fractionalization rises from 1970 to1980, with the entry of significant Portuguese immigration from the two former Portuguese colonies of Mozambique and Angola. However, the increase in fractionalization rapidly reverted to the former level of fractionalization by 1991, as the Portuguese immigrants were assimilated into the Anglophone white linguistic grouping.

Two general implications are suggested by this evidence.

The first is that aggregate measures of fractionalization may prove to be substantively misleading as indicators of underlying potential for distributional conflict. In South Africa, the aggregate measure of linguistic fractionalization simply hides substantial movement in the extent of cleavage that emerges at more disaggregated levels, and which might conceivably have triggered distributional conflict. Linguistic fractionalization at the micro level appears considerably less stable, with dramatic and sustained change appearing over a relatively short time period. The corollary is that testing for the impact of fractionalization on instability, or growth, cannot necessarily simply have recourse to aggregate measures in any dimension, since the real change of relevance triggering the conflict might not be observable at the aggregate level.

The second implication is potentially both more interesting, but also of greater significance for estimation purposes. We have already noted that larger marginalized minorities in South Africa show a relatively rapid degree of acceptance of the language of trade and industry: English. The preceding evidence reports that the Coloured, Portuguese but particularly the Asian communities of South Africa show substantive tendencies to adopt English as their main language. Such changes may reflect decisions to improve employment and economic prospects by acquiring the linguistic currency of economic activity. In effect, this encompasses a choice to invest in more appropriate forms of both human and social capital in order to increase the prospects for both individual occupational mobility and the reduction of the transactions costs entailed in participation in the economy.Footnote 48 The suggestion is that at least larger, more significant minorities face considerable incentives to assimilate into dominant linguistic patterns.Footnote 49 By contrast, small minorities (such as the German and/or French speakers in the White population) may be able to retain linguistic identity longer since they are insignificant enough to be left alone and develop their own cultural enclaves within the broader society.

The net consequence of this may be that linguistic fractionalization at the micro level manifests the lack of stability noted above. Importantly, we note that the implication is that linguistic fractionalization is no longer an exogenous (constant), but becomes endogenous to the potential of distributional conflict.Footnote 50 Those population groups most at risk from conflict (large minorities) have an incentive to assimilate, while either larger groupings and/or small minorities have the means to maintain their identity over time.

In turn, this carries a potential challenge for the notion that fractionalization necessarily leads to distributional conflict between linguistic groups. At the very least one would have to recognize that in the South African context minority groups had an incentive to integrate into the prevailing language of commerce and industry, and did so with alacrity. Perhaps this is true only of minorities, and in the presence of roughly equally sized groups (e.g. Belgium) the story might be different. Indeed the lengthy stand-off between white and African population groups in South Africa would suggest that where “opponent” groups are of sufficient size or influence, distributional conflict may indeed be protracted and intense. Nevertheless, the evidence points to the importance and significance of nuance and clinical study as a valuable extension of aggregate or summary cross-sectional evidence.

Lastly, we examine the implications of using alternative fractionalization measures in the South African case. We focus on two particular measures which have been widely employed recently, namely a polarization measure (Montalvo & Reynal-Querol, 2005) and a cultural fractionalization measure (Fearon, 2003). The calculation of these measures was discussed in Sect. 2. The polarization measure accounts for the relative size distribution of groups within the population by applying larger weighting to relatively larger groups which may be more likely to engage in conflict with one another. This follows the seminal work by Horowitz (1985) which argues that the relationship between ethnic diversity and conflict is not monotonic. There is less violence in highly homogenous and highly heterogeneous societies, and more conflict in societies where a large ethnic minority faces an ethnic majority. Table 4 reports our polarization calculation for South Africa from 1946, denoted RQ.

Table 4 Comparison of ethno linguistic fractionalization (ELF), polarization (RQ) and cultural fractionalization (CF) measures for South Africa

Cultural fractionalization measures the ‘ethnic distance’ across groups in order to obtain indicators of cultural diversity. Fearon measures these distances in terms of the proximity in a tree diagram of the families of languages of different groups. In the South African case we calculate the linguistic distances on the basis of language identification provided in the ethnologue by Gordon (2005).Footnote 51 Our cultural fractionalization measures for South Africa, denoted CF, are presented in Table 4.

A number of inferences emerge from the comparison of the distinct measures. First, for the Asian and Coloured population groups, the three measures are distinct in terms of the absolute level of social cleavage they indicate. But in the dynamic sense, there is little to distinguish between the three measures: they rise and fall together, and in similar proportions.

Second, for the White population group, the fractionalization (ELF) and cultural distance (CF) measures are distinct neither in terms of the absolute level of social cleavage they indicate, nor in terms of dynamic changes over time, leaving little to distinguish between the two measures. By contrast, the polarization index (RQ) is highly negatively correlated with both of the alternative measures, and returns very high measures of polarization.

Third, for the Black population group and the population as a whole, fractionalization (ELF) and both polarization (RQ) and cultural distance (CF) measures have high but negative correlations, while the distance and polarization measures appear to be uncorrelated. For both sets of correlations, however, it must be noted that while the measures are distinct in terms of the absolute level of cleavage they indicate, they do not manifest much dynamic change.

Fourth, we note that on all three measures of social cleavage, aggregate measures continue to mask sometimes quite considerable changes at the level of sub-groups within the aggregate. Thus the dramatic change in fractionalization for the Asian population group is evident for all three of the measures presented, without the change manifesting itself at the aggregate level.

Fifth, the three measures are not necessarily readily interpreted as triggers of conflict. Thus for instance, the Asian population group is highly fractionalized, polarized and culturally distant in terms of its constituent groups, and then rapidly homogenizes. The White population group is not fractionalized or subject to strong cultural distance, but is highly polarized. But neither of these groups, despite high polarization measures, has been subject to very strong internal conflict in the sample period under consideration. Similarly both the Black and the total South African population are highly fractionalized, but not very polarized—and yet only the latter has experienced substantial conflict.

The final conclusion appears to be that none of the measures proposed by the literature presents a sure means of identifying a predictor of conflict—and problems of aggregation may be a pervasive limitation of these sorts of measures of social cleavage.

6 Conclusion: implications for growth studies

The discussion has been focused on the development of general methodological lessons associated with a specific time series case study. It remains to draw the general conclusions.

First, we note immediately that while many of the aggregate fractionalization indexes show considerable stationarity over time, this is not universally true (Racial Fractionalization), and even where it does hold, the aggregate stability hides substantial movement at the micro level. The implication we draw is that indicators of social cleavage may themselves be subject to substantial change over time, rendering the question of what drives such changes (particularly whether economic development indicators do so) germane.Footnote 52

Second, the evidence presented above is suggestive of a more substantive hypothesis. We have observed that linguistic fractionalization in South Africa is subject to an (admittedly slight) downward trend, while religious fractionalization has been increasing. (The contrast is suitably dramatic for Asians). One possible interpretation of the evidence is that the movement in the two indexes is due not so much to changes in underlying cleavages in South African society, but to changes in the nature of identity formation. In particular, the evidence is consistent with a shift of identity formation that is based on “essentialist” linguistic (ethnic) roots, to one which is choice-based. Increased religious fractionalization might reflect the abandonment of religions of birth, in favor of religions of choice.

Third, we note the significant difficulties associated with data collection surrounding social cleavage. We note further that in cross-sectional contexts this issue surely compounds. Consistency, as well as reliability of data compilation must play a role across wide geographical reaches, and periods of time. Countries with the highest levels of fractionalization are quite likely to have the most contentious data being employed in cross-sectional studies.

But we also note that measurement difficulties surrounding fractionalization indexes are also not necessarily terminal. On the South African time series data we show that even relatively egregious problems of measurement do not necessarily bias fractionalization measures as much as one might have anticipated. Certainly the means of exploring the relationship between different measures of social and institutional context remains alive, even in the presence of measurement error.

Finally, the evidence on various fractionalization indexes reported above presents the suggestion that various social cleavages (for instance religious and linguistic fractionalization) show a strong correspondence. Where the society is fractionalized in one, it is tempting to conclude that it will be fractionalized in other dimensions also. Racial fractionalization may be mirrored in linguistic as well as religious diversity.

A further question that then arises in this regard relates to the importance of the intentional mobilization of social cleavage which may itself be inherently latent, often through political agency.Footnote 53 How and when such mobilization may come to be successful may depend substantially on the institutional context. For instance, South Africa’s historical institutional emphasis on racial cleavage as a means of economic and political exclusion had the effect of exacerbating distributional conflict, here confirmed by bounds analysis, thus harming growth. By contrast, religious fractionalization may have had very little or no impact at all on economic activity.

Which particular line of social cleavage, when, how and why deserves close attention. With the suggestion that any one particular form of social cleavage may not be significant—what matters is the existence of cleavage, and the institutional means of exploiting such cleavage.

Given the conclusions that have been reached in terms of the potential impact of fractionalization on long run growth, it is essential that we ensure that we are using the right measures. This paper does not purport to have developed the “right” measures but it has raised important caveats about the way in which cross-sectional growth studies employ the fractionalization indexes currently available.Footnote 54 It also suggests that the use of time series data for individual country case studies may be a fruitful new route for social scientists interested in the development prospects of developing countries.