Introduction

Why might you want to project the ethnic group populations of a developed country? The first reason is that if demographic rates or probabilities vary across subgroups of the population, then that heterogeneity needs to be incorporated into projections. There is plenty of evidence of such heterogeneity (Large and Ghosh 2006a, b). The second reason for projecting ethnic group populations is to provide information useful to organizations wanting to monitor the achievement of equality of opportunity in education, employment and housing. The third reason is to inform public and private service providers of the future ethnic mix of local populations, so that their provision can be tailored properly. You might object that the future is likely to be uncertain, so that projections will always turn out to be wrong. But the range of uncertainty can be estimated either by running many projections under different variants or scenarios (as in this paper) or by sampling from error distributions of summary indicators of the main component drivers, fertility, mortality and migration (Lutz et al. 2004).

The main aims of the research reported here were to understand (1) the demographic changes that the United Kingdom’s ethnic populations are likely to experience to mid-century, (2) the impact that international and internal migration will have on the size and ethnic composition of the UK population, (3) the role that differences in fertility rates between the UK’s ethnic groups play in shaping future trends, (4) how mortality differences between ethnic groups affect the changing demography of the UK populations and (5) how the ethnic diversity of UK national local populations is likely to change in the future. To achieve the research aims, we built a model and input database for projecting the ethnic group populations of UK local areas and used the model to project alternative futures. To carry out the projections we made estimates of (1) ‘ethnic group fertility’ using alternative data sources, (2) ‘ethnic group mortality’ through combining information on local mortality for all groups with information on long-term limiting illness for ethnic groups, (3) ‘international migration’ for local areas by using census, survey and administrative data to produce new estimates of local immigration and (4) ‘internal migration’ into and out of local areas for ethnic groups using census and patient register data.

The population projection model delivered projected ethnic populations for local areas, including migration flows out of local areas and into them, specific to each ethnic group. We made estimates of ethnic group populations in Scotland and Northern Ireland using the England and Wales classification, so that our projections apply to the whole UK. The projection model includes ethnic group mixing through the birth of infants to parents from different ethnic groups. However, we did not try to handle transitions in ethnic group membership at older ages (Rees 2002) because an analysis of the Longitudinal Study linking the 1991 and 2001 Censuses of England and Wales showed that reliable estimates of these transitions were not possible (Simpson and Akinwale 2007; Simpson et al. 2005).

Why do we project the ethnic group populations of the UK for 355 local areas? The first reason is because local ethnic populations are of considerable interest to local planners and researchers engaged with local areas. The second reason is that local projections are likely to capture the heterogeneity of ethnic group demographic rates and flows across the country and therefore yield better forecasts as long as the local rates and flows can be estimated reliably. The third reason for projecting ethnic group populations at local scale is that the Office for National Statistics (ONS) has met a demand for such statistics by making estimates of ethnic group populations for local authorities in England (Large and Ghosh 2006a, b) while deciding, for now, not to extend the work to include projections. Official estimates of population by ethnic group are not provided in the rest of the UK by the Welsh Assembly Government (WAG), the General Register Office for Scotland (GROS) or the Northern Ireland Statistics and Research Agency (NISRA).

The 355 local areas consist of 352 local authorities (LAs) in England plus the home countries of Wales, Scotland and Northern Ireland. A detailed list of LAs is given in Wohland et al. (2010) together with maps showing their location. Full details of LA boundaries are provided in ONS (2011). The average LA in England had 143,000 inhabitants at mid-year 2001. The largest had 985,000 and the smallest 24,000 at mid-year 2001, with an upper quartile of 169,000 and a lower quartile of 86,000. The average LA-ethnic-group population is 8,800 with a minimum of zero and a maximum of 645,000. There are small numbers in many LA-ethnic-group combinations, so we frequently need to model the relevant probabilities or rates for components rather than being able to compute these projection inputs directly. Rees et al. (2009b) spell out in detail how estimates of life tables for LA-ethnic group combinations were achieved.

Throughout the paper we employ the usually resident population definition (de facto) used by National Statistics, which includes long-term international migrants (with durations of 12 months or more) but not short-term international migrants or people in second residences. A usual residence is the residence that a person declares to be his or her main home and where they spend a majority of their time. The usual-residence count includes refugees and asylum seekers and incorporates an adjustment for visitor and migrant switchers for international migrants though these two corrections roughly cancel out. Data on internal migrants come from the 2001 Census and persons who were in a different LA 12 months before the census. This definition influences the form of the projection model we design.

The paper focuses on the methods used, component estimates, future assumptions and projection results at the national scale. Local results are discussed in Wohland et al. (2010) and Rees et al. (2011). The plan of the paper is as follows. The second section reviews approaches to ethnic population projection and selects a model for use in the UK. The third section describes the projection model formally. The fourth to seventh sections discuss how ethnic-specific estimates of the components of change were produced and future assumptions were designed. The eighth section of the paper describes the rationale for our four projections and the assumptions used in each. The ninth section discusses the results of four projections. The outcomes are described in terms of numbers, shares, growth rates and changing age distributions for the 16 ethnic groups. The final section of the paper compares our results with other UK ethnic-group projections, and summarizes the key findings.

Previous work on projecting the UK’s ethnic group populations

Here we review the field of ethnic population projection; we build on an earlier review by Coleman (2006b) but look at the alternative methods used rather than outcomes. A number of challenges are involved in carrying out ethnic population projections. How should ethnic groups be defined? How should they interact demographically? How do we estimate the key ingredients—fertility, mortality, internal and international migration by ethnic group—in the face of inadequate data? What kind of projection model should be used? What assumptions should we adopt for future fertility, mortality or migration differences? How do we validate our projections?

Ethnic groups: what are they?

‘Ethnic’ derives from the Greek work ethnos meaning a nation. Belonging to a nation may be defined using one or more variables that can be measured in surveys and censuses or recorded in registers. In general, persons are born into an ethnic group and tend to remain in that group for the rest of their lives. This situation contrasts with age or household status which change as the life course proceeds. It also differs from social class, linked to occupation, which changes through upward or downward social mobility. The variables used to define ethnicity include: country of birth, country of citizenship or nationality, country of family origin, racial group (defined mainly in terms of skin colour or facial features), language, religion and self-identification.

Many of these statuses used to define a person’s ethnicity change over time, and these changes lead to problems in identifying groups. For example, use of a country of birth different from that of current residence applies most usefully to groups that have immigrated recently. Their children and grandchildren born in the country to which they migrated no longer share this characteristic. Nationality changes through the acquisition of citizenship. People whose ethnicity is defined by religion may change through conversion of religious belief. Where a person’s ethnicity is self-defined, they may change their identification over time.

Ethnic classifications in the UK

Ethnic classifications in the United Kingdom are based on self-reporting through census or social survey questionnaires (ONS 2003). Considerable consultation informs the formulation of the question. The resulting categories are a compromise between the demands of pressure groups interested in counting and promoting their own group and the need to have a question that the whole population can understand. Ethnic classifications change over time, recognizing the evolution of groups as a result of migration from the outside world and as a result of marriage or partnership of people from different groups who then have children of mixed ethnicity.

Table 1 shows the ethnic group classifications adopted in the 1991 and 2001 Censuses of England and Wales and that used in the 2011 Census; there are different classifications for Scotland and Northern Ireland. The classifications are based on two concepts: race; and country of origin, either directly through migration or through ancestry. Many studies (e.g. Coleman 2010; Klodawski 2009; Rees 2008; Rees and Butt 2004; Rees and Parsons 2006) used a collapsed version of the classification (e.g. White, Mixed, Asian, Black, Chinese & Other) but these amalgamated classes hide huge differences in timing of migration to the UK, age-sex structures, population dynamics and socio-economic and cultural characteristics. Most studies (e.g. Coleman 2006b; Coleman and Scherbov 2005; Rees and Butt 2004) drop the Mixed group; since the 2001 Census revealed this to be the fastest growing group such an omission is regrettable.

Table 1 Ethnic classifications in the 1991, 2001 and 2011 censuses, England and Wales

In this research we have adopted the full set of 16 ethnic groups used in the 2001 Census for England and Wales and made estimates of the Scotland and Northern Ireland population of these groups using ancillary information: custom tables supplied by GROS and NISRA. Table 1 indicates that our results can easily be converted to the 2011 Census classification.

Ages: dealing properly with age-time

Period-cohorts are the key age-time concept used in cohort-component projection models. A period-cohort is the space occupied by a birth cohort in a time period and shows how persons aged x at the start of year t, born in year t − x, age forward over 1 year to be aged x + 1 at the start of year t + 1. Data which are published by period-age need conversion to period-cohort format. It is preferable to use single years of age in a projection model wherever the data allow so that projections for each year can be produced and so that aggregate age groups can be flexibly constructed. It is important to extend the age range to 100 and over, recognizing the current higher rates of survival into the older old ages.

Models for handling migration

Most ethnic population projections produced to date are for national populations (Coleman 2006a, 2010), though the US Bureau of the Census (Campbell 1996) produces state projections by race-ethnicity groups. Where subnational populations are studied, internal migration between them must be included in the projection model. There are three approaches: single region, multi-region and bi-region.

In the single-region model each subnational unit is treated as a single unit with streams of in- and out-migration, which are often reduced to total net migration, adding to or subtracting from the population. This is unsatisfactory as it gives no insight into real migration flows; it is better to recognize separate migration streams into and out of regions.

In the multi-region model, all subnational units are handled together by representing migration as flows between them, which recognizes that in-migrants to a subnational unit are out-migrants from other subnational units (Rogers 1990) and that the migration flows are best modelled simultaneously. The form of the multi-region model depends on the way in which the migration data used are measured. There are two types of measure: transition and movement. Transition migration results from comparison of a person’s location at two points in time; if they are different, a transition has occurred. Movement migration results from recording the events of subnational unit to sub-national unit migration.

In the bi-region model, the system is handled as a set of region pairs, the region and the rest of the country between which migration flows occur. This achieves a compromise between the large size and estimation difficulties of the multi-region model and the failure of the single-region model to allow proper interaction between regions. The bi-regional model was originally suggested by Rogers (1976) and has been further tested by Wilson and Bell (2004b) for a set of Australian regions: they found that the bi-region model gave results which were close to those of the multi-region model. The data requirements of a bi-region model are much smaller than for the multi-region model: it uses 2N probabilities rather than N 2, where N is the number of regions. The bi-region model needs an additional step at each time interval: adjustment of the total of projected in-migrations to match the total of outmigrations.

Population projection models adapted for ethnic groups

Do we need to develop new models for handling ethnic population projections? Could not existing models and associated software be used to produce the projections? We consider the advantages and disadvantages of current models and software. Table 2 provides a summary of work over several decades in the UK that has produced either population estimates or projections by ethnicity. The methodologies used are listed in the final column of the table.

Table 2 Summary of UK work on ethnic population estimates and projections

Simpson, Andelin Associates and colleagues (Cathie Marsh Centre for Census and Social Research (CCSR 2009)) have developed a suite of spreadsheet macros called POPGROUP that implement a single-region cohort-component model with net migration or with gross in- and out-migration flows or rates, which is widely used by UK Local Governments and has been applied to ethnic forecasts for Birmingham, Oldham, Rochdale and Leicester (Danielis 2007; Simpson 2007a, b, c; Simpson and Gavalas 2005a, b, c). Rees and Parsons (2006, 2009) in work for the Joseph Rowntree Foundation used a single-region cohort-component model for UK regions which used four migration streams: internal outmigration and emigration as intensities (probabilities) and immigration and internal in-migration as flows. These single-region models have the key advantage of being relatively easy to implement and use for a large number of subnational units and ethnic groups. They suffer from the disadvantage of neglecting the process whereby the outmigrants from one region become the in-migrants to other regions.

Since the 1970s various programs have been developed to implement the multi-regional cohort-component model. In the early 1990s a general version, LIPRO, was developed at Netherlands Interdisciplinary Demographic Institute (NIDI) by Van Imhoff and Keilman (1991) for use with household projections but in a form in which other state definitions could easily be introduced. The software is made available through NIDI (2008). Another multi-region program, the UKPOP model, was developed by Wilson (2001; Wilson and Rees 2003) for projecting the populations of a full set of local authorities. This accounts-based model relies on iteration to make consistent the relationship between observed deaths in a region, the variable generally available, and the deaths to the population in the region at the start of the interval, who die in that region and elsewhere. Parsons and Rees (2006) had difficulty in achieving convergence in the iterative procedure for older ages, because populations, deaths and migration come from different and inconsistent data sources. Wilson and Bell (2004a, b), and Wilson et al. (2004) have used simpler versions of the multi-region model for Australian projections; Wilson (2009) developed a model for the indigenous and non-indigenous population of the Northern Territory, Australia.

The Office for National Statistics (ONS) subnational projection model for local authorities in England has a long pedigree and is in continued use (ONS 2009c), but has not been extended to project ethnic groups.

As the local government body with the largest ethnic minority population, the Greater London Authority has a longstanding interest in ethnic group population trends in London. Ethnic projections were prepared by Storkey (London Research Centre 1999; Storkey 2002), which incorporated ethnic fertility estimates and which linked to the all-group projection model for London boroughs. The model was revised by Hollis and colleagues, and ethnic population projections became a regular publication that followed the main London Borough projections (Hollis and Chamberlain 2009) and were constrained to them (Bains 2008; Bains and Klodawski 2006, 2007; Hollis and Bains 2002; Klodawski 2009). Ethnic-specific fertility rates were estimated using Hospital Episode Statistics gathered by the London Health Observatory.

Kupiszewski and colleagues at CEFMR (Bijak et al. 2005, 2007; Kupiszewska and Kupiszewski 2005) have developed a nested multi-region model called MULTIPOLES, based on a prototype by Rees et al. (1992) that uses several spatial layers. For example, in a projection study of 27 EU states (Bijak et al. 2005) three layers are recognized: inter-region migration within states, inter-state migration within the EU and extra-EU migration. This approach enables different models to be used in the different layers within a consistent accounting framework (Kupiszewski and Kupiszewska 2011). The approach has been used to develop scenario projections for European regions (Rees et al. 2010a).

The projection model

The accounting framework for the model

The work reviewed above informed the design of our projection model for ethnic groups. The model uses a transition framework because the internal migration information used derives from the decennial census. Transition data derive from a question asked about a person’s usual residence at a fixed point in the past: 1 year before the 2001 Census, in the current analysis. Every projection model has an explicit or implicit accounting framework, which must be consistent. Table 3 shows the population accounting framework used in our bi-region model for each zone, period-cohort and gender group. We chose the bi-region model because it can handle the large number of local ethnic group subpopulations we study. Table 4 sets out the notation for model variables. We use single-letter variables as far as possible, but double or triple-letter variables are needed. We use lower-case letters to refer to intensities (rates or probabilities), and upper-case letters to counts of populations, migrants or cohorts.

Table 3 Bi-region accounts for subnational populations using migration (transitions) data from the UK census
Table 4 The notation for the ethnic population projection model

The projection model handles population groups classified by 355 local areas, 16 ethnic groups, 102 period-cohort ages and two sexes, constituting 1,158,720 subpopulations. The 355 local areas are made up of 352 local authorities in England with Wales, Scotland and Northern Ireland as additional single zones. We use the 16 group classifications in the 2001 Census for England and Wales (Table 1). The 102 period-cohorts start with the newborn to age 0 period-cohort, followed by age 0 to age 1, and so on to age 99 to age 100 with the final period-cohort being 100+ to 101+.

In Table 3, the variable SM ir represents the number of surviving migrants resident in zone i on 29 April 2000 who live in zone r (rest of the country) on 29 April 2001. The variables in the principal diagonal, SS i and SS r, are persons present in zones i and r at both the start of the year and the end of the year (surviving stayers). These counts include migrants who moved within the zone.

From the start population are subtracted the deaths or non-survivors, DE i, to those in zone i start population, the emigrant survivors, SE i, from the zone i start population, and the sum of out-migrant survivors to other zones in the country, M ir. Then we add the sum of in-migrant survivors from other zones within the country and surviving immigrants, SI i, from the rest of the world. So the end of interval population EP i, for ethnic group e, age x and gender g in zone i, is given by:

$$ EP^{i} = SP^{i} - DE^{i} - SE^{i} - SM^{ir} + SM^{ri} + SI^{i} . $$
(1)

The surviving stayer term, SS i, does not appear in this accounting equation. However, we do need to estimate the SS i variable, because in the projection model we use probabilities of migration conditional on survival within the country. These are the sum of all populations originating in region i and surviving the time interval within the country, including the surviving stayer terms. We estimate these terms by subtracting internal in-migrants, SM ir and surviving immigrants, SI i from the 2001 Census population aged 1+.

Table 3 refers to each ethnic group-age-sex combination. The accounting framework is repeated 102 × 2 × 16 or 3,264 times in the model computations. The cohorts between 0–1 and 99–100 differ from the newborn period-cohort in their starting population stocks: in the typical period-cohort these are the populations at the start of the time interval, while for the newborn period-cohort the starting stocks are births during the period (by ethnic group of child).

The projection equations

We convert the accounting Eq. 1 into a projection model equation by substituting probabilities multiplied by populations at risk for the transitions, adding age subscripts:

$$ \begin{aligned} EP_{x}^{i} & = SP_{x}^{i} - sm_{xc}^{ir} \left( {SP_{x}^{i} - se_{xc}^{i} SP_{x}^{i} - ns_{xc}^{i} SP_{x}^{i} } \right) - se_{xc}^{i} SP_{x}^{i} - ns_{xc}^{i} SP_{x}^{i} \\ & \quad + sm_{xc}^{ri} \left( {SP_{x}^{r} - se_{xc}^{i} SP_{x}^{r} - ns_{xc}^{r} SP_{x}^{r} } \right) + SI_{xc}^{i} . \\ \end{aligned} $$
(2)

Table 4 gives the definitions of the variables and indices (subscripts). The equation applies to all period-cohorts, all ethnic groups, both genders and all areas. The populations of the rest of the UK are computed by subtracting the local area population from the total of all area populations:

$$ SP_{x}^{r} = \sum\limits_{i} {SP_{x}^{i} - SP_{x}^{i} } . $$
(3)

The migration probabilities, \( sm_{xc}^{ir} \), are conditional on survival within the country and so are multiplied by the estimate of within-country survivors in the bracketed term to project the number of surviving migrants. The surviving emigrants are computed by multiplying the start population in zone i by the survivorship probability, \( se_{xc}^{i} \), applicable to emigrants. The number of deaths to the start population is projected by multiplying it by a non-survivorship probability, \( ns_{xc}^{i} \), the complement to life table survivorship probabilities for period cohorts, \( s_{xc}^{i} \), that is, \( ns_{xc}^{i} = 1 - s_{xc}^{i} \). The migration from the rest of the country is computed as a probability of migration conditional on survival within the country multiplied by the estimate of within-country survivors. Finally, surviving immigrants are added as a projected count.

These equations for a typical ethnic group, gender and period-cohort are repeated for all period cohorts. In the first period-cohort from birth to age 0, projected births are substituted for the start population (\( SP_{ (-1)}^{i} = B^{i} \)), where the subscript refers to the ‘start age’ of the newborn period-cohort. Care is taken in the estimation for the terms for the first period-cohort to allow, either empirically or by assumption, for the shorter period of exposure to transitions for infants born during a year. The last period-cohort is treated differently only when the projected end populations of a time interval are converted into the start populations of the next. For a typical period-cohort this is achieved thus:

$$ SP_{x}^{i} (t + 1) = EP_{x - 1}^{i} (t) $$
(4)

where t and t + 1 refer to successive time intervals and x is the age that starts the period cohort. For the last period-cohort, this assignment combines the end populations of the last but one, age z − 1 period-cohort, and the last period-cohort, z:

$$ SP_{z}^{i} (t + 1) = EP_{z - 1}^{i} (t) + EP_{z}^{i} (t). $$
(5)

Estimation of the inputs to the projection model

Given the number of zones, ages and ethnic groups represented in our projection model, we should not expect to find reliable data to count directly the flows and transition probabilities needed for the projection model. Instead we will need to estimate these probabilities and rates using a variety of sub-models which use more aggregate and reliable data together with a set of assumptions.

The projection of fertility rates, births and mixed births

For convenience, births are calculated first in the projection model so that all period-cohorts can be computed together. We approximate the fertility rate for women in a period-cohort by averaging successive period-age fertility rates:

$$ f_{xc,me}^{i} = 0.5\left( {f_{xa,me}^{i} + f_{xa + 1,me}^{i} } \right) $$
(6)

where me stands for mother’s ethnic group. So we project births to mothers of each ethnic group as follows:

$$ B_{ + ,me}^{i} = \sum\limits_{xa = 15}^{xa = 45 + } {f_{xa,me}^{i} \left( {SP_{xfe}^{i} } \right)} $$
(7)

where \( f_{xa,me}^{i} \) are the age-specific period-age (xa) fertility rates for ethnic group me in zone i. This is a female-dominant fertility model. The + subscript attached to the births variable refers to sex of the newborn which is not specified at this stage of the model.

We then add one ingredient to this model to represent newborn children of mixed ethnicity. The births in Eq. 7 are defined with respect to mother’s ethnicity; if the father of the child is of a different ethnicity, the child may be assigned mixed origin. Mixed groups are recognized in the 2001 Census ethnic-group question: parents may decide to assign their child to a mixed ethnicity group or they may assign offspring to the mother’s ethnic group or the father’s. Rather than apply an arbitrary rule, we use detailed tables from the 2001 Census which classify infants aged 0 by their mother’s ethnicity and their own. From these tables we compute the probability that an infant has ethnicity ie given mother’s ethnicity me, P(ie|me), apply the probability to the projected births and assign a gender to the newborn as well:

$$ B_{g,ie}^{i} = v_{g} B_{ + ,me}^{i} P^{I} (ie|me). $$
(8)

where v g is the sex proportion at birth, assumed constant at 0.513 for boys and 0.487 for girls over all UK ethnic groups, mothers’ ages and time intervals. The probability, P I(ie|me), is computed for a larger region I (usually the Government Office Region) to which zone i of interest fits. The highest probabilities occur where the mother’s ethnicity and the child’s ethnicity are the same. There are significant probabilities for some groups where the ethnicity of the newborn is different from that of the mother. For example, the majority of children of White Irish mothers are classified as White British. Many mothers from Asian groups and Black groups also have children of mixed ethnic origins. Many children are born to non-White British mothers and White British fathers.

Non-survivors projected using survivorship and non-survivorship probabilities

At older ages, for small populations, there is a problem in measuring mortality rates accurately. Sometimes rates will exceed one because the deaths counts and population numbers are drawn from different data sources which may not match exactly. To avoid these problems, we use survivorship and non-survivorship probabilities from life tables. We assume that non-survivorship probabilities derived from the life table produced using mortality rates based on area of usual residence at time of death, \( d_{xa}^{i + } \), are a reasonable estimate for non-survivorship probabilities for origin-zone populations at the start of the period:

$$ d_{xa}^{i + } = d_{xa}^{ + i} $$
(9)

where the + superscript means summation over all locations at death (LH side term) or all locations prior to death (RH side term). To estimate non-survivorship probabilities for period-cohorts, we use the life table equation for survivorship probabilities, \( s_{xc}^{i} \), for region i:

$$ s_{xc}^{i} = \frac{{L_{x + 1}^{i} }}{{L_{x}^{i} }}. $$
(10)

We then compute non-survivorship probabilities as:

$$ ns_{xc}^{i} = 1 - s_{xc}^{i} . $$
(11)

We derive the survivorship probabilities from ethnic group life tables for local authorities (Rees and Wohland 2008; Rees et al. 2009b). Survivorship and non-survivorship probabilities are used to generate the total number of non-survivors, \( DE_{xc}^{i} \), from the start populations of origin zones, \( SP_{xc}^{i} \):

$$ DE_{xc}^{i} = ns_{xc}^{i} SP_{xc}^{i} $$
(12)

for all period-cohorts, genders and ethnic groups.

Emigration and surviving emigrants projected using emigration rates and survivorship probabilities

The next terms we need to estimate and project are the surviving emigrants. Because the accounting framework uses the transition concept, we need to estimate probabilities of emigration and survival. The statistics available on emigration derive from the International Passenger Survey (IPS) which estimates the number of emigrations occurring over a 1 year interval. The estimate is based on a question about intention to leave the country for 12 months or more. However, some of these emigrants may die before the year is out and the projection of non-survivors in Eq. 12 already contains an explicit estimate of these non-surviving emigrants. The emigration counts must be multiplied by survivorship probabilities to project surviving emigrants. The survivorship probabilities must be modified to reflect the reduced risk of exposure to dying of emigrants who spend on average only half the time interval in their destination zone. We multiply the square root of the survival probability, \( s_{xc}^{i} \), to estimate the surviving emigrant probability, \( se_{xc}^{i} \) by the projected emigration flow, \( E_{xc}^{i} \)

$$ se_{xc}^{i} = \left( {s_{xc}^{i} } \right)^{\raise.5ex\hbox{$\scriptstyle 1$}\kern-.1em/ \kern-.15em\lower.25ex\hbox{$\scriptstyle 2$} } E_{xc}^{i} . $$
(13)

Projected emigration flows are set by assumption or through an independent analysis of flow trends in most of our projections. However, in one projection we make assumptions instead about the rate of emigration, \( re_{xc}^{i} \), defined as the total emigration count, \( E_{xc}^{i} \), divided by the start population as an approximate population at risk:

$$ re_{xc}^{i} = \frac{E_{xc}^{i}}{SP_{x}^{i}} $$
(14)

so that the projected emigration flow is given by

$$ E_{xc}^{i} = {re_{xc}^{i}}\times {SP_{x}^{i}}. $$
(15)

which is used in Eq. 13 to yield the surviving emigrant probability. The number of surviving emigrants, \( SE_{xc}^{i} \), is projected by applying the surviving emigrant probabilities to the starting population:

$$ SE_{xc}^{i} = se_{xc}^{i} SP_{x}^{i} . $$
(16)

Then we can estimate surviving internal migrants within a country:

$$ SM_{xc}^{ir} = sm_{xc}^{ir} \left( {SP_{x}^{i} - se_{xc}^{i} SP_{x}^{i} - ns_{xc}^{i} SP_{x}^{i} } \right) $$
(17)

where

$$sm_{xc}^{ir} = \frac{SM_{xc}^{ir}}{\left({SS_{xc}^{i} + \sum_{j \ne i}{SM_{xc}^{ij}}}\right)}.$$
(18)

These are the probabilities of migration given survival within the country, measured from the latest census migration tables. The surviving migrant variables, \( SM_{xc}^{ij} \) are recorded directly in the census migration tables, but within region surviving stayers, \( SS_{xc}^{i} \), are not usually tabulated. We must therefore compute this variable from the census migrant data and the census population by subtracting surviving in-migrants to a zone and surviving immigrants from abroad from the end population:

$$ SS_{xc}^{i} = EP_{xc}^{i} - SM_{xc}^{ri} - SI_{xc}^{i} . $$
(19)

What does this reformulation of the projection model achieve? Essentially, the re-formulation using internal migration probabilities conditional on survival decouples the processes of mortality and migration and enables us to develop separate models for each component. We use two sets of properly defined probabilities, the sums of which will not exceed one.

Software for implementing the projection model

To implement the ethnic group and local area cohort component model for the UK we use the software R. The current version of the model implementation consists of four scripts: Script 1 reads in and arranges the data; Script 2 runs the model for 2001–2002 and computes the 2002 midyear populations; Script 3 compiles R function to run the projection; Script 4: runs the model and creates the output. Scripts 1 and 4 can be specified for particular projections; scripts 2 and 3 are never changed. Fuller details of the R scripts that constitute the UPTAP projection model are given in Wohland et al. (2010).

Time intervals for estimation and projection

The time framework for the analysis is as follows. We project populations from midyear (June 30–July 1) in 1 year to midyear in the next year. This enables us to compare our estimates and projections with those of the Office for National Statistics, which are produced for midyears. Where necessary, we averaged statistics for successive calendar years to estimate midyear-to-midyear interval variables. We define the starting point of our projection (the jump-off point) to be mid-2001. We use the projection model for all subsequent midyear-to-midyear intervals. For the first set of years, from 2001–2002 to 2007–2008, the outputs are estimates rather than projections because we use published data to estimate the inputs to the projection. From 2008–2009 onwards the inputs are set by assumption. Our estimates in the years 2001–2002 to 2007–2008 are independent and distinct from the ethnic population estimates for local authorities produced by ONS (Large and Ghosh 2006a, b). We chose to do this because ONS population estimates do not use ethnic-specific mortality, have low ethnic-fertility estimates and use local authority immigration estimates which are based on data from sample surveys (International Passenger Survey, Labour Force Survey) which may not yield reliable estimates.

Fertility estimates, trends and assumptions

Age-specific fertility rates (ASFRs) by ethnic group are not readily available in the UK. In the following section we describe the steps used to estimate ethnic-group-specific ASFRs for local Authorities (LAs) in the UK (see Wohland et al. 2010).

Calculating a time-series of ASFRs and TFRs from the 1980s to 2006 has been achieved for all women using vital statistics on births and official midyear estimates as denominators with all data allocated to the LA geography by the national statistics agencies (Tromans et al. 2008). TFRs by ethnic group and LA are estimated from 1991 and 2001 Census data using child-to-woman ratios (CWRs) which are assumed to emulate family size by ethnic group (Sporton and White 2002). Annual trends in national level ASFRs by ethnic group are derived from the Labour Force Survey (LFS) by modelling the probability of a woman having a child based on her age and ethnicity.

These data are combined to provide the fertility estimates. For each year from the early 1980s to 2006, fertility trends for all women have been identified for each LA and by ethnic group at national level using the LFS. The UK’s Census provides indicators of changes in family size by ethnic group between 1991 and 2001. In combination, these variables underpin the calculation of ASFRs and trends for all LAs across the UK by ethnic group, as appropriate to each country.

Assumptions are needed on the direction of fertility in the future. Fertility rates have risen recently (Tromans et al. 2008) from an all-time low in 2001; demographic momentum and social change will affect the number of future births. Since we have information estimated from 1991 for ethnic groups assumed common across the 1991 and 2001 Censuses we can use a trend over this time period which encompasses both falling and rising fertility and differences by age of woman and by ethnic group. The trends for each age and broad ethnic group are modelled using curve fitting with the parameters of the curve applied to estimate future fertility rates up to the year 2021. The general picture is of parallel curves across the groups with relative differences maintained, but the White group shows less of a decline between 1991 and 2001 than the general trend and, after the current period, the fertility of the White and Other groups stays pretty constant whilst the fertility levels of all other ethnicities tend to decline.

In the projection model, the decline (growth) rates from 1 year to the next by 5-year group are used to scale the single-year information after the projection jump-off point. Taking these model-based assumptions past 2021 is ill advised so the rates after that time point are assumed to stay constant. The trends for each broad group are applied to constituent subgroups, e.g. White rates to White–British, to White–Irish and to White Other. Table 5 sets out the assumed TFRs. The highest TFRs in 2006–2011 are experienced by the Pakistani and Bangladeshi groups, followed by the Indian group, all above a UK replacement rate of 2.07 (Smallwood and Chamberlain 2005). All the other groups have below-replacement fertility rates, with the TFRs for the Chinese group particularly low. Small declines in South Asian fertility are assumed to 2021 while the White and Mixed groups experience smaller declines. Thus we assume convergence towards UK fertility norms of smaller families.

Table 5 The fertility assumptions of the UPTAP projections

Mortality estimates, trends and assumptions

Mortality data by ethnic groups are not available in the UK since a person’s ethnic group is not registered when they die. Even though a place of birth has been noted on English death certificates since 1969, this only indicates mortality for first-generation immigrants. A direct source for ethnic-group mortality is the ONS Longitudinal Study (LS) but this represents only 1% of the England and Wales population and has considerable loss to follow-up of LS members, up to 30% at older ages (Harding and Balarajan 2002). The LS cannot provide local mortality information.

Longitudinal research finds that self-reported health is a strong predictor for subsequent mortality, for both total populations and subgroups (e.g. Burström and Fredlund 2001; Heistaro et al. 2001; Helweg-Larson et al. 2003; McGee et al. 1999). Thus, with no adequate ethnic mortality data available, we use a proxy measure for which data are available by UK LA level and ethnic group: answers to the 2001 Census question, ‘Do you have any long-term illness, health problem or disability which limits your daily activities or the work you can do?’.

To estimate mortality by ethnic group, we use a suite of census, official midyear population estimates and vital statistics data to estimate ethnic-group life expectancy. First, we calculated standardized illness ratios (SIRs) for each LA by sex with data from the 2001 Census. We also calculated standardized mortality ratios (SMRs) for all local areas and both sexes from midyear population estimates and vital statistics mortality data. Next, we use these ratios to define all-person SMRs as a function of all-person SIRs. This all-person function is then applied to each ethnic group’s local-area SIR to calculate an ethnic group-specific SMR. These ethnic-group SMRs are used to adjust upwards or downwards age-sex specific mortality rates (ASMRs) for each local area. These ASMRs are fed into life tables to derive survivorship probabilities for our projection model. During this procedure, we found men reporting less illness than women but experiencing higher mortality. We also found different SIR/SMR relationships for the UK’s constituent countries.

Thus, we estimated life expectancies and survivorship probabilities for all ethnic groups defined in the UK 2001 Census for each local authority, by single year of age and sex. Table 6 shows population-weighted ethnic-group life expectancy in rank order for men and women. Three groups are ranked above the national average for women and four for men, with the Chinese group on top in both cases. Within the White group, we estimate the White Irish group to occupy the lowest rank. This ranking is due to the rather low life expectancy for Irish men, whereas life expectancy of Irish women is estimated to be close to that of White British women. The lowest life expectancies are for the Bangladeshi and Pakistani groups which have the poorest labour-market positions (Simpson et al. 2006). That the Other Asian and the Indian groups occupy moderate ranks shows the importance of having welldefined subgroups. We also find a strong contrast in the Black group, where the Black African group is one rank below the total population, in contrast to the Black Caribbean group which occupies rank 12. The Black African estimate is reasonable considering the ‘healthy migrant’ effect (Fennelly 2005). Persons changing countries are advantaged in various ways (compared with their origin and/or their destination populations) including good health which enables their move. The Black African group contains a higher proportion of recent immigrants than the Black Caribbean group which is longer-established in the UK.

Table 6 Mean life expectancies at birth for men and women by ethnic group, 2001

To establish recent trends, before ethnic mortalities are introduced into the population projection, they are updated to 2007. Since there is no comprehensive source of local ethnic illness data beyond the 2001 Census, we update ethnic mortality in line with the local mortalities for all groups. We use abridged life tables for local areas for 2001 (2000–2002) to 2007 (2006–2008) to update the survivorship probabilities needed for the projection model. For each ethnic group and local area, we multiply the survivorship probability from 2001 by the year y to 2001 ratio:

$$ s_{xc,g}^{ei} (y) = s_{xc,g}^{ei} (2001)\frac{{s_{Xc,g}^{+i}(y)}}{{s_{Xc,g}^{+i} (2001)}} $$
(20)

where \( s_{xc,g}^{ei} (y) \) is the survivorship probability for ethnic group e, area i, single age period-cohort xc, gender g in year y, \( s_{xc,g}^{ei} (2001) \) is the same probability for 2001, \( s_{Xc,g}^{+i} (y) \) is the survivorship probability for all groups, area i, 5-year age Xc, gender g in year y and \( s_{Xc,g}^{+i} (2001) \) is the same probability in the year 2001.

For the TREND projection, we implemented the assumptions built into the 2008-based National Population Projections. These involve adopting rates of percentage per annum decline in mortality rates for each age and sex. The declines start with the experience of recent years and then are converged to a uniform percentage decline across all ages and sexes within 25 years and held constant thereafter.

In our model we work with non-survivorship probabilities for period-cohorts rather than mortality rates for period-ages and, after trending, convert them back into survivorship probabilities. For the TREND projection we adopted the long-term rate of decline of 1% used by ONS. For our own UPTAP projections we adopted a higher (2%) rate of decline. Table 7 shows the period life expectancies associated with our 2% decline assumption. The rate of improvement in life expectancy under our UPTAP assumption looks at first glance quite optimistic, but is, in fact, half the rate experienced between 1981 and 2006.

Table 7 Life expectancies, UPTAP projections, with 2% rate of decline of mortality rates

International migration estimates, trends and assumptions

There are various alternative sources which provide intelligence about the movement of population into and out of the UK (Rees et al. 2009a). These sources include census, survey, administrative and ‘composite’ datasets with each having its limitations depending upon the question asked, purpose of data collection and the population covered (see Green et al. 2008; Rees and Boden 2006). The UK’s official source of data on immigration and emigration is the Total International Migration (TIM) statistics (ONS 2010a), which are primarily based on the International Passenger Survey’s question on each migrant’s intentions to stay or leave the UK. Immigration estimation in the Labour Force Survey (LFS) is part of the official subnational calibration process with 2001 Census data used for the proportional allocation of flows to local authority areas. To estimate emigration a ‘migration propensity’ model (emigration rate multiplied by the population at risk) is adopted by ONS to estimate the distribution of flows from each local authority. An ONS ongoing program of improvement to international migration statistics includes an evaluation of the explicit use of administrative statistics (Bijak 2010; ONS 2009a; Rees et al. 2009a).

A ‘New Migrant Databank’ (NMD) (Rees and Boden 2006) has been developed to produce a repository of UK-wide migration statistics from national to local authority level (Boden and Rees 2008, 2009, 2010). The NMD provides a single source of migration statistics for each LA and has facilitated the development of alternative migration estimation methods. Using the NMD repository in parallel and data from the ONS improvement program, we have developed methods for sub-national estimation incorporating intelligence from administrative datasets. An alternative method for distributing immigration flows has been derived combining TIM statistics at a national level with subnational statistics from three administrative sources: National Insurance Number (NINo) registrations by migrant workers, the registration of international migrants with a local GP and Higher Education Statistics Agency (HESA) data on international students (Boden and Rees 2009). The method uses flow ‘proportions’ to distribute national TIM totals to subnational areas. The number of immigrants to a local authority is estimated by decomposing the national total by purpose of immigration: formal study, work or other purpose, using probabilities derived from the International Passenger Survey. The national total by purpose is then allocated to regions using probabilities based on administrative data: HESA data on foreign students, new NI numbers issued to persons previously resident abroad, new NHS numbers issued to patients previously resident abroad. These regional totals by purpose are then allocated to local authorities using the NHS new registrations by immigrants.

The alternative model results in a very different distribution of immigration flows to that recorded in official statistics (Boden and Rees 2009). This redistribution of immigration flows reflects the differences between immigration counts derived from administrative sources and those produced from ONS estimates which combine IPS and LFS sample data with census counts at a local level. There are significant differences between our estimates and the ONS estimates of immigration. We suggest that a distribution of flows based on administrative data is likely to be more robust than an estimation process which relies upon a relatively small national sample (IPS) in combination with the dated census to produce its local authority estimates. For our local authority estimates of international migration by ethnic group we have used our alternative immigration totals based on the ‘administrative data’ model. In the absence of further empirical evidence on emigration we have retained the existing emigration estimates produced by ONS for each local authority.

Our chosen disaggregation of immigration and emigration flows by ethnicity, age and sex has relied upon census information in combination with aggregate age-sex profiles from the published TIM statistics. For immigration, local authority totals have been disaggregated by ethnic group using local area profiles from the 2001 Census immigration tables. Decomposition by single year of age and sex has then been applied using the national age-sex schedule in 2001. To make the age-sex profile consistent with the most recent evidence at a national level, the age-sex profile of immigration has been constrained to the TIM aggregate age-group totals recorded since 2001. This composite estimation process has produced an immigration profile by ethnicity, age and sex for each local authority area.

For emigration the process of ethnicity, age and sex disaggregation has required a more creative approach given the absence of census information on international outflows. Using TIM statistics at a national level, an estimate of the British—non-British split of emigration was derived. Using this split at a local authority level, the ethnic profile of non-British emigration flows has been based upon the observed 2001 Census immigration profile; the ethnic profile of British emigration flows has mirrored that of the 2001 Census internal, outmigration profile. The same age and sex profiles were applied as for immigration, although the TIM aggregate age split for emigration provided an important additional weight to the profile of emigration flows. The emigration estimation is by no means a perfect solution but one which makes best use of the alternative sources that are available.

Later in the paper we explain how we construct four different projection scenarios. In three of these projections we handle immigration and emigration streams as fixed flow counts. In a fourth projection, labelled ER (emigration rates), we adopt an alternative model for emigration, recognizing that the populations at risk of emigration are known and that emigration can be projected by multiplying a UK population risk by an assumed emigration rate. The resulting flows are not adjusted to an assumed total but are free to change as the populations at risk change.

These two alternative models of emigration adopt different views of the international migration system. Use of flow totals is based on the assumption that emigration flows can be controlled through policy. Use of populations at risk and emigration rates assumes that migrants are free to move to other parts of the world like internal migrants because there is no policy constraint on emigration applied in the UK. Both views are partly true. Some immigration streams are subject to control but other immigration streams are not. There are no constraints on the return of nationals who have moved overseas, on the migrant flows from the European Union, and on the migration of family members who are able to join immigrants with the right to reside permanently. Conversely, while emigrants are free to migrate to some destinations such as other European member states, other destinations have their own immigration controls which will affect emigration from the UK. In the projections reported later, we are able to measure what effect these alternative models of international emigration have on the projected population.

Table 8 sets out the net international migration result of our estimates and assumptions for the UPTAP projections for the current five-year period leading up to the next census, a period 25 years hence and a period at the end of our projection horizon. The table shows how much effect a switch from a fixed-flows model to a rates model for emigration has on projected net international migration. Under the flows model, all groups except the White British experience net immigration. Under the emigration-rates model the net international balances fall dramatically because a growing population leads to rising emigration balanced against a fixed immigration flow.

Table 8 Net international migration associated with the UPTAP projections

Internal migration estimates, trends and assumptions

As the purpose of the paper is to present an account of our ethnic population projections at national scale, we summarize the methods used to develop internal migration probabilities. Internal migration between local areas within the UK does have some effect on the national population as it changes the distribution of groups across areas with different growth regimes (see Wohland et al. 2011).

We require probabilities of migration conditional on survival by ethnicity as model inputs. Using a commissioned table from the 2001 Census of inter-local-authority migration by 16 ethnic groups, we assembled the bi-regional accounts of Table 3. Surviving outmigrants from each LA and its rest-of-the-country partner region are divided by the totals of within-country survivors to produce probabilities of migration within the UK conditional on survival within the UK. National age-sex migration profiles in the form of ratios to the average migration probability are applied to the all-age conditional probabilities to generate the age-sex probabilities of internal migration required in the projection model. Finally, we update these probabilities from 2000 to 2001, the year before the census, using migration flow information for 2001–2002 to 2007–2008 supplied by ONS. Important gaps in the flow matrix had to be filled using methods explained in Dennett and Rees (2010). Beyond 2008 we hold the internal migration probabilities constant, as the 2001–2008 time series had exhibited a high degree of stability. Table 9 illustrates the outcomes of these estimates and assumptions for selected periods for two of the projections.

Table 9 Projected totals of inter-zone migration for 355 zones by ethnic group (1,000s)

Design of the projections

Here we describe the assumptions which underpin each projection. Table 10 sets out the design for four projections we have implemented.

Table 10 Assumptions used in the four projections

The Benchmark projection

The ‘Benchmark projection’ was designed to test the model and the associated R software. We used as start populations the 2001 midyear ethnic-group population estimates produced by the Office for National Statistics for local authorities in England supplemented by our own estimates of the ethnic-group populations of Wales, Scotland and Northern Ireland adjusted to the England and Wales classification. We made our own estimates of the components of change in local ethnic populations. We did not use ONS ethnic-group estimates because our methods and estimates of these components differ to a greater or lesser extent. The benchmark estimates use component estimates for the midyear to midyear interval 2001–2002, except for internal migration derived from the census, which refers to the year before the census date, 29 April 2001. We assume that these benchmark component intensities (rates, probabilities or flows) continue unchanged into the future. Such a projection is, of course, likely to be wrong but it serves as a comparator for later projections in which more recent information is introduced along with variable assumptions. What is remarkable about the benchmark projection is how far it differs from later ones and the 2008-based ONS National Population Projection (NPP). These differences are due to radical rises in fertility and immigration in the decade after 2001 and the continued fall in mortality rates.

The Trend projection

The second projection we term ‘Trend’, which indicates that we made estimates of the components of change for years subsequent to 2001–2002 using published ethnic information, for example the fertility and international migration components, or by assuming that all group population trends applied to ethnic groups, for example the mortality and internal migration components. We were able to make such updated estimates for all years to 2006–2007. From midyear 2007 forward we continue the latest estimate rates, probabilities and flows forward at levels aligned as far as possible with the assumptions made in the 2008-based National Population Projections (ONS 2009c). The internal migration assumptions derive from the Sub-national Projections for England, which, in fact, assume continuation of the redistribution effected in the 2004–2006 migration estimates. An analysis of internal migration trends (Dennett and Rees 2010) suggests a fair measure of stability. However, even the application of a constant migration structure results in substantial changes in the distribution of populations across local areas and in our projection of ethnic-group populations across the local areas of England.

The UPTAP-EF and UPTAP-ER projections

The third and fourth scenarios we call the ‘UPTAP’ projections. UPTAP stands for Understanding Population Trends and Processes, the ESRC program which supported the research. Here we have applied our own judgments on the assumptions for the future from 2007 to 2008 onwards, which may differ from or coincide with the official assumptions by ONS, GROS, NISRA and WAG. For ethnic fertility our assumptions are usually higher than those estimated by ONS though we adopt roughly the same view about long-term fertility. Our long-term mortality improvement assumption of 2% decline per annum is more optimistic than ONS’s convergence to a 1% decline. Our international migration assumptions are a little lower than the ONS assumptions in the UPTAP-EF (Emigration Flows) projection and substantially below the ONS assumptions in the UPTAP-ER (Emigration Rates) projection. The assumptions used in the UPTAP projection are reported for fertility in Table 5, for mortality in Table 7, for international migration in Table 8 and for internal migration in Table 9.

Projection results

The aim here is to present the results of our four projections, selecting highlights and comparing 2001 and 2051 populations. We first present the summary populations for the UK, compare them with the official projections and discuss the reasons for the differences. Then we describe the projected ethnic-group populations, showing how each group evolves in the four projections. In this paper we discuss the national results of our projections, the summation of projections for 355 zones and 16 ethnic groups. More details of local results are provided in Wohland et al. (2010) and Rees et al. (2011). A set of files with selected population and component outputs has been deposited with the UK Data Archive (2011). The full results of our projections are available in a web accessible database (Norman et al. 2011).

Projections for the United Kingdom

Table 11 presents the total populations for the United Kingdom in 2001 and 2051. Figure 1 graphs the projected populations and adds the projected populations from the 2008-based ONS National Population Projections (ONS 2009c).

Table 11 Projected ethnic populations (1,000s), 2001–2051
Fig. 1
figure 1

Trends in the UK population, ONS 2008-based projections and five ethnic group projections, 2001–2051

A comparison of the Benchmark projection (BENCHEF in Fig. 1) with the other three projections shows how profoundly the UK’s demographic regime has changed in the 2000–2009 decade with increased net inflows from outside the UK, increased fertility rates leading to higher numbers of newborn and continued improvement in survival changes leading to higher numbers of older people.

The UK population was 59.1 million in 2001; under the 2008-based NPP, the population grows steadily to 77.1 million by mid-century. If this level of growth comes to pass, it is likely that the UK will have Europe’s largest population (Europa 2008; Rees et al. 2010b). Our Trend projection, with assumptions aligned with those of the 2008-based NPP (ONS 2008a), produces slightly higher projected populations. The UPTAP-EF projection, using a model that handles international migration as flows, produces slightly higher numbers than the Trend projection (TRENDEF in Fig. 1).

The NPP projection is a set of four single region cohort-component models linked by a matrix of net migration flows between the four home countries. Our results come from summing the projected 16 ethnic-group populations for 355 zones using a bi-regional cohort-component model that links zones through internal migration and ethnic groups through mixed-ethnicity births. We can interpret the differences between NPP-2008 and the Trend projection as a result of using 5,680 local ethnic-group populations compared with four national populations. The differences between the Trend and UPTAP-EF projections can be interpreted as mainly due to the additional population surviving to older ages because of the more optimistic UPTAP mortality assumptions.

The fourth projection, the UPTAP-ER projection, shows projected populations that differ considerably from the Trend projection. The model for handling emigration is different: we use rates of emigration multiplied by populations at risk to project the numbers of emigrants. As the projected population grows so does the number of emigrants, so the net contribution of international migration to population growth diminishes because immigration is assumed to be a set of roughly constant flows. This asymmetry in the treatment of the immigration and emigration streams leads to 7.4 million fewer people in 2051 compared with the 2008-based NPP projections The UPTAP-ER projection we regard as the most likely future trajectory for the UK population, though the differences between projections indicate how uncertain this conclusion is. In the analysis of our projection results that follow we mainly present results of the preferred UPTAP-ER projections. Selected results from the Trend and UPTAP-ER projections are presented as appropriate.

Projections for the 16 ethnic groups

Our analyses yield projected populations for 16 ethnic groups for the whole UK, summing the results for the individual zones; these sums are set out for our four projections in Table 11. In the Benchmark projection, we see that the White British and White Irish groups decrease in size by 2051, while the other ethnic-group populations grow, in some cases substantially. The differences between groups are due mainly to the following factors: the favourable age structure for growth in many minority groups, concentrations in the fertile age range leading to a favourable demographic momentum; the higher fertility rates for some groups; and the higher gains from international migration, counterbalanced for some groups by higher mortality.

How does the ethnic composition of the UK population change under the four projections? In 2001, 87% of the UK population was White British (the host group) and 13% belonged to ethnic minorities. Some 92% of the population was White (the first three groups) and 8% non-White. In 2051 the White British share of the population falls to between 73 and 80% while the White share falls to between 83 and 86%. The difference between the White British and White shares is due mainly to the rapid growth of the Other White population, which gained from heavy immigration during the 2000–2009 decade that is reflected in the Trend and UPTAP-EF projections. The UPTAP-ER projection assumes that growing numbers of migrants from Eastern Europe will return home. The latest international migration estimates suggest that this has begun.

In Fig. 2, we plot the ethnic group changes as time-series indexes using a 2009 base (2009 = 1), the jump-off point from data-informed estimates into true projection. The sixteen ethnic groups are arranged into four groups for presentation purposes: White and other groups that grow slowly (Fig. 2a), mixed groups that grow rapidly (Fig. 2b), South Asian and Other Asian groups which grow strongly (Fig. 2c) and various newer groups that grow strongly (Fig. 2d).

Fig. 2
figure 2

Time series indices for 16 ethnic groups, UPTAP-ER projection, 2009–2051. a Lower growth groups, b four mixed groups, c four traditional groups, d four newer groups

The white and other groups that grow slowly (Fig. 2a)

The White British group grows by 10% over the 50 years under the UPTAP-ER projection. The age profile becomes older over the 2009–2051 period. The White British population loses 12% share under the Trend projection, 11% under the UPTAP-EF projection and 7% under the UPTAP-ER projection. The White Irish group has its origins in a long history of migration between Ireland and the UK; by 2051 the older ages are made up of the children of a 1940s and 1950s wave of migrants from Ireland. Fertility levels of this group are forecast to be low. Intermarriage and assimilation mean that offspring ‘move’ into the White British group. Under the Trend and UPTAP-EF projections the group grows by 11–12% but by only 1% under the UPTAP-ER projection, where more of the group return to the Irish Republic. The group loses its share of the UK population under all projections from 2.46% in 2001 to 2.04–2.08% in 2051 under the Trend and UPTAP projections. The Other White group grows strongly at first but then levels off, because of the large influx of new migrants from central and east Europe. Their fertility is, however, low, and the group has a low number of children by 2051; there is also evidence that there is likely to be return migration to Poland (ONS 2010c). Under the Trend and UPTAP-EF projections the Other White population increases by over 200% by 2051 while it increases by just under 100% in the UPTAP-ER projection. Under this projection emigration rises so that fewer people are added to the group’s population. The Black Caribbean population experiences a high level of emigration back to its West Indies origins. The population growth for the group between 2001 and 2051 varies between 21% (UPTAP-ER projection) and 42% (Trend projection). The UPTAP-ER projection applies emigration rates to the UK local populations which reflect high levels of return migration to the West Indies among older ages. Continuing low fertility and a high level of mixed marriages or unions mean the demographic momentum effect is subdued and return migration reduces ageing.

The mixed groups (Fig. 2b)

The mixed groups all have a very young age structure in 2001 and so have the potential to grow substantially as the children move into the family-building ages. The White and Black African group grows fastest, followed by the White and Asian groups and Other Mixed group. The White and Black Caribbean group grows slightly less. The White and Black Caribbean group increases by between 170 and 232% of its 2001 population, depending on projection. Its share of the population increases to around 1% of the population. The White and Black African group grows by between 210 and 337% of its 2001 population, depending on projection. Its share of the population increases to between 0.37 and 0.46% of the population. The White and Asian group grows by 187% to 298% of its 2001 population, depending on projection. Its share of the population increases to 0.8–1.0% of the population. The Other Mixed group increases by between 184 and 313% of its 2001 population, depending on projection. Its share of the population increases to 0.65–0.84% of the population.

The traditional immigrant groups (Fig. 2c)

The Asian groups all have a young age structure in 2001 reflecting their immigration in the 1960s to 1990s, so they have the potential to grow given the concentration of the population in the family-building ages. The Pakistani group grows fastest, followed by the Bangladeshi and Other Asian groups and the slower-growing Indian group. The Indian group population increases by 95% to 149% of its 2001 population by 2051, depending on projection. Its share of the population increases from 1.8% to 3.0–3.4%. In 2001 the Indian group was the third largest ethnic minority group after the Other White and White Irish groups; in 2051 it is projected to be the second largest. The Pakistani group increases its population by 125–179% of its 2001 population by 2051, depending on projection. Its share of the population increases from 1.3% to 2.4–2.7%. In 2001 the Pakistani group was the fourth largest ethnic minority group after the Other White and White Irish groups; in 2051 it is projected to be the third largest. The Bangladeshi population increases 2.2–2.6 times between 2001 and 2051, depending on the projection chosen. Its share of the population increases from 0.5% to close to 1%, about twice its 2001 share. The Other Asian population increases by 105–194% of its 2001 numbers by 2051, depending on projection; its share of the population increases from 0.4% to 0.73–0.95%.

The newer groups (Fig. 2d)

These newer groups are projected to grow substantially under our preferred UPTAP-ER projection. The newer groups all have an age structure in 2001 dominated by the younger age groups in which immigration is high, and they therefore have the potential to grow considerably. The Other Black group grows fastest, followed by the Other Ethnic, Black African and Chinese groups. Note that the ‘Other Black’ and ‘Other Ethnic’ groups are collective labels for a large number of separate ethnicities.

The Black African population increases by 93–191% of its 2001 value by 2051, depending on projection. The Black African share of the population increases from 0.9% to 1.5–2.6%. The Other Black population grows by 110–158% of its 2001 level by 2051, depending on projection; its share of the population increases from 0.17% to 0.30–0.32%. The Chinese population increases by 86–202% of its 2001 value by 2051; its share of the population increases from 0.4% to 0.68–1.15%. The choice of projection makes a substantial difference for this group. As a substantial proportion of this group enters as students taking HE courses, it is reasonable to expect high emigration once those courses are completed. The Other Ethnic group, an amalgam of many groups not included elsewhere, increases 2.4–6.7 times its 2001 level by 2051, depending on projection; its share of the population increases from 0.4% to 0.68–1.15%. Choice of projection also makes a substantial difference for this group.

Discussion and conclusions

Comparisons of our projections with other estimates and projections

ONS have a rolling program for producing midyear ethnic population estimates for local authorities in England (Large and Ghosh 2006b; ONS 2009b, 2010b). We compare estimates for mid-2007 with our projections. ONS estimates the components of population change for each year from mid-2001 using techniques described in Large and Ghosh (2006a). We have developed independent estimates of each component and introduce these estimates as rates, probabilities and flows into our projection model. The projection results for mid-2007 are compared directly with the ONS estimates in Table 12. The differences over just 6 years are considerable: our figure for the England population is 359,000 greater than that of ONS or 0.70% greater. Our estimates for the White population are larger than those of ONS while our ethnic minority estimates are lower. Some of the lower figures for Asian or Asian British groups or Black or Black British groups may be a result of introducing ethnic-specific mortality as these groups had lower life expectancies than the total population (Table 6). That we should obtain such different estimates over a very short period is concerning. The differences serve to highlight a great deal of uncertainty in estimating the population disaggregated by ethnicity.

Table 12 Comparison of ONS estimates and TREND projections, England, mid-year 2007 (1,000s)

The Greater London Data Management and Analysis group, led by John Hollis, has a history of preparing London Borough projections since the 1970s and of ethnic group projections since 1999, reviewed above. We have aggregated our 16 ethnic groups to match the 10 groups used by the GLA and summed our London Borough projections to yield totals for Greater London. The GLA combines the White and Black Caribbean and White and Black African groups with the Black Other group. The White and Asian group is merged into the Other Asian group while the Other Mixed group is combined into the Other Ethnic group. The GLA projections have an estimate base at mid-year 2008 while the UPTAP-ER projection starts in 2001 and uses the emigration rates model, which matches the technique used by the GLA. The results are set out in Table 13. The UPTAP-ER projections are 2.6% lower than the GLA projections. The UPTAP-ER White population is larger while the BAME population is smaller. The differences vary between groups: the Indian and Other Asian group populations are very close, while projected numbers in the Black and Other South Asian groups are lower in the UPTAP-ER projections than in the GLA projections. These differences may be a consequence of the adoption of ethnic specific survivorship probabilities in our projections: these groups have worse than average mortality experience. Other sources of difference may be the way international migration is handled or the constraining of GLA model ethnic projections to the total population projections. The projected percentage of the population of Greater London that belongs to the Black and Minority Ethnic (BAME) population is similar though lower in our projections, 35% compared with 40% in the GLA projections. Some 35% of the UK BAME population in 2031 reside in Greater London under our UPTAP-ER projection, so we can be pleased with the degree of similarity of our projections to those of the organization with most experience in this field.

Table 13 Comparison of GLA and UPTAP-ER projections for Greater London, 2031, ten groups

Table 14 assembles results for the UK from Coleman (2010) for 2031 and 2056 and compares them with our UPTAP-ER projections in 2031 and 2051. Again we aggregate group populations from our projections to match the ethnic groups used by Coleman: the White Irish group was merged with the Other White group; the Mixed groups were summed. The Coleman projection produces higher populations for the UK than either of our UPTAP projections. The projections for the White British group and BAME population are very different. In order to understand why this might be we need to compare assumptions. We can ignore our internal migration assumptions because Coleman’s projection is for one spatial unit only. We cannot compare the mortality assumptions because Coleman uses the all-group mortality rates for all ethnicities whereas we use ethnic-specific mortality rates. This difference will probably result in lower projected numbers for Other Black, Bangladeshi and Pakistani groups, given their lower than average life expectancies, while Chinese, Other White and Other Ethnic groups will have higher numbers. We can compare fertility and international migration assumptions and these are quite different. Overall the UK TFR is slightly higher in our projections than in Coleman’s. However, the profiles of fertility across groups are different. We assume higher fertilities for the White British group for the later periods of the projection and the Indian group throughout, while Coleman assumes higher fertility for the other BAME groups. Differences are substantial (over 0.4 of a child) for the Black Caribbean, Black African, Other Black and Other Ethnic groups and higher than the Pakistani and Bangladeshi groups at the start of the projections. These differences will contribute to the differences in projected ethnic mix: in particular, to the lower UPTAP-ER projected populations for the Asian and Black groups.

Table 14 Comparison with the UK ethnic group projections of Coleman (2010) for twelve groups

Reflections

These comparisons have shown that our projections differ considerably from the estimates of ONS and from the projections of Coleman, but are quite close to the projections of the Greater London Authority. There are many sources of difference. First, there are the methods used to estimate the components of change for each ethnic group. Our projections are the only ones to estimate ethnic-specific mortality. Each of the projection endeavours makes estimates of ethnic-group fertility, drawing on vital statistics, survey and census data in different mixes. Our projections assume much lower fertility rates for the main BAME groups than the Coleman projections. The projections differ substantially in the way international migration is allocated across the ethnic groups. Our projections make use of internal migration estimates by ethnicity, drawing on both the 2001 Census and the post-census all-groups migration data, although the internal migration estimates could be improved by using the LFS data used by Raymer et al. (2008) and Raymer and Giulietti (2009). So there is considerable uncertainty about the degree of change in the UK’s ethnic populations. There is, however, agreement about the direction of change: towards increasing population diversity.

Summary of findings

This paper has reported on the findings of an investigation of ethnic population trends at local-area scale in the United Kingdom and built a model to project those trends under a variety of assumptions into the future. To carry out the projections, we have made new estimates of component rates, probabilities and flows for 16 ethnic groups for 355 local areas, the UK summaries of which have been reported in this paper.

The key achievements and findings of the research are as follows. We have designed an innovative model to project forward ethnic-group populations for local areas in the UK simultaneously. The key feature is the bi-regional structure that captures the migration connections between areas and enables simultaneous projection of 355 zone populations. The model handles internal migration through probabilities of outmigration conditional on survival within the country. Such probabilities enable the proper separation of mortality and migration processes. The model design makes possible different configurations of the international migration process as gross or net flows or rates. We have explored two configurations: treating immigration and emigration as gross flows (the EF model) and treating immigration as gross flows and emigration as a product of emigration rates and populations at risk (the ER model). The projections connect together ethnic groups by generating births of mixed ethnic parentage, using information from the 2001 census. The model handles explicitly all population components of change: fertility, mortality, immigration, emigration, internal in-migration and internal outmigration for each local area and for each ethnic-group population. The model uses single years of age from 0 to 100+, which recognizes the need to know more about the distribution of the population of the very old, as the population ages. The model has been written as a set of R scripts; R is a general purpose statistical computer language/package, which has handles large arrays well and enables each projection to be run in a couple of hours.

New estimates of ethnic-group mortality have been prepared, which show moderate variation. The range in life expectancies between best and worst experience is 5 years, lower than in other countries where equivalent information is available such as the USA or New Zealand. For the UPTAP projections we adopt a decline rate in mortality probabilities of 2% per annum, which is slower than the decline in the last decade, about equivalent to the declines of the past 25 years and faster than the 1% per annum assumed by National Statistics. As a result we project a larger old-age population than alternative projections.

Our fertility rate estimates are based on three sources: annual vital statistics, census populations (mothers and children) and LFS data for post-census information on ethnic fertility. The method is calibrated for 1991 and 2001. For 2006–2011 the total fertility rate estimates range from 1.47 for Chinese women to 2.47 for Bangladeshi women, with TFRs for White women estimated to be 1.88 and for Mixed women 1.74. Asian group fertility is estimated to be higher than Black group fertility. These estimates are higher than those of National Statistics but lower than those of Coleman.

Our work on international migration has focused on improving local-area estimates of immigration using administrative sources. We combined this with the ethnic profile based on the 2001 Census immigrations. These estimates are different from the ONS and Coleman alternatives.

When we align our projection assumptions as closely as possible with the 2008-based National Population Projections (NPP), we obtain a comparable trajectory for the UK population as a whole. In 2051 in these Trend projections, the UK population grows to 78.2 million compared with 77.1 million in the NPP. The gap of 1.1 million is an estimate of the aggregation effect in projection, being due to the difference between projecting four home-country populations and projecting a large number of local authority-ethnic populations.

Our Benchmark projections produced much lower projected populations than the NPP at 62.2 million in 2051. The gap of 6 million people demonstrates the dramatic demographic shift in the 2000s, resulting from the combined effect in the 2001–2009 period of lower mortality (gains of 2.1 years in male life expectancy and 1.5 years in female for the UK 2000–2007), higher fertility (gains of 0.33 of a child in TFR for the UK 2001–2008) and higher net immigration (+154,000 in 2000 and +217,000 in 2007).

The differences between our UPTAP-EF and UPTAP-ER projections demonstrate the impact of a change in the model for emigration. Modelling emigration by applying a fixed rate to a changing population at risk rather than a fixed flow count produces a total population in 2051 that is lower by 8.8 million.

Our projections show huge differences in the potential growth of the different ethnic groups. As a result of these differences, the ethnic composition of the UK will change substantially over the period to 2051.

Our projections establish that the UK in 2051 will be a more diverse society than in 2001 irrespective of which set of component assumptions are adopted. Diversity will be higher under the Trend and UPTAP-EF projections than under the UPTAP-ER projections because of differences in the model for international migration. These differences across our projections and those of others show that the degree of increase in ethnic diversity is still quite uncertain.