Introduction

The origins of two key measures of inequality, the Lorenz curve, and the Gini coefficient (also known as the Gini index or Gini ratio), date back more than a century. The American economist, Max Otto Lorenz, developed the Lorenz curve, which was published as a paper when he was a doctoral student at the University of Wisconsin-Madison (Lorenz 1905). The term “Lorenz curve” was coined by Willford King in 1912 (Schneider 2021). The Italian statistician and demographer, Corrado Gini developed the Gini coefficient. Together, these two scholars left an indelible mark on the distribution of income and wealth and the measurement of inequality.

In his article in The Economic Journal, Dalton (1920) showed how inequality can be measured by using a Lorenz curve. He stressed that this is a simple and convenient graphical method of exhibiting any distribution of income provided that the Lorenz curve is based on a ranked cumulative distribution in both directions, i.e., of income and number of recipients. More specifically, the Lorenz curve constitutes a graphical representation of income or wealth inequality that plots the cumulative percentage of total income (or some other variable) against the cumulative percentage of the corresponding population ranked in ascending order of share. The Gini coefficient quantifies the magnitude of inequality as the ratio of the area between the line of perfect equality and the Lorenz curve to the lower triangle. It is a measure of the statistical dispersion by which the Lorenz curve bends away from the line of perfect equality (Lerman and Yitzhaki, 1984).

Additionally, Dalton (1920) affirmed that Gini was the first to mathematically prove that inequality can be measured as the area enclosed between the Lorenz curve and line of equal distribution. Gini demonstrated that the larger this area, the greater the degree of inequality. In a letter responding to the seminal work of Dalton (1920), also published in The Economic Journal more than a century ago, Gini (1921) suggested that his method of measuring inequality could be applied not only to measuring income inequality, but also to other quantitative characteristics, for example, economic, demographic, anatomical and physiological attributes.

However, applying these measures to other domains has thus far been rather limited, despite Gini’s belief that they had the potential to be applied beyond the traditional measures of income and wealth inequality. Therefore, the breadth and richness of these relatively simple mathematical and graphical expositions has been largely untapped. This study aims to fill some of that void by utilizing the methodological and theoretical framework of the Lorenz curve and the Gini coefficient to model inequality in the population distribution in Canada. This adaptation of the traditional Lorenz curve constitutes a novel representation of population growth inequality and provides new insights into the distribution of subnational population in Canada.

Literature Review

Decades after his death, Gini’s original work in Italian was republished in English (Gini 2005), which reiterated that the Gini coefficient can be calculated as the ratio of the area between the line of perfect equality and the Lorenz curve divided by the total area under the perfect equality line. The following paragraphs provide an overview of selected studies that utilize the Lorenz curve and Gini coefficient to measure inequality, first with respect to income and wealth distributions and then to other areas.

Kuhn et al. (2020) analyzed archival data from the Survey of Consumer Finances (SCF) to examine the Gini coefficient for a better understanding of income and wealth inequality in the United States from 1949 – 2016. The authors showed that SCF data can be a valuable resource for empirical and theoretical research on inequality and household finance. Benhabib et al. (2011) used simulated wealth distributions to study how such fiscal policy as capital income taxes and estate taxes affect the Gini coefficient and subsequently wealth inequality. While estate taxes can have ambiguous effects on wealth inequality, the Gini coefficient declined consistently as capital income taxes increased. To examine the world distribution of income, Sala-i-Martin (2006) estimated the Gini coefficient along with seven other indexes of income inequality for 138 countries between 1970 and 2000. He showed that during the 1980s and 1990s all indexes decreased, thereby implying reduction in global inequality. To analyze income inequality and polarization across time or different entities, Duclos et al. (2004) computed the Gini coefficients for 21 countries over two time periods, 1989–1992 and 1994–1997. They showed that although polarization and inequality are correlated, they are empirically distinct. Chesher and Schluter (2002) used the Gini coefficient to show that measurement error in the survey-based income and expenditure data can have a significant impact on inequality and poverty measurements. Using the Gini coefficient, Wolff (1992) analyzed the distribution of household wealth in the United States (U.S.) over the period 1922 to 1981. He showed that over 25% of total wealth was owned by the top 1% in each of those years, except 1976 and 1981. He also stressed the role of social-security wealth, which may decrease the Gini coefficient. Danziger and Gottschalk (1987) used the Gini coefficient to examine male earnings inequality in the U.S. in 1939 and 1979. They subsequently evaluated the Wilson hypothesis, which suggested that outmigration of middle-class Black males led to increased inequality of income resulting in adverse behavioural consequences for low-income Blacks and provoking the development of an underclass. While it was shown that the Gini coefficient did increase for all Black men relative to that of Whites over the 40 year period, the authors suggested that more research was needed to prove that the changes in community composition resulted in attitude changes. In an earlier mathematical note, Lerman and Yitzhaki (1984) demonstrated an exhaustive method of calculating and interpreting the Gini coefficient.

Beyond the traditional applications of measuring income and wealth distributions, several studies extended the Lorenz/Gini characterization to measure inequality in other contexts. Xiong et al. (2019) evaluated public health resilience to ambient air pollutants. Using the Lorenz curve and Gini coefficient, the authors examined the relationship between the risk caused by air pollution and hospital density in Sichuan, China. Such research could provide a basis for optimizing the size and environmental characteristics of new hospital construction and future air pollution control measures. Frees et al. (2013) extended the Lorenz curve and Gini coefficient by ordering insurance risks. The ordering variable was a risk-based score relative to price, known as relativity. The new relativity-based measures coped with adverse selection and quantified potential profit. Similarly, Teng et al. (2011) used the Lorenz curve and Gini coefficient (with adjustment to per capita historical cumulative emissions) to construct a carbon Gini coefficient, which measured inequality in areas of climate change. The results indicated that 70% of the world’s total emission space is allocated unequally, and such space has been excessively occupied by developed countries. The carbon Gini coefficient confirmed that developed countries had excessively occupied the emission space. According to the principle of the United Nations Framework Convention on Climate Change (UNFCCC), developed countries should take the lead in reducing their emissions to redress this unequal situation (United Nations 2023).

As noted earlier, this study looks at a novel characterization of the Lorenz curve and the Gini coefficient. The current study takes the notion of these related and inseparable measures of inequality as it relates to income and wealth distributions and extends this notion to the distribution of population. No previous study has attempted this type of analysis.

Research Method

This analysis required data regarding the population, land area, and hence population density (i.e., population per square kilometer), of census divisions (CDs)Footnote 1 for each successive census period from 1991 up to and including 2021 (Statistics Canada 1998, 2007, 2017, 2022a). Statistics Canada fulfilled these data requirements. Accordingly, the degree of inequality in the distribution of population was determined for 1991, 1996, 2001, 2006, 2011, 2016, and 2021. respectively.

The distribution of census divisions by province and territory over time is presented in Table 1. Notwithstanding certain minor exceptions, the number of census divisions by jurisdiction remained relatively stable throughout the duration of this study. For Quebec, the number of census divisions decreased from 99 in 1991 to 98 in 2001 and subsequent census years. Similarly, in British Columbia, this number dropped from 30 in 1991 to 28 in 2001, remaining at 28 in 2006 before rising to 29 in 2011, at which level it has thenceforth remained. The Northwest Territories experienced the largest fluctuation in census divisions as shown in Table 1.Footnote 2

Table 1 Census divisions by provinces and territories, 1991 – 2021

The nature of Canada’s geography is such that a significant proportion of the aggregate land mass is largely uninhabitable. In essence, there is a vast expanse within the northern territories where population density approximates zero due to inhospitable climatic conditions. Thus, it was determined that the inclusion of the territorial regions might skew the resultant data. Therefore, this study measured (i) the population distribution for all of Canada, including the three thinly populated territories of Northwest Territories, Yukon, and Nunavut, as well as (ii) the population distribution for only the ten Canadian provinces measured separately.

In this study, the degree of inequality in the population distribution for each respective census period was represented graphically utilizing a Lorenz curve. The Lorenz curve was formed by plotting the cumulative population share against the cumulative land area ordered by census divisions of increasing population densities (Fig. 1).

Fig. 1
figure 1

Lorenz curve diagram. The margin by which the Lorenz curve deviated from the 45-degree line of equality shows the degree of inequality in the population distribution

The population, density of population, and land area of each respective census division were first determined, and divisions were then ordered with respect to ascending population density. As such, the census division possessing the lowest population density (i.e., the poorest division in terms of population) was ordered first, and the census division comprising the highest population density was ordered last. Then, for each census division, the percentage of population and the percentage of land area were determined to calculate the cumulative population percentage and the cumulative percentage of aggregate land area, respectively. Thereafter, the cumulative population percentage was plotted against the cumulative percentage of land area. In the case of perfect equality, the relationship between the cumulative percentage of national population and the cumulative percentage of land area is proportional, as shown by the 45-degree line of equality (Fig. 2).

Fig. 2
figure 2

Step-by-step algorithm for construction of the Lorenz curve and calculation of the Gini coefficient

The corresponding Gini coefficient was then derived, thereby quantifying the margin by which the Lorenz curve deviated from the 45-degree line of perfect equality. The Gini coefficient was calculated using Eq. (1):

$$G=\frac{A}{A+B}$$
(1)

where A is the area between the Lorenz curve and the line of equality, B is the area beneath the Lorenz curve, and A + B is the area beneath the line of equality which is equal to the area of a right isosceles triangle. The Gini coefficient is a real number between 0 and 1, where 0 is the Gini coefficient for a perfectly equal distribution of population, and 1 is the Gini coefficient for a totally unequal distribution of population. Area B was determined by iteratively calculating the area of each respective right trapezoid comprised between successive data points. As such, the summation of each right trapezoid constituted the total area under the Lorenz curve (Eq. 2):

$$B=\sum\nolimits_{i=1}^{100}\frac{{P}_{i}+{P}_{i-1}}{2}\left({L}_{i}-{L}_{i-1}\right)$$
(2)

where P is the cumulative percentage of the population, L is the cumulative percentage of land area, and i is an index that denotes data points from 1 to 100.

Results and Discussion

Over successive census periods, the Lorenz curve notably deviated from the line of equality, indicating significant inequality in the population distribution in Canada (Online Supplemental Appendix Figs. 17). The Gini coefficient has risen continuously without exception from one census period to another since 1991 (i.e., the earliest census year included in this study). As shown in Fig. 3, the Gini increased steadily from 0.9392 in 1991, 0.9418 in 2001, 0.9457 in 2011 to 0.9483 in 2021. While the rate of change was relatively modest from one census year to the next, the upward trajectory of the Gini coefficient is nonetheless unambiguous, suggesting an increasing concentration of population growth within urban agglomerations on the one hand and severe decline in rural jurisdictions on the other.

Fig. 3
figure 3

Upward trajectory of the Gini coefficient over time. The three territories of Northwest Territories, Yukon, and Nunavut are included. Source: Own calculations using data from Population and Dwelling Count Highlight Tables (Statistics Canada 1998, 2007, 2017, 2022a)

Thus, in 1991, the least densely populated 905 of the aggregate land mass accounted for only 9.1% of the national population (Table 2). Furthermore, a mere 3.4% of the national population inhabited the least densely populated 80% of the country’s land area (Table 3). Inversely, approximately 90.9% of the national population was contained within the most densely populated 10% of the aggregate land area, and roughly 96.6% of the population resided within the most concentrated quintile of the nation’s land mass. Tables 2 and 3 provide further details of these extreme characteristics of Canada’s increasing concentration of population in the most densely populated areas and an unambiguous thinning of population from the least densely populated areas.

Table 2 Percentage of population in the least densely and most densely populated deciles by land area
Table 3 Percentage of population in the least densely and most densely populated quintile by land area

In 2011, the Gini coefficient increased to a value of 0.9457, and thus the most densely populated decile of the land mass contained approximately 92.3% of the national population, while the most concentrated quintile comprised 96.9% of the total populace. By 2016, the Gini coefficient had risen to 0.9469. Accordingly, 7.5% of the total population were dispersed over the least dense 90% of the aggregate land area, while approximately 92.5% of the population were concentrated within the most densely populated 10% of the total land mass. In terms of quintile distributions, only 3% of the national population inhabited the most sparsely populated 80% of the country’s land area, while 97% of the total population were contained within the most concentrated 20% of the territory of Canada. By 2021, the least densely populated 90% of the aggregate land mass comprised approximately 7.1% of the national population, while 92.9% of the population resided within the most densely populated 10% of the nation’s land area. Furthermore, the most concentrated 20% of the aggregate land area contained approximately 97.1% of the national population, and thus a mere 2.9% of the population inhabited the least densely populated 80% of the Canadian geographical expanse. As well, across all census periods, the three most populated census divisions, Toronto, Montreal, and Greater Vancouver, comprised around 20% of the nation’s total population.

Exclusion of the territories lowered the Gini coefficient, but not its upward trajectory (Fig. 4). Once more, the Gini coefficient grew consistently over the 30-year period, rising from 0.9040 in 1991 to approximately 0.9220 by 2021.

Fig. 4
figure 4

Upward trajectory of the Gini coefficient over time. The three territories of Northwest Territories, Yukon, and Nunavut are excluded. Source: Own calculations using data from Population and Dwelling Count Highlight Tables (Statistics Canada 1998, 2007, 2017, 2022a)

Table 4 presents the trajectory of Gini coefficients for each province from 1991 to 2021. Each respective province displays rising inequality over time with one minor exception, that being Manitoba, from 1991 to 1996. Quebec, Ontario, and Manitoba exhibit the highest levels of inequality, due largely to these provinces having highly concentrated urban areas at one extreme and large expanses of thinly populated rural areas on the other. Newfoundland, Alberta, and British Columbia present an intermediate case with rising Gini coefficients, but not to the same extent as the first group of provinces. This is followed by Saskatchewan, which is in a category of its own, with the Gini coefficient rising from 0.6706 in 1991 to 0.7175 in 2021. The least (but nonetheless increasing) inequality is characterized by Prince Edward Island, Nova Scotia, and New Brunswick. In particular, Prince Edward Island stands out with its relatively low Gini coefficients, which increased precipitously from 0.2098 in 1991 to 0.2789 in 2021. As the smallest province in both land area and population, yet the most densely populated, Prince Edward Island lacks a census metropolitan area (CMA),Footnote 3 as the province is largely dominated by relatively small communities. According to the 2021 census, the largest two communities are the capital, Charlottetown, with a population fewer than 40,000, and Summerside with just over 16,000 inhabitants.

Table 4 Provincial Gini coefficients, 1991 – 2021

Figure 5 presents the line graph trajectory of Canada’s 10 provincial Gini coefficients for the 1991 – 2021 period. This visual characterization shows that Quebec exhibits the highest inequality over time followed by Ontario, Manitoba, and British Columbia in that order. Alberta and Newfoundland are seemingly tied, with Alberta’s Gini slightly higher for all census years studied except for 2006 and 2011 when Newfoundland’s Gini overtook it marginally. Saskatchewan occupies the middle ground. The three Maritime provinces exhibit lower inequality in comparison to the remaining seven provinces. This graphical presentation reinforces the relatively lower Gini for Nova Scotia, with New Brunswick trailing immediately behind, and Prince Edward Island exhibiting the least inequality in the population distribution among the 10 Canadian provinces over time.

Fig. 5
figure 5

Line graph trajectory of Canada’s 10 provincial Gini coefficients for the 1991 – 2021 period. Source: Own calculations using data from Population and Dwelling Count Highlight Tables (Statistics Canada 1998, 2007, 2017, 2022a)

An interesting area of variability is evident from the calculation of the percentage change of provincial Gini coefficients, as presented in Table 5. At one extreme, the Gini for Prince Edward Island rose 32.9% between 1991 and 2021, due in large part to the low initial Gini value. The intermediate cases occurred in Nova Scotia and New Brunswick, which registered changes of 14.2% and 15.7% respectively. At the other extreme, provinces possessing an elevated Gini coefficient in 1991, such as Quebec, Ontario, and Manitoba, did not have much room to move upwards. The Gini for these three provinces grew by a mere 0.8%, 1.6%, and 1.2%, respectively, between 1991 and 2021. Nevertheless, the increase in inequality was unambiguous for each province. Over this period, for Canada as a whole, the Gini rose by 1%. Not surprisingly, if one excludes the thinly populated territories of Northwest Territories, Yukon, and Nunavut, and considers only the ten provinces in isolation, the Gini decreased from 0.9392 to 0.9040 in 1991 and from 0.9483 to 0.9220 in 2021.

Table 5 Percentage change of provincial Gini coefficients, 1991 – 2021

Like all other measures of inequality, the Lorenz curve and the Gini coefficient suffer from certain limitations. The Lorenz curve is a smooth and continuous curve derived from disaggregated data that may not necessarily be continuous. This could lead to imperfect measures of inequality (Kristensen 2022), but such imperfections are likely to be relatively minor since this study uses actual population data. Use of high-quality census data published by the government suggests that this study does not suffer from potentially erroneous assumptions and other shortcomings that could affect estimates based on sample data. Thus, the sensitivity of the shape of the Lorenz curve to sample data and sample size is not an issue here. Changes in the middle of the distribution characterized by the Lorenz curve, however, have a relatively larger impact on the Gini coefficient than similar changes at either end of the distribution. Finally, two intersecting Lorenz curves with different slopes drawn from two different distributions could possibly lead to similar or even identical Gini coefficients (Osberg 2017). The impact of these potential limitations is believed to be negligible and unlikely to alter the key findings of this study.

Conclusions

The findings exhibit significant and growing inequality in the population distribution between successive census periods. Accordingly, the geographic dispersion of the Canadian population has become increasingly skewed, and thus a substantial proportion of the national populace is concentrated within a handful of metropolitan jurisdictions. This has fundamentally altered the economic and social reality of the country.

By 2021, most Canadians (73.7%) were living in one of Canada's major cities, illustrating the ongoing trend of urbanization in Canada, as large urban centers continued to consolidate their pre-eminent stature as the principal destinations for new immigrants. From 2016 to 2019, Canada experienced an unprecedented influx of new immigrants with more than 9 out of 10 settling in Canada's major metropolitan areas. Additionally, the population of downtown cores has increased significantly and continues to expand at a faster rate than previously with a 10.9% growth between 2016 and 2021. This is more than twice the 4.6% rate of the previous census period (2011–2016). Toronto, Vancouver, Montréal, Ottawa, and Edmonton had the five most populous downtown cores, with Toronto’s downtown the largest at 275,931. Vancouver, meanwhile, holds the title of having the most densely populated downtown at 18,837 inhabitants per square kilometer (Statistics Canada 2022b).

This rapid expansion of population is crying out for more infrastructure, more public transportation, and more housing, which are all experiencing significant stress. As a consequence, between 2016 and 2021, the value of homes in both Ontario and British Columbia rose by more than 50% in most of their census subdivisions. Moreover, these two provinces were the least affordable provinces for housing in the country, largely because of the growing number of people competing for scarce rental housing and being forced to pay high rents to live in the city centers of Toronto and Vancouver. Real estate investors continue to thrive in this environment with nearly 40% of the occupied housing in the downtown cores consisting of condominiums in Canada’s major cities with one-half of these condominiums being rented out (Statistics Canada 2022c).

High housing prices are impacting unfavorably on large segments of the population, from young professionals who cannot finance a down payment for their first home to students and tenants living on tight budgets. As rental prices continue to rise, people with limited financial resources are being forced out to more affordable neighborhoods. Investors are often tearing down older structures to make way for expensive new housing and commercial real estate. Over time, prime real estate in the downtown cores and adjoining suburban areas has become increasingly available for high income earners and businesses that cater to the more affluent sections of society to move in. The influx of the relatively wealthy to the more attractive parts of urban areas is contributing to increasing gentrification, leading to a loss of cultural diversity and in the process crippling small business and increasing social tension (Chong 2017; Siddiq 2020).