1 Introduction

The similarity or dissimilarity of a collection of \(K ({>}2)\) distributions,\(f(x), g(x), h(x)\ldots \, \), of the random vector x (where x could be a scalar or vector) is of interest in many fields. Wherever the degree of heterogeneity of types is an issue, measures and tests of the degree of similarity–dissimilarity would be of great use especially in the era of big data. Notions of dissimilarity in many distributions are closely related to Gini’s Transvariation Measure for two discrete distributions [34, 35], which was generalized to many discrete distributions by Dagum [14, 15]. In economics this interest ranges from concerns regarding the similarity–dissimilarity of the outcome distributions of agents in different circumstance classes in the equality of opportunity and social justice literature to the similarity–dissimilarity of technologies in the empirical growth literature to the commonality of valuation distributions in first price auction theory literature. In economics and sociology literatures such measures would have relevance for social classification [10, 12], social mobility [36] and segregation [7, 13, 18, 19, 21, 27, 30, 32, 43]. In the political science literature on polarization [33], conflict, and diversity [22,23,24, 51] the subject matter has much to do with increasing or diminishing dissimilarities of the latent subgroup distributions of an overall population distribution [20].

Perception of similarities and differences from particular viewpoints is also of interest. In the population ethics literature [8, 11, 16, 17] versions of the constituency principleFootnote 1 argue that only the views of a particular population or constituency (essentially those affected by the comparison) are relevant in assessing goodness (or badness) of states. Underlying this principle is the notion that “better-ness” from the point of view of one population may differ from the corresponding view of another, apparently unaffected, population. Discourse in this literature was pursued in the context of successive populations, but similar ideas can be explored in the context of contemporaneous populations or constituencies. For example, it is this doctrine that underlays the Focus Axiom in poverty analysis wherein only the incomes of the poor constituency should matter in the poverty calculation. As a principle, it is strong in the sense that only the views of one constituency are of account, the views of all other constituencies have zero weight in the calculus. Weaker versions of the constituency principle would see primacy given to the views of the affected population with lower but non-zero weight attached to other constituencies. Extending the poverty measurement analogy, higher orders of the Foster et al. [29] family of poverty measures attach increasing weight to constituencies further below the poverty line. This in turn leads to the idea of intensifying the focus, or magnifying in some sense, particular aspects of the distribution of interest to those constituencies with primacy.

Here these ideas will be explored and extended to comparisons of continuous distributions, to the idea of relative dissimilarity and to the idea of comparisons of dissimilarity of particular features of distributions all of which could potentially be multivariate. In Sect. 2 the concepts will be developed in the context of Gini’s original two-distribution world. Various extensions to the many distribution case are considered in Sect. 3. The comparison techniques will be exemplified in a study examining differences by gender in Aboriginal and non-Aboriginal constituency income distribution in Canada in the twenty first century in Sect. 4. Section 5 concludes.

2 Dissimilarity in a two-constituency world

Gini’s Transvariation Measure, “GINITD”, for comparing two discrete distributions \(f_i\) and \(g_i \) defined on i outcomes \(i=1,\ldots ,I\) where \(f_i, g_i\) are respectively the probabilities of outcome “i” occurring is given by:

$$\begin{aligned} GINITD=\frac{1}{2}\mathop {\sum }\limits _{i=1}^I \left| {f_i -g_i } \right| \end{aligned}$$

It can be shown that GINITD is bounded by 0 and 1. In addition, when f and g have mutually exclusive support, \(GINITD = 1\) (since \({\sum }_i | {f_i -g_i } |={\sum }_i f_i + {\sum }_i g_i =2)\), and when f and g are identical \(GINITD = 0\). A convenient way of writing GINITD in what follows is:

$$\begin{aligned} GINITD=\frac{1}{2}\mathop {\sum }\limits _{i=1}^I | {f_i -g_i } |=\frac{1}{2}\mathop {\sum }\limits _{i=1}^I ( {\max ( {f_i , g_i } )-\min ( {f_i , g_i } )} )=\frac{1}{2}( {\theta _U -\theta _L } ), \end{aligned}$$

where \(\theta _U ={\sum }_{i=1}^I \max ({f_i , g_i})\) and \(\theta _L ={\sum }_{i=1}^I \min ( {f_i , g_i } )\).

Noting that \(\theta _U = 2-\theta _L \) this can be rewritten as:

$$\begin{aligned} GINITD=1-\theta _L \end{aligned}$$
(1)

The continuous version of Gini’s Transvariation Measure, GINITC, is similarly related to the integral of absolute differences between two PDFs, f(x) and g(x), each defined on the real line and is similarly related to the overlap measure \(\theta _L ( {{=}\int \min \{ {f(x), g(x)} \}dx} )\):

$$\begin{aligned} GINITC=0.5\mathop {\int }\limits _{-\infty }^\infty \left| {f(x)-g(x)} \right| dx=1-\theta _L \end{aligned}$$
(1a)

In both continuous and discrete cases \(\theta \) can be shown to be asymptotically normally distributed [2, 6] and, since \(\theta \sim _{a } N( {\theta , V} )\), then both GINITD and \(GINITC \sim _{a } N( {1-\theta , V} )\) thus facilitating inference. The distribution and properties of the latter parameter, \(\theta \), which is a measure of the degree of overlap or commonality between the distributions, have been provided for the two distribution multivariate case in Anderson et al. [2]. In essence, when Kernel estimates of f(x) and g(x) are employed it is shown that:

$$\begin{aligned} \sqrt{n}\big ( {\hat{\theta }_L -\theta _L } \big )-a_n \sim N( {0,v} ) \end{aligned}$$

Thus from the relationship between GINIT and \(\theta _L \) in Eq. (1) above it follows that:

$$\begin{aligned} \sqrt{n}\big ( {\widehat{GINIT} -GINIT} \big )+a_n \sim N( {0,v}) \end{aligned}$$

Here \(a_n \), a bias correction term and v, the variance are dependent on the “contact set”, the set of points where \(f(x)=g(x)\), when this set is empty or of measure 0 the bias term is 0 and v is given by:

$$\begin{aligned} v=h(n)\{ {p_f ( {1-p_f } )+p_g ( {1-p_g } )+( {p_{f, g} -p_f p_g } )} \}, \end{aligned}$$

where \(p_f \) is the estimated probability that \(f(x)-g(x)<-c, p_g\) is the estimated probability that \(f(x)-g(x)>c\) and \(p_{f, g}\) is the estimated probability that \(-c<f(x)-g(x)<c\) for some tuning parameter c. (Note the last term is omitted if the random variables under f and g are not jointly distributed).

A convenient way of thinking about this comparison is to note that it can be written as:

$$\begin{aligned} GINITC= & {} 0.5\mathop {\int }\limits _{-\infty }^\infty \left| {f(x)-g(x)} \right| dx=0.5\mathop {\int }\limits _{-\infty }^\infty \left| {\frac{f(x)}{g(x)}-1} \right| g(x)dx\\= & {} 0.5E_{g(x)} \left( {\left| {\frac{f(x)}{g(x)}-1} \right| } \right) \end{aligned}$$

similarly:

$$\begin{aligned} GINITD= & {} 0.5\mathop {\sum }\limits _{i=1}^I \left| {\frac{f_i (x)}{g_i (x)}-1} \right| g_i (x)\nonumber \\= & {} 0.5E_{g(x)} \left( {\left| {\frac{f(x)}{g(x)}-1} \right| } \right) \end{aligned}$$
(2)

GINIT then has the interpretation of half of the first moment or expected value of \(\left| {\frac{f(x)}{g(x)}-1} \right| \) under distribution g(x).Footnote 2 In essence it is a first moment comparator and the sample average provides an immediate estimator with a method of moments interpretation.

2.1 Importance weighted comparisons in a two distribution world

Often interest will focus on particular aspects of the differences between f(x) and g(x) in various regions of x, for example differences at the extremes of a reference distribution or differences at the center of a reference distribution. Alternatively, interest may focus on the differences between f(x) and g(x) in high frequency or low frequency regions of x according to the reference or target distribution. For convenience suppose the reference distribution h(x) has positive support over the whole domain of x then, for some chosen value \(\lambda \), importance weighted versions of GINITD and GINITC may be written as:

$$\begin{aligned} GINITCI= & {} \mathop {\int }\limits _{-\infty }^\infty \left| {f(x)-g(x)} \right| \frac{h(x)^{\lambda }}{E_{h(x)} \big ( {h(x)^{\lambda }} \big )}dx \\= & {} E_{g(x)} \left( {\left| {\frac{f(x)}{g(x)}-1} \right| \frac{h(x)^{\lambda }}{E_{h(x)} \big ( {h(x)^{\lambda }} \big )}} \right) \\= & {} \frac{1}{E_{h(x)} \left( {h(x)^{\lambda }} \right) }E_{g(x)} \left( {\left| {\frac{f(x)}{g(x)}-1} \right| h(x)^{\lambda }} \right) \\ GINITDI= & {} \mathop {\sum }\limits _{i=1}^I \left| {f_i -g_i} \right| \frac{h_i^\lambda }{E_{h(x)} \left( {h_i^\lambda } \right) }\\= & {} E_{g(x)} \left| {\frac{f_i }{g_i }-1} \right| \frac{h_i^\lambda }{E_{h(x)} \big ( {h_i^\lambda } \big )}\\= & {} \frac{1}{E_{h(x)} \left( {h_i^\lambda } \right) }E_{g(x)} \left( {\left| {\frac{f_i }{g_i }-1} \right| h_i^\lambda } \right) \end{aligned}$$

Whereas the Gini Transvariation Measure can be seen as cumulating the absolute difference between the functions over all possible outcomes, the importance weighted versions can be seen as cumulating the “importance” weighted absolute difference between the distributions.Footnote 3 Generally h(x) will be related to g(x) or f(x) and, when h(x) is a regular distribution or a cumulative distribution, the weighting function \(h^{*}(x)=\frac{h(x)^{\lambda }}{E_{h(x)} \left( {h(x)^{\lambda }} \right) }\) when applied to the regular distribution f(x) can be seen to transform it to another distribution \(f^{*}(x) = h^{*}(x)f(x)\) which retains all of the properties of a regular probability distribution (\(f^{*}(x)\ge 0\), \(0\le F^{*}(x)\le 1)\) with aspects of f(x) amplified or muted according to the choice of h(x) and \(\lambda \). Thus the importance weighted version may be written as in Eq. (2) in terms of transformed distributions, retaining all of the properties of that measure (boundedness between 0 and 1, association with the overlap measure including its asymptotic normality, etc.). As a result, the Importance Weighted Gini Transvariation Measure may be written as:

$$\begin{aligned} GINITCI= & {} 0.5\mathop {\int }\limits _{-\infty }^\infty \left| {f^{*}(x)-g^{*}(x)} \right| dx =0.5\mathop {\int }\limits _{-\infty }^\infty \left| {\frac{f^{*}(x)}{g^{*}(x)}-1} \right| g^{*}(x)dx\\= & {} 0.5E_{g^{*}(x)} \left( {\left| {\frac{f^{*}(x)}{g^{*}(x)}-1} \right| } \right) \end{aligned}$$

similarly:

$$\begin{aligned} GINITDI=0.5\mathop {\sum }\limits _{i=1}^I \left| {\frac{f_i^*(x)}{g_i^*(x)}-1} \right| g_i^*(x)=0.5E_{g^{*}(x)} \left( {\left| {\frac{f^{*}(x)}{g^{*}(x)}-1} \right| } \right) \end{aligned}$$
(2a)

In the following \(\lambda = -0.5\) is considered,Footnote 4 a choice inspired by, and in the spirit of, entropic measures of variation such as in [31, 46, 52, 56]. For example, by letting \(h(x) = g(x)\), the continuous version of Thiel’s Entropic Measure (TE) can be related to the continuous version of Pearson’s Chi Squared Dissimilarity Measure (PCHI) as follows:Footnote 5

$$\begin{aligned} TE= & {} \int f(x)\ln \left( {\frac{f(x)}{g(x)}} \right) dx=\int f(x)\ln \left( {\frac{f(x)+g(x)-g(x)}{g(x)}} \right) dx\\= & {} \int f(x)\ln \left( {1+\frac{f(x)-g(x)}{g(x)}} \right) dx\approx \int f(x)\left( {\frac{f(x)-g(x)}{g(x)}} \right) dx\\= & {} \int \left( {\frac{\left( {f(x)-g(x)} \right) ^{2}}{g(x)}} \right) dx=PCHI \end{aligned}$$

Note that the argument under the integral sign in GINITCI can be seen to be the square root of the argument under the integral sign of PCHI. Intuition for Pearson-type measures may be gleaned from noting that:

$$\begin{aligned} PCHI= & {} \int \left( {\frac{\left( {f(x)-g(x)} \right) ^{2}}{g(x)}} \right) dx=\int \left( {\frac{\left( {f(x)-g(x)} \right) ^{2}}{\left( {g(x)} \right) ^{2}}} \right) g(x)dx\nonumber \\= & {} \int \left( {\frac{\left( {f(x)} \right) ^{2}+\left( {g(x)} \right) ^{2}-2f(x)g(x)}{\left( {g(x)} \right) ^{2}}} \right) g(x)dx\nonumber \\= & {} \int \left( {\frac{\left( {f(x)} \right) ^{2}}{\left( {g(x)} \right) ^{2}}+1-2\frac{f(x)}{g(x)}} \right) g(x)dx\nonumber \\= & {} \int \left( {\frac{f(x)}{g(x)}-1} \right) ^{2}g(x)dx\nonumber \\= & {} E_{g(x)} \left[ {\left( {\frac{f(x)}{g(x)}-1} \right) ^{2}} \right] \end{aligned}$$
(3)

PCHI in turn has the interpretation of the expected value under g(x) of \(\big | {\frac{f(x)}{g(x)}-1} \big |^{2}\) yielding an interpretation of the statistic as a second moment measure of \(\big | {\frac{f(x)}{g(x)}-1} \big |\) with the sample average of \(\big | {\frac{f(x)}{g(x)}-1} \big |^{2}\) providing an estimator of the dissimilarity parameter with a method of moments interpretation.Footnote 6

Importance weighting can be seen as weighting the differences between f(x) and g(x) by some monotonic function of the “target” function so that for \(\lambda < 0\), a given difference from a small target plays a bigger role in the calculation than the same order of difference in a correspondingly larger target (for \(\lambda > 0\) the reverse is true). In essence the differences are weighted with respect to some reference distribution (in this case h) and the statistic can thus be considered a “relative” dissimilarity measure where differences are measured relative to the target measure h(x).

Choice of the target distribution is a matter of some consequence, some special cases are obvious. When the target measure h(x) is set equal to either g(x) or f(x), PCHI has the form of a classic Pearson goodness-of-fit test of distributional form or independence where the target distribution is the null hypothesized distribution. Suppose g and f are empirical distributions, such that f(x) first-order stochastically dominates g(x) then measures with a target g(x) are dissimilarity measures relative to the poor distribution and measures with a target f(x) are dissimilarity measures relative to the non-poor distribution. When h(x) is the sample size weighted sum of f(x) and g(x), the statistic has the form of a goodness-of-fit test for homogeneity of distributions, in essence f and g are being compared to some “average” distribution. Generally, when the distributions f and g are not identical, the dissimilarity of the distributions f(x) and g(x) in terms of f(x) will not be the same as the dissimilarity of those distributions in terms of g(x).Footnote 7

3 Indices of dissimilarity in a many distribution world

These ideas can be extended to the comparison of many distributions in either continuous or discrete paradigms. (Intuition suggests that the above statistics will work for parametric, semi parametric and non-parametric representations of the distributions.)Footnote 8 Consider first continuous scalar x with continuous support on the interval \(\left\{ {a, b} \right\} \) and discrete scalar x with support on the set of non-negative integers with a set \(U_K \) of K distributions \(f(x),g(x), h(x), \ldots \) under consideration:

  • For x continuous:

    $$\begin{aligned} DIS_C =\frac{\int _a^b \max \left( {f(x),g(x), h(x), \ldots } \right) dx-\int _a^b \min \left( {f(x),g(x),h(x),\ldots } \right) dx}{K} \end{aligned}$$
  • For x discrete:

    $$\begin{aligned} DIS_D =\frac{\sum _{x=0}^\infty \max \left( {f(x),g(x), h(x), \ldots } \right) -\sum _{x=0}^\infty \min \left( {f(x),g(x), h(x), \ldots } \right) }{K} \end{aligned}$$

Note that \(0 \le DIS \le 1\). In particular, for complete dissimilarity (perfect segmentation in the terminology of Ref. [61]) \(DIS = 1\), and for complete similarity (all distributions identical) \(DIS = 0\). This index may be construed as the distributional analogue of a range statistic for a variable measuring as it does the cumulative distances between the upper and lower extremes of the collection of distributions over the values of x. The asymptotic distribution of Eq. (3) and bootstrapping techniques for standard errors are developed in [5].

Conceptually these indices depend upon two components \(\theta _U \) and \(\theta _L \) given by:

$$\begin{aligned} \theta _U= & {} \int _a^b \max \left( {f(x),g(x), h(x), \ldots } \right) dx \nonumber \\&\left\{ =\sum \limits _{x=0}^\infty \max \left( {f(x),g(x), h(x), \ldots } \right) \hbox { for }x\hbox { discrete}\right\} \nonumber \\ \theta _L= & {} {\int }_a^b \min \left( {f(x),g(x),h(x),\ldots } \right) dx \nonumber \\&\left\{ =\sum \limits _{x=0}^\infty \min \left( {f(x),g(x), h(x), \ldots } \right) \hbox { for }x\hbox { discrete}\right\} \end{aligned}$$
(4)

When dissimilarity is complete (perfect segmentation of all distributions) \(\theta _U =K\) and \(\theta _L =0\). When similarity is complete (perfect overlap of all distributions) \(\theta _U = 1\) and \(\theta _L = 1\). When U is the set of outcome distributions for K inheritance classes, \(M = 1-DIS\) can be interpreted as a mobility index with \(M = 1\) representing perfect mobility (outcome distributions identical for all inheritance classes) and \(M = 0\) representing complete immobility (inheritance class outcome distributions have no common points of support). When U is the set of outcome distributions for K social classes whose alienation is ordinal measured on x, DIS can be considered a polarization measure. If dissimilarity from a base or target distribution, for convenience refer to it as g(x), is of interest, following Tukey [57] for interpretational purposes, one could contemplate:

For x continuous:

$$\begin{aligned}&DIS_C \nonumber \\&\quad =\frac{\int _a^b \max \left( {f(x),g(x), h(x), \ldots } \right) g(x)^{-0.5}dx\!-\!\int _a^b \min \left( {f(x),g(x),h(x),\ldots } \right) g(x)^{-0.5}dx}{K} \end{aligned}$$

For x discrete:

$$\begin{aligned}&DIS_D\nonumber \\&\quad =\frac{\sum _{x=0}^\infty \max \left( {f(x),g(x), h(x), \ldots } \right) g(x)^{-0.5}\!-\!\sum _{x=0}^\infty \min \left( {f(x),g(x), h(x), \ldots } \right) g(x)^{-0.5}}{K} \nonumber \\\end{aligned}$$
(5)

So for example g(x) could be a weighted average of all the other distributions i.e. a mixture where the weights reflected proportions of the population under the corresponding distribution (commonly used in the segregation literature (see for example [43] and references therein). Alternatively, g(x) could be the “poorest” or “richest” distribution in a collection of distributions or the “median” distribution. In effect “inequality in distribution” or degrees of difference from a distribution g(x)Footnote 9 is being measured where g(x) would be the basis of the “importance” weighting function \(g^{*}(x)\) as in \(h^{*}(x)\) above.

These may be compared with the “many distribution” analogue to the Pearson goodness-of-fit test [1] which for K distributions \(f_k (x) \, k=1,\ldots ,K\) may be written in the continuous case as:

$$\begin{aligned} PT=\frac{1}{K-1}\mathop {\sum }_{k=1}^K \int \frac{\left( {f_k (x)-g(x)} \right) ^{2}}{g(x)}dx, \end{aligned}$$

where g(x), the target distribution, is usually of the form:

$$\begin{aligned} g(x)={\sum \limits _{k=1}^{K}} w_k f_k (x)\quad \hbox { where }\;\; \sum \limits _{k=1}^{K} w_k =1 \end{aligned}$$

Following Eq. (3) above this multivariate Pearson Test has a similar Method of Moments interpretation:Footnote 10

$$\begin{aligned} \frac{1}{K-1}\mathop {\sum }_{k=1}^K E_{g(x)} \left[ {\left( {\frac{f_k (x)}{g(x)}-1} \right) ^{2}} \right] . \end{aligned}$$

3.1 Specific dominance based indices

Stochastic dominance tests have been employed in a wide variety of situations, particularly in the empirical wellbeing and finance literatures, where there is a specific theoretical rationale for comparing distributions at a particular order of dominance in order to see how different they are in that particular aspect. In empirical welfare analysis different orders of dominance correspond to specific types of wellbeing measure. In finance different orders of dominance correspond to specific types of risk class.

The condition for the dominance of G() by F() at order J is given by:

$$\begin{aligned} F_J (x)\le G_J (x) \ \forall \ x \in [ {a,b} ]\quad \hbox { and }\quad F_J (x)<G_J (x)\quad \hbox { for some }\;\; x \in [ {a,b} ] \end{aligned}$$

where

$$\begin{aligned} F_i (x)= \int _a^x F_{i-1} (z)dz\quad \hbox { and }\quad F_0 (x)=f(x)\quad (\hbox {similarly for}\; G_i (x)) \end{aligned}$$
(6)

Essentially the condition requires that the functions \(F_J (x)\) and \(G_J (x)\) do not cross so that one, the dominating distribution, is “unambiguously” below the other.Footnote 11

A variety of tests have been proposed for these dominance conditions at various orders. It may readily be seen that successive orders of dominance attach increasing weight/importance to lower values of x. If concern was with differences at high values of x one would work with the condition for the dominance of G() by F() at order J given by:Footnote 12

$$\begin{aligned} F_J (x)\le G_J (x) \ \forall \ x \in [ {a,b} ]\quad \hbox { and }\quad F_J (x)<G_J (x)\quad \hbox { for some }\quad x \in [ {a,b} ], \end{aligned}$$

where

$$\begin{aligned} F_i (x)= {\int }_x^b F_{i-1} (z)dz\quad \hbox { and }\quad F_0 (x)=f(x)\quad (\hbox {similarly for}\; G_i (x)) \end{aligned}$$
(7)

Essentially the condition requires that the functions \(F_J (x)\) and \(G_J (x)\) do not cross so that one, the dominating distribution, is “unambiguously” below the other.

For the indices of similarity for the collection of k distributions of the variable \(x, f(x),g(x), h(x), \ldots \) at the Jth order we construct \(U_J (x) \!=\! {\max }_x ( {F_J (x),G_J (x), H_J (x),\ldots } )\) and \(L_J (x) = {\min }_x ( {F_J (x),G_J (x), H_J (x),\ldots } )\). The maximum range of variation of the collection of distributions at the Jth order is given by the Jth order distance measure:

$$\begin{aligned} DISS(J)=\mathop {\int }\limits _a^b ( {U_J (x)-L_J (x)} )dx \end{aligned}$$

To provide a complete ordering of a collection of distributions at a particular order one could also contemplate “proximity to the boundary” indices (see [4, 5]) for a particular distribution say f(x) of the form:

$$\begin{aligned} PROXU_J ({F_J (x)})= & {} 1-\frac{\int _a^b ( {U_J (x)-F_J (x)})dx}{DISS(J)}\\ PROXL_J ({F_J (x)})= & {} 1-\frac{\int _a^b ( {F_J (x)-L_J (x)} )dx}{DISS(J)}, \end{aligned}$$

where \(PROXU_J ({F_J (x)}) \, \{ {PROXL_J ({F_J (x)})} \}\) measures proximity to the upper {lower} boundary.

3.2 Distribution separation measures

The degree of separation between the classes reflecting increased within class homogeneity or better identified group classification in a polarization sense is of interest. To fix ideas, suppose class distributions are ordered by some location measure, then one minus the average overlap of contiguous classes is a useful measure, as is the overlap of distributions at the extremes. These measures are closely associated with the measures of polarization in Ref. [20],Footnote 13 and the segmentation terminology of Ref. [61] in the following sense.

When the kth pair of contiguous lower and upper class distributions, \(f_L (x)\) and \(f_U (x)\) respectively, do not overlap (i.e. \(\theta _{LUk} =\int \min ( {f_L (x),f_U (x)} )dx=0)\), they are perfectly segmented in the terminology of Ref. [61]. When they overlap perfectly (i.e. \(\theta _{LUk} =\int \min ( {f_L (x),f_U (x)} )dx=1)\) separate distributions cannot be identified. Thus \(1- \theta _{LUk} \) provides a measure of the polarization or distance between two contiguous groups and \(1-\theta =K^{-1}\mathop {\sum }\nolimits _k ( {1- \theta _{LUk} } )\), the average over all pairs of contiguous groups, provides an index of polarity in the collection of groups. When all pairs are perfectly segmented the average of these overlaps will be 0 and \(1-average\, \theta \) will be 1. When all classes overlap perfectly and cannot be separately identified it will be 0. In contrast the extent of overlap of the extreme distributions yields a lower bound to the average overlap, reflects the extent to which the extremes have polarized, and provides a measure of potential for polarization in the collection of classes.

3.3 “Leave one out” measures and tests of exceptionality

Where concern regarding exceptionality (the distinct difference of one distribution from a collection of others) is of interest a “leave one out test” i.e. a change in the dissimilarity or overlap statistic is likely to have particular power. If concern is about differences in the lower regions of the distributions, higher orders of dominance comparators are the relevant instrument since they implicitly weight lower components of the indices more heavily. If concern is about differences in the higher regions of the distributions, higher orders of dominance comparators for the counter-cumulative densities are the relevant instrument since they implicitly weight higher components of the indices more heavily.

In general, when some of the groups are segmented or non-overlapping in distribution with all other groups they may be thought to be exceptional. Leaving them out of the calculus, but not reducing K, would reduce DIS by at most the proportion of the number of segmented groups. If the segmented groups were first order dominated by and first order dominated other distributions in the collection, that is they were internal to the collection, they would be unexceptional and their omission would have no effect on DIS (if K was not reduced). In general, for exceptionality measures, interest focuses on \(\frac{DIS-DIS_O K_1 }{K}\) where \(DIS_O \) relates to the many group dissimilarity measure with \(K-K_1 \) classes omitted. If the omitted classes were perfectly segmented, the biggest contribution to the transvariation of K distributions that the \(K-K_1 \) constituencies could make would be \(\frac{K-K_1 }{K}\), thus \(DIF_O =\frac{K\left( {\frac{DIS-DIS_O K_1 }{K}} \right) }{K-K_1 }\) provides a [ 0, 1 ] bounded measure of the degree of segmentation of a group of \(K-K_1 \) subgroups in a K constituency comparison.

4 An application: comparing the incomes of Aboriginal and non-Aboriginal constituencies in Canada in the twenty first century

Canada’s constitution recognizes three Aboriginal peoples: North American Indians, Inuit, and Metis.Footnote 14 It is well known that Aboriginal peoples in Canada have, on average, lower incomes than non-Aboriginal people. For example, in comparison with non-minority native-born workers with similar characteristics, Aboriginal women faced income and earnings gaps of 10–20% between 1995 and 2005, while Aboriginal men faced gaps of 20–50% [47]. Less commonly known, however, is that there are also significant disparities between Aboriginal identity groups within Canada (NAEDB [44]; NAEDB [45]). The Right Honourable Paul Martin, a former Prime Minister, argued that Canada faced “a moral imperative” to close the income gaps that exist between Aboriginal and non-Aboriginal people—“the descendants of the people who first occupied this land deserve an equal chance to work for and to enjoy the benefits of our collective prosperity” [59, p. vi], prompting a variety of government policies directed at improving the lot of aboriginal peoples.Footnote 15

To check the response to government policies, income gaps have been tracked and documented over time. The previous literature tended to emphasize average incomes; here the focus is on differences in the distributions of incomes over the 8 constituencies of Inuit, Metis, North American Indian and non-Aboriginal of each gender. In the spirit of Paul Martin’s “equal chance” declaration, the issue is very much one of equality of opportunity, that income distributions should be independent of Aboriginal status, i.e. income distributions conditional on Aboriginal status should be identical. An index of the degree of variation in these conditional income distributions provides a measure of the extent to which equality of opportunity prevails in the sense of similar outcomes across Aboriginal and non-Aboriginal communities by gender. Usually this is examined in the context of a \(k \times k\) transition matrix (see for example [26, 50, 53,54,55, 58]). Here the outcome state is continuous i.e. incomes have not been categorized with the consequent loss of information. Furthermore, an adaptation of Tukey’s “Rootgram” approach facilitates extra focus on particular aspects of the distribution, magnifying differences within regions of interest in the comparison distributions. By setting \(\lambda \) at −0.5 and setting h(x) to be equal to one of the probability density or cumulative densities of interest, differences in low frequency regions in the former case and differences in the lower income strata in the latter case are magnified. To exemplify this, the extent to which Canadian Aboriginal and non-Aboriginal groups have similar-dissimilar income distributions and how those differences have progressed over time is studied.

Table 1 Summary statistics, total income, Aboriginal and non-Aboriginal, 2000, 2005 and 2010

4.1 Data

Data from the 2001 and 2006 Census and the 2011 National Household Survey are employed to construct various weighted and unweighted transvariation estimates for the three observation years.Footnote 16 Statistics Canada’s 2001 and 2006 censuses provide detailed data on the social and demographic characteristics of the Canadian population. Each census is composed of two parts: the long-form and the short-form, with fewer questions the short-form census requires less time to complete than the long-form census. In 2001 and 2006, the short-form census was delivered to 100% of households, while the long-form census was distributed to one-fifth of Canadian households. Both surveys were mandatory. The long-form census had a response rate of about 94% in 2006 [48]. In 2011, the short-form census was delivered as usual, but the 2011 National Household Survey (NHS) replaced the long-form census, the main differences being that it was voluntary and distributed to 33% of households. The response rate, unsurprisingly, dropped precipitously to 69% [48].

4.2 Results

The relative positions of Aboriginal and non-Aboriginal constituencies in the income distribution have long been an issue especially with respect to their relative progress. 8 identifiable subgroups were examined for the years 2000, 2005 and 2010: the groups were male or female members of Inuit, Metis, North American Indian and non-Aboriginal Societies.Footnote 17 Nominal incomes for individuals over the age of 15 were employed in the analysis. Note there was some top and bottom coding. Values greater than the 99th percentile in each geographical region and gender were top coded and some negative values were down coded to a low threshold.

Table 1 presents basic statistics for income by identity and gender. The data show that male incomes are higher than female incomes for all four groups across all 3 years. Non-Aboriginal people face the largest gap between male and female incomes (Table 2). Metis have the next highest male–female relative income measure, followed by North American Indians and, finally, the Inuit. One surprising result is the consistency of the relative male–female income gap across this decade for all four groups. There were slight declines for non-Aboriginal people and North American Indians, but the Metis and the Inuit had the same level of relative male–female income in all 3 years.

Table 2 Relative male–female total income, Aboriginal and non-Aboriginal, 2000, 2005, and 2010
Table 3 Coefficients of variation, total income, Aboriginal and non-Aboriginal, 2000, 2005, 2010

Coefficients of variation (Table 3) show that in 2000, non-Aboriginal people of both genders had lower coefficients of variation than all three Aboriginal groups. In 2005, the reverse holds true. The data for males in 2010 suggest that non-Aboriginal people continued to have higher coefficients of variation, but the gap was declining. For females, the picture is less clear, as the highest coefficient of variation was among the Inuit, while the lowest coefficient of variation was among the Metis.

The data also show that coefficients of variation in 2005 are higher than coefficients of variation in 2000, while coefficients of variation in 2010 are lower for non-Aboriginal people, North American Indians, and female Metis, but higher for male Metis and the Inuit. The log income distributions for the 8 constituencies together with upper and lower boundaries for the years 2000, 2005 and 2010 respectively are illustrated in Figs. 1, 2 and 3.

Fig. 1
figure 1

Distribution of log income, Aboriginal and Non-Aboriginal by Gender in 2000

Fig. 2
figure 2

Distribution of log income, Aboriginal and Non-Aboriginal by Gender in 2005

Fig. 3
figure 3

Distribution of log income, Aboriginal and Non-Aboriginal by Gender in 2010

Table 4 reports the Unweighted Transvariation Measures for 8 individual constituent distributions (combined), 4 female constituent distributions, 4 male constituent distributions and 4 gender integrated constituent distributions. In addition it reports the corresponding Transvariation Measures for the Aboriginal constituencies. Firstly, from the full sample results, note that there is substantially more distributional variability in the males than in the females over all 3 years and there is a general trend in the reduction of distributional variability over time. Such trends in Aboriginal group results are less discernable. Over the whole period, the variation across Aboriginal categories in their distributions is much lower than for the full collection, male distributional differences have tightened somewhat in 2010. But considering the 8 distributions as a collection, there has been a considerable tightening, signalling the closing gaps between the extreme distributions. This overall may be seen as an improvement of equality of opportunity indexed by \(1 - DIS\).

Turning to the exceptionality results Table 5 reports the exceptionality measures for each of the constituencies. In the context of 8 constituencies the non-Aboriginal groups are the most exceptional, that is to say their distribution contributes most to the transvariation measure, generating the biggest drop in transvariation when it is left out, the impact is greater when male, female and male–female combined distributions are considered but its omission is ubiquitously more important than the omission of any other group.

With regard to trends, the overriding trend in the 8 constituency analysis is a reduction in the exceptionality of the non-Aboriginal group and the increasing exceptionality of the North American Indian group. These results are a consequence of the non-Aboriginal group being the primary constituency in the lower boundary (i.e. the highest income group) and North American Indians being the primary constituency in the upper boundary (i.e. the lowest income group). Thus it can be deduced that improvements in equality of opportunity have been achieved largely as a result of reductions in the exceptionality of the non-Aboriginal constituencies.

Table 4 Unweighted transvariation measures, Aboriginal and Non-Aboriginal by Gender in 2000, 2005, 2010
Table 5 Exceptionality of Aboriginal and non-Aboriginal people, change in transvariation measures, total income, 2000, 2005, 2010
Table 6 Weighted transvariation measures, total income, Aboriginal and non-Aboriginal, 2000, 2005, 2010

4.3 Importance weighted transvariation measures

Importance weighting presents an opportunity to assess aspects of transvariation with respect to particular aspects of distributions relevant to particular groups. First importance weighting with respect to the full societal distribution is explored, then with respect to the Aboriginal communities, then with respect to non-Aboriginal communities. Two types of weighting are entertained, weighting with respect to the probability density function and weighting with respect to the cumulative density function. The former will emphasize variations in low density regions of the focus distribution which, given the foregoing figures, will be in the low and high income regions of the focus distribution. The latter emphasizes the low income regions of the focus distribution. The results are reported in Table 6.

The full societal PDF focus reveals increased variation in the tails of the distribution in the middle of the decade, which become attenuated at the end of the decade. The gender specific analysis reveals males having the same trend, whereas females have the opposite trend, which nets out when the genders are aggregated. The cumulative density results are similar suggesting this is in essence a lower tail issue. Turning to the non-Aboriginal and Aboriginal focus, an increasing trend in distributional variation in the tails is observed in virtually all categories of analysis suggesting that while there is increasing equality of opportunity overall it is occurring mainly in the middle of the distribution and there is increasing variation in incomes (i.e. increasing inequality of opportunity) in the tails of the distribution. That is to say that, while average Metis, Inuit, North American Indian and non-Aboriginal people are becoming increasingly similar in their outcomes the low and high income generators are not.

In essence while some progress in equality of opportunity has been achieved over the decade it has not been the experience of low-income earners in the Aboriginal communities.

5 Conclusions

A methodology for analyzing the degree of variation in a collection of distributions has been presented which facilitates exploration of particular aspects of the differences between many distributions. The techniques were employed to study differences between the income distributions of males and females drawn from Metis, Inuit, North American Indian and non-Aboriginal constituencies in Canada in the first decade of the twenty first century. It was found that, while the distributions were becoming more alike (which may be interpreted as increasing equality of opportunity) this was occurring in the middle of the income distributions and that, at their extremes the distributions were diverging, suggesting that such improvements in equality of opportunity were not for all, that high and, in particular, low Aboriginal income earners were experiencing diminishing equality of opportunity over the period. An exceptionality analysis revealed that these results were largely driven by non-Aboriginal income earners contributing the most to the variation in income distributions, followed by the North American Indian constituency, the former at the high end of the collection of distributions, the latter at the bottom.