Abstract
This paper presents a new family of parametric Lorenz curves based on the arctan function and adding a parameter \(-\infty <\alpha <\infty \), \(\alpha \ne 0\) to an initial Lorenz curve \(L_0(p),\,0\le p\le 1\). The particular case obtained when \(\alpha \) tends to zero is reduced to the initial Lorenz curve \(L_0(p)\). The corresponding distribution functions are shown. Some inequality measures are calculated, and a method to compute the Gini index based on the use of the inverse of the Lorenz curve is proposed. Finally, an application to two well-known data sets is presented and a good fit is obtained.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
This paper introduces a parametric family of Lorenz curves obtained by a general method, based on adding a parameter \(-\infty <\alpha <\infty \), \(\alpha \ne 0\) to an initial Lorenz curve \(L_0(p),\,0\le p\le 1\), using the arctan function. The particular case obtained when \(\alpha \) tends to zero is reduced to the initial Lorenz curve \(L_0(p)\).
The development of new functional forms of Lorenz curves has been an attractive area of research in recent decades; see, for example, Kakwani (1980), Aggarwal and Singh (1984), Gupta (1984), Ortega et al. (1991), Basmann et al. (1990), Chotikapanich (1993), Ogwang and Rao (1996), Sarabia et al. (1999, 2010). For a recent review of Lorenz curves and income distributions, see Chotikapanich (2008). These methods also provide new functional forms of Leimkuhler curves, which are interesting in terms of informetrics and in particular regarding concentration aspects in this field (see Burrell 1992, 2005; Sarabia and Sarabia 2008; Sarabia et al. 2010, among others).
The densities and distribution functions corresponding to the new Lorenz curves and the corresponding Gini and Pietra inequality indices are shown in closed forms for some particular cases. A method based on the use of the inverse of the initial Lorenz curve is given, which facilitates the computation of the Gini index with the family proposed here.
In this study, we use two data sets (1977 and 1990) from the US Current Population Survey (CPS), considered in Ryu and Slottje (1996), and compare the results with those of the initial Lorenz curves examined.
The structure of this paper is as follows. In Sect. 2, we describe the new family of \(\arctan \) Lorenz curves and the corresponding Leimkuhler curves. Some particular cases obtained by starting with an initial Lorenz curve \(L_0(p)\) are shown. In Sect. 3, the Gini and Pietra indices are obtained, together with the population functions for some cases. In Sect. 4, we compare the performance of the proposed Lorenz curves with that of the initial ones by fitting them to the two data sets, and finally, in Sect. 5, our main conclusions are presented.
2 The new family of Lorenz curves
This section begins with the definition of the Lorenz curve provided by Gastwirth (1971) in accordance with the original proposal by Pietra (1915). Thus:
Definition 1
Given a distribution function F(x) with support in the subset of the positive real numbers and with finite expectation \(\mu \), we define a Lorenz curve as
where \(F^{-1}(x)=\sup \left\{ y:F(y)\le x \right\} \).
A characterization of the Lorenz curve, which is well known in the literature, is given by the following result.
Theorem 1
Assume that L(p) is defined and continuous in the interval [0, 1] with second derivative \(L^{\prime \prime }(p)\). The function L(p) is a Lorenz curve if and only if
The main result of this paper is expressed in the following theorem.
Theorem 2
Let \(L_0(p)\) be a Lorenz curve, \(-\infty <\alpha <\infty \), \(\alpha \ne 0\), a real parameter and consider the transformation
Then, \(L_{\alpha }(p)\) is also a Lorenz curve.
Proof
Simple algebra provides that \(L_{\alpha }(0)=0,\;L_{\alpha }(1)=1\),
and \(L_{\alpha }(p)<p\). Then, if \(L_0(p)\) is a genuine Lorenz curve, expression (3) possesses the proper convexity and slope constraints for us to assure that it always lies in the lower triangle of the unit square, and therefore, \(L_{\alpha }(p)\) represents a genuine Lorenz curve. \(\square \)
Using the well-known result that establishes that
(3) can be rewritten in a more compact form as
By taking in (3) or alternatively in (4) the limit when the parameter \(\alpha \) tends to zero and applying L’Hospital’s rule, it is straightforward to derive that the initial Lorenz curve \(L_0(p)\) is obtained as a special case, i.e., \(L_{\alpha }(p)\rightarrow L_0(p)\) when \(\alpha \rightarrow 0\). Thus, the methodology proposed here can be considered as a mechanism for adding a parameter to an initial Lorenz curve and therefore a means of obtaining a more flexible Lorenz curve.
Other ways to write \(L_{\alpha }(p)\) given in (4) can be obtained by using the following representation of the \(\arctan \) function (see Castellanos 1988):
Here \(_2F_1(a,b;c,z)\) represents the hypergeometric function which has the integral representation
and where \(\varGamma (\cdot )\) is the Euler gamma function.
Approximations to the arctan function can be obtained using second- and third-order polynomials and simple rational functions (see Rajan et al. 2006 for details), and it is thus obtained that \(\arctan \left( (1+x)/(1-x)\right) \approx \pi (x+1)/4\). Applying this to (3) and after some algebra, we have
Observe that the right-hand side in (7) is a genuine Lorenz curve and coincides with expression (27) in Sarabia et al. (2010). Additionally, the Aggarwal and Singh Lorenz curve (see Aggarwal and Singh 1984; Arnold 1986) is obtained from (7) when \(L_0(p)=p\). The mechanism proposed here is more general that the one proposed in Sarabia et al. (2010).
Expression (7) can also be obtained by considering the ordered sequence of Lorenz curves given by
where n is an integer. It is possible to build a new family of Lorenz curves beginning from (8), but now assuming that the powers \(\{1, 2, \dots , n,\dots \}\) are not fixed, and are distributed according to a convenient discrete random variable with probability mass function \(P_j = Pr(X = j), j = 1, 2, \dots \). In the particular case that \(P_j=1/(1+\alpha )\left( \alpha /(1+\alpha )\right) ^{n-1}\), \(\alpha >0\), i.e., the geometric distribution, the family of Lorenz curves gives (7).
It is known that the Lorenz curve determines the distribution of X up to a scale factor transformation, since \(F^{-1}(x)=\mu L^{\prime }(x)\). Moreover, the relation
determines the relationship between the Lorenz and the Leimkuhler curves (see Sarabia and Sarabia 2008 and Sarabia et al. 2010, among others). This curve plays an important role in informetrics (see, for instance, Burrell 1992, 2005). Therefore, from (3) and (9), we can also define a family of \(\arctan \) Leimkuhler curves starting from an initial Lorenz curve \(L_0(p)\), given by
2.1 Lorenz ordering
Lorenz ordering is an important aspect in the analysis of income and wealth distributions. If we define L to be the class of all nonnegative random variables with positive finite expectation, the Lorenz partial order \(\le _L\) on the class L is defined by
If \(X\le _L Y\), then X exhibits less inequality than Y in the Lorenz sense. In the next result, we show that family (3) is ordered with respect to parameter \(\alpha \).
Proposition 1
The Lorenz curve \(L_{\alpha }(p)\) is ordered with respect to \(\alpha \), i.e., if \(|\alpha _1|\le |\alpha _2|\), \(-\infty <\alpha _1,\alpha _2<\infty \), \(\alpha _1,\alpha _2\ne 0\), then \(L_{|\alpha _1|}(p)\ge L_{|\alpha _2|}(p)\), for \(0\le p\le 1\).
Proof
After computing the derivative of the logarithm of (3), then the sign of \(\mathrm{d}L_{\alpha }(p)/\mathrm{d}\alpha \) depends on the sign of
Now, using the following inequalities
it is simple to see that \(\varPhi _{\alpha }(p)<0\).
Hence, the result. \(\square \)
The following result sustains that the equality is obtained, i.e., X exhibits the same inequality as Y, when \(\alpha _1=-\alpha _2\).
Proposition 2
It is verified that \(L_{\alpha }(p)=L_{-\alpha }(p)\), for all \(-\infty <\alpha <\infty \), \(\alpha \ne 0\) and \(0\le p\le 1\).
Proof
Self-evident. \(\square \)
2.2 New functional forms of Lorenz curves
In order to derive new functional forms of Lorenz curves, we now consider the following initial Lorenz curves: egalitarian, Aggarwal and Singh Lorenz curve and Pareto Lorenz curve.
The \(\arctan \) egalitarian Lorenz curve is obtained in (4), by replacing the initial Lorenz curve with \(L_0(p)=p\). Thus, it is given by
The \(\arctan \) Aggarwal and Singh Lorenz curve is obtained in a similar way, replacing the initial Lorenz curve (see Aggarwal and Singh 1984; Arnold 1986) with \(L_0(p)=p/(1+\theta (1-p)),\quad \theta >0\), and therefore we have
where \(\theta >0,\, -\infty <\alpha <\infty ,\, \alpha \ne 0\).
Consider now the Pareto Lorenz curve
from which we obtain the \(\arctan \) Pareto Lorenz curve
Finally, by taking as the initial one the Chotikapanich Lorenz curve given by \(L_0(p)=(\exp (\theta p)-1)/(\exp (\theta )-1),\;\theta >0\), we obtain the \(\arctan \) Chotikapanich Lorenz curve
Of course, other \(\arctan \) Lorenz curves can be obtained by replacing \(L_0(p)\) in (4) with other initial Lorenz curves, such as the Gupta or generalized Pareto Lorenz curves. We chose the above initial Lorenz curves because, as discussed in the next section, closed-form expressions can be obtained for some inequality measures and population functions.
3 Inequality measures and population functions
The corresponding Gini and Pietra indices can be computed straightforwardly when the egalitarian and Aggarwal initial Lorenz curves are chosen as \(L_0(p)\).
3.1 Gini and Yitzhaki indices
The Gini coefficient (also known as the Lorenz concentration ratio) is a measure (degree of concentration) of the inequality of a variable in a distribution of its elements, on a scale from 0 to 1. If \(|\alpha |<1\), \(\alpha \ne 0\), and using the following representation of the \(\arctan \) function
then the Gini index, which is defined as
can be written as
When \(|\alpha |>1\), \(\alpha \ne 0\), more algebra is required, as we wish to obtain a closed form for the Gini index. In this case, and when the inverse of the initial Lorenz curve can be obtained simply, the Gini index is derived from the following result
Proposition 3
The Gini index for the Lorenz curve in (3) is given by
for \(-\infty <\alpha <\infty \), \(\alpha \ne 0\). Here, \(\tan \) is the circular tangent function and \(L_0^{-1}(\cdot )\) is the inverse of the initial Lorenz curve.
Proof
By computing the inverse function of the Lorenz curve in (3) and using a result given by Anderson (1970), we have
Now, by computing the inverse of the Lorenz curve \(L_{\alpha }(p)\), we obtain the result after some simple algebra. \(\square \)
Expression (15) facilitates calculation of the Gini index, instead of using expression (14), especially when the inverse of the initial Lorenz curve can be computed straightforwardly.
For example, if we assume that the initial Lorenz curve is the egalitarian Lorenz curve then, by using (15), the Gini index is given by
This result can also be obtained by performing integration by parts, taking into account that
An important generalization of the Gini index was proposed by Yitzhaki (1983), who suggested the generalized Gini index, which is defined as
where \(\nu >1\) and L(p) is the Lorenz curve. Of course, if \(\nu =2\), we obtain the Gini index. When \(L_0(p)=p\), after some algebra, we obtain that the Yitzhaki index is given (see “Appendix”) by
In the case of the Aggarwal and Singh initial Lorenz curve, using (15), the Gini index is given by
Then, the Gini index (see “Appendix”) is expressed as
Finally, assume the classical Pareto Lorenz curve as the initial Lorenz curve, and again using (15), the Gini index is given by
The above integral is developed in the “Appendix,” and the Gini index is found to be
Using numerical integration techniques, Gini and Yitzhaki indices can also be calculated when other Lorenz curves are assumed as \(L_0(p)\).
3.2 Pietra index
An interesting but less well-known index of inequality is the Pietra index, given by the proportion of total income that would need to be reallocated across the population to achieve perfect equality in income. This proportion is given by
and corresponds to the maximal vertical deviation between the Lorenz curve and the egalitarian line (Pietra 1915; Frosini 2012 calls this same index Pietra–Ricci index, owing to the extensive study made by Ricci (1916) on the same subject). Frosini (2005) also provides a simple graphical representation of this index.
Differentiating \(p-L_{\alpha }(p)\) and using (3), we find that the Pietra index is attained for a value of p satisfying the equation
In particular, when \(L_0(p)=p\), the maximum is attained when
Then, the Pietra index is given, in this case, by
When the initial Lorenz curve considered is the Aggarwal and Singh Lorenz curve, the maximum is attained when
and the Pietra index is then
Finally, for the Chotikapanich Lorenz curve, the Pietra index is
where \(p_0\) is derived from
Numerical computation can be used to obtain the Pietra index in other cases, when the initial Lorenz curve assumed is other than the egalitarian and Aggarwal and Singh Lorenz curves.
3.3 Population functions
In some particular cases, closed-form expressions can be obtained for the distribution functions. For example, if we assume that \(L_0(p)=p\) we have, if \(\alpha <0\)
and
if \(\alpha >0\), where \(\kappa _1(x;\mu ,\alpha ) =\sqrt{\frac{\mu \alpha }{x\arctan \alpha }-1}\) and \(\kappa _2(\mu ,\alpha ) =\frac{\mu \alpha }{(1+\alpha ^2)\arctan \alpha }\). The corresponding probability density functions are
and
for \(\alpha >0\) and \(\alpha <0\), respectively.
Let \(L_0(p)\) be the Aggarwal and Singh Lorenz curve. In this case, if \(\alpha <0\)
and
if \(\alpha >0\), where
The corresponding probability density functions are
and
for \(\alpha >0\) and \(\alpha <0\), respectively.
Finally, for the arctan Chotikapanich Lorenz curve, the population function becomes
where
begin \(-\infty <\alpha <\infty \), \(\alpha \ne 0\) and
4 Numerical application
To compare the performance of the functional forms given in (10), (11) and (12), we used the US data (for 2009 and 2013) obtained from the US Census Bureau, Current Population Survey, 2014 Annual Social and Economic Supplement (see “Appendix, Tables 5 and 6”). Three methods of estimation are considered, as described below.
4.1 Nonlinear least squares estimators
These are defined by the estimators which minimize the sum of the squared differences between the predicted and observed values. For a particular Lorenz curve \(L_{\alpha }(p)\), the minimization is associated with the expression
where the points \((p_i,L_{\alpha }(p_i))_{i=1}^n\) are available from an empirical Lorenz curve.
From the approximation given in (7), we consider as initial estimates those obtained by least squares, replacing \(L_0(p)\) for the classical expression and in every case mapping from the observations to the estimated parameters. This expression can also be employed to obtain estimates by the method proposed by Castillo et al. (1998). In this case, we begin by considering a single point \((p_i, q_i)\) of the empirical Lorenz curve, and by substituting in (7), we obtain the simple estimate for \(\alpha \) given by
where \(\widehat{\phi }\) is the least squares estimate obtained from the classical Lorenz curve, which depends on parameter \(\phi \) (which is a vector of parameters when the classical Lorenz curve depends on more than one parameter). By combining all the initial estimators (16) using a function such as the mean or median, the final estimators are obtained. For example, if we use the mean function, the final estimation of \(\alpha \) will be \(\widehat{\alpha }=\frac{1}{n}\sum _{i=1}^{n}\widehat{\alpha }_i\).
Finally, the results for the two data sets, 2009 and 2013, are shown in Table 1. The parameter estimates, the mean squared error (MSE) and the maximum absolute error (MAX) were computed for the two data sets considered. The corresponding table shows that the new models provide better results in terms of smaller MSE, MAX, Gini and Pietra indices (the empirical Gini, computed according to Brown’s formula, and Pietra indices give the results 0.450007 and 0.324733, respectively, for the 2003 data and 0.457607 and 0.330401 for the 2013 data) with respect to the initial Lorenz curves considered, and that the best fit is obtained with the new functional forms proposed.
Figure 1 presents a graphical comparison between the empirical Lorenz curves and the corresponding estimated Lorenz curves based on the nonlinear least squares estimators for the Egalitarian and Pareto cases.
4.2 Maximum likelihood estimation based on the use of the population function
Maximum likelihood estimation based on the use of the population function was also studied, using the cumulative distribution functions given in Sect. 3.3. When data are grouped, let \(n_i\) be the number of observations in the interval \((c_{j-1},c_j]\). The log-likelihood function is then,
where n is the sample size and \(\phi \) the parameter/s to be estimated. See Chotikapanich (2008) for details. From Table 2, we can see that the arctan model provides the value of the maximum of the log-likelihood function in a better way than does the Dirichlet distribution.
Because there is a mapping from the Lorenz curve to the density of the data and in order to correct standard errors for model misspecification, we have estimated the parameters of interest by maximizing the log-likelihood and obtained robust (sandwich) standard errors. See Freedman (2006) for details.
Finally, when the population function associated with a given Lorenz curve is not known, estimation based on the use of the Dirichlet distribution is adequate for comparing different models (see Chotikapanich and Griffiths 2002).
4.3 Model validation
For the situation in which the models are non-nested, a Vuong test was conducted to compare the estimates of the different Lorenz curves. In this regard, we test the null hypothesis that the two models are equally close to the actual model, against the alternative that one of them is closer (Vuong 1989). The z-statistic is
where \(\widehat{\theta }_1\) and \(\widehat{\theta }_2\) are vectors of the estimated parameters and
where f and g represent the probability density functions of the two models to be compared, respectively.
Due to the asymptotically normal behavior of the Z statistic, the null hypothesis is rejected in favor of the alternative that f occurs with a significance level \(\alpha ,\) when \(Z>z_{1-\alpha }\), where \(z_{1-\alpha }\) is the \((1-\alpha )\) quantile of the standard normal distribution.
To work with this test, we choose a critical value from the standard normal distribution that corresponds to the desired level of significance (e.g., for \(c = 1.96\); \(\Pr (z\ge |\pm c|)=0.05\)). Then, if \(z>c\), we reject the null hypothesis that the models are the same, in favor of the alternative that f is better than g. Thus, if \(z<c\), we reject the null hypothesis that the models are the same in favor of the alternative that g is better than f, while if \(z\le c\), we cannot reject the null hypothesis that the models are the same. Under this criterion, and from Table 3, we conclude that the classical Aggarwal and Singh Lorenz curve performs all the arctan models proposed and that the Chotikapanich Lorenz curve performs the arctan Egalitarian Lorenz curve. In the remaining cases, the arctan models are better than the Pareto and the Chotikapanich Lorenz curves.
Finally, we examined whether likelihood ratio tests suggested that nested versions were adequate. This test was computed, and the results obtained are shown in Table 4. As we can see, the arctan model performs the classical model.
5 Conclusions
The proposed family of Lorenz curves seems to be a worthy addition to the existing class of single parameter Lorenz curves. The family was applied to two data sets with satisfactory results, using least squares and maximum likelihood. Thus, the new specification is well capable of modeling income data.
References
Aggarwal V, Singh R (1984) On optimum stratification with proportional allocation for a class of Pareto distributions. Commun Stat Theory Methods 13:3017–3116
Anderson N (1970) Integration of inverse functions. Math Gaz 54(387):52–53
Arnold BC (1986) A class of hyperbolic Lorenz curves. Sankhyā: Indian J Stat, Ser B 48(3):427–436
Basmann RL, Hayes KL, Slottje DJ, Johnson JD (1990) A general functional form for approximating the Lorenz curve. J Econom 43:77–90
Burrell QL (1992) The Gini index and the Leimkuhler curve for bibliometric processes. Inf Process Manag 28:19–33
Burrell QL (2005) Symmetry and other transformation features of Lorenz/Leimkuhler representations of informetric data. Inf Process Manag 41:1317–1329
Castellanos D (1988) The ubiquitous pi. Math Mag 61:67–98
Castillo E, Hadi AS, Sarabia JM (1998) A method for estimating Lorenz curves. Commun Stat-Theory Methods 27:2037–2063
Chotikapanich D (1993) A comparison of alternative functional forms for the Lorenz curve. Econ Lett 41:129–138
Chotikapanich D (2008) Modeling income distributions and Lorenz curves. Springer, Berlin
Chotikapanich D, Griffiths WE (2002) Estimating Lorenz curves using a Dirichlet distribution. J Bus Econ Stat 20(2):290–295
Freedman DA (2006) On the so-called “Huber sandwich estimator” and “robust standard errors”. Am Stat 60(4):299–302
Frosini BV (2005) Inequality measures for histograms. Statistica 65:27–40
Frosini BV (2012) Approximation and decomposition of Gini, Pietra-index and Theil inequality measures. Empir Econ 43:175–197
Gastwirth JL (1971) A general definition of the Lorenz curve. Econometrica 39:1037–1039
Gupta MR (1984) Functional forms for fitting the Lorenz curve. Econometrica 52:1313–1314
Kakwani N (1980) On a class of poverty measures. Econometrica 48:437–446
Ogwang T, Rao URG (1996) A new functional form for approximating the Lorenz curve. Econ Lett 52:21–29
Ortega P, Martín G, Fernandez A, Ladoux M, García A (1991) A new functional form for estimating Lorenz curves. Rev Income Wealth 37:447–452
Pietra G (1915) Delle relazioni tra gli indici di variabilit. Atti Regio Istituto Veneto 74(II):775–792
Rajan S, Wang S, Inkol R, Joyal A (2006) Efficient approximations for the arctangent function. IEEE Signal Process Mag 23(3):108–111
Ricci U (1916) Lindice di variabilit e la curva dei redditi. Giornale degli economisti e Rivista di statistica 53:177–228
Ryu HK, Slottje DJ (1996) Two flexible functional form approaches for approximating the Lorenz curve. J Econom 72:251–274
Sarabia JM, Castillo E, Slottje DJ (1999) An ordered family of Lorenz curves. J Econom 91:43–60
Sarabia JM, Gómez-Déniz E, Sarabia M, Prieto F (2010) A general method for generating parametric Lorenz and Leimkuhler curves. J Informetr 4:424–539
Sarabia JM, Sarabia M (2008) Explicit expressions for the Leimkuhler curve in parametric families. Inf Process Manag 44:1808–1818
Vuong Q (1989) Likelihood ratio tests for model selection and non-nested hypotheses. Econometrica 57:307–333
Yitzhaki S (1983) On an extension of the Gini inequality index. Int Econ Rev 24:617–628
Acknowledgments
The authors thank the two anonymous referees and the Associate Editor for their valuable comments and suggestions. EGD was partially funded by Grant ECO2013-47092 (Ministerio de Economía y Competitividad, Spain).
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
To compute the integral
we perform integration by parts, which gives
The above integral is obtained by making the change of variable \(\omega =1-p\), and the result is obtained after some algebra, taking into account (6).
To obtain the integral
we make the change of variable \(\omega =\alpha \theta -\tan y\) and thus obtain the rational integral
which is simple to calculate.
In order to obtain the integral \(\int _{0}^{\arctan \alpha }(\tan y)^{1/k}\,\mathrm{d}y\) we make the change of variable \(\omega =\frac{1}{\alpha ^2}\tan ^2 y\), giving the integral
From which the result is obtained after some algebra, taking into account (6).
Rights and permissions
About this article
Cite this article
Gómez-Déniz, E. A family of arctan Lorenz curves. Empir Econ 51, 1215–1233 (2016). https://doi.org/10.1007/s00181-015-1031-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00181-015-1031-y