Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

The present debate on well-being measurement is clearly pointing out that a valuable evaluation process has to take into account many different and complementary aspects, in order to get a comprehensive picture of the problem and to effectively support decision-making. Assessing well-being requires sharing a conceptual framework about its determinants and about society and needs the identification of the most consistent and effective methodologies for building indicators and for communicating purposes. From a statistical perspective, one of the critical points concerns the preservation of the true nature of the socio-economic phenomena to be analysed. This calls for an adequate methodological approach. Several socio-economic phenomena have an intrinsic ordinal nature (e.g. material deprivation, democratic development, employment status), and correspondingly, there has been an increasing availability of ordinal datasets. Nevertheless, ordinal data have been often conceived as just a rough approximation of truly numerical and precise, yet non-observable, features, as if a numerical latent structure would exist under ordinal appearances. As a result, the search for alternative statistical procedures has been slowed down, and many epistemological, methodological and statistical problems regarding ordinal data treatment are still open and unsolved:

  1. 1.

    Methodological approaches: between objectivity, subjectivity and arbitrariness. The epistemological research of the last century has focused on the role of the subject in knowledge production and has clearly showed how pure objectivism cannot account for the knowledge process, even in scientific disciplines. This is particularly evident when observing and analysing socio-economic phenomena. Given the complexity and the nuances of socio-economic issues, data can often be considered as a (fragmented) “text” to be “read” by the researcher, in search for a “sense” and a structure in it. This “sense structuring” process is not an arbitrary one, but necessarily involves some subjectivity. To make an example, think about the issue of defining poverty thresholds in deprivation studies, both in a monetary and in a multidimensional setting, with the consequences that different choices have in the final picture. Admittedly, in many applied studies, subjectivity is generally felt as an issue to be removed, and many evaluation procedures are designed to accomplish this task. Ironically, removing subjectivity is not an objective process and often produces arbitrary results. Thus, it is important to distinguish between a necessary “objectivity” of the research methodology (e.g. observation and data collection procedures) and an unavoidable “subjectivity” related, for instance, to the definition and choice of the conceptual framework and the analytical approaches. The real methodological issue is not removing subjectivity; rather, it is building a sound statistical process, where subjective choices are clearly stated and their consequences can be clearly worked out in a formal and unambiguous way.

  2. 2.

    Ordinal data: between accuracy and ambiguity. A great part of the methodological and statistical efforts has been dedicated to the issue of making measures quantitatively more precise. In practice, this has often been turned into applying multivariate statistical tools to ordinal data, after transforming, or interpreting, them in cardinal terms, through more or less sophisticated scaling procedures. These procedures may sometimes lead to useful results, but they are often quite questionable, not being consistent with the intrinsic nature of data. De facto, the efforts for getting more precise measures have the effect of frequently forcing the true nature of socio-economic phenomena. On the contrary, it could be wise to realize that the great part of socio-economic phenomena is characterized by nuances and “ambiguities”, which are not obstacles to be removed, but often represent what really matters.

  3. 3.

    Ordinal data: technical issues. Transformed or not in quantitative terms, ordinal data are generally submitted to traditional statistical tools, typically designed for quantitative data analysis and usually based on the analysis of linear structures. The results are quite arbitrary and questionable, since the data are forced into a conceptual and technical framework which is ultimately poorly consistent. Although these problems are well known, and new methodologies are continuously being developed, they are still unsolved. Basically, it can be asserted that the issue of ranking and evaluation in an ordinal setting is still an open problem, even from a pure data treatment point of view.

Motivated by these issues and by the relevance of the topic, in this chapter we introduce new tools for ranking and evaluation of ordinal data, with the aim to overcome the main problems of the classical methodologies and, particularly, of the composite indicator approach. We address the evaluation problem through a benchmark approach. Each statistical unit in the population is described in terms of its profile, that is, in terms of the sequence of its scores on the evaluation dimensions; profiles are then assessed against some reference sequences, chosen as benchmarks, to get the evaluation scores. We address the comparison of profiles to benchmarks in a multidimensional setting by using tools and results from partially ordered set theory (poset theory, for short). Indeed, through poset tools, sequences of scores can be assessed without involving any aggregation of the underlying variables since the evaluation is performed by exploiting the relational structure of the data, which involves solely the partial ordering of the profiles. The remainder of this chapter is organized as follows. Section 2 gives a brief account of the composite indicator approach, highlighting its main criticalities, particularly in the ordinal case. Section 3 introduces a few basic concepts from poset theory. Section 4 describes the basic evaluation strategy and the procedure to compute the evaluation scores. Section 5 tackles the problem of “weighting” evaluation dimensions. Section 6 specializes the methodology to the fundamental case of binary variables. Section 7 concludes. The aim of this chapter is primarily methodological, leaving to future works the systematic application of the evaluation procedure to real data. Nevertheless, for sake of clarity, all the basic concepts and the key ideas behind the methodology are introduced by examples, all of which pertain to material deprivation and multidimensional poverty.

2 The Composite Indicator Approach and Its Critical Issues

Addressing the complexity of socio-economic phenomena for evaluation aims is a complex task, often requiring the definition of large systems of indicators. Frequently, the complexity of the indicator system itself leads to the need of computing composite indicators in order to (Noll 2009):

  • Answer the call by “policy makers” for condensed information.

  • Improve the chance to get into the media.

  • Allow multidimensional phenomena to be synthesized.

  • Allow easier time comparisons.

  • Compare cases (e.g. nations, cities, social groups) in a transitive way (e.g. through rankings).

Despite its spreading, the composite indicator approach is currently being deeply criticized as inappropriate and often inconsistent (Freudenberg 2003). Critics point out conceptual, methodological and technical issues, especially concerning the difficulty of conveying into unidimensional measures, all the relevant information pertaining to phenomena which are complex, dynamic, multidimensional and full of ambiguities and nuances. The methodology aimed at constructing composite indicators is very often presented as a process needing specific training, to be performed in a scientific and objective way. Actually the construction procedure, even though scientifically defined, is far from being objective and aseptic. Generally, it comprises different stages (Nardo et al. 2005; Sharpe and Salzman 2004), each introducing some degree of arbitrariness to make decisions concerning:

  • The analytical approach to determine the underlying dimensionality of the available elementary indicators and the selection of those to be used in the evaluation process.

  • The choice of the weights used to define the importance of each elementary indicator.

  • The aggregation technique adopted to synthesize the elementary indicators into composite indicators.

Indicator selection. Selecting the indicators to be included in the composite represents a fundamental stage in the construction process since it does operationally define the latent concept that the composite is supposed to measure. Selection criteria should consider (Nardo et al. 2005) the issues of reducing redundancies, allowing both comparability among statistical units and over time and should be oriented to obtaining politically relevant results. From a statistical point of view, indicator selection often involves a principal component analysis or a factor analysis, to reveal correlations and associations among evaluation variables and to perform some dimensionality reduction. Irrespective of the statistical tool adopted, dimensionality reduction raises some relevant questions, concerning its consequences on the composite indicator construction. If the concept to be measured turns out to be actually unidimensional, computing a single composite indicator could be justifiable. But when concepts are truly multidimensional, then singling out just one, albeit composite, indicator is very questionable. The nuances and ambiguities of the data would in fact be forced into a conceptual model where all the features conflicting with unidimensionality are considered as noise to be removed. Moreover, synthetic scores could be biased towards a small subset of elementary indicators, failing to give a faithful representation of the data.

Weighting variables. When constructing composite indicators, particular attention is paid to the weighting process, which gives different importance to the elementary indicators forming the composite. The necessity of choosing weights based on objective principles is frequently asserted (Nardo et al. 2005; Ray 2008; Sharpe and Salzman 2004), leading to a preference for statistical tools like correlation analysis, principal component analysis or data envelopment analysis, to mention a few. However, adopting purely statistical methods in the weighting process must be carefully considered. Removing any control over the weighting procedure from the analyst gives a possibly false appearance of objectivity that is actually difficult to achieve in social measurement (Sharpe and Salzman 2004). Moreover, since defining weights is often interpreted in the perspective of identifying personal and social values, the procedure should necessarily involve individuals’ judgments. If indicators concern societal well-being, their construction turns out to be not just a technical problem, being part of a larger debate aimed at obtaining a larger legitimacy. In this perspective, the weighting issue can be even considered as a leverage of democratic participation to decisions. For example, Hagerty and Land (2007) stresses that building composite indicators should take into account and maximize the agreement among citizens concerning the importance of each elementary indicator. Choosing consistent weighting criteria is thus a critical issue, largely subjective and possibly data independent.

Aggregating indicators. Further criticisms concern the aggregation process (Munda and Nardo 2009), needed to get unidimensional scores out of multidimensional data, and which raises methodological difficulties when dealing with ordinal data. The process is in fact quite controversial since:

  • The indicators to be aggregated are rarely homogeneous and need not share common antecedents (Howell et al. 2007).

  • The aggregation technique might introduce implicitly meaningless compensations and trade-offs among evaluation dimensions.

  • It is not clear how to combine ordinal variables, using numerical weights.

Even using scaling tools, turning ordinal scores into numerical values is not satisfactory; it forces the nature of the data and is not definitely a clear process, since different choices of the scaling tools may imply very different final results.

Composite indicators represent the mainstream approach to socio-economic evaluation, yet the discussion above shows how many critical issues affect their computation. The difficulties are even greater when ordinal variables are dealt with since statistical tools based on linear metric structures can be hardly applied to non-numeric data. In a sense, socio-economic analysis faces an impasse: (1) implicitly or not, it is generally taken for granted that “evaluation implies aggregation”; thus (2) ordinal data must be scaled to numerical values, to be aggregated and processed in a (formally) effective way; unfortunately (3) this often proves inconsistent with the nature of the phenomena and produces results that may be largely arbitrary, poorly meaningful and hardly interpretable. Realizing the weakness of the outcomes based on composite indicator computations, statistical research has focused on developing alternative and more sophisticated analytic procedures, but almost always assuming the existence of a cardinal latent structure behind ordinal data. The resulting models are often very complicated and still affected by the epistemological and technical issues discussed above. The way out to this impasse can instead be found realizing that evaluation need not imply aggregation and that it can be performed in purely ordinal terms. This is exactly what poset theory allows to do.

3 Basic Elements of Partial Order Theory

In this section, we introduce some basic definitions pertaining to partially ordered sets. In order to avoid collecting too much technicalities in a single paragraph, other results, needed in subsequent developments, will be presented along this chapter.

A partially ordered set (or a poset) P = (X, ≤ ) is a set X (called the ground set) equipped with a partial order relation ≤, that is, a binary relation satisfying the properties of reflexivity, antisymmetry and transitivity (Davey and Priestley 2002):

  1. 1.

    xx for all xX (reflexivity).

  2. 2.

    If xy and yx, then x = y, x, yX (antisymmetry).

  3. 3.

    If xy and yz, then xz, x, y, zX (transitivity).

If xy or yx, then x and y are called comparable; otherwise, they are said incomparable (written x  | |  y). A partial order P where any two elements are comparable is called a chain or a linear order. On the contrary, if any two elements of P are incomparable, then P is called an antichain. A finite poset P (i.e. a poset over a finite ground set) can be easily depicted by means of a Hasse diagram, which is a particular kind of directed graph, drawn according to the following two rules: (1) if st, then node t is placed above node s; (2) if st and there is no other element w such that swt (i.e. if t covers s), then an edge is inserted linking node t to node s. By transitivity, st (or ts ) in P, if and only if in the Hasse diagram there is a descending path linking the corresponding nodes; otherwise, s and t are incomparable. Examples of Hasse diagrams are reported in Fig. 4.1. As any binary relation, a partial order (X, ≤ ) can be regarded as a subset of the Cartesian product X 2; we will write (x, y) ∈ ≤ if and only if xy in P. With this notation, the poset axioms read: (1) (x, x) ∈ ≤, for all xX; (2) if (x, y) ∈ ≤ and (y, x) ∈ ≤, then y = x; and (3) if (x, y) ∈ ≤ and (y, z) ∈ ≤, then (x, z) ∈ ≤, for x, y, zX. It is then meaningful to consider expressions like “a subset of partial order” or “the intersection of a family of partial orders” or similar since they reduce just to ordinary set operations.

Fig. 4.1
figure 1

Hasse diagrams of a poset (a), a chain (b) and an antichain (c)

4 Evaluating Multidimensional Ordinal Phenomena Through Poset Theory

4.1 Representing Ordinal Data as Posets

In this paragraph, we use poset theory to give a simple and effective representation of multidimensional ordinal data which proves essential for the development of the evaluation procedure. The presentation follows mainly Fattore et al. (2011), generalizing and extending it in many directions.

Let v 1, , v k be k ordinal evaluation variables. Each possible sequence s of scores on v 1, , v k defines a different profile. Profiles can be (partially) ordered in a natural way, by the dominance criterion given in the following definition:

Definition 1.

Let s and t be two profiles over v 1, , v k ; we say that t dominates s (written st) if and only if v i (s) ≤ v i (t) \(\forall i=1,\dots,k\), where v i (s) and v i (t) are the scores of s and t on v i .

Clearly, not all the profiles can be ordered based on the previous definition; as a result, the set of profiles gives rise to a poset (in the following, called the profile poset).

Example 1 (Material deprivation).

Let us consider the following three deprivation dimensions from the Italian EU-SILC survey:

  1. 1.

    HS120 – The household makes ends meet with difficulty.

  2. 2.

    DIFCIB – The household has received food donations over the last year.

  3. 3.

    DIFDEN – The household has received money donations over the last year.

Variable HS120 is coded in binary formFootnote 1 (“0 – No”, “1 – Yes”). Variables DIFCIB and DIFDEN are recorded on a four-grade scale, (“0 – Never”, “1 – Seldom”, “2 – Sometimes”, “3 – Often”). The 32 profiles resulting from considering all the sequences of scores over HS120, DIFCIB and DIFDEN can be partially ordered according to Definition 1. The Hasse diagram of the resulting poset is shown in Fig. 4.2. The top node (⊤) represents the completely deprived profile (133); correspondingly, the bottom node (⊥) represents the completely non-deprived profile (000).

4.2 The Evaluation Strategy

Since ordinal phenomena cannot be measured against an absolute scale, the evaluation scores of the statistical units are computed comparing them against some reference units, assumed as benchmarks.Footnote 2 In operative terms, the procedure is organized in three steps:

Fig. 4.2
figure 2

Hasse diagram for the poset of material deprivation profiles

  1. 1.

    Given the evaluation dimensions, reference profiles are selected, identifying benchmarks in the profile poset.

  2. 2.

    Given the benchmarks, the evaluation function is computed assigning a score to each profile in the poset. This score depends upon the “position” of the profile with respect to the benchmarks and is computed analysing the partial order structure.

  3. 3.

    Once the scores of the profiles are computed, each statistical unit is assigned the score corresponding to its profile. This way, the evaluation is extended from the poset to the population.

In this chapter, we focus on assessing the elements of the profile poset. It is worth noticing that the computation of the evaluation scores depends only upon the benchmarks and the structure of the profile poset, but not upon the statistical distributions of the evaluation variables on the population. Thus, our procedure is in a sense halfway between an absolute and a relative approach to evaluation and can be tuned in one direction or the other, with a convenient choice of the benchmarks.

4.3 The Evaluation Function

The evaluation function η(·) assigns a score to each element in the profile poset P. Formally, it is an order-preserving map from P to [0, M], that is, a map

$$\begin{array}{l}h\\:\\P\\\mapsto \\\ [0,M]\\ \\\\:\\\ s\\\\to \\\ h(s)\end{array}$$
(4.1)

such that

$$s \trianglelefteq t\Rightarrow h(s)\le h(t),$$
(4.2)

where M > 0 ⊤ maximum evaluation score and can be seen as a scaling factor.

Condition (4.2) states the minimal consistency requirement that the score computed through η(·) increases as we move towards the top of the profile poset.

Given η(·), the profile poset P is naturally partitioned into the union of the following disjoint subsets:

  • The set D of profiles such that η(s) = M

  • The set W of profiles such that η(s) = 0

  • The set A of profiles such that 0 < η(s) < M

Sets D and W have the following useful property: if sD and st, then, according to (4.2), η(t) = M, that is, tD; similarly, if sW and ts, then η(t) = 0 and tW. In poset theoretical terms, sets like D and W are called up-sets and down-sets, respectively.

4.4 Benchmarks and Poset Thresholds

To pursue a benchmark approach to evaluation, the concept of evaluation threshold, typical of quantitative evaluation studies, must be extended to the ordinal case. To this goal, we draw upon the following nice property of up-sets and down-sets. Given the up-set D, there is a unique subset \(\underset{¯}{d}\subseteq D\)of mutually incomparable elements (i.e. an antichain), such that sD if and only if ds for some \(d\in \underset{¯}{d}\)(Davey and Priestley 2002). The up-set D is said to be generated by \(\underset{¯}{d}\)(in formulas, \(D=\uparrow \rm{}\rm{}\underset{¯}{d}\)). Excluding trivial cases, any element of the generating antichain is below only elements of D and is above only elements of P \ D, so that it shares, in the profile poset, the same role of a numerical threshold in the quantitative case. Thus, \(\underset{¯}{d}\)will be called the superior threshold. The same result can be dually stated for the down-set W: an antichain \(\underset{¯}{w}\subseteq W\)can be found such that sW if and only if sw for some \(w\in \underset{¯}{w}\). W is said to be generated by \(\underset{¯}{w}\)(in formulas, \(W=\downarrow \rm{}\rm{}\underset{¯}{w}\)), and \(\underset{¯}{w}\)will be called the inferior threshold.

Example 1 (continuation).

Given the deprivation poset, at a pure illustrative level we can identify the superior (deprivation) threshold and the inferior (non-deprivation) threshold as

$$\begin{array}{ccc}\underset{¯}{d}& = (121,112),\end{array}$$
(4.3)
$$\begin{array}{ccc}\underset{¯}{w}& = (011).\end{array}$$
(4.4)

In other words, we state that any statistical unit having profile 121 or 112 is considered as completely deprived and will be assigned a deprivation score equal to M. Similarly, any statistical unit having profile 011 will be assigned a deprivation score equal to 0. The sets D of completely deprived profiles and W of completely non-deprived profiles are represented by the black nodes in Fig. 4.3. Note that for logical consistency any element of \(\underset{¯}{d}\)is unambiguously more deprived than the element of \(\underset{¯}{w}\).

4.5 Computation of the Evaluation Function

In the socio-economic literature, evaluation is often performed according to a “response from a population” approach (Cerioli and Zani 1990). In the language of social choice, this means that a set of judges is identified, each assigning an evaluation score to the statistical units. Judges’ scores are then averaged in the final evaluation scores. Usually, judges coincide with the evaluation dimensions, leading to the definition of a composite indicator (Alkire and Foster 2007). In the ordinal case, this choice is unsatisfactory since it is unclear how to aggregate ordinal scores. Although also our methodology follows a “response from a population” approach, it overcomes the problem of ordinal score aggregation selecting the set of judges in a different way. The key idea behind judge selection can be explained as follows. Judges produce rankings of profiles out of the poset P; when accomplishing this task, they are free to order incomparable pairs as preferred (no ties are allowed), but they cannot violate the constraints given by the profile poset, that is, if st in P, then any judge must rank t above s in his own ranking. Thus, the set of all possible different judges (i.e. judges not producing the same rankings) coincides with the set of all the linear extensions of P. A linear extension of a poset P is a linear ordering of the elements of P which is consistent with the constraints given by the partial order relation. For example, if P is composed of three elements x, y and z, with yx, zx and y | | z, only two linear extensions are possible, namely, zyx and yzx, since x is greater than both y and z in P. The set of all the linear extensions of a poset P is denoted by Ω(P); it comprises all the linear orders compatible with P and identifies uniquely the partial order structure (Neggers and Kim 1998; Schroeder 2003). Thus, considering the set of linear extensions of P is just a different way to consider the whole poset; in other words, the set of judges exploits all the information contained in the original partial order.

Fig. 4.3
figure 3

Thresholds for the deprivation poset and corresponding sets D and W (black nodes)

In view of the definition of the evaluation function, it must be determined (1) how each linear extension (i.e. each “judge”) ω ∈ Ω(P) assigns a score ηω(s) to each profile sP and (2) how such scores are aggregated into the final evaluation score η(s). The two steps are described in sequence.

Let \(\underset{¯}{d}\)be the superior threshold. If in ω a profile t is ranked above a profile \(d\in \underset{¯}{d}\), that judge (i.e. ω) must assign a score M to t, consistently with (4.2). Similarly, for the inferior threshold, judge ω will assign evaluation score equal to 0 to any profile ranked, in ω, below a profile \(w\in \underset{¯}{w}\). On the contrary, if in ω a profile s falls below any element of \(\underset{¯}{d}\) and above any element of \(\underset{¯}{w}\), then it can receive neither a score equal to M nor a score equal to 0. As a consequence, it will be assigned an evaluation score equal toFootnote 3 0.5M, to reflect the uncertainty in the evaluation. This way, elements of D and W are assigned scores equal to M and 0, respectively, by any judge ω. On the contrary, elements of A are assigned scores equal to M, 0, or 0.5M by ω according to whether they are ranked, in the linear extension ω, over elements of \(\underset{¯}{d}\), below elements of \(\underset{¯}{w}\)or in between (Fig. 4.4).

Fig. 4.4
figure 4

Exemplificative linear extension ω of a poset with 12 elements and corresponding evaluation function ηω(·), when \(\underset{¯}{d}=({d}_{1},{d}_{2})\) and \(\underset{¯}{w}=({w}_{1},{w}_{2})\)

Formally, let ω ∈ Ω(P) and let us define the following sets:

$${D}_{w}\\\ = \\\ \left\{s\in w:d \trianglelefteq sin\\ w,\rm{for at least one}d\in \underset{¯}{d}\right\};$$
(4.5)
$${W}_{w}=\left\{s\in w:s \trianglelefteq w\\rm{in}w,\rm{for at least one}w\in \underset{¯}{w}\right\}; $$
(4.6)
$${A}_{w}=\left\{s\in w:s\notin {D}_{w}\cup {W}_{w}\right\}. $$
(4.7)

Then the evaluation function ηω(·) associated to judge ω is defined by

$${h}_{w}(s)=\{\begin{array}{cc}M & s\in {D}_{w},\\ 0.5M & s\in {A}_{w},\\ 0 & s\in {W}_{w}.\end{array} $$
(4.8)

Given ηω(·) for each single judge ω Î Ω(P), an aggregation function g(·, ,·) is to be selected in order to define the evaluation function η(·) as

$$h(s)=g({h}_{{w}_{1}}(s),\dots,{h}_{{w}_{n}}(s)), $$
(4.9)

where n is the number of linear extensions of the profile poset P.

To restrict the possible forms of the aggregation function, we impose the following list of axioms on g( ·, , ·):

  1. 1.

    g(x, , x) = x,

  2. 2.

    g(k ·x 1, , k ·x n ) = k ·g(x 1, , x n ).

  3. 3.

    Other things being equal, if \({x}_{i}<{\widehat{x}}_{i}\), then \(g({x}_{1},\dots,{x}_{i},\dots {x}_{n})\le g({x}_{1},\dots,{\widehat{x}}_{i},\dots {x}_{n})\).

  4. 4.

    g(·, , ·) must be quasi-linear.

  5. 5.

    \(g(z-{x}_{1},\dots,z-{x}_{n})=z-g({x}_{1},\dots,{x}_{n})\).

The first three axioms are self-evident; quasi-linearity means that the computation of the evaluation function is consistent with grouping judges into disjoint classes. The fifth axiom requires a deeper explanation. Consider the complement to M of the evaluation function η(·):

$$q(·)=M-h(·)=M-g({h}_{{w}_{1}}(·),\dots,{h}_{{w}_{n}}(·)).$$
(4.10)

If η(·) measures (say) the deprivation degree, θ(·) consistently measures the non-deprivation degree. Alternatively, the non-deprivation degree could also be obtained as

$$q{(·)}^{\ast }\\=\\g(M-{h}_{{w}_{1}}(·),\dots,M-{h}_{{w}_{n}}(·)) $$
(4.11)

since \(M-{h}_{{w}_{i}}(·)\) is the complement to M of \({h}_{{w}_{i}}(·)\). Axiom 5 requires θ(·) and θ(·)* to coincide, which is a logical consistency requirement. Using a theorem by de Finetti (1931), the only functions satisfying axioms (1)–(5) turn out to belong to the class of the weighted arithmetic means, so that the following proposition can be stated:

Proposition 1.

The evaluation function η(·) must be computed as a weighted arithmetic mean of the evaluation functions η ω (·), ω in Ω(P).

Since there is no reason to treat judges asymmetrically (i.e. judges are anonymous), we adopt a uniform weighting scheme and compute the evaluation functionFootnote 4 for profile s as

$$h(s|\underset{¯}{d},\underset{¯}{w})=\frac{1}{|\Omega (P)|}\\\{\displaystyle \sum _{w\in W(P)}{h}_{w}(s|\underset{¯}{d},\underset{¯}{w}).$$
(4.12)

In the following, we set M = 1, so that η(s) Î [0, 1].

In principle, to compute the evaluation function, it would be necessary to list all the linear extension of P, assigning the corresponding score to any profile of the poset. In practice, listing all the linear extensions of real posets is computationally unfeasible, so that the evaluation function must be estimated, based on a sample of linear extensions. Many algorithms exist to perform this task, but the most efficient is known to be the Bubley-Dyer algorithm (Bubley and Dyer 1999) and all the computations presented in this chapter have been performed using it.

Example 1 (continuation).

Given the deprivation and non-deprivation thresholds, the evaluation function has been computed sampling 108 linear extensions out of the deprivation profile poset. Results are reported in Table 4.1 and depicted in Fig. 4.5. As expected, profiles in D and profiles in W are assigned deprivation scores equal to 1 and 0, respectively. All other profiles have deprivation scores in (0, 1). It is worth noticing that (1) the evaluation function increases gradually over the deprivation poset and (2) profiles sharing the same level in the Hasse diagram may receive different scores. This shows how the evaluation procedure is effective in extracting information out of the data structure, reproducing the nuances of multidimensional deprivation.

Remark.

Differently from other approaches, our procedure extracts the evaluation information directly out of the data structure, so that no aggregation of ordinal variables is required. In fact, once the thresholds are identified, the problem of computing evaluation scores is solved assessing the “relational position” of each element of the profile poset, with respect to the benchmarks. Since the structure of the profile poset can be rigorously investigated through poset theory tools, numerical evaluation scores are obtained without scaling the original ordinal dimensions into cardinal variables.

Table 4.1 Evaluation function η(s | 121, 112;  011) for the deprivation poset
Fig. 4.5
figure 5

Graph of η(s | 121, 112; 011) (deprivation profiles are listed on the x axis according to increasing deprivation scores)

The effectiveness of poset methodologies is even more clearly revealed when addressing one of the main problems in evaluation studies, namely, how to account for the different relevance of the evaluation dimensions, when computing the evaluation scores. We refer to this issue as the “weighting” problem, and we explore it in the next section.

5 The “Weighting” Problem

The methodology introduced in the previous paragraphs assumes the evaluation dimensions to share the same relevance. As a matter of fact, some asymmetry among the dimensions is only implicitly introduced when the thresholds are identified. As a legacy of the composite indicator methodology, the problem of accounting for the relevance of the evaluation dimensions is usually tackled using numerical weights, even in an ordinal setting (Cerioli and Zani 1990; Lemmi and Betti 2006). As we show in the following, an alternative and more consistent solution comes from poset theory. However, before introducing it, the weighting problem must be carefully reconsidered.

5.1 Extension of a Poset

Generally, weighting schemes are introduced in order to improve the informative content of the analysis and to reduce ranking ambiguities (often, weights are computed through a principal component analysis, so as to maximize the variance, i.e. the informative power, of the final index). Ambiguity reduction is the key to address the weighting problem also in an ordinal setting. The profile poset P, built as described in Sect. 4, comprises only those comparabilities which are implied by the purely logical ordering criterion stated in Definition 1. Still, many ambiguities, that is, incomparabilities, remain in P, since the ordering criterion is not enough informative to “resolve” all of them. Adding information to the evaluation procedure should therefore yield a reduction of the set of incomparabilities in the profile poset. This idea can be formally stated, through the concept of extension of a partial order.

Definition 2.

Let P 1 = (X, ≤ 1) and P 2 = (X, ≤ 2) be two posets over the same ground set X. If a1 b implies a2 b, for any a, b Î X, then P 2 is called an extension of P 1.

In set terms, P 2 is an extension of P 1 if and only if P 1 Í P 2 as subsets of X 2. In general, a poset P has many extensions, and clearly, if P 1 and P 2 are extensions of P, then also is P 1 Ç P 2.

Example 2.

Let X = { a, b, c} and let P = { aa; bb; cc; ba}. P admits five extensions, as reported in Fig. 4.6.

5.2 The Weighting Procedure

We are now in the position to outline how new ordinal information can be added to the profile poset; for sake of clarity, we introduce the “weighting” procedure through a simple example.

Fig. 4.6
figure 6

Poset P and its five extensions P 1, , P 5

Example 3.

Consider two variables v 1, v 2 each recorded on a three-grade scale: 0, 1 and 2. The corresponding profile poset is depicted in Fig. 4.7a. Suppose that, based on some exogenous considerations, profile 12 is regarded as more deprived than profile 21. Poset P is then enlarged, with the addition of a new comparability, namely, (21, 12), (i.e. 21 ⊴ 12). Unfortunately, P È  (21, 12) is not a poset, since it does not satisfy the transitivity axiom. In fact, since 20 ⊴ 21 in P, from 21 ⊴ 12 it follows that also 20 ⊴ 12 must be added to P, so as to restore transitivity and to get a partial order P which is an extension of P. Technically, considering P* = P È  (12, 21)  È  (20, 12) defines P* as the transitive closure \(\overline{P\cup (12,21)}\) of P È  (12, 21), that is, as the smallest extension of P, comprising 21 ⊴ 12. The extension is depicted in Fig. 4.7b. It is important to realize that there are other extensions of P comprising 21 ⊴ 12, but each of them would also comprise comparabilities not directly implied by it. Therefore. choosing an extension different from the transitive closure would be arbitrary.

The procedure outlined in the example can be generalized allowing more comparabilities at a time. If C is the set of new comparabilities to be added, it is in fact sufficient to consider P È  C and to compute the transitive closure \(\overline{P\cup C}\), so as to get the desired extension.Footnote 5

Fig. 4.7
figure 7

Addition of comparabilities to a poset (a) and Hasse diagram of the transitive closure (b)

After extending the profile poset to the transitive closure, the evaluation process may proceed as before, with the selection of the benchmarks and the computation of the scores. As P is turned into P , the number of incomparabilities reduces and the partial order structure changes. Correspondently, the number of linear extensions decreases, modifying the ranking distribution of the profiles and the evaluation scores of the statistical units.

Example 1 (continuation).

Let us consider the deprivation poset introduced in the previous sections. Suppose to consider difficulties to make ends meet to be more relevant than seldom receiving both food and money donations. Correspondently, profile 100 is ranked as more deprived than profile 011, and 011 ⊴ 100 is added to the poset P. By transitivity, also 010 ⊴ 100 and 001 ⊴ 100 are added to P, so as to get the transitive closure \({P}^{\ast }=\overline{P\cup (011 \trianglelefteq 100)}\), whose Hasse diagram is depicted in Fig. 4.8. As it can be seen, the symmetric structure of the original profile poset is broken by the addition of the new comparabilities and the evaluation function, given the same thresholds, is slightly more polarized, being steeper in the left part (Table 4.2 and Fig. 4.9).

Table 4.2 Evaluation function η(s | 121, 112;  011) for the extended deprivation poset

Remark.

In real cases, posets consist of more profiles than those discussed in these examples. In such cases, transitive closures cannot be computed by inspection, as done in this chapter. However, the computations involved in the weighting procedure are very easily accomplished, drawing on the matrix representation of a poset. Let P = (X, ⊴ ) be a poset over a set of n profiles s 1, , s n . To P, it is associated an n ×n binary matrix Z, defined by Z ij = 1 if s i s j and Z ij = 0 otherwise. When a new comparability s h s k is added to P, Z hk is set to 1. If C is the set of new comparabilities added to P, and \(\widehat{Z}\) is the matrix corresponding to P ÈC, the matrix Z  * associated to the transitive closure \({P}^{\ast }=\overline{P\cup C}\) is obtained as

$${Z}^{\ast }=Bin\left({\displaystyle \sum _{\ell =0}^{n-1}{\widehat{Z}}^{\\ell }}\right),$$
(4.13)

where Bin(·) is the operator that sets to 1 all the non-null elements of its argument (Patil and Taillie 2004).

Remark.

The weighting procedure described above is based on a subjective judgment, pertaining to the ordering of incomparable profiles. The identification of the comparabilities to be added to the profile poset should be performed based on some kind of socio-economic analysis; nevertheless, it necessarily involves values judgment and includes individuals’ contribution in attributing importance to different domains. Such a subjectivity should not be an issue: it is in fact responsibility of the decision-maker to make a stand on these aspects turning a “pre-policy” profile poset into a “policy-oriented” profile poset, useful as an evaluation tool.

Fig. 4.8
figure 8

Extension of the deprivation poset and corresponding sets D and W (black nodes)

Fig. 4.9
figure 9

Graph of η(s | 121, 112; 011) for the extended deprivation poset (deprivation profiles are listed on the x axis according to increasing deprivation scores)

Checking the incomparabilities of a partial order to decide how and whether to extend, it is not any easy task, particularly when the number of profiles is large. Moreover, socio-economic scientists are likely to tackle the weighting problem at the evaluation variable level, rather than at the profile level. Therefore, it would be desirable to have a procedure capable to extend the profile poset P directly based on the assessment of the relative importance of the evaluation dimensions. In the following section, we build such a procedure in the fundamental case of binary data.

6 Binary Variables

In many socio-economic studies, the evaluation dimensions have a simple binary form. Typical examples can be found in surveys about material deprivation, where information pertaining to the ownership of goods (e.g. car, telephone, television) are collected as dichotomous data. All the concepts and tools described in the previous paragraphs apply to binary variables as well, but the simpler structure of the binary case makes it possible to further develop the methodology, particularly concerning the weighting problem.

6.1 The Structure of the Hasse Diagram for Binary Data

When the k variables v 1, , v k are binary, the set X of possible profiles has cardinality 2k and becomes a poset P = (X, ⊴ ) under the usual order relation defined by

$$u \trianglelefteq w\iff {v}_{i}(u)\le {v}_{i}(w),\forall i=1,\dots,k.$$
(4.14)

The Hasse diagram of P has a simple and symmetrical structure since each level of the graph comprises profiles with the same number of 0s and 1s, so that two profiles u and w share the same level in the diagram if and only if one is a permutation of the other.

Example 4 (Binary material deprivation).

Let us consider the following set of five deprivation variables, from the EU-SILC survey for Italy:

  1. 1.

    HS160 – Problems with the dwelling: too dark, not enough light

  2. 2.

    HS170 – Noise from neighbours or from the street

  3. 3.

    HS180 – Pollution, grime or other environmental problems

  4. 4.

    HS190 – Crime, violence or vandalism in the area

  5. 5.

    UMID – Dampness in walls, floor, ceiling or foundations

All five variables are coded in a binary form: 0 if the household does not report the issue and 1 if it does. The set X comprises 25 = 32 profiles. The Hasse diagram of the profile poset P is depicted in Fig. 4.10. For future reference, we have computed the degree of deprivation of each profile, when the superior (deprivation) threshold is set to \(\underset{¯}{d}=(01110,11001)\) and the inferior (non-deprivation) threshold to \(\underset{¯}{w}=01000\). The result is reported in Table 4.3 and depicted in Fig. 4.11.

Table 4.3 Evaluation function η(s | 01110, 11001;  01000) for the deprivation poset built on five binary variables

Clearly, the variables considered in the example have different relevance. For instance, dampness in the house may be considered as less relevant than living in an area affected by crime or pollution, and the profile poset should be extended accordingly. In the next paragraph, we show how this extension can be accomplished, directly based on the existence of (partial) hierarchies among binary evaluation dimensions.

Fig. 4.10
figure 10

Hasse diagram of deprivation profiles, built on five binary deprivation variables

Fig. 4.11
figure 11

Graph of η(s | 01110, 11001;  01000) for the deprivation poset built on five binary variables (deprivation profiles are listed on the x axis according to increasing deprivation scores)

6.2 The Weighting Procedure in the Binary Case: The Connection Rule

Let V = {v 1, , v k } be the set of binary evaluation variables. The set V can be turned into a poset Π = (V, ≺ ) (in the following, called the relevance poset), defining the strict partial order ≺ through

$${v}_{j}\prec {v}_{i}\iff {v}_{j}\rm{is less relevant than}\ {v}_{i}$$
(4.15)

Given the relevance poset, the next step is to define a way to link its structure to the extension of the profile poset. This is done introducing the following connection rule that we present in three steps.

Step 1.:

For v i Î V, let us consider the set L i , defined as

$${L}_{i}=\{{v}_{j}\in V:{v}_{j}\prec {v}_{i}\},$$
(4.16)

that is, the set of variables less relevant than v i (in the following, v i is called the pivot). An incomparable pair s | |  t such that

  1. 1.

    v i (s) < v i (t)

  2. 2.

    v j (t) < v j (s), for v j Î L i

  3. 3.

    v j (s) = v j (t) for v j v i , v j ÏL i

is turned into the comparability st and added to P. Let us denote by CR Π (P, v i ) the set of comparabilities added to P, when v i is selected as the pivot. It can be easily checked that CR Π (P, v i ) can be defined in a more compact way as

$$\begin{array}{l}C{R}_{P}(P,{v}_{i})\\\ = \\\\(u,w)\in {X}^{2}:{v}_{j}(u)<{v}_{j}(w)\iff j = i \\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\ \ and \\ {v}_{j}(w)< {v}_{j}(u)\Rightarrow {v}_{j}\in {L}_{i}.\end{array}$$
Step 2.:

The procedure described in Step 1 is repeated, selecting each variable in Π as the pivot. This way, the set CR Π (P), comprising all of the new comparabilities directly implied by Π, is obtained as

$$C{R}_{P}(P)={\cup }_{i=1}^{k}C{R}_{P}(P,{v}_{i}).$$
(4.17)
Step 3.:

The set CR Π (P) is then added to P, getting P ÈCR Π (P), that is, enriching the profile poset with the comparabilities derived by Π. In general, P ÈCR Π (P) is not a partial order, since the transitivity properties need not be fulfilled. Therefore, we finally compute the transitive closure

$${\overline{CR}}_{P}(P)=\overline{P\cup C{R}_{P}(P)}.$$
(4.18)

The set \({\overline{CR}}_{P}(P)\) is the desired extension of the original profile poset P and includes all the comparabilities comprised in CR Π (P) and all those implied by transitivity.

We are now in the position to state the connection rule formally.

Definition 3.

Connection rule. Let Π = (V, ≺ ) be the relevance poset over a set V = { v 1, , v k } of binary variables and let P = (X, ⊴ ) be the original profile poset. Then the information contained in Π is added to P extending P to \({\overline{CR}}_{P}(P)\).

We now give some examples to show how the connection rule works in practice.

Example 5.

Consider a set of three binary variables v 1, v 2 and v 3 and consider the corresponding profile poset P, whose Hasse diagram is reported in Fig. 4.12a. Suppose to consider variable v 1 as more relevant than variable v 2 (in symbols, v 2v 1), while no criterion to order v 3 with respect to v 1 or v 2 is provided. Quite naturally, any incomparability in P, due to a “disagreement” between v 1 and v 2 only, can be eventually turned into a comparability since v 1 “prevails” on v 2. Explicitly, the incomparabilities 101 | | 011 and 100 | | 010 turn into 011 ⊴ 101 and 010 ⊴ 100, respectively. Adding these new comparabilities to P and taking the transitive closure, a new poset P is produced, as the smaller extension of P consistent with the additional information conveyed by v 2v 1 (Fig. 4.12b).

Example 6.

Example 5 can be easily generalized considering, for instance, v 2v 1 and v 3v 1. In this case, the incomparability 100 | | 011 is turned into 011 ⊴ 100 since v 1 “prevails” on both v 2 and v 3. Adding this comparability to P and taking the transitive closure, six other comparabilities are added to P, namely, 011 ⊴ 110, 011 ⊴ 101, 010 ⊴ 100, 011 ⊴ 100, 001 ⊴ 110 and 010 ⊴ 101. The Hasse diagrams of the profile poset and the resulting extension P are depicted in Fig. 4.13.

Figure 4.14 reproduces the Hasse diagrams for the relevance posets Π 1 and Π 2, implicitly defined in Examples 5 and 6.

Fig. 4.12
figure 12

Hasse diagrams of P and its extension P , when v 2v 1

Fig. 4.13
figure 13

Hasse diagrams of P and its extension P , when both v 2v 1 and v 3v 1

Fig. 4.14
figure 14

Hasse diagrams of the relevance posets Π 1 and Π 2

Example 7.

Let V = { v 1, v 2, v 3} be the set of three binary variables of Examples 5 and 6. In addition to the posets of Fig. 4.14, three other posets can be defined on V (a part from label permutations), namely, the antichain Π 3 = (v 1 | | v 2 | | v 3), the poset Π 4 given by (v 3v 1, v 3v 2) and the chain Π 5 = (v 3v 2v 1). The corresponding extensions \({\overline{CR}}_{P}(P)\) for each case are represented in Fig. 4.15.

As these examples make clear, when the relevance poset is an antichain (i.e. when no information on the relative importance of the evaluation variables is available), the transformation \({\overline{CR}}_{P}(·)\) has no effect and leaves the profile poset unchanged. At the opposite, when Π is a chain (i.e. when the evaluation variables are ranked in a complete hierarchy), then \({\overline{CR}}_{P}(·)\) transforms P in a linear order.

Remark.

Comparing the transformations \({\overline{CR}}_{{P}_{i}}(·)\) described in Examples 57, it can be directly checked that if Π 1 Í Π 2, then \({\overline{CR}}_{{P}_{1}}(P)\subseteq {\overline{CR}}_{{P}_{2}}(P)\). Since any poset Π can be extended to a linear order, from the discussion above it follows that \({\overline{CR}}_{P}(P)\) is always comprised in some linear extension of P. This ensures that applying the connection rule, no loops are accidentally introduced in the profile poset.

We end this section applying the connection rule to the data of Example 4.

Example 4 (continuation).

Suppose to (partially) order the five evaluation variables according to the poset Π depicted in Fig. 4.16.

The relevance poset comprises two levels, and all the variables in the upper level dominate each variable in the lower.Footnote 6 An application of the connection rule directly gives the extension presented in Fig. 4.17. As can be directly seen, the extended poset P * has far less incomparabilities that the original profile poset. In particular, it is worth noticing that, in P *, 11001 ⊴ 01110, so that the deprivation threshold reduces to just a single profile, namely, 11001.

Fig. 4.15
figure 15

Extensions of P through Π 3, Π 4 and Π 5

Fig. 4.16
figure 16

Relevance poset for the deprivation example

Fig. 4.17
figure 17

Extension of the binary deprivation poset

Table 4.4 reports the deprivation scores computed on the extended poset. As it can be easily checked (Fig. 4.18), the scores are more polarized towards the extreme values 0 or 1, than in the case of the original profile poset. As expected, the added information has reduced the ambiguity of the original partial order, resulting in a much steeper evaluation function.

Fig. 4.18
figure 18

Graph of η(s | 01110, 11001; 01000) for the extended deprivation poset, built on five binary variables (deprivation profiles are listed on the x axis according to increasing deprivation scores)

7 Conclusions and Perspectives

In this chapter, we have introduced a new methodology for evaluation purposes in multidimensional systems of ordinal data. The methodology is based on a benchmark approach and draws upon poset theory, so as to overcome the conceptual and computational drawbacks of the standard aggregative procedures, which involve composite indicators. Poset tools allow to describe and to exploit the relational structure of the data, so as to compute evaluation scores in purely ordinal terms, avoiding any aggregation of variables. The effectiveness of the partial order approach is particularly evident in the way the “weighting” problem is addressed and solved. Exogenous information pertaining to the relevance of the evaluation dimensions is in fact taken into account modifying the structure of the profile poset, through the transitive closure device, avoiding the introduction of numerical weights in the computations. Although simplified, the examples discussed in this chapter show how the methodology can be applied in practice and to real datasets. The software routines needed for the computations can also be easily implemented through standard programming languages. As any novel proposal, our methodology can be improved in many respects and extended in many directions, both at theoretical and applied level. These are interesting avenues for future research.

Table 4.4 Evaluation function η(s | 01110, 11001; 01000) for the extended deprivation poset, built on five binary variables