Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

2.1 Data Matrix

Suppose there are five objects. Think of them as sediment samples a, b, c, d, e which we would like to rank. The first and main question is: What is the aim of ranking? We can rank the five sediment samples according to their age, or according to their content of a mineral, etc. If we know the aim of ranking, we need to identify properties that are relevant. In the case of ranking according to their age, it may be simple. Just order the samples according to their age!

In other cases, it may not be as simple. If, for example, the hazard for humans is of concern, then how to define the hazard caused by sediments to humans? One way of doing this is to determine properties like acute toxicity, or hygienic aspects, or potential carcinogenicity. Even hygienic aspects have several facets which need to be considered. Thus we come up with some properties, say q 1, q 2, q 3, which define the columns of a data matrix, whereas its rows represent the objects.

The next question we are concerned with is the orientation. Do all the properties q 1, q 2, and q 3 contribute to the aim of the ranking in the same way? This means: Is an increasing value for each property associated with an increasing hazard? For example, toxicity q 1 is measured as that concentration, where for a fraction of test species a well-defined adverse effect can be observed. A large value of the (acute) toxicity is less hazardous than a low value. On the other hand, the hygienic aspect q 2 is measured by the number of fecal coliforms. Here a large value of fecal coliforms is more of hygienic concern than is a low value. The two properties of a sediment sample are not similarly oriented. Therefore we must transform the properties so that they have a common monotonicity with the aim. Without knowing the aim and without checking the correct orientation of the attributes, a partial order analysis is meaningless.

The question of orientation is closely connected with another problem which is more a matter of convention: Should large values of (transformed) properties always mean “bad” or should they mean “good”? Here we do not follow a general rule; the important point is that the same orientation is considered for all properties for ranking. The consequence is that for every study, the kind of orientation and its meaning in terms of “good” and “bad” must be explicitly given.

Having clarified the two basic questions, we can begin to compare each object with another. The comparison has to be based on the data matrix. So, let us assume that sediment sample a has values (2.0, 7.3, 1.0) and sediment sample b has values (3.1, 8.4, 1.5). Let us furthermore think of “high values” indicating a bad state. Then examining the three properties of sediment samples a and b, we conclude that sediment sample b is worse than sediment sample a, because all values of sample b are simultaneously larger than those of sample a. Let us now consider sediment sample c: (4.2, 8.1, 2.3). We see that c is worse than a but c cannot be compared with b because two properties favor b (q 1 and q 3) but the property q 2 favors c.

Adding two more sediment samples d (1.7, 2.6, 0.1) and e (5.8, 12.3, 3.7), we see that d is better than a, a is better than b, and hence d is better than b. There are still some comparisons to be performed and the reader should realize that even for small data matrices and only for some few properties, the statement which object is better (or worse) than another is not difficult, but troublesome. What we have established among the objects is a partial order, because we cannot give each pair of objects an order.

In the next section, we will explain the partial order more thoroughly and introduce some useful notation.

2.2 Characteristics of Partial Order

2.2.1 Axioms

Let us suppose that an “object set” X (in technical terms also called a “ground set”) consists of our objects of interest. Suppose that X is a finite set (we do not mention it further). In our example above, X consists of objects a, b, c, d, e. We also write \(X = \{ a,b,c,d,e\}\). Furthermore, recall that we wish to compare objects of the object set. Therefore we use the symbol ≤ as a binary relation among the objects. The role of this relation is now fixed up by axioms:

$${\textrm{Axiom }}1: \quad {\textrm{Reflexivity:}} \quad x \in X:x \le x$$
((2.1a))
$${\textrm{Axiom }}2: \quad {\textrm{Anti-symmetry:}} \quad x \le y,y \le x\;{\textrm{implies }}y = x$$
((2.1b))
$${\textrm{Axiom }}3: \quad {\textrm{Transitivity:}} \quad x \le y\;{\textrm{and }}y \le z\;{\textrm{implies }}x \le z$$
((2.1c))

Reflexivity: An object can be compared with itself.

Anti-symmetry: If both comparisons are valid, i.e., y is better than x and at the same time, x is better than y, then this axiom requires that x is identical with y. Later we will see that this requirement is very restrictive.

Transitivity: Transitivity is present if the objects are characterized by properties which are at least ordinal scaled. Any measurable quantity like height, length, and price implicitly bears the transitivity. There are also properties, like “color,” where the meaning for ordering is unclear. If color is just a category like “red,” “blue” by which objects can be labeled, then color is not a property relevant for ranking. If, however, color is given an order like red ≤ green ≤ blue, then objects can be ordered. It is a question of design of the matrix, availability of this kind of information, and use for the ranking aim.

2.2.2 Quotient and Object Sets

In applications, it is convenient to relax slightly the requirements concerning partial order. Several objects may have the same numerical values but are certainly different individuals (ties). So we consider the objects as equivalent, expressing that they have identical rows in the data matrix, but must nevertheless be considered as different items. These objects form an equivalence class and one may take one object out of the equivalence class and let it represent all the others. In such cases we proceed as follows (Patil and Taillie, 2004): We consider only one of the objects of any equivalence class as a representative and perform all operations which can be done in partial order theory. We keep in our memory, or in the computer memory, all the other objects being represented. We insert them whenever needed.

To make a clear distinction:

  • The set of equivalence classes under an equivalence relation ℜ is called quotient set, denoted, e.g., by X/ℜ .

    $${\textrm{From any equivalence class}},{\textrm{ one element is selected as representative.}}$$
    ((2.2a))
    $$\begin{array}{l}{\textrm{When}},{\textrm{however}},{\textrm{all objects}},{\textrm{even the equivalent ones}},{\textrm{are to be taken}}\\ {\textrm{into consideration}},{\textrm{then we speak of the object set.}}\end{array}$$
    ((2.2b))

2.3 Product Order

2.3.1 Notation

How do we arrive at a partial order if a data matrix is at hand?

Let x, y be two different objects of the object set X. Let Q be the space of measurements (of different scaling levels). If, for instance, data are continuous in concept, then \(Q \subset {R^m}\) (the m-dimensional space of real numbers). Let q(x) be the data row for x and q(y) for y, i.e., \(q(x) \in Q\). We say

$$\begin{array}{c} x \le y,\;{\textrm{if and only if }}q(x) \le q(y), \\ q(x) \le {q}({y}),\;{\textrm{if and only if }}{q_i}(x) \le {q_i}(y),\;{\textrm{for all }}i \\ \end{array}$$
((2.3))

The space of measurements, Q, having the order relation property allows us to define order relations of the object set.

If x, y are different objects but \(q(x) = q(y)\), i.e., \({q_i}(x) = {q_i}(y)\), for all i, then the objects x and y are called equivalent and the equivalence relation in (2.2a) is the equality. Equivalence is denoted as

$$x \cong y$$
((2.4))

If we want to exclude equivalence, then we also write

$$x < y$$
((2.5))

Consequently

$$\begin{array}{l} x < y,\;{\textrm{if and only if }}q(x) \le q(y), \\ q(x) \le q(y),\;{\textrm{with at least one }}{q_i}^*, \\ {\textrm{for which }}{q_i}^*(x) < {q_i}^*(y)\;{\textrm{is valid}}. \\ \end{array}$$
((2.6))

Sometimes it is necessary to specify the ≤ or < relation for a set. In that case, ≤ or < gets an appropriate subscript, e.g., \({ \le _{\{ q1,\ q2\} }}\) or \({ \le _{\{ q1,\ q3\} }}\) indicates that different partial orders are considered, one with the attributes q 1 and q 2, and the other one with q 1 and q 3.

The order among the objects based on Eqs. (2.3) and (2.6) is called “product order” or “component-wise order.” Product order is our method to obtain a partial order from a data matrix and the focus of the monograph is on the partial order analysis (PoA) of data matrices. With \(x{\ < _{\{ qi,\ qj\} }}\) or \(x\left.\right\| {_{\{ qi,\ qj\} }} y\) (for \(\left.\right\|\), see below) we indicate that the relation between x and y is based on a certain subset of attributes.

In our example above, the condition (2.6) cannot be established for the sediment samples b and c. It is convenient to express this fact by \(\left. b \right\|c\). The symbol \(\left.\right\|\) expresses that “b is incomparable to c” or that there is a conflict among the attribute values of b and c. When for the objects x, y it is valid that \(q(x) \le q(y)\) or \(q(x) \ge q(y)\), then x and y are comparable. If the comparability between two objects is to be indicated without defining the orientation, then we write \(x\ \bot\ y\).

When the object set X is equipped with a partial order, meaning that the objects of X are related to each other by a relation, which obeys the above-mentioned axioms, then we write \((X, \le )\). If no confusion is possible, we also use bold symbol for the object set X to denote the corresponding partial order and add indices if necessary. For example, \({{\textbf{X}}_i} = ({X_i}, \le )\). An object set equipped with a partial order is often called a poset (partially ordered set). Our analysis is based on a data matrix, and we see from Eq. (2.3) that \(x\ \bot\ y\) or \(x\ \left.\right\|\ y\) depends on the attributes used. It is convenient to speak of an attribute set. Bruggemann et al. (1995) introduced the concept information base (IB) which is the set of attributes used in the data matrix. Therefore, we will write either \((X,\{ {q_1},{q_2}, \ldots \} )\) if it is important to refer to the attributes or (X, IB). As (X, IB) is the basis for an ordinal analysis of the data matrix, we introduce the following definition.

(X, IB) is the partial order based on Eq. (2.3), where

$${q_i} \in {\textrm{IB and }}x \in X$$
((2.6a))

\((X/_{\cong },\;{\textrm{IB}})\) is the partial order based on Eq. (2.6), where \({q_i} \in {\textrm{IB}}\) and we take from each equivalence class exactly one element x, the representative (see Eq. (2.2)).

$$ {\rm Thus}\;(X/ \cong ,\;{\rm IB})\,{\rm is the partial order or representatives} $$
((2.6b))

Furthermore, we denote the number of elements of a set A as usual as |A|. The set A may be either X or IB or subsets of them. In the following, we use the terms “attributes” when we are speaking of the columns of the data matrix without specifically referring to partial order, whereas we use the term “indicator” when their use for an ordinal analysis is to be stressed. As the focus of the monograph is the ordinal analysis, we will be using “attributes” and “indicators” interchangeably. The data rows are identified by the objects. Sometimes, if we stress that the objects belong to some set, we also speak of them as “elements of a set X.”

2.3.2 Example

In Section 2.1, we have an example, so we will fix the concepts and notation just with that (Table 2.1).

Table 2.1 Illustrative example

The object set is \(X = \{ a,b,c,d,e\}\). Let us introduce \({\textrm{IB}} = \{ {q_1},{q_2},{q_3}\}\). If we would like to know whether \(a \le b\), we have to check q 1, q 2, and q 3 for a and b. Generally, taken the whole set of objects, there are \(|X|^*(|X| - 1)^*|{\textrm{IB}}|/2\) single attribute comparisons.

In our example, we find that

  • $$a < b,a < e,a < c,b < e,c < e,d < a,d < b,d < e,d < c$$

With this list, everything is said! However, it is convenient to explicitly state that \(b\left.\right\|c\).

2.4 Some Basic Concepts

  1. 1.

    Data matrices having the same rank matrix have the same partial order.

  2. 2.

    If there is no incomparability, then we speak of a complete, total, or linear order. In the case of a complete order, the objects \(x \in X\) can be arranged in a sequence \({x_1} < {x_2} < \ldots < {x_n}\), i.e., a ranking is found.

  3. 3.

    Chain: If a subset \(X' \subset X\) can be found such that for all \((x,y) \in X' \times X'\) a complete order can be found, then this subset, together with the partial order relation, is a chain.

  4. 4.

    When for a chain C, no element ∈ X can be found to extend C, then C is called maximal. There may exist a maximum chain.

  5. 5.

    Weak order: Representative elements of equivalence classes are in a chain, but there are nontrivial equivalence classes.

  6. 6.

    Antichain: If a subset \(X' \subset X\) can be found such that for no \((x,y) \in X' \times X',x\ \bot\ y\) holds, then this subset, equipped with the partial order relation, is called an antichain.

  7. 7.

    When for an antichain (AC), no element ∈ X can be found, by which AC can be extended, then AC is maximal. There may exist a maximum antichain.

  8. 8.

    In finite data matrices, chains and antichains contain a finite number of objects. Therefore, we can speak of chains or antichains having a certain length, according to the number of elements they contain. Within a partial order in general, there can be several maximal chains and several maximal antichains.

  9. 9.

    Height: Number of elements of the longest chain is called the height of the poset.

  10. 10.

    Width: The number of elements of the maximum of antichains is called the width of the poset.

  11. 11.

    Maximal, minimal, greatest, least, isolated elements of a poset:

    • A maximal element \(x \in X\) is an element for which no relation \(x \le y\) can be found.

    • The set of maximal elements of (X, IB) is denoted as MAX(X, IB) or if no confusion is possible, we write simply MAX.

    • A minimal element \(x \in X\) is an element for which no relation \(y \le x\) can be found.

    • The set of minimal elements of (X, IB) is denoted as MIN(X, IB) or if no confusion is possible, we write simply MIN.

  12. 12.

    Greatest /least element: There is only one maximal/minimal element (quotient set).

  13. 13.

    Isolated element: An element \(x \in X\) which is at the same time a maximal and a minimal element is called an isolated element.

    Let us call an isolated element i, then for all x ∈ X-{i}: i || x.

    The set of isolated elements of (X, IB) is denoted as ISO(X, IB) or if no confusion is possible, we write simply ISO.

  14. 14.

    Proper maximal/minimal element: A maximal/minimal element \(x \in X\) which is not isolated.

  15. 15.

    Cover relation: x is covered by y if there is no element \(z \in X\) for which \(x < z\) and \(z < y\). We write this as \(x \le :y\).

We may examine points 1–15 on the basis of the partial order list. However, it is far simpler to apply the graphical display of a partial order! Therefore this is introduced next.

2.5 Hasse Diagram

With the cover relation at hand, we can get a diagrammatic representation of the partially ordered set (poset).

Let us consider x and y, and assume that \(x \le :y\). Then we draw x in a vertical plane below y and connect both with a straight line. This is repeated for every ordered pair, i.e., for all pairs of two objects for which ≤: relation holds. The resulting diagram is denoted as Hasse diagram (sometimes partial order set diagram, order diagram, line diagram, or simply the diagram) after the German mathematician Hasse, who made this kind of visualization popular.

In our example, \(X = \{ a,b,c,d,e\}\) (Fig. 2.1).

Fig. 2.1
figure 1

Hasse diagram based on data of Table 2.1

There are many remarks to be made:

  1. 1.

    Differently drawn Hasse diagrams may nevertheless graphically represent the same partial order. In that case we speak of isomorphic Hasse diagrams.

  2. 2.

    As the Hasse diagram allows the overview about the order relation in a very convenient way, it is very important to draw the Hasse diagram carefully. Aeschlimann and Schmid (1992) have given many recommendations. Nevertheless, there are many degrees of freedom to draw a Hasse diagram.

  3. 3.

    The objects are located vertically in the drawing plane in order to get them organized in “levels.” For example, object d forms the first level, object a the second, objects b and c the third, and finally object e the fourth level. If an object could be located in several vertical positions, the highest possible one is selected (see Fig. 2.2 for a demonstration).

  4. 4.

    If avoidable, the lines should not cross each other in locations which are not those of objects (see Fig. 2.2 for a demonstration).

  5. 5.

    There should be as few different slopes as possible for the single lines which represent the cover relations.

  6. 6.

    Most software realizations locate the objects symmetrically. The next five items refer to Fig. 2.1.

  7. 7.

    The fact that \(d \le b\) can be easily deduced from the Hasse diagram because of transitivity; no line appears for \(d \le b\).

  8. 8.

    There is one maximal element, namely the object e. There is one minimal element, namely the object d. Object d is the only one minimal element, therefore object d is the least element and similarly object e is the greatest element.

  9. 9.

    A chain is, for example, \(d < a < b\). This is not the maximal chain, because we could add e.

  10. 10.

    The set {bc} is an example for an antichain. The width of the partial order is 2.

  11. 11.

    The height of the poset is 4 (counting the objects d, a, b, e).

Fig. 2.2
figure 2

Drawing rules of Hasse diagrams

Figure 2.2 shows examples of “crossings” and how convention 2 is working.

In Fig. 2.2, the four Hasse diagrams (1) and (2) on the one side and (3) and (4) on the other side are order theoretically correctly drawn (they are isomorphic). However, in (1) there is an avoidable crossing and (4) follows the remark 3, whereas diagram (3) does not.

Sometimes it is convenient to refer to the “fence relation” and to a “dual” poset or “dual” Hasse diagram. An example may be sufficient for an explanation (Fig. 2.3).

Fig. 2.3
figure 3

Fences and dual Hasse diagrams

On top of Fig. 2.3, objects x and y are in a “fence relation.” Fences or “zigzag posets” are often denoted by F(n), according to the number of objects. Objects in a fence relation are connected (in the ordinary graph theoretical sense, but not necessarily comparable). At the bottom, an example of duality between posets, i.e., between Hasse diagrams, is shown.

2.6 Components

Let us assume the object set \(X = \{ a,b,c,d\}\) and the Hasse diagram of its partial order in Fig. 2.4.

Fig. 2.4
figure 4

Hasse diagram of (X, IB). \({\textrm{IB}} = \{ {q_1},{q_2}\}\), see table

A partially ordered set can be considered a directed graph (digraph) without cycles. We speak of a weak connection if its underlying graph is connected. In Fig. 2.2, object c and d are weakly connected, because the underlying graph contains a sequence of edges (an edge in this case) between c and d. The maximal weakly connected components, where there is no outside object which can be included, such as \((\{ a,b\} ,\;{\textrm{IB}})\) and \((\{ c,d\} ,\;{\textrm{IB}})\) in Fig. 2.4, we simply call components of the partially ordered set. Isolated elements can also be considered as components (“trivial components”).Footnote 1 The appearance of components is exciting, because their presence indicates interesting data structures and a high sensitivity to any weights of a composite indicator (see Chapter 5). Components of partially ordered sets are maximal, because no element ∈ X can be found to extend components. In Chapter 5, we call subsets of components with the inherited order relation “separated subsets.”

2.7 ζ Matrix and Other Representations of Partial Order

A convenient way to code a partial order is the ζ matrix: The rows and columns of this matrix are labeled with the object names. If \(a < b\), then the corresponding cell gets a 1, in all other cases a 0.

Let

$$x,y \in X,\;{\textrm{then }}\zeta (x,y) = 1: \Leftrightarrow x < y$$
((2.7))

For example, the partial order represented in Fig. 2.1 obtains the following ζ matrix:

$$\begin{array}{lll} & \,\,\,\,\, \begin{array}{*{20}{c}} a \hfill & b \hfill & c \hfill & d \hfill & e \hfill \end{array} \\ \zeta = \begin{array}{*{20}{c}} a \hfill \\ b \hfill \\ c \hfill \\ d \hfill \\ e \hfill \end{array} & \left( {\begin{array}{*{20}{c}} 0 \hfill & 1 \hfill & 1 \hfill & 0 \hfill & 1 \hfill \\ 0 \hfill & 0 \hfill & 0 \hfill & 0 \hfill & 1 \hfill \\ 0 \hfill & 0 \hfill & 0 \hfill & 0 \hfill & 1 \hfill \\ 1 \hfill & 1 \hfill & 1 \hfill & 0 \hfill & 1 \hfill \\ 0 \hfill & 0 \hfill & 0 \hfill & 0 \hfill & 0 \hfill \end{array}} \right) \\ \end{array}$$

Note that transitive relations are coded by giving the corresponding cell a 1. So the entry, belonging to row d and column b, gets a 1. A variant of the matrix ζ is the cover matrix, which we will not use in this book. The main diagonal of the matrix ζ contains 1 if in Eq. (2.7) the ≤ relation is used.

Another possibility to represent a partial order is to describe it as a set of ordered pairs, X 2: Let \((x,y) \in {X^2}\) be an ordered pair, then

$$(x,y) \in (X,\;{\textrm{IB}}):\; \Leftrightarrow x < y$$
((2.8))

as defined in (2.6).

The set of ordered pairs consists of all pairs of comparable elements except the diagonal of X 2. Hence

$$|(X,\;{\textrm{IB}})| = \sum {\sum {\;{\zeta _{ij}}} } $$
((2.9))

The Hasse diagram (Fig. 2.1) would therefore get the following representation (suppressing the reflexivity relation):

$$(X,\;{\textrm{IB}}) = \{ (d,a),(d,b),(d,c),(d,e),(a,b),(a,c),(a,e),(b,e),(c,e)\}$$

This kind of representation will be useful in Chapter 10 and is useful for programming purposes. Now we know a lot about posetic characteristics of the data matrix. But how does this help in our ranking problem? This we discuss in Chapter 3.

2.8 Summary and Commentary

The evaluation of a data matrix by partial order needs the following: (i) a ranking aim, (ii) orientation, and (iii) comparison of objects according to (i). The graphical display by a Hasse diagram allows one an easy way to identify the concepts mentioned in Section 2.4. How far are they helpful for interpretation? Let us go back to the ideas discussed in Section 2.1: The analysis of a set X with respect to prioritization and ranking means (1) establishing a partial order, (2) clarifying how far equivalence classes (ties) appear and how to handle them (by analyzing the quotient set), (3) finding out the chains and antichains, and (4) finding the maximal, minimal, and isolated elements.

Maximal or minimal elements are priority elements, which most often are of special concern. Isolated elements can be considered as elements which are maximal and minimal elements simultaneously. So far they are also priority elements. However – as we will see later – isolated elements indicate peculiarities of the data matrix. Chains: Elements are in a chain if their attributes vary simultaneously either (weakly) increasing or (weakly) decreasing. Often, there is a positive rank correlation among the elements of a chain. We will later see that under a weighting scheme, i.e., a set of weights to construct a composite indicator, elements of a chain will keep their mutual order, whereas elements of an antichain can get very different positions in the final ranking.