Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

Order relationships are often used in chemistry and environmental sciences, as can be recognised in statements as “less reactive than”, “less polluting than”, etc. If objects are included in the above claims, then they turn into “x is less reactive than y”, “x is less polluting than y”, etc. Now, if “is less polluting (reactive) than” is symbolised by ≼,Footnote 1 then we obtain x ≼ y. The binary relation ≼ satisfies three properties (Trotter 1992), i.e. reflexivity, antisymmetry and transitivity, which means that x ≼ x, that if x ≼ y and y ≼ x , then x = y, and that if x ≼ y and y ≼ z, then x ≼ z, respectively, with x, y, z being objects to order. If the objects are grouped into a set X, then the couple (X,≼) is called a partially ordered set, abbreviated as poset (Trotter 1992). It is called “partially” because it might happen that not every couple of objects in X satisfies the relation ≼. In general, for any couple x, y ∈ X, one of these possibilities holds: (1) x ≼ y, y ≼ x; (2) x ⋠ y, y ⋠ x; in the first case we say that x and y are comparable and we write, in general, x ⊥ y and in the second case we state that x and y are incomparable (x ∥ y) (Brüggemann and Bartel 1999). For the particular case where x ≼ y and y ≼ x, we say that x and y are equivalent (x ∼ y). This kind of relation is only explored in Sect. 5.2.3. If every pair of objects of X is comparable, the poset is called a chain and ≼ a linear order (or total order). In contrast, an antichain is made of incomparable objects (Brüggemann and Bartel 1999). A poset may contain several chains in it. An incomparable object is a chain as well as an antichain; if the chains of a poset are collected, the chain containing more objects than all the other chains is called a maximum chain. Likewise, a maximum antichain is the antichain of a given poset having more objects than the other antichains (Trotter 1992).

The derived idea from the above generalisation is that it is not always possible to end up with a total order. A particular popular kind of total order is a ranking, where there is only one first, one second, one third, and so on. Hence, given a set of objects to order, it might occur (according to the ordering constraints) that some couples are equivalent and incomparable and others comparable. Brüggemann, Klein, and their co-workers have shown how these situations abound in daily life and in chemistry and environmental sciences (Brüggemann and Patil 2011; Restrepo et al. 2011).

Once (X,≼) is given, a graphical representation can be generated, namely the Hasse diagram (Halfon 2006; Neggers and Kim 1998). This diagram is a directed graph

 = (X,E) with (x,y) ∈ E iff x, y ∈ X, x ≼ y, and there exists no z ∈ X with x ≼ z ≼ y (symbolised as x ≼ : y and called a cover relation of y over x) (Trotter 1992); E is called the set of directed edges of.

By convention,

is drawn in the Euclidean plane whose horizontal/vertical coordinate system requires that the vertical coordinate of y ∈ X be larger than the one of x ∈ X iff x ≼ : y (Restrepo and Brüggemann 2008).

If X = {a,b,c,d,e} and the following order relations hold: a ≼ : b, c ≼ : b, c ≼ : d, b ≼ : e, and d ≼ : e, then the corresponding Hasse diagram is shown in Fig. 5.1a. For the ensuing discussion it is important to define linear extensions and intersection of posets. A linear extension of the poset (X,≼) is a total order obtained from the poset such that it preserves the order relations contained in (X,≼) (Trotter 1992). The corresponding linear extensions of the Hasse diagram in Fig. 5.1a are shown in Fig. 5.1b.

Fig. 5.1
figure 1

(a) Hasse diagram and (b) linear extensions of the poset ({a,b,c,d,e},{a ≼ : b, c ≼ : b, c ≼ : d, b ≼ : e, d ≼ : e})

In spite of the simplicity gained when drawing cover relations rather than comparabilities, Hasse diagrams may become rather entangled as the number of elements and cover relations grows. For practical reasons, mainly related to the readability of a Hasse diagram, it is important to develop indicators of such readability, which may be associated with the complexity of the diagram.Footnote 2 Here we discuss some approaches to treating posetic complexity and moot a novel complexity indicator, which is further generalised at the end of the chapter.

2 Complexity of Posets

Trotter and Bogart, in 1976, related the complexity of a poset to its dimension (Trotter and Bogart 1976). The dimension of a poset was defined by Dushnik and Miller in 1941 as the minimum number of linear extensions of the poset whose intersectionFootnote 3 yields the poset. Hence, for the poset (X,≼) shown in Fig. 5.1a, its dimension is 2 since (X,≼) = L 4 ∩ L 5. Trotter and Borgart gave an interesting justification of dimension as a posetic complexity indicator: “Suppose each of the finite number of observers expresses his individual opinion on the relative merits of a finite set of options by ranking the options in a linear order. A partial ordering on the options is obtained by ranking option x higher than option y when all observers have agreed that x is preferred to y. Conversely, the dimension of a partial order indicates the minimum number of observers necessary to produce the given partial order as a statement of those preferences on which the observers agree unanimously” (Trotter and Bogart 1976). Trotter and Borgat defined two additional complexity indicators, namely interval dimension and semi-order dimension, which are based on the original dimension definition and require the introduction of functions from the poset to the real numbers (Trotter and Bogart 1976). Another dimension complexity indicator is the greedy dimension (Kierstead and Trotter 1985), which follows the same dimension principle explained above but with restrictions on the kind of linear extensions to use, called greedy linear extensions. As explained by Trotter, greedy dimension arose as a solution to the scheduling problem known as the jump number problem: “An ordered set P Footnote 4 represents a set of tasks to be performed on a single processor. If x < y in P, then x must be performed before y. An admissible schedule is then a linear extension of P. Suppose that a set-up cost is paid for each pair x, y ∈ P with x incomparable to y in P and x and y occurring consecutively in the linear extension. Find a linear extension L of P which minimizes the number of consecutive pairs of L which are incomparable in P” (Kierstead and Trotter 1985). The minimum number of those greedy extensions able to reproduce the poset by their intersection is called the greedy dimension of the poset. Some other results on dimension theory are found in Kelly and Trotter (1982) and West (1985).

Although complexity associated with posetic dimension has generated a wealth of studies, its calculation is a difficult task, to the extent that it is an NP-hard problemFootnote 5 (Yannakakis 1982; Yáñez and Montero 1999).

Another posetic complexity indicator was developed in 2000 by Luther et al., who developed a heuristic indicator, defined as follows:

$$ \mathrm{J}=-\mathrm{D}\times'' C/'' X $$

where T(s) is given by

$$ T(s):= \left\{ \begin{array}{lll} 0 & 0\le s\le \beta,\\ \frac{s-\beta\;}{a-\beta} & \beta\le s\le \alpha,\\ \frac{1-s}{1-\alpha} & \alpha\le s,\kern0.36em \end{array}\right. $$

with

$$ s:=\frac{ number\; of\; incomparabilities}{ number\; of\; objects\times \left( number\; of\; objects-1\right)/2} $$

where α and β are steering parameters, which are set by the authors as α = 0.8 and β = 0.3. To illustrate this method, let us consider the unlabelled posets of a set of three objects,Footnote 6 the corresponding values, along with the posets, are depicted in Fig. 5.2.

Fig. 5.2
figure 2

Unlabelled posets of three objects and their respective values s and T(s) needed to calculate their complexity C according to Luther et al.’s method. Values p , p , and H are explained in Sect. 5.2.1

The complexity indicator C captures the intuitive idea that a maximum chain and a maximum antichain are not complex posets, while posets holding a mixture of comparabilities and incomparabilities are more complex. However, the method has a subjectivity, namely the setting of α and β.

In the current chapter, we develop a novel posetic complexity indicator following a Shannon’s entropy approach based on the number and distribution of order relationships in the poset.

2.1 An Entropic Posetic Complexity Indicator

If the system whose complexity is going to be calculated can be described as a discrete random variable with possible values {A 1,A 2, …,A n } and associated probabilities {p 1,p 2, …,p n }, then Shannon’s entropy (Shannon 1948) (complexity) of the system is H = − ∑ p i log2 p i .

The random variable we define on a poset is given by the relationships among the elements of the poset and its values are the number of comparabilities and of incomparabilities.

Definition 1

Let (X,≼) be a poset and let R ⊂ X × X be the possible relationships between couples of objects to order. Let ⊥ ⊆ R and ∥ ⊆ R be the comparabilities and incomparabilities of X, respectively, such that R = ⊥ ∪ ∥. As |R| = N(N − 1)/2 is the cardinality of R, with |X| = N, we define p  = | ⊥ |/|R| and p  = | ∥ |/|R| as the probabilities of having a comparability and an incomparability in (X,≼), respectively. We say that H(X,≼) = − ∑ p i log2 p i , with i ∈ {⊥,∥}, is the complexity of the poset (X,≼).

For the posets of Fig. 5.2, the respective probabilities and complexities are shown at the bottom of the same figure, where the posets with maximum chain and maximum antichain have lowest complexity while all the other posets have maximum complexity. This is so since the distribution of relationships for those with maximum complexity is the same, i.e. 2 and 1, regardless of if they are comparabilities or incomparabilities. Note that for the maximum chain the number of comparabilities is 3 even if the Hasse diagram depicts only 2; this occurs because the diagram shows cover relationships rather than comparabilities.

If both complexity indicators, i.e. C and H, are normalised, then they yield similar results for the posets in Fig. 5.2, except for the one with one comparability and two incomparabilities. In Luther et al.’s indicator, this poset is the most complex one, while in our methodology it has the same complexity as the others that are neither maximum chain nor maximum antichain. This occurs because Luther et al.’s method gives more importance to incomparabilities than to comparabilities, which, as mentioned, is a subjectivity of the method. In our case, both comparabilities and incomparabilities are treated with the same importance.

Some properties of H can be explored on the basis of its relationship with integer partitions and therefore with Young diagrams (Andrews and Eriksson 2004). A Young diagram is a graphical representation of a partition; it is made up of a two-dimensional arrangement of boxes where the kth row has the same number of boxes as the kth term in the partition. If the partition of the integer n is a + b + … + c, for a list a, b, … c, of r positive integers in such a way that a ≥ b ≥ … ≥ c, then the diagram is the arrangement of n boxes in r rows (Andrews and Eriksson 2004).

As R is always an integer and ⊥ and ∥ partition it, then |R| = | ⊥ | + | ∥ |. Hence, the associated Young diagrams for the posets shown in Fig. 5.2 are depicted in Fig. 5.3. The Young diagrams here described always have two rows, one referring to the number of comparabilities and the other to the incomparabilities. The top row always corresponds to the relation with more cases. Hence, if there are more comparabilities than incomparabilities, then the top row refers to comparabilities. If the incomparabilities are highest, the top row corresponds to incomparabilities. In the case of having the same number of comparabilities and incomparabilities, then there is no distinction as to which relation is at the top or at the bottom.

Fig. 5.3
figure 3

Unlabelled posets of three elements, considering two relationships, and their associated Young diagrams for |R| partitioned into one or two parts

Since our complexity indicator always requires R to be partitioned into comparabilities and incomparabilities and as extreme cases one has either |R| comparabilities or |R| incomparabilities, then the number of possible Young diagrams for unlabelled posets of |R| relations is ⌊<Emphasis aid:cstyle=”Italic” Type=”Italic”>(|R|/2) + 1</Emphasis>⌋,Footnote 7 which is the number of partitions of |R| into one or two parts (Hardy 1920). This implies that there are always (|R|/2) + 1 different values of H for a given |R|, which is the same as there are always (N(N − 1)/4) + 1 different values of H for the set of all unlabelled posets on a set X of cardinality N.

Proposition 1

For an unlabelled poset (X,≼), with |X| = N, there are (N(N − 1)/4) + 1 different values of H (Definition 1).

Proof

It is already stated in the previous paragraph□

For all 16 posets on a set of four objects, their respective complexities and Young diagrams are shown in Fig. 5.4.

Fig. 5.4
figure 4

Entropy values H of unlabelled posets of four objects, which are ordered according to their number of comparabilities | ⊥ |. At the top of the plot, the corresponding Young diagram for each set of equivalent posets is depicted

Proposition 1 is important as it states that just by knowing the number of objects to order, one knows the possible complexity values the different unlabelled posets can take.

Now the question that arises is: having calculated the complexity of a poset is it possible to state that the poset is complex or not? The flat answer is no, for the entropic complexity calculated is not an absolute indicator. The problem can be overcome if one knows the total number of unlabelled posets for the N in question, i.e. the number of posets for N objects. It turns out that knowing the total number of unlabelled posets of a given N is an open question in mathematicsFootnote 8 (McKay and Brinkmann 2002). Fortunately there is a way to overcome the problem of counting posets, namely through knowing the upper and lower bounds of H. Hence, if we know the maximum and minimum values H can take, then the values of H make more sense, as a bounded scale is given for H and then one can know whether the poset and its complexity are close to the H upper bound, in which case one can claim that the poset is complex to a great extent. Or one can state that the poset is not that complex if its complexity is close to the lower bound.

Let us first consider the lower bound of H. According to the Young diagram’s representation of the partition of N induced by ⊥ and ∥, the minimum value H can occur when there is either only comparabilities or only incomparabilities, i.e. p  = 1 and p  = 0, or p  = 0 and p  = 1, respectively. In that case, the Young diagram is made of a single row containing N boxes. The complexity associated with such a case is H min = 0, which is the lower bound for H.

The upper bound of H occurs when the number of comparabilities and incomparabilities is equal or almost equal, i.e. when p  = p  ≅ 0.5, which implies a Young diagram with two rows evenly or close to evenly populated.

In general, we have H values ranging between the real scale depicted in Fig. 5.5.

Fig. 5.5
figure 5

Upper H max and lower H min bounds of H, with their respective Young diagrams. The line corresponds to the real scale bounded by H max and H min

The above findings can be formalised as follows:

Proposition 2

For an unlabelled poset (X,≼), with |X| = N, the upper (H max) and lower (H min) bounds of H (Definition 1) are given by H min = 0 and Footnote 9

$$ {H}_{\max }=\Big\{\;\begin{array}{l}\kern2.4em 1\kern2.76em for\; even\;N, with\;{p}_{\hat{\mkern6mu} }={p}_{\left|\right|}=0.5\\ {}-{\displaystyle \sum {p}_i\;{ \log}_2{p}_i\kern1.08em for\; odd\;N, with\;{p}_{\hat{\mkern6mu}}\; and\;{p}_{\left|\right|}\; as\; follows}\end{array} $$
$$ \begin{array}{l}{p}_{\perp{\mkern6mu} }=\frac{\left\lceil \frac{N\left(N-1\right)}{4}\right\rceil }{\frac{N\left(N-1\right)}{2}}\kern0.24em and\;{p}_{\left|\right|}=\frac{\left\lceil \frac{N\left(N-1\right)}{4}\right\rceil }{\frac{N\left(N-1\right)}{2}}\\ {}\kern4.92em or\\ {}\\ {}{p}_{\perp{\mkern6mu} }=\frac{\left\lceil \frac{N\left(N-1\right)}{4}\right\rceil }{\frac{N\left(N-1\right)}{2}}\kern0.24em and\;{p}_{\left|\right|}=\frac{\left\lceil \frac{N\left(N-1\right)}{4}\right\rceil }{\frac{N\left(N-1\right)}{2}}\end{array} $$

Proof

In the previous paragraph proof was given that H min = 0. The upper bound of H is attained when N is partitioned into two parts p and p of the same or almost the same cardinality. It is the same when N is even, in which case the total number of possible relationships (N(N − 1))/2 is evenly divided into two parts of cardinality (N(N − 1))/4. Hence, \( {p}_{\perp{\mkern6mu} }={p}_{\left|\right|}=\frac{\left(N\left(N-1\right)\right)/4}{\left(N\left(N-1\right)\right)/2}=0.5 \); therefore H = 1. If N is an odd integer, then (N(N − 1))/4 is not an integer number and the most homogeneous distribution of comparabilities and incomparabilities is given when the absolute value of (| ⊥ | − | ∥ |) = 1. In that case the respective cardinalities are given by \( \left\lfloor \frac{N\left(N-1\right)}{4}\right\rfloor\;and\;\left\lceil \frac{N\left(N-1\right)}{4}\right\rceil \) or \( \left\lceil \frac{N\left(N-1\right)}{4}\right\rceil\;and\;\left\lfloor \frac{N\left(N-1\right)}{4}\right\rfloor \). Hence, \( {P}_{\perp }=\frac{\left\lfloor \frac{N\left(N-1\right)}{4}\right\rfloor }{\frac{N\left(N-1\right)}{2}}\kern0.48em and\kern0.24em {p}_{\left|\right|}=\frac{\left\lceil \frac{N\left(N-1\right)}{4}\right\rceil }{\frac{N\left(N-1\right)}{2}} \) or \( {p}_{\perp }=\frac{\left\lceil \frac{N\left(N-1\right)}{4}\right\rceil }{\frac{N\left(N-1\right)}{2}}\kern0.24em and\kern0.24em {p}_{{}_{\left|\right|}}=\frac{\left\lfloor \frac{N\left(N-1\right)}{4}\right\rfloor }{\frac{N\left(N-1\right)}{2}} \) The complexity H is then calculated as in Definition 1□

Having found the upper and lower bounds of H for posets of N objects, it is now possible to frame each complexity indicator for a particular poset of N objects into the range defined by the upper and lower bounds. Hence, the closer the complexity to its upper bound, the more complex it is; likewise, the closer to its lower bound, the less complex it is.

2.2 An Application to the Ordering of Countries by Performance in Chemistry

A widespread source of posets in chemistry and environmental sciences is the ordering of objects that are characterised by different properties, something that has been further explored in the so-called Hasse diagram technique (Brüggemann and Bartel 1999; Restrepo and Brüggemann 2008; Restrepo et al. 2008a). If objects x, y ∈ X are characterised by properties q 1(x), q 2(x), …, q i (x) and q 1(y), q 2(y), …, q i (y), respectively, x is ordered lower than y (x ≼ y) if all its properties are lower in magnitude than those of y, or if at least one property is lower for x while all others are equal. This gives place to comparabilities. If all properties of x and y are equal, both objects are equivalent. Note that in the entropic complexity approach introduced in this chapter, equivalences are not considered. If at least one property q j satisfies q j (x) < q j (y) while the others are opposite (q i (x) ≥ q i (y)), x and y are incomparable.

Let us consider the ordering of countries according to the circulation of their scientific production in chemistry from 1996 to 2007, i.e. a bibliometric ordering of countries. We took the data from SCImago (2007). The example takes 195 countries characterised by three properties: NDoc, number of citable documents (articles, reviews, and conference papers); ACit, average citations of documents published during 1996–2007Footnote 10; and Hind, the H-index (Hirsch 2005), which takes the value h if the country’s number of documents has at least h citations. To make a fair comparison of countries, we took into account the 2011 population (Pop) of the countries listed in the database of the World Bank (2013), which corresponds to 183 countries. Thus, we obtained three properties: NDoc-cap, number of citable documents per capita [NDoc/Pop]; ACit-cap, average citations of documents published during 1996–2007 per capita [ACit × NDoc-cap]; and Hind-cap, the H-index per capita [H-index × (NDoc/Pop)]. For the sake of simplicity, we selected those countries with NDoc ‐ cap ≥ 0.001, ACit ‐ cap ≥ 0.01, and Hind ‐ cap ≥ 0.05. Hence, we ended up with 29 countries, whose information is shown in Table 5.1. The complete information for all countries is shown in Table SP1 of the supplementary material uploaded in the following link: https://docs.google.com/file/d/0B4tX8gtPEjlbNjJFUWsxOTk1c0k/edit.

Table 5.1 29 countries characterised by NDoc-cap (number of citable articles, reviews, and conference papers, per capita), ACit-cap (average citations of documents published during 1996–2007, per capita), and Hind (H-index, per capita) of chemical documents published between 1996 and 2011

To assess the effect of each property upon the complexity of the poset, we considered four posets: one considering the three properties and three others where only two out of the three properties are regarded. These posets are depicted in Figs. 5.6, 5.7, 5.8 and 5.9.

Fig. 5.6
figure 6

Hasse diagram of 29 countries ordered by three bibliometric properties, with complexity of 0.9831. Country’s labels are found in Table 5.1.

Fig. 5.7
figure 7

Hasse diagram of 29 countries ordered by two bibliometric properties: ACit-cap (average citations of documents published during 1996–2007, per capita) and Hind-cap (H-index per capita) of chemical documents published between 1996 and 2011, with complexity of 0.9073. Country’s labels are found in Table 5.1

Fig. 5.8
figure 8

Hasse diagram of 29 countries ordered by two bibliometric properties: NDoc-cap (number of citable articles, reviews, and conference papers) and Hind-cap (H-index per capita) of chemical documents published between 1996 and 2011, with complexity of 0.9593. Country’s labels are found in Table 5.1

Fig. 5.9
figure 9

Hasse diagram of 29 countries ordered by two bibliometric properties: NDoc-cap (number of citable articles, reviews, and conference papers, per capita) and ACit-cap (average citations of documents published during 1996–2007, per capita) of chemical documents published between 1996 and 2011, with complexity of 0.5917. Country’s labels are found in Table 5.1

The three properties are oriented, i.e. high values of them indicate better circulation than low values. The poset of the 29 countries is shown in Fig. 5.6, which was drawn and further analysed with the software WHasse (Brüggemann and Bartel 1999; Brüggemann et al. 1995) available from Rainer Brüggemann. It shows that there is a country with maximum circulation of chemical scientific production, i.e. Switzerland (SWI). There are two countries behaving better than the others except for SWI, i.e. Sweden (SWE) and Germany (GER). Five are the countries with minimum circulation of chemical literature: Monaco (MONA), Hungary (HUN), Estonia (EST), South Korea (SKO), and USA. By inspecting each one of the three properties, we found that SWI always had the maximum score in each of them. Therefore, by removing one or even two properties out of the three discussed, SWI keeps its position as a country with maximum circulation of chemical literature (Figs. 5.7, 5.8 and 5.9).

As N = 29, H can take 204 different values (Young diagrams). According to Proposition 2, the maximum value H can take is a number close to 1. The poset has 234 comparabilities and 172 incomparabilities, yielding H = 0.9831, i.e. the poset has a complexity of 98 %, which indicates that it is closer to the maximum allowed complexity of 100 %. The Young diagram depicting this partition would be that with the largest row having 234 boxes and the shortest with 172 ones.

The deletion of NDoc-cap and the further ordering of the countries based on the remaining two properties give place to the poset shown in Fig. 5.7 and to its respective complexity (H = 0.9073). Sweden (SWE) and Germany (GER) keep their position as countries with better circulation than other countries except Switzerland (SWI). Note that USA now accompanies SWE and GER in their behaviour of good circulation. In fact USA, in this poset, is not part of the countries with minimum circulation as it was in the poset of Fig. 5.6. USA is now behaving better than several other countries, e.g. Japan (JAP), Australia (AUS*), Italy (ITA), and Czech Republic (CZE), among others. The countries with minimum circulation are Monaco (MONA), Hungary (HUN), Estonia (EST), and South Korea (SKO). In the diagram of Fig. 5.6 the only thing we knew about USA was that SWI was better than it in terms of chemical scientific circulation. Hence, by ordering countries disregarding the number of citable documents per capita, the complexity of the poset is reduced.

The deletion of ACit-cap and the ordering of the countries based on the remaining two properties yielded the poset shown in Fig. 5.8 with complexity H = 0.9593. Sweden (SWE) and Germany (GER) keep their good behaviour and USA leaves this group by becoming, again, a country with minimum scientific circulation along with Monaco (MONA), Estonia (EST), and South Korea (SKO). By comparing this result with that of Fig. 5.6, Hungary (HUN) is not part of the countries with minimum circulation, as it is now better than New Zealand (NZE), Norway (NOR), and Estonia (EST). Hence, by ordering countries disregarding the average citations of documents published per capita, the complexity of the poset is reduced (Fig. 5.8).

The deletion of Hind-cap and the ordering of the countries based on NDoc-cap and ACit-cap yielded the poset shown in Fig. 5.9 with complexity H = 0.5917. This poset has several changes, e.g. Sweden (SWE) and Germany (GER) leave their good circulation, for there are some other countries behaving better than them. In the case of SWE, now Monaco (MONA) is better than it. In the case of GER, it turns out that Belgium (BELG), Singapore (SIN), Denmark (DEN), and even SWE now have better circulation than GER. South Korea (SKO) and USA are the countries with minimum chemical scientific circulation. In this case MONA and Hungary (HUN) leave the set of countries with minimum circulation as MONA becomes better than many other countries; in fact the only country with better circulation than MONA is Switzerland (SWI). Regarding HUN, in spite of leaving the set of countries with minimum circulation, it is only better than South Korea (SKO). Thus, by ordering countries disregarding the H-index per capita, the complexity of the poset is largely reduced regarding the complexity considering the three properties. This result indicates that this poset is more ordered or possesses more comparabilities than the previous ones. An explanation of such an effect can be seen in the meaning of NDoc-cap and ACit-cap, properties used to obtain the discussed poset. NDoc-cap refers to the citable documents per capita and ACit-cap to the average citations per capita, which depict similar orderings. In fact, SWI > MONA > SWE shows a complete agreement of both properties for these three countries. Thus, it is the inclusion of Hind-cap which gives complexity to the poset.

2.3 Some Possibilities for Posetic Complexity Indicators

Coming back to the complexity indicator here presented (Definition 1), such an indicator is based on comparabilities and incomparabilities. However, there is another kind of relationship between objects of a poset, namely equivalence, which occurs when x ≼ y and y ≼ x, with x and y being objects to order. This relation between the two objects is written (x ∼ y). Hence, a general posetic complexity indicator could include the three relationships. In this case the total number of relationships R, given the number of objects to order, would be partitioned into three parts. The number of different possibilities of doing that (Hardy 1920) is given by (n + 3)2/12, which yields the number of corresponding Young diagrams for those partitions. However, not all Young diagrams are attainable by partitioning relationships into comparabilities, incomparabilities and equivalences. This can be seen by a set X of three elements, which yields three possible relationships. For example, the Young diagram with three rows indicating one comparability, one incomparability and one equivalence is not a real possibility, as the equivalence of two elements implies either the comparability or incomparability with the third one, giving place to a partition 2,1, i.e. a Young diagram of only two rows, one with two boxes and another one with one. A mathematical question to be solved is the calculation of the realisable Young diagrams associated with these particular posetic partitions including three parts (three relationships).

Even if such a relation between relation partitions and Young diagrams is not established in this chapter, the complexity indicator settled in Definition 1 can be generalised to the three order relationships.

Definition 2

Let (X,≼) be a poset and let R ⊂ X × X be the possible relationships between couples of objects to order. Let ⊥ ⊆ R, ∥ ⊆ R and ∼ ⊆ R be the comparabilities, incomparabilities, and equivalences of X, respectively, such that R = ⊥ ∪ ∥ ∪ ∼. As |R| = N(N − 1)/2 is the cardinality of R, with |X| = N, we define p  = | ⊥ |/|R|, p  = | ∥ |/|R|, and p  = | ∼ |/|R| as the probabilities of having a comparability, an incomparability and an equivalence in (X,≼), respectively. Hence, we say that H(X,≼) = − ∑ p i log2 p i , with i ∈ {⊥,∥,∼}, is the complexity of the poset (X,≼).

The posets with three objects and their respective complexities are shown in Fig. 5.10.

Fig. 5.10
figure 10

Unlabelled posets of three objects, considering three relationships, and their respective values of complexity H. Multiple circles indicate equivalence among circles

Following a similar analysis of the upper and lower bounds found in Proposition 2, we found that here the respective values are H min = 0 and H max is equal to 1 if N is even and close to 1 if N is odd. In this latter case, the probabilities of each one of the three relationships are almost equal and are of the form \( {p}_i\approx \frac{\frac{N\left(N-1\right)}{6}}{\frac{N\left(N-1\right)}{2}}\approx 0.33 \), always satisfying ∑ p i  = 1.

3 Conclusions and Outlook

The posetic complexity indicator, based on comparabilities and incomparabilities, solves the shortcomings of the complexity indicators reviewed in the chapter, namely the one based on dimension theory and the heuristic one using parameters α and β. When contrasted with the first one, our complexity indicator does not have the problems of dimension calculation for it is based on counting comparabilities and incomparabilities, rather than intersection of linear extensions. These kinds of counting are already included in several statistical packages to treat posets, e.g. WHasse and PyHasse (Brüggemann and Voigt 2009), available from Rainer Brüggemann. Regarding the contrast of our complexity indicator with the heuristic one, our method does not fit any parameter before complexity calculation, which makes it an objective complexity indicator. Additionally, our method does not emphasise the importance of incomparabilities over comparabilities. In the indicator presented in this chapter, both comparabilities and incomparabilities are evenly regarded.

In a recent workshop on posets and their applications,Footnote 11 Fattore mooted the idea of calculating the complexity through a Kolmogorov’s approach. In this case the complexity is not based on counting comparabilities and incomparabilities but on the treatment of a posetic derived matrix, e.g. a cover matrix (showing cover relationships between couples of objects of the poset). This approach is related to the compressibility of the given matrix. Hence, a quite complex poset is one requiring more bits to be represented in a string code, e.g. a binary string. In fact, for a given set of objects, the poset with maximum complexity is the one that after compression has the longest string. In contrast, the less complex poset is the one that, after compression, is represented by a minimum number of bits. It would be interesting to explore this approach and its mathematical properties, as well as its relationship with other complexity indicators.

The example showing the applicability of the complexity indicator introduced here is of particular importance given the current interest on academic rankings. For several reasons, including distribution of funds for research based on research performance, academic rankings have become popular. Examples of these rankings are the Academic Ranking of World Universities (Shanghai ranking), Times Higher Education and the QS World University Rankings, among others. All these rankings, in the end, yield a total order resulting from the weighted aggregation of indicators (properties); the resulting indicator is called a composite indicator. The difference among those rankings lies on the kind of indicators used, i.e. some more oriented to research, some others to education, etc. The additional difference is the importance each ranking gives to the indicators. It turns out that the aggregation of indicators is customarily a linear combination, whose weights are selected upon the importance of the indicators. That is why, contrary to the popularisation of these rankings and to their use for decision-making processes, we think they are not “the” best option due to their subjectivities. Changing the subjectivities here mentioned changes the final ranking (Restrepo et al. 2008b). A possible way to overcome the subjectivity on the indicators’ weights is to avoid aggregations, something the Hasse diagram technique allows.

The example of countries’ ordering based on chemical literature circulation was, then, selected as a way to show an alternative option to academic rankings. The analysis of the posetic complexity shows that, among the properties considered, the H-index per capita constitutes the property introducing more incomparabilities when combined with the other two bibliometric properties. This kind of approach—not a ranking—and several other techniques designed to extract information from posets under the shade of the Hasse diagram technique constitute a novel approach worth studying and using by decision-makers. The results on the ranking of countries show that Switzerland (SWI), regardless of the three kinds of descriptors used, is always the best country in chemical circulation of knowledge. There is not “a” worst country regarding this circulation, which constitutes one of the advantages of posets, i.e. if data do not allow it and if aggregations are not performed, several “firsts”, several “seconds”, and several “lasts” may result. Among the 29 countries considered, Monaco (MONA), Hungary (HUN), Estonia (EST), South Korea (SKO), and the USA are countries that need to take measures to address (increase) their circulation of chemical knowledge. It is a matter of surprise to find the USA in the group of countries needing action, as it is common to consider this country as one of the best in this kind of circulation of its research. This idea is true only if the bibliometric data are not considered per capita. For the particular case considered in this chapter, it would be interesting to know the investment on chemical research of the countries studied to see how this information maps the ordering found here or how it affects the ordering. Then, considering the population of a country is important as it is not fair to compare, e.g., the USA with SWI based on raw information, where USA has by far more scientists than SWI. If that is done, the USA appears with the best chemical circulation.