Keywords

1 Introduction

Several scientific disciplines are involved in the study of chemical processes that occur in water, air, terrestrial and living environments, and the effects of human activity on them. Environmental chemistry, as one of these, is not only the discipline handling substances in difficult targets, such as sludge (Jin et al. 2017), trees (Ferretti et al. 2002), or lichens (Pirintsos et al. 2006), but also having the task to support decisions in environmental systems (Pirintsos and Loppi 2008).

Due to the complexity of environmental systems a series of well-defined indicators is constructed, representing the knowledge about the system and thus supporting decisions for an appropriate management (Buonocore et al. 2018; Grönlund 2019). Hence, the start for a management and a decision based on a ranking of chemicals (our example) is the analysis of multi-indicator systems (MIS). An example of MIS is the output of several lichen biomonitoring studies concerning metal pollution in the atmospheric environment. To be more specific, the indicators are related to locations, their values are derived following a procedure described by Nimis and Bargagli (1999).

Lichens are perennial, slow-growing organisms, highly dependent on the atmosphere for nutrients. The lack of a waxy cuticle and stomata allows many contaminants, which are deposited on lichens by precipitation, fog and dew, dry sedimentation and gaseous absorption, to be absorbed over the whole lichen thallus surface, indicating levels of these contaminants in the surrounding environment (Loppi et al. 1999). By biomonitoring at specifically selected sites, for example near roads (Frati et al. 2006) or more pristine areas (Loppi and Pirintsos 2003), information is obtained about the transport and origins of pollution.

In Pirintsos et al. (2014) 11 metals/metalloids are investigated in 20 sites of an urban and industrial area based on the lichen biomonitoring data set of Demiray et al. (2012), where Xanthoria parietina lichen specimen have been used as a biomonitoring organism. The evaluation of the corresponding data matrix is based on the conception that the Hasse diagram technique (see below) can further be expanded and improved in the direction of (i) cumulative risk, (ii) the up-to-date formal presentation and (iii) the interpretation of results in biomonitoring studies of metal atmospheric pollution.

The analysis of biomonitoring results can be crudely characterized by two aspects: (a) attempts to support a decision, based on order relations and (b) attempts to present small scale spatial variations within a geostatistical approach. Here our focus is on the order theoretical aspects.

As the metals and metalloids are measured at m different sites the concentrations found in lichens of each site define, after transformations as recommended by Nimis and Bargagli (1999) an indicator. Hence the MIS contains m indicators, the values describing the pollution due to a single metal or metalloid.

The question arises how to derive a decision when confronted with m indicators. Here we show first a Hasse diagram, which is a visualization of the partial order, induced by the set of indicators (cf. Bruggemann and Patil 2011), then we discuss, as to how far a single ranking (a weak order (see below)) can be obtained without the need of a subjective weighting scheme of the indicators in order to aggregate them by a weighted sum. As an exact solution of the problem how to get a weak order is hardly computationally tractable, we investigate a new calculation method.

2 Material and Methods

2.1 Data Set

Eleven Metals and metalloids, i.e., Hg, Al, As, Cd, Cu, Fe, Mn, Ni, Pb, V and Zn for which their pollution has been monitored are included in the study (Pirintsos et al. 2014). For a management it is of importance which of these metals or metalloids (in the following we call them simply metals, although this is not a chemically correct term) is highly concentrated. As sites we select only 10 (of 20) because of reasons which become clear in following sections. The concentrations are transformed into a scale of integers from 1 to 7, following the suggestion of Nimis and Bargagli (1999). 1 indicates a high naturality, whereas 7 express a high deviation from the natural state. The data are shown in Table 1, in the Results-section. An entry of the data matrix, dm(I,j) is associated with jth site and the ith metalloid. Details can be found in Pirintsos et al. (2014).

Table 1 data of 11 metals and their scores in lichens in 10 sites

2.2 Basic Concepts of Partial Order

Let X be a finite set of n objects, labeled by x(i) (i = 1,…,n). Objects could be but not limited to

  • chemical compounds (here: metals/metalloids)

  • nations, characterized by for example child well-being indicators

  • strategies, characterized by performance indicators

  • geographical units, characterized for example by pollution, or (within a socio-economic context) by poverty indicators

Here, indeed the elements of the real example are “chemical elements”, namely n = 11 metals.

To define an order relation among them, the relation “≤” has to obey the following order axioms:

  • reflexivity: the object can be compared with itself

  • antisymmetry: if x ≤ y and y ≤ x ⇒ x = y

  • transitivity: if x ≤ y and y ≤ z ⇒ x ≤ z

A special realization of order relations is given by Eqs. (1, 2, and 3):

$$ x(i)=\left(x\left(\mathrm{i},1\right),\mathrm{x}\left(\mathrm{i},2\right),\dots .,\mathrm{x}\left(\mathrm{i},\mathrm{m}\right)\right) $$
(1)

The quantity x(i, j) is the value of the ith object (i = 1,…,n) (here the ith metal), the jth indicator (j = 1,..,m) (here the jth site) and m the number of indicators used (here m = 10).

Equation 1 describes a mapping XIR m,wherein X is the set of objects (the metals) and IRm is the set of tuples of real numbers with m components. Note that the tuples are also denoted as data profiles.

According to m = 10 sites, we will have a system of 10 indicators.

$$ x(i1)\le x(i2):\iff \left(x\left( i1,1\right),\dots, \mathrm{x}\left( i1,m\right)\right)\le \left(x\left( i2,1\right),\dots .,x\left(\underline{\mathrm{i}2},m\right)\right) $$
(2)

Equation 2 needs clarification, as it is not yet clear under which conditions one tuple (that of x(i1) is to be considered less or equal to that of x(i2). The way how Eq. 2 can be given a meaning, opens the door to many variants. By Eq. 3

$$\setcounter{equation}{2}\begin{array}{llll} &&\left(x\left( i1,1\right),\dots, \mathrm{x}\left( i1,m\right)\right){\le} \left(x\left( i2,1\right),\dots .x\left( i2,m\right)\right):\iff x\left( i1,j\right){\le} x\left( i2,j\right)\nonumber\\ &&\quad \mathrm{for}\ \mathrm{all}\ j=1,..,m \end{array}$$
(3)

a special partial order is defined. Two objects, following Eq. 3 are called “comparable”, otherwise “incomparable”.

The immediate relation to the data and the corresponding indicators has two consequences:

  1. 1.

    Any order relation xy is a direct reflection of the data values of x and y. This is in contrast to many decision support systems, where an order relation cannot easily be traced back to the original data, i.e., to the data matrix.

  2. 2.

    The partial order methodology, based on Eq. 3 is applicable wherever a data matrix is available and where a ranking aim can be defined.

In the literature the method, based on Eq. 3 together with appropriate supporting software, is often denoted Hasse diagram technique (HDT) (Galassi et al. 1996; Grisoni et al. 2015; Halfon and Reggiani 1986; Bruggemann et al. 2001, 2008; Patil and Taillie 2004; Klein and Ivanciuc 2006; Simon et al. 2004, 2006; Helm 2003; Bruggemann and Voigt 2008, 2011, 2012; Carlsen and Bruggemann 2011, 2014a, b; Carlsen 2008a, b, 2013, 2018; Newlin and Patil 2010; Annoni et al. 2014; Sørensen et al. 1998, 2000; Pavan and Todeschini 2004; Pudenz and Heininger 2006; Quintero et al. 2018; Restrepo and Bruggemann 2008; Restrepo et al. 2008a, b; Voigt et al. 2004a, b) with reference to the German mathematician Helmut Hasse (1967). Sets X, equipped with a partial order (and thus in this paper by the Eqs. 1, 2, and 3) are called partially ordered sets and are conveniently denoted as posets and indicated by (X, ≤).

Two objects x, y mutually incomparable are denoted as x ǁ y. Within a poset (X, ≤) the number of incomparable pairs x ǁ y is called U. When the orientation xy or xy is of minor interest then the mere fact of comparability is denoted by xy.

Note that partial order methodology can also be applied, by evaluation of the space of all possible data profiles, when the indicators are discrete (cf. e.g. Fattore and Maggino 2014, as well as Maggino et al. this book).

2.3 Hasse Diagram

The construction of a Hasse diagram, starting from a set of partial order relations (as an outcome of Eq. 3) is frequently explained in the literature (see e.g. Bruggemann and Halfon 1997). For the sake of reader’s convenience, some words about Hasse diagrams may nevertheless useful here: The basis is the order relation x < y. Usually the object x will be drawn below object y; both are vertices of a graph and presented by small circles, with the label of the object in the centre. In case x < y a line is connecting x with y, called an edge, if the vertices are in a cover relation, i.e if there is no object z for which is valid: x < z < y. The orientation of the order relation is just obtained from the vertical position. When two objects are not connected by a system of oriented edges the two objects are incomparable.

By this construction a Hasse diagram allows a two-fold interpretation:

  1. 1.

    Upwards: The numerical values of the objects are nondecreasing along a system of edges. This “vertical” oriented analysis allows a ranking of objects of subsets of X, so-called chains.

  2. 2.

    In contrast to (1) there is also a “horizontal” evaluation. This evaluation has its focus on not connected objects. Following the construction principles of a Hasse diagram, the objects of in the same vertical position are mutually incomparably. A set of mutually incomparable objects is called an antichain.

2.4 Weak Order

When the general policy of decision is to find not only the optimal option but also alternatives, then ranking is a good starting point, since suboptimal objects can be easily identified if the optimal object is not suitable (e.g., due to political or economic reasons). The task is how to get a ranking, which is at least a weak order, if ties are accepted. Whereas a complete (i.e., total or linear) order is a set of objects, in which all elements x, yX are mutually comparable with x ≠ y, a weak order does not require the condition xy, i.e., it accepts equivalent elements (or in terms of statistics: it accepts ties).

A Hasse diagram allows identifying rankings for subsets of X, without any subjectivity beyond the data matrix. It is clear that the task to get a weak order should be parameter free too. Hence, the typical procedure to aggregate the values of the m indicators into a composite indicator by a numerical procedure, where weights for each indicator and other parameters are required, is to be avoided. An important device, how to get a weak order without the need of finding additional parameters, such as weights for the indicators, out of a partially ordered set is found in the paper of Winkler (1982). The crucial term is the average height, denoted as Hav.

2.5 Average Height

Any poset can be represented by a set of linear order, whose elements are called linear extensions (Davey and Priestley 1990; Trotter 1992). A linear extension is a linear order, respecting all order relations within a poset. For example the set X = {a, b, c, d} may have the following order relations:

$$ \mathrm{a}<\mathrm{b},\mathrm{a}<\mathrm{c},\mathrm{a}<\mathrm{d}. $$
(4)

Obviously, b ǁ c and c ǁ d, i.e. U = 2. Then the set of linear extensions is:

$$ \left\{\left(\mathrm{a},\mathrm{b},\mathrm{c},\mathrm{d}\right),\left(\mathrm{a},\mathrm{c},\mathrm{b},\mathrm{d}\right),\left(\mathrm{a},\mathrm{c},\mathrm{d},\mathrm{b}\right)\right\}. $$

Within the above set {(a, b, c, d), (a, c, b, d), (a, c, d, b)} a linear extension is for example (a, b, c, d), others are (a, c, b, d) and (a, c, d, b). Each single linear extension indicates a complete ordered set, for example (a, b, c, d) denotes: a < b, a < c, a < d, b < c, b < d, c < d). All order relations of Eq. 4 are reproduced. The fact that the poset in Eq. 4 includes some incomparabilities leads to the necessity to consider the 3 linear extensions simultaneously. Within each linear extension any object x has a height that is the number of objects ≤ x. For example, in the linear extension (a, b, c, d) object a has the height 1, b the height 2, whereas in the linear extension (a, c, d, b) object b has the height 4.

The idea of Winkler (1982) is to calculate the average of all heights of all objects, denoted as Hav(x). Let L(k) be the kth linear extension and h(L(k),x) the height of x in L(k), then, after Winkler (1982)

$$ Hav(x):= \left(\sum h\left(L(k),x\right)\right)/ LT\kern1.25em \left(k=1,\dots, LT\right) $$
(5)

where LT is the number of linear extensions derived from a specific poset.

Equation 5 could be a good starting point, when the set of linear extensions is small. Taking into mind that the number of linear extensions for an object set with n objects can be up to n! the problem to generate linear extensions and store them into a memory is computationally hard (see e.g. Atkinson and Chang 1986).

The above mentioned difficulty leads to several variants:

  • There is still an exact method available. It is based on the fact that the storage of some sets derived from the poset needs less memory than the storage of the linear extensions. From a methodological, mathematical point of view this method transforms the original poset into a lattice and the quantities of interest can be directly derived from this lattice (De Loof et al. 2006). However, the lattice-method is only working, when U*n (U: number of incomparabilities in a set of n objects) is not too large, for details see Bruggemann and Carlsen (2011).

  • Some approximations seem to have found more applications, for instance the method of Bubley and Dyer (1999), which suggests a “good” sampling of linear extensions.

  • Another one has a graph – theoretical background and considers the local environment around each object within a poset. There are two variants: (1) the LPOM0 (local partial order model 0) Bruggemann et al. 2004) and (2) an extended model (LPOMext) (Bruggemann and Carlsen 2011). Although the extended variant is thought of as delivering better results than LPOM0, it turned out (Rocco and Tarantola 2014) that the more simple method (LPOM0) may be in some cases a better approximation than the extended one.

2.6 Idea for an Alternative for the Hav-Calculation

Often partial order can be considered as being composed from simpler posets, here for example, the concept of linear sum is of specific interest. It is defined as follows: Let X 1, X 2 be disjoint subsets of X with

$$ X={X}_1\oplus {X}_2 \vspace*{-23pt}$$
(6a)
$$ x\in {X}_1,y\in {X}_2\ \mathrm{implies}\ x>y\ \mathrm{for}\ \mathrm{every}\ x,y $$
(6b)

Equation (6b) can be formulated as follows: If two sets can be found where for an element of the first set, x, and for any element of the second set, y, is valid: x > y; the relations among the first, and the second set, resp., are not of interest.

We may speak of “X 1 is fully dominating X 2”. Equation 6b does not imply that within X 1 or X 2 the elements are mutually comparable.

Let Hav(x, X) denote the average height of x, considering the set X and the settings of Eqs. 6a and 6b. Then:

$$ Hav\left(x,X\right)=\mid {X}_2\mid + Hav\left(x,{X}_1\right) $$
(7)

where |…| denotes the cardinality of the set. Eq. 7 is a simple conclusion found from Eq. 6b:

$$ H\left(L(k),\mathrm{x}\ \mathrm{in}\ X\right)=\mid {X}_2\mid +H\left(L(k),x\ \mathrm{in}\ {X}_1\right). $$

Thus, a calculation method can be thought of, which can be formulated as follows:

$$ x\in {X}_1: Hav\left(x,X\right)= Hav\left(x,{X}_1\right)+\mid {X}_2\mid $$
(8)
$$ y\in {X}_2: Hav\left(y,X\right)= Hav\left(y,{X}_2\right), $$
(9)

supposed that Eq. 6b is exactly fulfilled.

The concept of X 1 , X 2X with X 1X 2 =  was already studied by (Restrepo and Bruggemann 2008) and lead to two quantities, the dominance of X 1 over X 2 and the separability of X 1 and X 2 (Eqs. 10 and 11).

$$ \underline{\mathrm{Dom}}\left({X}_1,{X}_2\right):= \mid \left\{\left(x,y\right)\ \mathrm{with}\ x\in {X}_1,y\in {X}_2\ \mathrm{and}\ x>\mathrm{y}\right\}\mid /\left(|{X}_1|\ast |{X}_2|\right) $$
(10)
$$ Sep\left({X}_1,{X}_2\right):= \mid \left\{\left(x,y\right)\ \mathrm{with}\ x\in {X}_1,y\in {X}_2\ \mathrm{and}\ x\ \Big\Vert\ y\right\}\mid /\left(|{X}_1|\ast |{X}_2|\right) $$
(11)

By a set of subsets the quantities, defined in Eqs. 10 and 11 can be conveniently denoted as matrices, dominance (Dom) and separability (Sep) matrices.

Equation (6b) demands that Dom(X 1, X 2) = 1 and Sep(X 1, X 2) = 0.

If Dom(X1, X2) = 0, then X1, X2 are completely separated subsets, meaning that then x ∈ X1, y ∈X2 implies x ǁ y. In that case it is easily seen that we find:

$$ \mathrm{Hav}\left(\mathrm{x},{\mathrm{X}}_1\right)<\mathrm{Hav}\left(\mathrm{x},\mathrm{X}\right)<\mid {\mathrm{X}}_2\mid +\mathrm{Hav}\left(\mathrm{x},{\mathrm{X}}_1\right) \vspace*{-21pt}$$
(12a)
$$ \mathrm{Hav}\left(\mathrm{y},{\mathrm{X}}_2\right)<\mathrm{Hav}\left(\mathrm{y},\mathrm{X}\right)<\mid {\mathrm{X}}_1\mid +\mathrm{Hav}\left(\mathrm{y},{\mathrm{X}}_2\right) $$
(12b)

because as an extremal case the subposet based on X1 can once be completely below the subposet (X 2 , <) or completely above (X 2 , <). Hence: When Dom(X 1 , X 2) < 0.5 then the role of the separability matrix is overwhelming (because the sum of Dom- and Sep-matrices is bounded, due to the finite number of comparabilities and incomparabilities and the Eqs. 8 and 9 fail. Therefore it is needed that the poset, to be considered, has more comparabilities than incomparabilities. This is the reason, why instead of 20 sites (the real example) only the 10 first sites were selected.

Summarizing: From a methodological point of view, we want to check, as to how far a deviation of Dom(X1, X2) from 1 can lead to acceptable results.

3 Results

3.1 Randomly Generated Datasets

In order to test as to how far deviations of Dom(X1, X2) from 1 lead to errors in the estimation of Hav, 22 smaller datasets (each of 10 objects) were randomly generated. For each object x of these artificial data sets the Hav-value based on the scheme given in Eq. 12 was calculated, HavDom(x) and the exact value, Havexact, based on the lattice theoretical method presented by De Loof et al. (2006, 2011, 2012). The deviation was calculated:

$$ Eps(x):= \mid Haxexact(x)\hbox{--} HavDom(x) \mid $$
(13)

For each dataset a final value epsav was determined:

$$ epsav:= \sum Eps(x)/n\kern0.75em \mathrm{with}\ n=\mid X\mid $$
(14)

the quantity epsav being the average error related to any single object. In Fig. 1 the scatterplot, together with the regression equation is shown.

Fig. 1
figure 1

Scatterplot of epsav vs Dom(X 1, X 2) and the regression equation with R2 ≈ 0.76

Figure 1 confirms that the deviations epsav will be rather large, when Dom(X1, X2) becomes small values. It is clear that the way, how the partitioning of X into two subsets X1 and X2 is selected, plays an important role. However, aiming at an efficient method for the calculation of Hav, the principles were:

  1. 1.

    To select the X 1, X 2 in that manner that they have approximately the same number of elements

  2. 2.

    To find a selection that maximizes Dom(X 1 , X 2).

The principle (1) was a priori considered as more important than the principle (2). From Fig. 1 it becomes clear that obviously in the specific considered randomly generated case (for details, see below) the deviations epsav require Dom(X 1, X 2) ≥ 0.8.

When the regression is restricted to those pairs of values (Dom(X 1, X 2) , epsav), where Dom(X 1, X 2) ≥ 0.8, then the result is (more or less trivially) better, see Fig. 2.

Fig. 2
figure 2

epsav vs Dom(X 1 , X 2) , with Dom(X 1 , X 2) ≥ 0.8

The regression equation based on 6 pairs (the pair (1,0) is realized three times) has the striking structure:

$$ epsav=a\ast \left(1- Dom\left({X}_1,{X}_2\right)\ \right)\ \mathrm{for}\ 0.8< Dom\left({X}_1,{X}_2\right)\le 1 $$
(15)

with the coefficient of determination, R2 = 0.88 and the coefficient a around 1.35. This statistical result indicates that the relevant quantity is the deviation Δ:

$$ \varDelta := 1- Dom\left({X}_1,{X}_2\right) $$
(16)

Hence, Eq. 15 expresses proportionality between the error epsav and Δ. The crucial value 0.8 for separating relevant Dom(X 1,X 2)-values from irrelevant ones, may vary from case to case and is open for future research. Furthermore, Fig. 2 shows that the average error related to single objects is less than 0.25 and the deviations from the regression line will be larger the smaller the value Dom(X1, X2) is.

3.2 Application to Real Data Set

The estimation method needs the following steps:

  1. 1.

    Defining X 1, X 2 and the Hasse diagram for the full set X

  2. 2.

    Calculation of Dom(X 1, X 2)

  3. 3.

    Providing the data for X 1 and X 2

  4. 4.

    Application of the lattice theoretical method: (a) for X, (b) for X 1 , (c) for X 2

  5. 5.

    Performing the calculations due to Eqs. 8 and 9

  6. 6.

    Inspecting epsav to check the quality of the results

Up to now there is no program performing all 6 steps. However, for steps (1)–(3) the program package PyHasse (see for details Bruggemann et al. 2014) was extended by the new module DomRkav. Its graphical user interface is shown in Fig. 3. The data is found in Table 1.

Fig. 3
figure 3

Graphical user interface for the new PyHasse module DomRkav

The corresponding Hasse diagram is shown in Fig. 4.

Fig. 4
figure 4

Hasse diagram of the metals, according to their deviations from their natural state in the studied stations

Some remarks concerning Fig. 4 may be useful here:

  • Hg and As are minimal elements, they cause the least deviation from a natural state

  • Fe and Zn are maximal elements, they are most problematic because the deviation of the natural state is very high.

  • Incomparabilities, such as for Zn and Fe show that the loading of the lichens in general is high, however with some geographical differentiation.

In order to perform the calculation scheme based on Eqs. 8 and 9 the first step is to select X 1 and X 2, i.e. the partitioning of set X.

  • Step 1:

The sets X 1 and X 2 are:

  • set X 1: Fe, Zn, Mn, Pb, V, As

  • set X 2: Ni, Al, Cd, Cu, Hg

  • Step2:

The Dom-matrix is:

 

X 1

X 2

X 1:

0.361

0.733

X 2:

0.0

0.48

  • Remark 1:

Although Dom(X 1, X 2) = 0.733 is less 0.8 the next calculation steps are documented, just for a demonstration.

  • Remark 2:

The partitioning selected above is not the only possible one. For example, the metalloid As is a minimal element. Why not assign As to X 2? Let X 2 = X 2 ∪ {As} and X 1 = X 1 – {As}. Indeed the value of Dom(X 1 , X 2 ) = 0.833 is better than that of Dom(X 1 , X 2) and correspondingly epsav = 0.395. The disadvantage is that (X 2 ,≤) leads due to its symmetry to a very high degree of degeneracy: Fe≅ Zn, Pb ≅ V and As ≅ Al ≅ Cd ≅ Cu. Therefore we continue with the partitioning of X into X 1 and X 2 as given above.

  • Step 3: Calculation of the averaged ranks by the lattice-theoretical method (De Loof et al., 2006) due to X 1 and X 2 of step 1.

Figure 5 shows the Hasse diagrams of the two subsets.

Fig. 5
figure 5

The two Hasse diagrams due to X 1 and X 2

  • Step 4: Application of Eqs. 8 and 9.

  • Step 5: Check for the accuracy of the results.

The remaining steps 4 and 5 are summarized in the following Table 2. The column below X 1 and X 2 is the membership function, indicating whether or not the metal belongs to X 1 or to X 2.

Table 2 Summarizing the results of step 5 and step 5. Wether x ∈ X1 or ∈ X2 is indicated by a membership function. If x ∈ X1 then the value in the corresponding column = 1, otherwise 0. Similarly for xX 2

The value of epsav = 0.563 deviates from the value obtained from Eq. 15; (epsav eq.15 = 0.36). However the value of Dom(X 1 , X 2) is not within the range of applicability of Eq. 15. As to be expected, the measure of deviation, epsav, indicates a bad approximation. Due to pretty large deviations (in terms of epsav the final weak order shows two inversions:

  • Exact: Hg < Cd ≅ Cu < Al < As < Ni < Pb < Mn < V < Zn < Fe

  • Approx.: Hg < Cd ≅ Cu ≅ Al < Ni < As < Mn < Pb < V < Zn < Fe

As it is often the case, different methods coincide, when extremal ranking positions are to be detected. This empirical finding is found here as well, i.e., Hg Cd, Cu, as well as V, Zn and Fe coincide in their positions at the beginning or the end of the ranking sequence. The other positions in a ranking sequence are usually determined by many factors. Therefore, here different methods will lead to different ranking positions. Here, indeed, some other metals change their position (As, Ni) and (Mn, Pb), when the exact, lattice theoretical method is compared with the approximation, suggested here. The reasons for the inversion Mn, Pb is that Mn “sees” four vertices order theoretically less than Mn, whereas Pb only “sees” three vertices. In the approximation however, both are minimal elements, so that for both metals the Eq. 8 gives the same summand |X2|, being 4. A similar argument holds for the pair (As, Ni).

4 Discussion

4.1 Lichen Biomonitoring/Bioaccumulation Matrices as Multi-indicator Systems

Undoubtedly, any data pre-processing, as done here, are of high importance not only in chemical risk assessment and management but also broader, in the decision-making process of environmental policy. However, here is not the place to discuss in depth the pre-processing, defined by Nimis and Bargagli (1999), nevertheless, we think that here some words may be helpful:

In biomonitoring techniques of air quality with native lichens, an approach to the interpretation of data of native lichens is the so-called “naturality/alteration scales” based on thresholds identifying classes of increasing element concentrations, and obtained by the meta-analysis of a large set of bioaccumulation data. The method by Nimis and Bargagli (1999) defines seven classes of element concentrations. These classes are built up on hundreds of data points collected in Italy between the 1980s and the 1990s. The seven class scale refer to (1) very high naturality, (2) high naturality, (3) middle naturality, (4) low naturality/alteration, (5) middle alteration, (6) high alteration and (7) very high alteration based on the percentile distributions of element concentrations in lichens (Nimis et al. 2000).

Recently a paper was published, where the data pre-processing of data (is examined under the methodological background of partial order theory, see Fattore et al. (2019).

4.2 Applicability of the Proposed Method

The quantity epsav, Eq. 14 is an average value and is – as mentioned already above – related to a single object. The domain of validity for Eq. 15 is given by 0.8 ≤ Dom(X 1 , X 2) ≤ 1.0.

If Dom(X 1 , X 2) → 0.8 the deviations Eps (Eq. 13) become quickly large as Figs. 1 and 2 (randomly generated data) show. Consequently in the following paragraph we investigate reasons for large deviations of Eps.

4.3 Reasons for Large Deviations

First of all, a dissection of a poset (X, ≤) into two subposets (X 1, ≤) and (X 2, ≤) leads to more symmetry in the resulting graphs of the subposets (as already mentioned above). Hence, the degeneracy of Hav-values is increased. Even if the enhanced degree of ties is accepted, there can be large deviations, which result from structures like the one shown in Fig. 6.

Fig. 6
figure 6

X 1 dominates fully X 2 but not X 2. A typical situation causing deviations

Considering Fig. 6 a situation, similar to that, causing eq. 12, arises. For xX 1 , Hav(x, X) = Hav(x, X 1) + |X 2| is an overestimation, because by constructing the linear extensions, the elements of X 2 can also be located above the elements of X 1, whereas by Hav(x, X) = Hav(x, X 1) + |X 2 | an underestimation follows. Based on remark 2 (see above) the two Hasse diagrams are shown, when X 1 = X 1 – {As} and X 2 = X 2 ∪ {As} (Fig. 7)

Fig. 7
figure 7

A variant for partitioning of set X

The element Fe “sees” the same number of lower neighbours as Zn. Similarly, Pb and V have one upper neighbour. Therefore the exact method delivers Havexact(Fe, X 1 ) = Havexact(Zn, X 1 ) as well as Havexact(Pb, X 1 ) = Havexact(V, X 1 ). In Fig. 7, the subposet (X 2 , ≤) has also symmetries, leading to: Al ≅ Cd ≅ Cu with respect to Havexact.

4.4 Conclusive Consideration

By applying the dominance matrix and based on this, the calculation scheme seems to be attractive for an estimation method of Hav. However, the requirement of very high values of Dom(X1, X2) seems to be too restrictive to justify to propose this method as a general approximation method. Thus, up to our actual knowledge this new procedure will not be practically feasible in comparison with exact results. When, however, a first check is wanted, for example to start from this a refinement procedure, then the scheme based on Eqs. 8 and 9 may be useful. When this line of research is to be followed, then

  • A catalogue could be aimed, where structures are gathered, which typically lead to strong deviations

  • As a candidate for a better approximation the method by Bubley and Dyer (1999), may be selected and modified in that manner that the weak order as a result of Eqs. 8 and 9 is a starting linear extension.

Summarizing, we hope that the present study has revealed some mathematical ideas which may be of interest and attract new research by scholars of the mathematical chemistry scene.