Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

The analysis of multi-indicator systems (Brüggemann and Patil 2010, 2011), aiming at a ranking of multiple characterized objects is of increasing interest in many scientific fields, e.g., environmental health (Voigt et al. 2011, 2012) or sociology and politics (Annoni 2007; Carlsen and Brüggemann 2013a, b). In this context, partial order methodology appears to be increasingly applied (see Brüggemann and Carlsen 2012 in their response to Huang et al. 2011). The tools of partial order are not as ancient as those of general decision-making methods which started with the scientific work of Condorcet, Borda, at the end of the eighteenth century (cf. Munda 2008). Partial order as a mathematical discipline seems to go back to the late nineteenth century, where Dedekind was exploring the Diedergroups. Strong impacts on the theory of partially ordered sets can be related to Hasse (1927, 1952) and Birkhoff (1984), two mathematicians who, as Dedekind, were mainly interested in algebraic aspects. Within the context of data matrices, i.e., within a statistical point of view, main contributions can be traced back to Patil on the one side (within the context of biological diversity, see Patil and Taillie 1976), and, without knowing each other, to the team Halfon and Reggiani (Halfon and Reggiani 1986), on the other side. The work of Halfon and his coauthors gave the basis for the computerized Hasse diagram technique (HDT), which is specifically related to partial order and their application to the ranking of objects simultaneously described by several indicators, i.e., by data matrices. A third line of development of the analysis of data matrices can be identified, which is the field of Formal Concept Analysis (FCA), developed in the 1980s (Ganter and Wille 1996), which also finds increasing interest (see for instance Bartel and Brüggemann 1998; Davey 2004; Carlsen 2009; Brüggemann and Patil 2011).

The application of many of the tools of partial order theory on data matrices is a priori extremely simple, however, tedious if performed manually. Therefore, it is understandable that together with development of computer programming, an increasing and more and more detailed support in the ordinal analysis of data matrices is ongoing.

In this book some chapters describe the application of selected modules of the PyHasse package, whereas in Brüggemann and Patil (2011), a state-of-the-art overview (by 2010) of the software packages Rapid and PyHasse is given.

This chapter explains some background material on Python, the programming language on which PyHasse is based, and renders some more and general information about PyHasse.

2 HDT Software

An overview about software, a status by 2006, was given by Halfon (2006).

A complete overview about theory and applications of partial order on multi-indicator systems is outside the scope of this chapter, which instead aims at a description of PyHasse. For introductory texts we refer to papers by Brüggemann et al. (2001) and Brüggemann and Voigt (2008).

In Table 19.1 a—certainly not complete—overview of software is given. The newest (so far the authors are aware), the PyHasse, will be explained in more detail in this chapter.

Table 19.1 Software aiming at partial order analysis of data matrices

3 Python as Programming Language for Contemporary Software Generation

Clearly the first question often stated may be: Why Python, why not JAVA, PERL, or traditional languages such as C++, Fortran, or VisualBasic? The most honest answer is, simply because Python fulfills to a wide degree the personal taste of the programmer, in this case of Rainer Brüggemann. This very personal way to find a decision about the suitable programming language may be unsatisfactory for many readers. Hence, we discuss some objective points that in the view of the programmer favor Python (however, without arguing that other programming languages do not have these features).

3.1 General Remarks

In the present context the arguments are following those of Lutz and Ascher (2003) closely, although there is a book available, where specifically Python for scientific uses is explained and which is recommended for further reading (Langtangen 2009).

When Python was developed by Guido von Rossum (cf. Venners 2003) it was developed in one step. Hence, Python had a very homogenuous structure from the very beginning. Clearly, Python has been further developed and will be further developed in the future. Actually, currently Python is delivered in version 3, whereas PyHasse is developed on the basis of Python 2.6, available since around 2007.

Python is—in contrast to C languages—comfortably readable and coherent. Python supports consequently the object-oriented programming style.

It is of further interest that typically Python codes are “1/3 to 1/5 the size of equivalent C++ or Java code” (Lutz and Ascher, page 3, 2003).

Briefly speaking there is a Python slogan which says that “In the Python way of thinking, explicit is better than implicit, and simple is better than complex” (Lutz and Ascher 2003, page 5).

Python is an interpreter language. That means, there is no need to compile and link the software before it is applied. In the Web site python.org, we find “Python is a programming language that lets you work more quickly and integrate your systems more effectively.” Clearly, it is to be expected that the linear reading of the programming code may be time consuming. However, the personal experience is that even the combinatorial algorithms, which are typical for the application field of partially ordered sets do not need much time, i.e., even the impatient programmer may await the result, sitting before his machine!

Technically spoken, Python belongs to the Very High-Level Languages (VHLL) (Müller and Schwarzer 2007). For the PyHasse author, the fact that developing new modules and testing them does not need to first compile parts of the program makes Python a very efficient and quick programming tool.

3.2 Portability

The portability of Python programs is high. For example, PyHasse programs run without problems on different Windows operating systems as well as on different UNIX or Linux machines. Although there is not much experience with Macintosh operating systems, examples are known that PyHasse can be ported to the Macintosh without major difficulties.

3.3 Libraries

As most other modern programming languages Python provides many freely downloadable libraries. All possible applications of “modern life” can be handled by such libraries:

  • NumPy and Matplotlib: powerful libraries for numerical calculations and visualization which can replace MATLAB in many applications.

  • Statistical libraries, drawing libraries are available, as well as libraries designed to handle databases (MySQL, Oracle, …).

  • The Internet programming libraries are of increasing importance and many web frameworks provide support for a quick construction of Web sites for example Plone (with parts of ZOPE) or Django.

Also more exotic applications are supported like:

  • PIL: (PhotoImageLibrary) a library for manipulating electronically photos

  • PyGame: a library facilitating game programming

3.4 Programming Support

One important point is that Python supports the development of own written libraries, which are specifically designed for the scientific purpose of any software package.

Another important point is that Python supports the development process by a set of tools: For example, Cython expands Python adding type information to a Python program to make Python modules faster. Cython may further be used to include C/C++ Code in a Python library. Alternatively SIP or SWIG can wrap existing C/C++ code to use it as a Python module. There are modules available that include the QT library which allows to implement modern graphical user interfaces or another module which includes the gnu scientific library (gsl).

For some applications like Monte-Carlo Simulations, Python as interpreter language is too slow. Techniques, discussed above, may be used to accelerate Python modules. It should further be noted that Python supports parallel programming. Thus, modules like PyPar connects Python programs to the powerful Message Passing Interface (MPI) and allow parallel processing even on a personal computer with four or eight cores. This underlines that Python is not restricted to fast prototyping but Python is a modern programming toolbox, which can be used for challenging projects.

Some useful links are:

python: http://www.python.org/

numpy, scipy: http://numpy.scipy.org/

matplotlib: http://matplotlib.org/

networkX: http://networkx.lanl.gov/

Cython: http://cython.org/

SWIG: http://www.swig.org/

QT: http://qt.digia.com/

PyQT: http://www.riverbankcomputing.co.uk/software/pyqt/intro

gsl: http://www.gnu.org/software/gsl/

PyPy: http://pypy.org/

Pypar: http://code.google.com/p/pypar/

4 A Practical Ranking Problem, i.e., a Test Set for Explaining PyHasse

In order to have an illustrative example at hand we look at nine regions in Germany. In order to measure the air quality, the concentrations of deposited Pb, Cd, and Zn are monitored in epiphytic mosses (Brüggemann et al. 1998). The question arises: Can we rank the regions simultaneously taking into account the concentrations of all three metals? In Table 19.2 the data matrix is shown.

Table 19.2 Data matrix, nine regions, three metals, concentrations in mg/kg dry weight (rounded)

Partial order theory provides an answer as displayed as a Hasse diagram, i.e., a transitively reduced acyclic, trianglefree digraph of order relations (as explained in several chapters of this volume) (Fig. 19.1). The first observation is that the Hasse diagram is not slim, i.e., it obviously deviates remarkably from a linear order (Fig. 19.1).

Fig. 19.1
figure 1

Hasse diagram of nine regions with three attributes, namely the (rounded) metal concentrations in epiphytic mosses of Pb, Cd, and Zn

Figure 19.1 shows the main characteristic of HDT: There are regions (generally “objects”) which cannot be compared, as, e.g., region 8 and 17. The reason is that region 8, with respect to one metal, has a higher concentration than region 17, whereas region 17 on the other hand, has a higher concentration than region 8 with respect to another metal. Thus, the two regions are “in conflict with each other.” Technically we describe an incomparability by the symbol ||, i.e., here as 8 || 17. A set of objects which are mutually incomparable is called an antichain, in contrast to a set of objects, which are mutually comparable, i.e., a chain. Hence, the set {9, 16, 5} is an antichain, whereas the set {6, 7, 9, 8} is a chain.

5 The PyHasse Software

5.1 Intention Behind the Software

5.1.1 Modules

PyHasse is a software consisting of a series of mutually independent programs. These programs are called “modules.” When programming tools, as well as interfaces and all the partial order analysis tools are counted, the complete number of modules of PyHasse software is 91 (April 2013). However, this number is continuously changing, as new modules may replace a couple of older modules or new ideas to analyze partial orders derived from the ordinal analysis of data matrices eventually result in new modules.

In total, PyHasse is a software package with more than 50,000 lines of programming code (including comment lines, empty lines, which help to get a clear program code). Obviously, parts of program codes often appear several times, due to the intention that the modules should be mutually independent.

5.1.2 PyHasse as Experimental Software

PyHasse is intended to help solving daily problems applying partial order concepts on data matrices. It does not intend to provide either perfect statistical or graphical tools, especially it does not intend to include the vast number of applications, which more or less routinely are performed by applying spreadsheet software, such as Microsoft Excel®. The same kind of philosophy holds when a drawing of Hasse diagrams is considered. Thus, virtually all PyHasse modules offer the drawing of Hasse diagrams following the drawing convention, which has its origin in the work of Halfon (Halfon and Reggiani 1986). Nevertheless, these PyHasse-generated graphs are far from being perfect drawings. Hence in this context PyHasse cannot compete with the powerful freely downloadable program Graphviz, see Gansner and North (1999), which visualizes partially ordered sets in an almost perfect manner (see Sect. 19.5.4.3).

In sum PyHasse tries to fill the gap between highly specialized programs often developed in laboratories but not generally applicable and professionally written software, which usually may not reflect the state of the art of the theoretical development, even though updates are made available from time to time.

5.2 Basic Structure

5.2.1 Contextual Categories

PyHasse is structured in two ways: Contextually and from the programming point of view. In Table 19.3, nine contextual categories are explained.

Table 19.3 Alphabetically sorted contextual categories of PyHasse, references are found in the Appendix

In Fig. 19.2 a bar diagram displays the distribution of the PyHasse modules over the nine categories described in Table 19.3.

Fig. 19.2
figure 2

Distribution of the 91 PyHasse modules within the nine contextual categories given in Table 19.3

5.2.2 Programming Structure

The 91 modules are supported by four libraries (Table 19.4).

Table 19.4 Libraries, supporting the PyHasse modules

These four libraries are delivered together with the PyHasse modules (and some additional files) and the user has to put them into the folder, where Python is localized.

In order to facilitate the installation of PyHasse software, the programmer, Brüggemann, did not extensively use other comfortable libraries, such as MatplotLib or NumPy.

Together with the utility functions, the programming structure can be characterized by a scheme, as shown in Fig. 19.3.

Fig. 19.3
figure 3

Programming structure of PyHasse

5.2.3 Graphical User Interface

Most of the modules have similar graphical user interfaces (GUIs). In Python GUIs can be programmed, applying the standard library Tkinter, which is derived from Tcl/Tk. Thus, all user interfaces in the PyHasse package are built using Tkinter. The location of the typos: buttons (which govern the user activity) are vertically arranged following the most typical logical sequence of steps. A few modules are menu oriented, such as pyhassemenue8_3.py, DAHP.py, and modelHD9.py. In almost every user interface an “about” function is found, which informs briefly about the aim of the module and the programmer and (sometimes) about the leading idea out of the literature.

In addition, there is a “help” function, which has the following structure:

  • Aim

  • Prerequisites

  • Usage or steps

  • Results (not in all cases)

  • Difficulties

  • Literature

  • Example data files

5.2.4 PyHasse Data Flow (Example: Windows® as Operating System)

Within the Windows® environment the majority of potential users will apply Microsoft Excel®.

In order to fulfill the input requirement for the PyHasse module, it is important that the rows as well as the columns have a short label (optimal are labels with up to three characters) and that the (0,0) position of the data matrix (in Excel the A,1) is not empty. Furthermore, none of the PyHasse modules accept data gaps. Hence, it is in the responsibility of the users to provide a data sheet with all labels and no data gaps. In contrast, software packages such as DART (see Manganaro et al. 2008) and WHASSE (Brüggemann et al. 1999) provide some facilities to handle missing data.

Typically the PyHasse modules require the Excel sheet stored as a tab-separated txt file. Only the module EXCELHD1.py can directly apply the data by copying the appropriate field in the Excel sheet. Once the data matrix is read in, one may perform calculations and results can be stored in the internal format pdt. Some more important modules therefore offer to read these intermediate results as *.pdt files.

5.3 Overview

5.3.1 Most Often Used Modules

The application of the following modules is well described (cf. Table 19.1 and the appendix at the end. Further, specific references are available within the single modules).

  • mainHD20_5.py and mHDCl2.py, resp.: Beside the Hasse diagram, these module provide navigation tools and much structural information, as well a variety of other facilities. As “basic” modules these are the most important

  • chain7_1.py: Search and analysis of chains

  • dds12.py: Dominance and separability of disjoint subsets of objects on the basis of the order relations among their elements

  • LPOMext4_2.py: Average ranks calculated after two different approximations based on the “local partial order concept”

  • fuzzyHD13.py: Instead of analyzing the “<” relation directly a subsethood is defined (Kosko measure, cf. Van de Walle et al. 1995) and a fuzzy partial order defined

  • sensitivity19_1.py: A partially ordered set has a structure. This structure is characterizable by chains and antichains. What is the impact of any single matrix column (representing the indicator values for all the objects)? i.e., what is the impact of any single indicator on the structure of a poset?

  • similarity10_1.py: The same set of objects may be described by different multi-indicator systems. What is the proximity between the two resulting posets?

5.4 Description of Some Modules of PyHasse Software

5.4.1 Module mHDCl2_7: The “New Main”

This module is one of the newest and is completely written in an object oriented programming style. The reason, why mHDCl2_7.py was developed, was threefold:

  1. 1.

    The similar module mainHD20_5.py runs into memory error when the data matrices are too large

  2. 2.

    The GUI and the logical organization were no more adequat

  3. 3.

    After some years of practical applications some adaptions appeared appropriate

The purpose is, as with mainHD20_5.py, to provide a complete basical analysis of a partially ordered set as derived from a data matrix. This includes as results:

  • Level structure

  • Information of each object about its successors, predecessors, and incomparable objects in the Hasse diagram, in tabular form

  • Hasse diagram

  • Navigation tools: principal down- and upsets, interval graphs, local Hasse diagrams, the most simple approximation of average rank by the local partial order (LPOM0) (Brüggemann et al. 2004)

The GUI and its subsequent windows are shown in Figs. 19.4 and 19.5.

Fig. 19.4
figure 4

GUI of mHDCl2_7.py and the window opening after pressing “Order theoretical navigation.” Note that the first three navigation buttons need the input of one single object, whereas the buttons “intervalHD” and “from–to” need two objects as input

Fig. 19.5
figure 5

Windows popping up after pressing “Save the different results” (a) and “Open the control board for graphics” (b)

In the following we describe each button given in Fig. 19.5, starting from the top in Table 19.5.

Table 19.5 Explanations of the buttons of the GUI of mHDCl2_7.py

In mHDCl2.py there are three other tools to overcome the difficulties of drawing Hasse diagrams: (a) by rendering information in a tabular form (Table 19.6) and (b) by the FOU plot, which is a realization of the concept of posetic coordinates (see Chap. 8), see Fig. 19.6.

Table 19.6 Structural information of the data matrix of Table 19.2, related with the Hasse diagram of Fig. 19.1
Fig. 19.6
figure 6

FOU plot (see text) based on Table 19.2, using posetic coordinates. The greater blue circles are obtained by clicking with the mouse on them. The abscissa counts from −10 to +10 with steps of 0.5, the ordinate, however, counts from 0 to 10 with steps of 1

In Fig. 19.6, a FOU plot is shown. Myers and Patil (2014) are focusing on possibilities to represent partially ordered sets by scatter plots in order to avoid too complex Hasse diagrams. Here our aim is similar. The basic idea is to describe partially ordered sets by “posetic coordinates,” i.e., by numbers which are derived from partial order theory, e.g., the contents of principal down- and upsets and of U(x). When equivalence relations are possible, the number of equivalent elements could be used too to obtain posetic coordinates. Here we characterize the poset by two order theoretical coordinates for each object x, i.e., by the difference of the contents of down (O(x)) and upsets (F(x)), OF and the content of the set of elements incomparable with x: U.

$$ OF:=\left(\left|O(x)\left|-\right|F(x)\right|\right)\kern0.24em \mathrm{and}\kern0.24em U:=\left|U(x)\right|. $$
(19.1)
  • In contrast to the coordinates, the original data matrix may render (Pb, Cd, Zn) now posetic coordinates, namely OF and U are used to characterize the objects.

  • In contrast to the triangle coordinate representation (Brüggemann and Patil 2011), which is more detailled, the scatter plot, based on OF amd U is simple to be interpreted.

Generally, it is a promising new task in partial order theory to find best “posetic coordinates” allowing presentations of partial orders not so much depending on the clarity of the relational graph, such as the Hasse diagram.

Figure 19.6 shows that

  • There are two regions selected (namely 8 and 14) being maximal elements, however, they differ in their values of their posetic coordinates.

  • There is one region being at most incomparable |U(x)|= 7. object, this is region 17, which also is a maximal element.

Checking the data matrix one can see that indeed regions 8 and 14 are pretty different with respect to their data profile (for the sake of clarity, the min, and max values over all regions for each of the three attributes are additionally given):

 

Pb

Cd

Zn

Max:

20

0.6

63

8:

20

0.4

55

14:

12

0.6

41

Min:

9

0.2

29

Region 8 is dominantly polluted by Lead and Zinc, whereas the main contribution of pollution of region 14 is Cadmium. The maximal and minimal values of Pb, Zn, and Cd taken over all objects of the data matrix (Table 19.2) are added to facilitate the interpretation.

The FOU plot is mainly useful for an interactive analysis and can be further explored using the mouse. So the FOU plot fulfills similar tasks as those, explained by Myers in this book (Myers and Patil 2014) There is an abscissa which describes the relative position on a bad–good axis and the ordinate which quantifies the conflicts associated with each object.

Clicking with the left mouse button, pessing “ALT” a window pops up with more information (Fig. 19.6, top, left side, and right side). Basically, depending on the ranking aim, the points near the lines given by (19.2a) and (19.2b)

$$ \left|U(x)\right|=n+1- OF, OF=\left(\left|O(x)\left|-\right|F(x)\right|\right),\left|F(x)\right|=0 $$
(19.2a)

and

$$ \left|U(x)\right|=n+1+ OF, OF=\left(\left|O(x)\left|-\right|F(x)\right|\right),\left|O(x)\right|=0 $$
(19.2b)

are of most interest, as they are the extremal points.

In contrast to mainHD20_5.py, the module mHDCl2.py does no more contain the Bubley–Dyer algorithm (Bubley and Dyer 1999) to get average ranks (see Patil and Joshi 2014) and the statistics concerning chain length. The BubleyDyer algorithm is now the central part of the module BubleyDyer8.py where also the algorithm, proposed by Patil and Taillie (2004), the Cumulatice Rank Frequency (CRF) iterative method is provided. The CRF algorithm can be applied to enrich the poset until a weak order is obtained. See for details Chap. 6.

5.4.2 The Module to Check the Role of Single Indicator Values: POOC6.py

As mentioned by Annoni et al. (2011, 2012) and explained in more detail by Brüggemann and Patil (2011), there are two types of sensitivity analysis:

  • Variation of the set of indicators, e.g., to elucidate the effect if one indicator is eliminated from the data matrix

  • Variation of the values of indicators

The first is referred to as attribute-related sensitivity (ARS), the second as attribute value-related sensitivity (AVRS). The ARS is the task of sensitivity18_3.py and is well described in the literature. Attribute value-related sensitivity is the task of POOC6.py (perturbation on order characteristics). With the new concept of variance-based sensitivity (Annoni et al. 2011, 2012; see also Chap. 13), the development concerning POOC6.py was slowed down. Nevertheless, this module appears mandatory, as long as the variance-based sensitivity is not programmed within PyHasse.

The GUI of POOC6.py is shown in Fig. 19.7.

Fig. 19.7
figure 7

GUI of POOC6.py

After selecting the same data matrix as for Fig. 19.1, a posetic overview over the data matrix (Fig. 19.8) is first obtained, whereby now four coordinates are used.

Fig. 19.8
figure 8

Posetic “coordinates” (equiv, predec, succ, and incomp) of the data matrix, describing metal pollution of epiphytic mosses, in south-west of Germany (cf. Table 19.2)

The coordinates are:

  • equiv (eq): number of equivalent elements with x

  • predec (pred): number of elements above x, pred =|O(x)-{x}|

  • succ: number of elements below x, succ = |F(x)-{x}|

  • incomp (ic): number of elements incomparable with x, ic= |U(x)|

Thus, as an example, select as element of interest region 9, one sees that there is no element, equivalent with region 9, there is one element above 9, two elements below region 9 and 5 regions which are not comparable to region 9.

The role of pooc6.py is now to check how a change in an attribute value will change the set of posetic coordinates.

We enter a perturbing value 1, select object 9 and attribute “Pb,” i.e., pollution of lead in the epiphytic moss. Note that clearly a perturbing value for one of the columns of the data matrix is specific, for instance, 1 for Pb concentration is a small value, 1 for Cd pollution would be larger than the whole span of Cd values! Here we perturb Pb by 1/20 of the maximal value.

Technically a perturbation by 1 means that we change the original entry q1(9) by adding 1, i.e., changing the value from 17 to 18, and thus observe the possible effects (Fig. 19.9).

Fig. 19.9
figure 9

Posetic coordinates, after perturbing the value of attribute Pb of region 9 by changing the attribute by adding the value of 1

To the most left side: the site of perturbation and the perturbed indicator are explained, then information is given how the different elements of the poset are reacting.

A perturbation by adding 4 to the original value, i.e., 20 % of the maximum of lead concentrations with the regions considered, changes the coordinates.

 

eq

predec

succ

incomp

(Perturbed) (9, Pb): Obj: 9:

0

0

2

6

The number of predecessors of region 9 would in this case be reduced and the number of incomparable elements with region 9 increased.

We could conclude that the posetic information concerning region 9 is rather stable with respect to increasing the value of Pb. Clearly this procedure can be repeated for every element of interest, every attribute, and with every perturbing value.

In Fig. 19.10 a graphical display on what happens after perturbing the value of Pb for region 9 by 4 is given (in terms of region 9 less than (lt), greater then (gt), incomparable with (ic), and equivalent with (eq)).

Fig. 19.10
figure 10

Schematic overview about the four posetic coordinates: eq: number of equivalences, ic: number of incomparabilities, gt: number of predecessors (greater), and lt: number of successors (less than) after changing the attribute Pb of region 9 by four units

Figure 19.10 deserves some further explanations: The large yellow circle informs about the perturbation itself.

The affected object (obj) is region 9, the attribute (attr) perturbed is Pb, and the amount of perturbation (perturb) is 4.0. In the little white circle the four posetic coordinates are indicated and in the rectangular box an information is given, what happens with respect to this specific coordinate: For example, the value of “lt” does not changed, i.e., it is not perturbed (perturbed value of lt:=0). In contrast, the coordinate “ic” changes by perturbation. The original value is 5, after perturbation this value changed to 6. (perturb (of ic)): 6.

Figures 19.8, 19.9, and 19.10 are the result of POOC6.py, which in turn is designed to help to find answers concerning the Hasse diagram in Sect. 19.4, namely the effect of data uncertainty. Additionally, however not shown, any perturbed data matrix can be visualized by a Hasse diagram.

5.4.3 Module graphvizHD1.py

5.4.3.1 Introduction

This module serves as a nice example for the general philosophy in the context of PyHasse, i.e., not to compete with professional software, if available. In the module graphvizHD1.py, some information on the partially ordered set is given. However, the graph drawing is a matter of the well-known graph–theoretical program Graphviz (Gansner and North 1999), which is explained below. In Fig. 19.11 the GUI is shown.

Fig. 19.11
figure 11

GUI of graphvizHD1

As for other modules, about and help functions are found. Behind the button “select and open a file,” the facilities of the Tkinter library are applied.

After the selection of the data file, a window pops up with more information (see Table 19.7).

Table 19.7 Content of the window, popping up after selecting and opening the file (containing the data matrix about pollution in epiphytic mosses) (and doing subsequently all needed calculations to obtain the partial order)

It is seen that local information (i.e., information not related to the complete object set, but to a user selected pair of objects) is available by inserting objects into the two open entry fields. Inserting for example “9” and “17,” two objects of the data matrix of the selected example file, an information is obtained: a) comparable or not and b) in which orientation the two regions are comparable. Here it is found: 9 || 17, see also Fig. 19.12.

Fig. 19.12
figure 12

Local information about a pair of objects, here about a pair of regions

For a deeper analysis procedure, the concept of “distance due to incomparability” (Bartel and Mucha, Chap. 3) may be applied.

Further, a window is opened to select name and site of the file, subsequently to be analyzed by graphviz.

5.4.3.2 Graphviz

Graphviz is a professional program to draw graphs, i.e., visualize binary relations on a ground set. Graphviz draws binary relations of the object set, or more exactly, of the set of representatives. The software Graphviz is freely downloadable from the Internet and is described by Gansner et al. (1993) and Gansner and North (1999).

The version used here is 2.26.3, and among the programs available in Graphviz, the program Gvedit, v: 1.01 is used.

Fig. 19.13
figure 13

Result after running Gvedit: a gif File

By successful running Gvedit, for instance, a gif File is obtained (see Fig. 19.13), many other formats are available too. It represents the same order relation as in Fig. 19.1. However, the drawing rules in Graphviz are dominated by minimizing the crossings of lines. Gvedit allows many controlling interactions by the user: However, for most purposes those specifications are not needed. A more detailed description of the graphviz option is outside the scope of the present chapter.

6 Summary and Conclusions

6.1 Summary

When a data matrix is to be analyzed with respect to some ranking or evaluation, then usually one has to select a software. Whereas the construction of a composite indicator is simple and can be done with spreadsheet facilities, like MS Excel, the analysis within partial order methodology can in general not be done by spreadsheet software.

With the example of a small real-life data matrix, where the regional pollution is measured in the special target of epiphytic the technical performance by some modules of PyHasse are demonstrated. There are PyHasse modules available, which unequivocallly are important and often used, for example, mainHD20_5.py, mHDCL2.py, similarity10_1, or sensitivity18_3, LPOM4ext.py, and dds12.py. Some others are described and follow crudely the logical line:

  1. 1.

    What information is provided by the Hasse diagram? (mHDCl2_7.py, Sect. 19.5.4.1)?

  2. 2.

    What happens when data entries are changed? (POOC6.py, Sect. 19.5.4.2)

  3. 3.

    Graphical display in the form of Hasse diagrams is appealing. However, the display of partial orders allows many freedoms. In PyHasse firstly a conservative point of view is taken, i.e., to locate the objects in the highest position, which is order theoretically possible. Secondly the objects are arranged in levels. These two principles often lead, in the graphical presentation of the results (the Hasse diagram), to crossing of lines, which may be rather confusing. Thus, an alternative is discussed, and the use of the freely downloadable software Graphviz is suggested (Sect. 19.5.4.3).

6.2 Conclusions

PyHasse is today applied by many teams around the world. It is clear that correspondingly many ideas are expressed how PyHasse can be improved

  • In its technical handling

  • Contextually, in its tools to ordinally analyze data matrices

PyHasse is not claimed as a user-friendly software with a good guidance of the users. However, it should be clear that PyHasse does not want to (and cannot) compete with, for instance, DART (Manganaro et al. 2008), which provides a very convenient tool to get Hasse diagrams as well as some basic information derived from a data matrix—even with missing data. The application of PyHasse needs some preparatory steps in data handling before it can be run. It also most often needs an a posteriori activity by the user. This is the consequence of the conception behind PyHasse, to help specifically in studies of partial ordering, i.e., in all consequences which arise from the ordinal analysis of multi indicator systems. PyHasse provides copy-and-paste texts to support the documentation of results.

When graphical representations are available (bar diagrams, Hasse diagrams, scatter plots, etc.), their purpose is to give the user a first impression, and when the user wants a professional graphic software, such as Excel, then some few steps of data handling are necessary.

PyHasse is rapidly developing as ideas from users as well as concepts from the literature relatively easily may be programmed leading to new modules. The price is that the total absence of bugs cannot be guaranteed, albeit most modules are tested rather carefully. Futher the user interfaces may not always be as comfortable as possibly desirable and philosophies how to guide the user are only rudimentarily realized. PyHasse is an “experimental” software under constant development and suggestions, comments, and wishes from users are always welcome and appreciated.

7 Outlook

For the time being, eight major objectives are still on the agenda. However, obviously time is required and the development further constantly compete with other more rapidly realizable ideas. The eight future objective can be summarized as:

  1. 1.

    PyHasse being made available in an Internet version. Some preliminary attempts have been made. However, the cooperation with web designers, etc., appears crucial.

  2. 2.

    Although the powerful conexp3, written in Java is available for analysis of Formal Concepts (Yevtushenko 2003; Ganter and Wille 1996, Burmeister 2003) a Formal-Concept-Analysis-module within PyHasse would facilitate many applications.

  3. 3.

    POSAC is a program performing a reduction of the attributes of the data matrix to two coordinates. The underlying idea is to maintain the typical outcome of partial order theory, i.e., the appearance of incomparabilities but at the same time simplifiying the analysis. POSAC is an approximation. Nevertheless a, possibly simplified version in PyHasse would be helpful (see Brüggemann and Patil 2011 and references therein).

  4. 4.

    When a reduction to two new attributes as in POSAC is intended, a calculation of the poset dimension would be useful. However, the calculation of the dimension of a poset is computationally extremely difficult. Nevertheless, it is important to get ideas about the dimension of posets.

  5. 5.

    The variance-based sensitivity analysis is most urgently needed as an implementation in PyHasse. So far, the needed calculations are performed using Matlab. Consequently, the extensive numerical part should be programmed in C++ or at least by including the library NumPy.

  6. 6.

    In multivariate statistics, cluster analysis plays an important role. A straightforward application of cluster analysis is suitable in order to get clear Hasse diagrams by reducing the number of vertices. This reduction can be done in the form of deriving a poset on cluster centers instead on the single objects. This reduction is the main feature of the PyHasse module pycluster1_2.py. However, an order theoretical approach would be helpful too: Instead of defining equivalence relations such as “belonging to the same cluster,” one could appropriately define equivalence relations among the elements of a poset, also called “blocks” (Davey and Priestley 1990) and analyze the resulting posets based on the representative elements, which clearly is simpler than the original poset.

  7. 7.

    Finally, a project aiming at extending PyHasse by an additional fuzzy-poset analysis is in progress. A first variant is provided in fuzzydds7.py. Now an intensive testing phase is needed.

  8. 8.

    The further analysis of the two approximations of average ranks based on local partial order model is a task for the future. It is hoped to give improved statements about the accuracy of the LPOM model.