Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

Formal Concept Analysis (FCAFootnote 1) has over the years grown from a small research area in the 1980s to a reasonably-sized research community with thousands of published papers. Rudolf Wille originally perceived FCA as an example of ‘restructuring mathematics’ because it provides a tool for data analysis with applications in non-mathematical disciplines (Wille 1982). Early applications of FCA in psychology were aimed at establishing FCA as an alternative to statistical analyses (Spangenberg and Wolff 1993). Because FCA formalises conceptual hierarchies it should be a natural tool for investigating and modelling hierarchical structures in dictionaries, lexical databases, biological taxonomies, library classification systems and so on. Thus FCA has the potential of being used as a tool in a wide variety of non-mathematical fields. But so far it is still mostly used by researchers with mathematical or computational training. A few papers were published in other areas since the 1990s. But these papers were mostly written by mathematicians or computer scientists and not by researchers who belong to these fields and are not mathematicians. This is in contrast to other mathematical techniques such as statistics which appear to have a much wider application domain.

As an example, we manually examined the first 100 documents that were retrieved by a query for ‘formal concept analysis and linguistics’ in a bibliographic search engine. It seems that the majority of the retrieved papers were either written by a computer scientist or mathematician or submitted to an FCA or a computer science conference. The topics of these papers centred on ontologies, data mining/processing/retrieval and logic or artificial intelligence in the widest sense. A query for ‘formal concept analysis and psychology’ yields a similar result. On the other hand, searching for ‘statistics and linguistics’ retrieves mostly papers in the area of computational linguistics and searching for ‘statistics and psychology’ retrieves papers on research methods, textbooks for psychologists and papers written for a psychological audience. Thus it appears that FCA researchers are still writing papers that attempt to demonstrate that and how FCA can be used in linguistics and psychology whereas statistics is an established tool in these disciplines. The question arises as to why it seems to be so difficult for FCA to establish itself as a commonly-used tool in non-mathematical or non-computational fields even though there are so many papers showing that and how FCA can be used in such fields. Clearly a large number of factors could be causing this, such as availability and usability of FCA software or a general resistance to change and to adopting new paradigms. One possible factor which is examined in this paper, however, is that the mathematics underlying FCA might be surprisingly difficult for non-mathematicians to understand. We are not arguing that FCA is more difficult to learn than, for example, statistics but simply that the difficulties faced by FCA novices should not be underestimated.

From a mathematical viewpoint, the core definitions of FCA are short and simple and can easily be explained to fellow mathematicians. But it may be that non-mathematicians need a significant amount of time and motivation in order to learn FCA. This is not contradicted by Eklund et al.’s (2004) observation that novices can read the line diagrams in their FCA software because their experiment was more focussed on the usability of their software than establishing whether users have a mathematically correct understanding of lattices. Again, there can be many reasons why learning FCA is difficult but it is at least a possibility that FCA contains a few ‘threshold concepts’ which are challenging to learn. According to Meyer and Land (2003) threshold concepts are difficult concepts which function like a portal into a domain of knowledge. Because their notion of ‘concept’ is not the same as the one in FCA we use the term ‘learning threshold’ instead of ‘threshold concept’ in the remainder of this paper. Meyer & Land state that experts are people who have overcome the learning thresholds of a domain. Learning thresholds have several characteristic properties: they are transformative because they invoke a shift in understanding, values, feeling or attitudes. They are irreversible because once one has mastered a learning threshold one cannot later unlearn it again. They are integrative because they establish connections with other concepts. But they are also troublesome because they might contradict prior beliefs or intuitions. Unfortunately because learning thresholds are both integrative and irreversible, experts tend to forget what their thinking used to be like before they acquired the concepts. That means that experts often lack understanding for the exact difficulties that novices are experiencing. Thus the notion of ‘learning threshold’ can be a learning threshold itself for teachers in training.

The purpose of this paper is to start a discussion within the FCA community about encountering learning thresholds within the teaching material of FCA. Once teachers have identified learning thresholds and possible misconceptions in a domain they can devote more time, materials and exercises to the teaching of difficult concepts. One possible outcome of such a discussion would be to establish a ‘concept inventory’ for FCA. Concept inventories are lists of learning thresholds for a domain. For example Almstrum et al. (2006) develop a concept inventory for discrete mathematics and at the same time describe the process of establishing such an inventory. Presumably the list of learning thresholds for the basic notions of FCA would not be long. But the questions are: what concepts belong to this list and why are they difficult to learn? This paper examines some concepts that might belong to the list of learning thresholds of FCA. Deciding on a list of learning thresholds is usually a community effort. Thus it is hoped that this paper will stimulate a discussion amongst FCA teachers about this topic.

The background for this paper was teaching a class on discrete structures to 70 first year computer science students (abbreviated by DS in this paper). The class was taught using a just-in-time teaching methodFootnote 2. The lecturer took notes about any concepts that appeared difficult because many students asked about them, many students had problems with exercises relating to them or there was a particularly lengthy class discussion about them. The class covered the usual discrete structures topics (logic, sets, functions, relations, groups, graphs) and concluded with the topic of partially ordered sets and lattices discussed in the last two 1.5 h class sessions. In addition to compiling a list of learning threshold candidates, it was attempted to investigate why the concepts appeared to be difficult to learn.

In analysing the list of difficult concepts gathered from the DS class it appeared that students have a general misunderstanding of how mathematical concepts are to be used. This analysis which employs semiotic-conceptual analysis (SCA) according to Priss (2016) is described in the next section. Section 3 then uses a further semiotic analysis to identify learning thresholds related to line diagrams of partially ordered sets and concept lattices. The paper finishes with a concluding section.

2 The Notion of ‘Formal Concept’ is a Learning Threshold

This section introduces a slightly more general notion of ‘formal concept’ which is seen as the core building block of mathematics and the main reason for why mathematics can be difficult to learn. The definitions in this section are not strictly formal because they attempt to build a bridge between the normal FCA definitions and less formal notions of ‘concept’. A more formal description of SCA is provided by Priss (2016, 2017). The following notion of ‘open set’ is adapted from linguistics (Lyons 1968, p. 436).

Definition 1:

An open set is a set for which there is a precise method for determining whether an element belongs to it or not but which is too large to be explicitly listed and which does not have an algorithmic construction rule.

The sets of even numbers or of all finite strings are not open because they have algorithmic construction rules. Examples of open sets are the sets of prime numbers, the words of the English language and currently existing species of primates. The last two examples are finite open sets at any point in time even though there may be more elements added to them in the future. The reference to an ‘abstract idea’ in the next definition is non-formal but is added in order to avoid having any pair of sets being called ‘concept’. Whether something is a concept depends on the formal condition and on the informal property of someone considering it to be an abstract idea.

Definition 2

  1. (a)

    A concept is an abstract idea corresponding to a pair of two sets (extension and intension) which can be crisp, rough, fuzzy or open.

  2. (b)

    A formal instance concept is a concept whose extension is finite and whose attributes are clearly determined by the extension.

  3. (c)

    A mathematical concept is a concept for which a necessary and sufficient set of attributes in its intension can be identified which determine exactly whether an item is in the extension of the concept.

  4. (d)

    A formal concept is a formal instance concept and/or a mathematical concept.

  5. (e)

    An associative concept is a concept that is not formal.

Instance concepts arise, for example, when someone perceives an object or a set of objects. Depending on whether the attributes of that object are clear an instance concept can be formal or associative. For example, a cat perceiving a mouse will most likely form an associative concept in its mind whereas a person thinking about number 5 may be thinking a formal concept. Values of variables in programming languages are formal instance concepts. Their extension is their value and their intension is the properties of the value such as its datatype. Any mathematical definition establishes a mathematical concept. Extensions and intensions of mathematical concepts can be open sets. For example the set of zeros of Riemann’s zeta function is currently open. For any existing number it can be determined whether it belongs to the set but it is currently not possible to list the complete set. Although a mathematical concept has a set of necessary and sufficient attributes, the set of all of its attributes tends to be open. Mathematical concepts also occur in other scientific disciplines. For example, the concept of the plant genus ‘bellis’ (to which the common daisy belongs) has a necessary and sufficient definition. With modern genetic methods it can be clearly distinguished which plants belong to this genus. But it is not possible to list all species that belong to bellis because some may not yet have been discovered. It is also not possible to provide a definitive list of all attributes of bellis.

The notion of formal concepts in Definition 2 extends the normal FCA definition. Normal FCA formal concepts are both formal instance concepts and mathematical concepts. Intensions of FCA concepts are always sufficient. The subset of necessary attributes can be determined by the FCA methods of clarifying and reducing (Ganter and Wille 1999). For formal concepts a clear ordering can be defined either extensionally for formal instance concepts or intensionally for mathematical concepts. For mathematical concepts this corresponds to evaluating attribute implications.

Concepts formed when interpreting natural language tend to be associative and not formal. For example, it is impossible to provide a set of necessary and sufficient conditions for a concept such as ‘democracy’. Because cats have an associative concept of ’mouse’ they might chase anything that is mouse-like but not a mouse. Even everyday concepts are difficult to exactly define and delimit from other concepts. For example ‘chair’ is difficult to precisely distinguish from ‘armchair’, ‘recliner’ and ‘bench’. Cognitive theories such as Rosch’s (1973) prototype theory explain that the extensions and intensions of such concepts tend to be prototypical and fuzzy. While concepts are the core units of thought, signs are the core units of communication. The next definition is taken from Priss (2017). Again the reference to ‘unit of communication’ is non-formal but important in order to avoid having any kind of triple which fulfils the formal condition automatically being a sign.

Definition 3:

A sign is a unit of communication corresponding to a triple (ird) consisting of an interpretation i, a representamen r and a denotation d with the condition that i and r together uniquely determine d.

A sign always involves at least two layers of interpretation. One interpretation i is an explicit component of the sign, but there is also a second interpretation by someone who observes or participates in a communicative act. This observer decides what i, r and d are. This second level is not explicitly mentioned because it usually happens in the mind of a person. The same applies to FCA: there is always someone who creates a formal context but this is not explicitly mentioned. To provide an example of signs, the set of even numbers can be represented verbally as ‘set of even numbers’, as \(\{2, 4, 6, ..\}\) or as ‘\(\{n \in \mathcal{N} \mid n/2 = 0 ( \text{ mod } 2)\}\)’. These are three different representamens \(r_1, r_2, r_3\) which are interpreted by mathematicians i to have the same denotation d. This results in three signs \((i,r_1,d)\), \((i,r_2,d)\) and \((i,r_3,d)\). Signs with different representamens and equal denotations are called strong synonyms. Thus the three signs in this example are strong synonyms. Students, however, might interpret some or all of these representamens incorrectly. Thus if sign use by students is not synonymous to sign use by teachers this indicates that students have misunderstood something.

It might be possible to connect Definitions 2 and 3 because denotations can be considered to be concepts. But representamens and interpretations can also be modelled as concepts. Furthermore, interpretations, representamens and denotations can also be signs and if a concept is represented in some form then it must be a sign itself. Thus the relationship between Definitions 2 and 3 is complex.

For some signs, the three components r, i and d are not totally independent of each other. For example, icons are signs where the representamen and the denotation are similar to each other in some respect such as a traffic sign for ‘bike path’ showing the picture of a bike. The signs for abstract associative concepts (such as ‘democracy’) tend to have three independent components. Because it is not possible to provide an exact definition of the denotation of democracy, the representamen is always needed when communicating this sign. It is also not possible to talk about democracy without explaining what interpretation is used. This is contrary to formal concepts. The set of even numbers can be defined as ‘\(\{n \in \mathcal{N} \mid n/2 = 0 ( \text{ mod } 2)\}\)’ and discussed without calling it ‘set of even numbers’. Thus the definition of even numbers can be a representamen itself. The same holds for the formal instance concept with the extension \(\{1\}\). Both are examples of anonymous signs as defined below.

Definition 4:

An i-anonymous sign is a sign (ird) with \(r = d\). If i is clear, it can also be referred to as an anonymous sign.

Thus (idd) is a sign where the representamen equals the denotation. This is an extreme form of an icon. In mathematics it is often sufficient to assume a single general interpretation which consists of understanding the notation. For example, a whole textbook on mathematics might use a single interpretation. Mathematical variables tend to be just placeholders for their values and thus anonymous signs. Programming values are also anonymous. Variables in programming languages, however, are not anonymous signs because they change their values at run-time depending on the state (which can be considered an interpretation). The following conjecture is based on the idea that formal concepts are fully described by their definition. It may be convenient to give a name to a formal concept but it is not necessary to do so. Associative concepts do not have a precise definition. The relationship between the representamen that is used for an associative concept, its interpretation and its fuzzy definition provides further information. If an associative concept is to be communicated it requires a triadic sign. The representamen need not be a word. It could be a cat meowing in the vicinity of a fridge and looking at its owner in order to express ‘I am hungry’. But for an associative concept this triad cannot be reduced.

Conjecture 1

  1. (a)

    A formal concept can be used as an anonymous sign.

  2. (b)

    The denotation of an anonymous sign corresponds to a formal concept.

Nevertheless because values of variables in programming languages are formal concepts one cannot conclude that signs with formal concepts as denotations are always anonymous signs. A claim of Priss (2016) is that the concepts underlying mathematical definitions are always formal but students often interpret them in an associative manner. Until students have a grasp of the nature of formal concepts, every mathematical concept is a learning threshold for them because they are using an inappropriate cognitive approach. This claim cannot be formally proven but there is evidence for it. For example, when asking students in the DS class what a graph is (according to graph theory), their first answer was that it is something that is graphically represented. But that is a prototypical attribute of graphs that is neither necessary nor sufficient. Thus such an answer is incorrect. A correct answer is easy to produce by stating the definition. Since the students were allowed to use the textbook for this answer, they could have just looked it up. An incorrect answer for such a simple question shows that the students have not yet grasped the nature of mathematical concepts and what it means to understand a mathematical concept.

As another example, when teaching FCA to linguists I have many times experienced at least one linguist objecting: ‘what you call a concept is not a concept’. Such a statement is mathematically non-sensical because a formal concept is exactly what it is defined to be. Whether it is called ‘concept’ or something else is just a convention. But for a (non-mathematically trained) linguist the notion ‘concept’ is associatively defined with some reference to abstract ideas. The rest of Definition 2 and in particular the notion that a concept is a pair of sets is meaningless for a linguist. From a mathematical viewpoint an appropriate argument that could be discussed with linguists is whether the mathematical model suggested in the formal definition approximates the associative concepts that linguists have about concepts. But this discussion is only possible if the distinction between formal and associative concepts is clear.

Further evidence for the importance of formal concepts in mathematics comes from Moore (1994) and Edwards and Ward (2004) who highlight the role of formal definitions in university mathematics. This is in contrast to primary and secondary school where mathematics is often taught in an associative manner using practical examples and introductory exercises. First year university students are having difficulties with mathematics because they are relying too much on associations instead of concept definitions. Another source of evidence for a difference between associative and formal concepts is the work by Amalric and Dehaene (2016) who argue that expert mathematicians use different parts of their brains when they are listening to mathematical and non-mathematical statements. Unfortunately, the difference between associative and formal concepts also raises questions about whether concrete examples help or hinder the teaching of abstract mathematical ideas. Kaminski et al. (2008) started a debate on this topic which is still ongoing but there seems to be a consensus that transfer from concrete to abstract is not easy.

3 A Semiotic Analysis of Further FCA Learning Thresholds

The last section argues that representamens and denotations coincide for mathematical concepts. Therefore analysing mathematical representamens coincides with analysing mathematical meanings. Mathematical concepts usually have many synonymous representamens. For example, Fig. 1 shows two different representamens of a set operation. From a semiotic-conceptual perspective it is of interest to analyse how structures amongst representamens relate to each other and to denotational structures. For example, in Fig. 1 the circles on the left correspond to the curly brackets on the right. On the left the intersection appears static, on the right it is more obvious that intersection is an operation. It is probably more apparent on the left that the sequence of elements in a set is not important. The right representamen might be misinterpreted by students as imposing a fixed sequence on set elements. The left representamen might be more difficult to observe for students with mild dyslexia because there is no clear reading direction for an image. The signs on the left and right are synonyms. Although visualisations might seem more intuitive they tend to be less precise than formulas and include irrelevant information. For example the size of the circles on the left is irrelevant. They do not coincide with the sizes of the sets. Mathematicians tend to frequently use visualisations and for example scribble on paper or blackboards while they are thinking. Amalric and Dehaene (2016) show that the part of the brain that is used by mathematicians while thinking about mathematics is the one responsible for spatial tasks, numbers and formulas in non-mathematicians.

Fig. 1.
figure 1

Two different representamens for the same denotation

From a semiotic viewpoint there are two major research questions about mathematical representamens: (a) how the formal language of mathematics functions and (b) how denotational structures are preserved, highlighted or hidden in different kinds of representamens or translations between representamens. The idea that a semiotic analysis should focus on how structures are represented and translated is similar to Goguen’s (1999) algebraic semiotics. We suspect that one major difficulty in teaching mathematics is that teachers are used to the different types of representamens and know what to look for and what to ignore whereas students might misinterpret them. The next two sections provide a semiotic analysis of line diagrams of partially ordered sets and lattices, respectively, based on our experiences with the DS class.

3.1 Reading Line Diagrams of Partially Ordered Sets

In the 90s Rudolf Wille’s research group organised workshops for non-mathematicians to learn FCA. As far as I remember, the workshops lasted for at least 3 hours and started with letting the workshop participants manually construct concept lattices from formal contexts by first generating a list of extensions (or intensions) and then creating the concept hierarchy from that list. It is quite possible that Wille’s teaching method would avoid some of the conceptual difficulties described in this paper. But most people probably encounter FCA first via line diagrams which were also the starting point for FCA in the DS class. Furthermore manual construction of lattices is time consuming and requires learners to be highly motivated to learn FCA. Teaching by constructing examples also has limits. For example, because concept lattices are always complete, one cannot teach what it means for a lattice to be complete or incomplete using concept lattices.

As mentioned in the introduction, Eklund et al. (2004) observe that users can interact with the line diagrams in their FCA software in order to conduct queries. Our experience with the DS class showed, however, that without some detailed instruction students employ incorrect interpretations when they first encounter line diagrams. There exists an overwhelming amount of research about the use of visualisation in learning and teaching of mathematics (Presmeg 2006). Nevertheless we have not been able to find any existing literature on the specifics of learning to use line diagrams. Before introducing lattices as a topic in the DS class, the students had already seen the visualisations shown in Fig. 2. The lattices on the left were introduced earlier in the semester as visualisations of power sets and divisors without mentioning lattices. At that point the students were asked to construct such lattices for the relations of subset and divides based on some examples. Thus the students had already constructed examples of lattices. On the right of Fig. 2 there are examples of a graph and a visualisation of an equivalence relation. These were visualisations used earlier in the semester which could potentially be a source of misconceptions about line diagrams because, for example, reflexivity and transitivity are explicitly represented in such diagrams but omitted in line diagrams.

Fig. 2.
figure 2

Different types of graphs: partially ordered sets, graph and equivalence relation

The main source of misconceptions about line diagrams in the DS class, however, appeared to be vector addition (cf. Fig. 3). Students were commenting that a coordinate system seems to be missing from line diagrams. Because linear algebra is taught in German secondary schools but graph theory is not, vector spaces appear to be the primary visualisation model for the students. A major difference between line diagrams and vector addition visualisations is that line diagrams are discrete and represent exactly the elements that exist whereas the vector addition visualisation imposes a few lines on a continuous space. Thus it was at first difficult for the students to realise that whether an element is smaller than another element depends on the existence of lines in the line diagram whereas it depends on distances and coordinates for vectors. For example, when asked about the ordering of nodes in a line diagram one student used a ruler in order to measure distances. Other examples of prior knowledge which the students mentioned were class inheritance in Java and levels. Because class inheritance in Java forms a partially ordered set, this is probably a case of supportive prior knowledge. The notion of ‘levels’, however, is probably more distracting than helpful because levels in a partially ordered set change depending on whether or not they are counted from above and below. Most likely asking about ‘levels’ is further evidence of an assumed underlying coordinate system and a misconception that the lengths of the edges or the absolute height of nodes in a line diagram carry meaning.

Fig. 3.
figure 3

Conflicting interpretations: line diagrams and vector addition

The example of divisor lattices (in the centre of Fig. 3) appeared to be most intuitive for the students. They were familiar with the notions of greatest common divisor (gcd) and least common multiple (lcm). The students were able to read these and the division relation from the diagram and to create similar diagrams themselves. Nevertheless it is not so easy for first year students to then perform the abstraction to lattices in general. Accepting that the two lattices on the left in Fig. 2 are both examples of a shared abstract idea is difficult. This involves understanding that the lines in line diagrams represent an ordering relation which in some examples corresponds to subset in others to division. The transfer from finding gcds in the middle example in Fig. 3 to finding infimas in the left example in Fig. 4 is another hurdle. The gcds can be found by either following lines or calculating the numbers and then searching for the number in the diagram. It appears that some students (who may have forms of dyslexia) have difficulty reading line diagrams and finding infima without some aid such as tracing the lines with their fingers.

In summary, it appears that line diagrams are not instantly intuitive to many users because they conflict with other types of graphical representations which the users already know. The main learning thresholds in this context appear to be the challenge of overcoming the misconception of line diagrams as embedded into vector spaces and the abstraction from the specific examples of divisor and powerset lattices to the underlying shared algebraic structure. Students with some form of dyslexia might find visualisations difficult to read in general.

3.2 Understanding Concept Lattices

After introducing line diagrams, the next steps in the DS class were to introduce lattices and then concept lattices. Priss (2016) collects a list of typical questions about and problems with lattices that users tend to have when they first encounter concept lattices. This list was also confirmed in the DS class:

  • What is the purpose of the top and bottom node?

  • Why are there unlabelled nodes?

  • How can the extensions and intensions be read from the diagram?

  • What is the relationship between nodes that do not have an edge between them but can be reached via a path?

  • What is a supremum or an infimum?

  • How can one tell whether it is a lattice?

The SCA analysis conducted by Priss (2016) resulted in the two lattices presented in Fig. 4. The left hand side presents a lattice for the structures contained in line diagrams and the right hand side a lattice for the concepts of lattice theory. The figure shows that a fair number of concepts is involved and that the mapping from line diagrams to lattices is reasonably complex. Ultimately, in order to answer questions such as ‘why are there unlabelled nodes’ one needs to know what nodes and edges are, how they can be traversed and extensions and intensions be formed. One also needs to know what concepts, joins, meets, operators and sets are. In order to understand what a lattice is, one needs knowledge of all of the concepts in Fig. 4, an understanding of lattices as an abstraction and examples of partially ordered sets which are not lattices.

Fig. 4.
figure 4

Structures of line diagrams and concept lattices (Priss 2016)

To conclude this section it should be stressed that a few students in the DS class appeared to have a very good grasp of lattices towards the end of the class sessions on this topic. Most of the students who passed the class achieved at least 75% of the points related to the exam question about partially ordered sets and lattices. Thus the misconceptions that students were initially having can be overcome. The students can learn to interpret line diagrams correctly after some time spent with reading, instruction and exercises. But some students are having difficulties with the required abstraction and with the visualisations. FCA contains several learning thresholds and is not instantly intuitive.

4 Conclusion

This paper discusses learning thresholds in the teaching materials of FCA. A major learning threshold that applies to all of mathematics is the realisation that mathematical concepts are very different from associative concepts underlying natural language. A different cognitive strategy must be used for both types of concepts. Visualisations such as line diagrams are helpful for users who understand them but cannot be assumed to be instantly intuitive for people who have never seen them before. It would be of interest to conduct a more comprehensive analysis of mathematical notation and visualisations from a semiotic-conceptual perspective. Most pedagogical studies tend to focus on particular examples – we are not aware of an existing larger scale analysis of the semiotic structures of mathematical notation and visualisation.