Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

Fuzzy Cognitive Map (FCM) is a method for modeling complex systems utilizing existence knowledge and human experience. It has learning capabilities and characteristics which improve its structure and computational behavior [39, 44, 63]. It was introduced by Kosko [31], as an extension to cognitive maps [10], providing a powerful machinery for modeling of dynamical systems. As a knowledge representation and reasoning technique, it depicts a system in a form that corresponds closely to the way humans perceive it. Also, it is able to incorporate experts’ knowledge and available knowledge from data in the form of rules [44, 63, 69, 71]. This approach represents knowledge by emphasizing causal connections and map structure.

The resulting fuzzy model is used to analyze, simulate, and test the influence of parameters and predict system behavior. The FCM model is easily understandable, even by a non-technical audience, and each parameter has a perceivable meaning [61].

Due to their simplicity, support of inconsistent knowledge, and circle causalities for knowledge modeling and inferring, FCM was applied to many diverse scientific areas including engineering [79], medicine [55, 68], business [85], software engineering [36, 67], environmental sciences [29, 46], politics [8], and so on. Most of the applications concern knowledge modeling and decision making tasks (i.e. [1, 3, 4, 6, 8, 9, 12, 21, 23, 25, 30, 34, 37, 42, 47, 48, 5053, 58, 61, 62, 68, 72, 74]).

Also, a number of FCM modeling methodologies and/or FCM extensions for modeling systems have been proposed [49]. These FCM-based approaches refer either to FCM extensions or to enhance FCM structures inheriting characteristics and advantages of other intelligent techniques. The current extensions are usually designed to solve three FCM drawbacks [49], uncertainty modeling (FGCM, iFCM, BDD-FCM, RCM), dynamic issues (DCN, DRFCM, FCM, E-FCM, FTCM, TQFCM), and rule-based knowledge representation (RBFCM, FRI-FCM). The extensions of conventional FCM seem to be a useful trend for overcoming FCM limitations.

The ability of FCMs to improve their operation on the light of experience (learn-ing of the connection matrix) is a crucial issue in modeling. The adaptation of the connection matrix (known as weight matrix) can be carried out by diverse unsupervised and evolutionary type learning methods, such as unsupervised learning based on the Hebbian method [5153, 57], supervised ones with the use of evolutionary computation [5, 11, 17, 18, 59, 7476] and/or gradient-based methods [38, 86]. In most known approaches to learning FCMs, the set of concept labels C is provided a-priori by expert, and only the weight matrix is drawn from raw data.

This chapter is devoted to the presentation of methods and learning algorithms for FCM-based modeling. FCMs will be proved to be useful to exploit the knowledge and experiences that human have accumulated for years on the operation of a complex system. Also, it will be shown how the FCM-based methods and its learning capabilities have been used for decision analysis and support research. These methodologies and algorithms contribute to engineers’ intention to construct intelligent decision support systems, since the more intelligent a system is, more symbolic and fuzzy representation it utilizes [25, 70, 79].

2 Theoretical Background

Fuzzy Cognitive Map is a combination of fuzzy logic and cognitive mapping, and it is a way to represent knowledge of systems which are characterized of uncertainty and complex processes. FCMs were introduced by [31, 32] and since then they have gradually emerged as a powerful paradigm for knowledge representation [66]. They provide a more flexible and natural mechanism for knowledge representation and reasoning, which are essential to intelligent systems [40, 55, 64, 80, 81].

A FCM consists of factors (concepts/nodes) which represent the important elements of the mapped system, and directed arcs, which represent the causal relationships between the factors. The directed arcs are labeled with fuzzy values in the interval \([0, 1]\) or \([-1,+1]\), that show the strength of impact between the concepts. The fuzzy part allows us to have degrees of causality, represented as links between the concepts of these diagrams. This structure establishes the forward and backward propagation of causality, admitting the knowledge base to increase when concepts and links between them are increased.

Each of FCM’s edges is associated with a weight value that reflects the strength of the corresponding relation. This value is usually normalized to the interval \([0,1]\) or \([-1,+1]\). The matrix \(E\) stores the weights assigned to the pairs of concepts. We assume that the concepts are indexed by subscripts \(i\) (cause node) and \(j\) (effect node).

In the simplest case, it is possible to distinguish Binary Cognitive Maps (BCM) for which the concept labels are mapped to binary states denoted as \(A_i \in \{0, 1\}\), where the value \(1\) means that the concept is activated. The weights of BCM are usually mapped to the crisp set, i.e., \(e_{ij} \in \{-1, 0, 1\}\). The value \(1\) represents, positive causality, understood e.g. such way, that the activation (change from \(0\) to \(1\)) of concept \(c_i\) occurs concurrently with the same activation of concept \(c_j\) or that deactivation (change from \(1\) to \(0\)) \(c_i\) occurs concurrently with the same deactivation of concept \(c_j\). The value \(-1\) represents the opposite situation, in which the activation of \(c_i\) deactivates the concepts \(c_j\) or viceversa. The \(e_{ij} = 0\) means that there are no concurrently occurring changes of the states of the concepts. In FCMs, each node quantifies a degree to which the corresponding concept in the system is active at iteration step.

Usually, experts develop an FCM or a mental model manually based on their knowledge in a related area. At first, they identify key domain aspects, namely concepts. Secondly, each expert identifies the causal relationships among these concepts and estimates causal relationships strengths. This achieved digraph (FCM) shows not only the components and their relationships but also the strengths (Fig. 1).

Fig. 1
figure 1

This figure is a simple FCM representation is illustrated which has five generic vertices (\(F_1\)\(F_5\)) and the weights (weighted edges) showing the relationships between concepts

Once the FCM is constructed, it can receive data from its input concepts, perform reasoning and infer decisions as values of its output concepts [32, 79].

3 Fuzzy Cognitive Map Reasoning

For FCM reasoning process, a simple mathematical formulation is usually used. Values of the concept \(C_i\) in time \(t\) are represented by the state vector \(A_i(k)\), and the state of the whole FCM could be described by the state vector \(A(k) = [A_i(k),\ldots ,A_n(k)]\), which represents a point within a fuzzy hypercube \(I^n = [0,1]^n\) that the system achieves at a certain point.

The whole system with an input vector \(A(0)\) describes a time trace within a multidimensional space \(I^n\), which can gradually converge to an equilibrium point, or a chaotic point or periodic attractor within a fuzzy hypercube. To which attractor the system will converge depends on the value of the input vector \(A(0)\).

The value \(A_i\) of each concept \(C_i\) in a moment \(k+1\) is calculated by the sum of the previous value of \(A_i\) in a precedent moment \(t\) with the product of the value \(A_j\) of the cause node \(C_j\) in precedent moment k and the value of the cause-effect link \(e_{ij}\). The mathematical representation of FCMs has the following form:

$$\begin{aligned} A_i (k+1) = f\left( A_i (k) + \sum _{j=1}^N A_j (k) \cdot e_{ji}\right) \end{aligned}$$
(1)

where \(f(\cdot )\) is a threshold (activation) function [82, 83]. Sigmoid threshold function gives values of concepts in the range \([0,1]\) and its mathematical type is:

$$\begin{aligned} f(x) = \frac{1}{1+e^{-m \cdot x}} \end{aligned}$$
(2)

where \(m\) is a real positive number and \(x\) is the value \(A^{(k)}_i\) on the equilibrium point [79, 82]. A concept is turned on or activated by making its vector element 1 or 0 or in \([0,1]\). The transformation function is used to reduce unbounded weighted sum to a certain range, which hinders quantitative analysis, but allows for qualitative comparisons between concepts [79].

New state vectors showing the effect of the activated concept are computed using method of successive substitution, i.e., by iteratively multiplying the previous state vector by the relational matrix using standard matrix multiplication \(A^k = A^{k-1} + (A^{k-1} \cdot W)\). The iteration stops when a limit vector is reached, i.e., when \(A^k = A^{k-1}\) or when \(A^k - A^{k-1} \le e\); where \(e\) is a residua, whose value depends on the application type (and in most applications is equal to \(0.001\)). Thus, a final vector \(A_f\) is obtained, where the decision concepts are assessed to clarify the final decision of the specific decision support system.

4 FCM for Decision Support

Real-world problems are not static, the environment changes continuously while decision makers attempt to make a choice, and that it also changes as a result of those choices. In fact, most of real-world decision making is dynamic. Critical decisions in finance, sales, engineering, manufacturing, and other fields need interrelated resource-constrained decisions under hardly complex and uncertain environments.

Overall, decision support includes selecting the optimal strategy for reaching goals, from several strategies. The risks and uncertainties associated with each alternative shape a set of constraints with influence over this process [7]. Real-world issues are often composed by several elements interrelated in so complex ways. In addition, they are frequently dynamic, since they evolve with time by the interactions among elements [63].

Intelligent DSS often incorporates Artificial Intelligence (AI) techniques of knowledge representation and rule-based inferencing. Intelligent DSSs have resulted from the use of artificial intelligence techniques to improve the performance of more traditional systems. AI techniques are used in DSS knowledge bases and inferential procedures [47].

One promising tool for modeling and controlling complex systems is the FCM, and it has emerged as alternative tool for representing and analyzing the systems behavior. FCMs illustrate different aspects in the system’s behavior and these concepts interact with each other showing the dynamics of the system.

The main goal of building a FCM around a problem is to be able to forecast the outcome by letting the relevant issues interact with one another. In this sense, it can be used for finding out whether a decision is consistent with the whole set of stated causal assertions [63]. FCM application may contribute to the effort for more intelligent control methods and for the development of autonomous decision making systems.

By using FCMs for decision support, we also get the following benefits [63]:

  • Simplicity. By transforming decision problems into causal graphs, decision makers with no technical background can easily understand all of the components in a given problem and their relationships.

  • Simulation and prediction. With FCMs, it is possible to determine the most critical factor that appears to affect the target variable and to simulate its impact.

  • Timeliness. By relying on FCM models, the decision maker has a strong support, and hence is able to decide faster.

  • Reliability. By relying on FCM models from a reputable source, decision makers have the guarantee, or the expectation, that it was built with all the required care, including extensive testing and some validation techniques.

  • Investment. FCM models is a way to save the know-how and ingenuity of the best decision makers; to turn a volatile asset into a permanent one.

  • Efficiency. Decision makers can aim at the best decisions in their fields of excellence, and for the remainder rely on someone else’s expertise modeled in FCMs. In this sense, FCM models could be an efficiency trigger.

  • Visual modeling. FCMs provide an intuitive, yet precise way of representing concepts and reasoning about them at their natural level of abstraction.

In addition, FCMs represent knowledge efficiently, handle fuzziness, model situations including uncertain descriptions, adaptive to different situations, and it is flexible to new knowledge.

5 FCM Models/Methodologies

5.1 Rule-based FCMs

Rule-based Fuzzy Cognitive Maps (RB-FCM) are a FCM evolution covering several types of interrelations, not just monotonic causality [15, 16]. RB-FCM represents the complex real-world qualitative systems dynamics with feedback and allow the simulation of events modeling their impact in the system.

RB-FCM are iterative fuzzy rule based systems dealing feedback with fuzzy mechanisms. RB-FCM timing and innovative methods with uncertainty propagation. RB-FCM proposes additional types of relations between concepts as follows causal, inference, alternatives, probabilistic, opposition, conjunction, and so on. Moreover, they include a new fuzzy operation (Fuzzy Carry Accumulation) to model qualitative causal relations (Fuzzy Causal Relations) (Figs. 2 and 3).

Fig. 2
figure 2

Rule-based fuzzy cognitive maps. It is illustrated with a couple of nodes (\(c_1\) and \(c_3\)) and a RBFCM relationship between them. Fuzzy rules and defuzzification process to compute the new state \(c_3\)

Fig. 3
figure 3

This figure shows the three kind of relationships in FGCM. The relationship between \(x_2\) and \(x_3\) is a white one, between \(x_1\) and \(x_2\) is a grey one, and between \(x_3\) and \(x_1\) is a black one. FCMs just represent white relationships

In addition, RB-FCM represent time in different ways. The RB-FCM modeler must be able to identify the implicit time in each relationship. Base Time (B-Time) represents the highest level of temporal detail that a simulation can provide in the RB-FCM model (the resolution of the simulation). B-Time must always be implicit while designing each rule in RB-FCM, because if B-Time is one day the meaning of a rule is different than the B-Time is one year.

5.2 Dynamical Cognitive Networks

Dynamical Cognitive Network (DCN), proposed by [40], improves FCM by quantifying the concepts and introducing nonlinear, dynamic functions to the edges. Therefore, DCNs are able to model the dynamic nature of causal processes and perform sensible inference robustly.

DCN relies on the Laplacian framework to represent the causal relationships. The transformation between fuzzy knowledge and Laplacian functions imposes more efforts to DCN modelers. Each DCN node (concept) have its own value set, according on how accurately it needs to be represent.

In this sense, DCNs are more flexible and scalable than conventional FCMs. A DCN can be as simple as a Cognitive Map, a FCM, or as complex as a nonlinear dynamic system. DCNs consider the causal inference factors: the value of the cause, the value of the causal relationship and the degrees of the effect. DCNs improve FCMs by quantifying the state’s concepts and introducing non-linear, dynamic functions to the edges.

The value set of the DCN (\(\varPhi _G\)) is the product space of the spaces (\(\varPhi _{v \in G}\)), where \(\varPhi _v\) are the spaces of the concepts which \(G\) contains. It is then defined as follows:

$$\begin{aligned} \varPhi _G&= \displaystyle \prod _{v\in G} \varPhi _v \nonumber \\&= \{x|x=(x_1,\ldots ,x_n)^T, \; x_i \in \varPhi _{v_i} \; i=1,\dots , n\} \end{aligned}$$
(3)

where \(G\) is a digraph representing the DCN adjacency matrix. The concept value set of a concept \(v\) is an order set denoted by \(\varPhi _v\); every element of the set is a possible state of the concept.

Every DCN concept has its own value set (a binary set, a triple set, a fuzzy set, or a real interval) according to its properties. Moreover, FCMs does not handle dynamics.

5.3 Fuzzy Grey Cognitive Maps

Fuzzy Grey Cognitive Map (FGCM) is an FCM-based generalization designed for environments with high uncertainty, under discrete incomplete and small data sets [65] and it is based on Grey Systems Theory. The FGCM nodes are variables and the relationships between them are represented by grey weighted directed edges. An interval grey weight between the nodes \(x_i\) and \(x_j\) is denoted as \(\otimes w_{ij} \in [\underline{w}_{ij}, \overline{w}_{ij}]\) and it has a lower limit \((\underline{w}_{ij})\) and an upper limit \((\overline{w}_{ij})\). FGCMs represent the human intelligence better than FCM, because it is able to represent unclear relations between nodes and incomplete information about the modeled system better than FCMs do.

The state values of the nodes are updated in an iterative process with an activation function, which is used to map monotonically the grey node value into the range [65].

$$\begin{aligned} \otimes \mathbf{{C}} (t+1)&= f\bigg (\mathbf {C} (t) \cdot A(\otimes )\bigg ) \nonumber \\&= f\bigg ( \otimes \mathbf{{C}}^* (t+1) \bigg )\nonumber \\&= f\bigg ( ( \otimes c_1^* (t+1), \otimes c_2^* (t+1), \ldots , \otimes c_n^* (t+1) )\bigg ) \nonumber \\&= \bigg ( f( \otimes c_1^* (t+1)), f(\otimes c_2^* (t+1)), \ldots , f(\otimes c_n^* (t+1))\bigg ) \nonumber \\&= \bigg ( \otimes c_1 (t+1), \otimes c_2 (t+1), \ldots , \otimes c_n (t+1)\bigg ) \end{aligned}$$
(4)

where \(A(\otimes )\) is the grey adjacency matrix, and \(f(\cdot )\) the grey activation function. Usually, the grey activation function is a unipolar grey sigmoid

$$\begin{aligned} \otimes w_i (t+1) \in \left[ \frac{1}{1+e^{-\lambda \cdot \underline{w}_i^* (t+1)}}, \frac{1}{1+e^{-\lambda \cdot \overline{w}_i^* (t+1)}} \right] \end{aligned}$$
(5)

or a grey hyperbolic tangent

$$\begin{aligned} \otimes w_i (t+1) \in \left[ \frac{e^{\lambda \cdot \underline{w}_i^* (t+1)}-e^{-\lambda \cdot \underline{w}_i^* (t+1)}}{e^{\lambda \cdot \underline{w}_i^* (t+1)}+e^{-\lambda \cdot \underline{w}_i^* (t+1)}}, \frac{e^{\lambda \cdot \overline{w}_i^* (t+1)}-e^{-\lambda \cdot \overline{w}_i^* (t+1)}}{e^{\lambda \cdot \overline{w}_i^* (t+1)}+e^{-\lambda \cdot \overline{w}_i^* (t+1)}} \right] \end{aligned}$$
(6)

5.4 Intuitionistic FCMs

Intuitionistic FCMs (iFCM) cope with the inability of the FCM models to co-evaluate the hesitancy introduced into the modeled problems due to imperfect facts, indecision and lack of information [49]. iFCM proposal is effective with numeric, reproducible examples, on process control and decision support.

iFCMs include the Intuitionistic Fuzzy Sets (IFS) to handle the experts’ hesitancy in their judgements. It improves conventional FCM through the intuitionistic theory so that it models the degree of hesitancy in the relations defined by the experts (Fig. 4).

Fig. 4
figure 4

A relation between a couple of nodes (\(x_1\) and \(x_2\)) in iFCM-II. Each node has an impact weight and a hesitancy weight

The experts propose the cause-effect relations between two concepts, and the degree to which the expert hesitates to express that relation. IFS is a generalization of conventional fuzzy sets since the IFS membership is a fuzzy logical value rather than a single truth value.

iFCM-I proposal just considers the hesitancy of the influence between a couple of concepts. On the other hand, iFCM-II introduced hesitancy in the determination of concept values [49]. The hesitancy of the element \(x\) of a fuzzy set \(A\) is defined as follows

$$\begin{aligned} \pi _A(x) = 1 - \mu _A(x) - \gamma _A(x) \end{aligned}$$
(7)

The iFCM-I iterative reasoning process is computed as follows

$$\begin{aligned} c_i (t+1) = f\bigg ( (2 \cdot c_i (t+1)) + \sum _{i=1}^{n} ((2 \cdot c_j (t+1)) \cdot \zeta _{ji} \cdot w_{ji}^\mu \cdot (1- w_{ji}^\pi )) \bigg ) \end{aligned}$$
(8)

where \(c_i \in [0,1], i = 1,\ldots , n\) represent real node values at iteration \(k\), \(w_{ji}^\mu \in [0,1]\) and \(w_{ji}^\pi \in [0,1]\) represent the impact weight and the hesitancy weight and factor \(\zeta _{ji}\) models the sign (positive or negative) impact between the related concepts.

iFCM-II considers that nodes \(i=1,\ldots , n\) are modeled with linguistic variables represented by IFSs as follows

$$\begin{aligned} L_n^{c_i} = \{\langle x, v_i^{\mu }(x), v_i^{\gamma }(x) | x \in E^+ \rangle \} \end{aligned}$$
(9)

5.5 Dynamic Random Fuzzy Cognitive Maps

Dynamic Random Fuzzy Cognitive Maps (DRFCM) improves conventional FCMs with the nodes’ activation probability and including a nonlinear dynamic function within the inference process [2]. The main proposal of the DRFCMs is focused on the dynamic causal relationships. The edges’ weight are updated during the FCM dynamics to adapt them better to the new conditions. DRFCM considers on-line adaptive procedures of the system like real-world problems.

The node’s state on the DRFCM (the probability of activation of a given concept \(c_i\)) is computed as follows

$$\begin{aligned} p_j = \min \{\varphi ^+_j, \max \{r_i, \varphi ^-_j\} \} \end{aligned}$$
(10)

where

$$\begin{aligned} \varphi ^+_j = \max _{i=1\ldots n} \{\min \{q_i, w^+_{ij}\}\} \end{aligned}$$
(11)
$$\begin{aligned} \varphi ^-_j = \max _{i=1\ldots n} \{\min \{q_i, w^-_{ij}\}\} \end{aligned}$$
(12)
$$\begin{aligned} r_j = \max _{i=1\ldots n} \{ w^+_{i,j}, w^-_{ij} \} \end{aligned}$$
(13)

where \(r_j\) is the fire rate, and \(w_{ij}\) represents how node \(c_i\) have influence over the node \(c_j\). If the relationship between both nodes is direct then \(w^+_{ij} >0\) and \(w^-_{ij}=0\). On the other hand, if the former relationship is inverse then \(w^-_{ij} >0\) and \(w^+_{ij}=0\). Finally, if doesn’t exist a relationship among them, then \(w^+_{ij}=w^-_{ij}=0\).

Fig. 5
figure 5

This figure illustrates the interactive operation of a FCN-based system. The experts offer information related to the structure and the initial weights of the FCN. The desired values represent the system’s goals

5.6 Fuzzy Cognitive Networks

Fuzzy Cognitive Networks (FCNs) is an extension of FCMs [13, 33]. The edges’ weights are updated in each iteration providing a quicker and smoother convergence. FCNs store the formerly operational situations in a fuzzy rule database avoiding intensive interference with the real-world system updating.

FCNs always get equilibrium points with a continuous differentiable sigmoid-like activation functions with non expansive (or even contractive) properties.

FCNs’ adjacency matrix is extracted from physical system historical data. Moreover, FCNs are in continuous interaction with the system they model. The main contribution is the updating mechanism that get feedback from the real-world system and its storage of the ongoing knowledge throughout the system dynamics (Fig. 5).

The FCN’s updating process takes into account feedback node states from the real-world system. The proposed updating rule is based on the conventional delta rule as follows

$$\begin{aligned} \begin{array}{rl} \delta _j (k) &{} = c_j^{system}(k) - c_j^{FCN}(k)\\ &{} = c_j^{system}(k) - \Bigg (1+ e^{-\big (\textstyle \sum _{i=1,i\ne j}^n c_i^{system}(k) \cdot w_{ij}(k)+c_j^{system}(k)\big )}\Bigg )^{-1} \\ \end{array} \end{aligned}$$
(14)
$$\begin{aligned} w_{ij}(k) = w_{ij}(k-1) + a \cdot \delta _j(k-1) \cdot (1-\delta _j(k-1))\cdot c_j^{FCN}(k-1) \end{aligned}$$
(15)

where \(a\) is the learning rate and \(\delta _j(k)\) is the error at iteration \(k\), usually set at \(a = 0.1\), \(c_i^{FCN}(k)\) refers \(i\) to the response of the FCN at \(k\) iteration, when the nodes take their state values from the system’s feedback.

5.7 Evolutionary Fuzzy Cognitive Maps

Evolutionary Fuzzy Cognitive Maps (E-FCM) simulate real-time concepts states [14]. Their use was examined to model the complex and dynamic causal-related context variables. E-FCM models every temporal state value, which is named as Evolving State in the running process.

Nodes states evolve in real-time, based on their internal states, external assignment, even external causalities. The nodes update their internal states in an asynchronous way with a tiny mutation probability. The causal relationship \(E\) represents the strength and probability of the causal effect between a couple of nodes. This proposal considers a couple of system’s uncertainty fuzziness and randomness as follows

$$\begin{aligned} E = [W, S, P_m] \end{aligned}$$
(16)

where \(W\) is a vector of relationships weights, \(S\) is a vector of the signs of causal relationships, \(P_m\) is a vector of the causal edges probabilities, and \(m\) the number of edges.

The E-FCM causal weights can be computed as the statistical correlation of the input data (changes in the presynaptic nodes) and output data (changes in the postsynaptic nodes) if training datasets are available.

$$\begin{aligned} w_{ij} = \frac{Cov(c_i,c_j)}{\sqrt{var(c_i) \cdot var(c_j)}} \end{aligned}$$
(17)

where \(var(c_i)\) is the variance of the changes in the node state \(c_i\), and \(Cov(c_i,c_j)\) is the co-variance of the changes in node state \(c_i\) and the changes in node state \(c_j\).

The updating rule is computed as follows

$$\begin{aligned} \begin{array}{rl} \varDelta c_i(t+T)&{} = f\Big (k_1 \cdot \sum _{j=0}^n \varDelta c_j(t) \cdot w_{ij} + k_2 \cdot \Delta c_i(t)\Big ) \\ c_i(t+T) &{} = c_i(t) + \Delta c_i(t+T) \end{array} \end{aligned}$$
(18)

where \(T\) is the time for concept i to update its value (Evolving Time schedule), and \(k_1\) and \(k_2\) are two weight constants.

E-FCM allows different update time schedule for each node, an asynchronous update of the concepts’ state. As a result, nodes can evolve in a dynamic and probabilistically way.

5.8 Fuzzy Time Cognitive Maps

Fuzzy Time Cognitive Maps (FTCM) is an FCM extension including time in node’s edges [56]. FTCMs model the delay of the influence between the presynaptic node over the postsynaptic one. The relationships between a couple of nodes has two values, the conventional weight and the time lag.

$$\begin{aligned} \varpi = \{w_{ij},t_{ij}\} \;| \; t_{ij} \ge 1 \end{aligned}$$
(19)

FTCM introduces dummy nodes for value-preserving and translate the FTCM with time delays to unit-time delays (Fig. 6). In addition, it allows comparison of the results between the model dynamics of FTCM and FCM for analyzing time delay effects on the system.

Fig. 6
figure 6

This figure shows a FTCM with time delays in the upper side and its translation in a unit-time FTCM with dummy nodes

5.9 Fuzzy Rules Incorporated with Fuzzy Cognitive Maps

Fuzzy Rules Incorporated with FCMs (FRI-FCM) extends conventional FCM inheriting the rule-based representation of RB-FCMs to describe the systems under a connected point of view [72]. FRI-FCM translates the reasoning mechanism of conventional FCMs to a set of fuzzy IF-THEN rules. FRI-FCM inherits the representation of RB-FCMs to represent the causality underlying the modeled systems.

The FRI-FCM proposal is a four-layer fuzzy neural network designed to enhance the capability of conventional FCMs to automatically identify membership functions and quantify the causalities from raw data [72].

FRI-FCM makes comprehensive use of the dimensional data underlying input vectors state and avoids troublesome degrading of the fuzzy rules activations when the input dimensions are increasing [47].

5.10 Fuzzy Cognitive Maps Extensions Comparison

Table 1 shows advantages, disadvantages of each FCM modeling method and in which domain, it is suggested for decision support. In this sense, we propose the following kinds of domains:

Type I

Dynamic systems with uncertainties and/or time delays.

Type II

Extremely uncertain environment.

Type III

Human decision making oriented.

Type IV

Real-time systems and control.

As a result, DCN, DRFCM, FCN, and FTCM are suitable for Type I domains where the environments are dynamic and it could include time delays. FGCM and iFCM are better for Type II where the real world has a high uncertainty level. For Type III domain the best approaches are Rule-based FCM, FGCM, iFCM, and FRI-FCM and for Type IV EFCM is the best modeling option.

Table 1 FCM extensions comparison

6 Learning Algorithms for FCMs

The learning approaches for FCMs are concentrated on learning the connection matrix \(E\), i.e. causal relationships (edges), and their strength (weights) based either on expert intervention and/or on the available historical data. According to the available type of knowledge, the learning techniques could be categorized into three groups; Hebbian-based, population-based and hybrid, combining the main aspects of Hebbian-based and evolution-based type learning algorithms [45].

They have been compared recently in a review work [45], where their main features were described and the degree of success of each one was pinpointed. However, after this review study, new learning methodologies were emerged and investigated for constructing FCMs especially from data.

The following three subsections describe each algorithm category from the three groups, presenting also, new learning algorithms for evolutionary-based and hybrid techniques as well as their domain applications. At the end of this section, the main advantages and disadvantages of each one learning category are described showing the appropriateness of each one according to the problem domain.

6.1 Hebbian-based Methods

Dickerson and Kosko were the first who attempted the suggestion of a simple Differential Hebbian Learning (DHL) method [19, 20], which is based on Hebbian theory [26]. During DHL learning the values of weights are iteratively updated until the desired structure is found. In general, the weights in the connection matrix are modified only when the corresponding concept value changes. The main drawback of this learning method is that the formula updates weights between each pair of concepts taking into account only these two concepts and ignoring the influence from other concepts.

An improved version of DHL learning, namely Balanced Differential Algorithm (BDA), was introduced by Huerga [28]. That algorithm eliminates one of the limitations of DHL method by taking into account the entire concept values that change at the same time when updating the weights. More specifically, it takes into consideration changes in all concepts if they occur at the same iteration and has the same direction; however it was applied only to binary FCMs, which limits its application areas.

One year later, Papageorgiou and her colleagues introduced two unsupervised Hebbian-based learning algorithms, such as Active Hebbian Learning (AHL) and Nonlinear Hebbian Learning (NHL) which were able to iteratively adjust FCM weights and thus the learning of FCMs was mainly based on experts’ intervention [12, 48, 5052, 55]. In NHL approach, experts are required to suggest nodes that are directly connected and only these edges are modified during learning.

The experts have to indicate sign of each edge according to its physical interpretation and only the non-zero edges are updated. Also, the experts have to define decision concepts and specify range of values that these concepts can take. The validation is based on checking whether the model state satisfies these constrains. In a nutshell, the NHL algorithm allows obtaining model that retains initial graph structure imposed by expert(s), and therefore requires human intervention before the learning process starts.

In AHL approach [52] experts determine the desired set of concepts, the initial structure, as well as the sequence of activation concepts. A seven-step AHL procedure, which is based on Hebbian learning, is iteratively used to adjust the weights to satisfy predefined stopping criteria. This approach exploited the task of determination of the sequence of activation concepts.

Later, Stach and coworkers [76] proposed an improved version of the NHL method, called Data-Driven Nonlinear Hebbian Learning (DD-NHL), which is based on the same learning principle as NHL. However, it takes advantage of historical data (a simulation of the actual system) and uses output/decision concepts to improve the learning quality. An empirical comparative study have shown that if historical data are available, then the DD-NHL method produces better FCM models when compared with those developed using the generic NHL method.

6.2 Population-based Methods

In the case of population-based algorithms, the experts are substituted by historical data and the corresponding learning algorithms or optimization algorithms are used to estimate the entries of the connection matrix \(E\). The population-based learning algorithms are usually oriented towards finding models that mimic the input data. They are optimization techniques, and for this reason, they are computationally quite demanding. Several population-based algorithms, such as evolutionary strategies [34], genetic algorithms [23, 74], real coded generic algorithm—RCGA [7375], Swarm Intelligence [43], Chaotic Simulated Annealing [4], Tabu search [6], game-based learning [37], Ant Colony Optimization [21], extended Great Deluge algorithm [89], Bing Bang-Big Crunch [87] for training FCMs have been proposed.

Due to the need of developing new approaches for an automated generation of fuzzy cognitive maps using historical data, some innovative and promising learning algorithms have been proposed recently. For example, an Ant Colony Optimization (ACO) algorithm was presented in order to learn FCM models from multiple observed response sequences. Experiments on simulated data suggest that the proposed ACO based FCM learning algorithm is capable of learning FCM with at least 40 nodes. The performance of the algorithm was tested on both single response sequence and multiple response sequences. The ACO approach was compared to these algorithms through experiments. The proposed ACO algorithm outperforms RCGA, NHL and DD-NHL in terms of model error and SS mean measures when multiple response sequences are used in the learning process [21].

Also, a new learning algorithm, which is called Big Bang-Big Crunch, was proposed for an automated generation of Fuzzy Cognitive Maps from data. Two real-world examples, namely a process control system and radiation therapy process, and one synthetic model are used to emphasize the effectiveness and usefulness of the proposed methodology.

Moreover, the evolutionary mechanism of Cellular Automata (CA) was used to learn the connection matrix of FCM [18]. One-dimension cellular automata were used to code weight parameters, and the cellular states were chosen within the range \([0, 1]\) to form a cell space. In order to guide the optimization direction effectively and accelerate the speed of convergence, a mutation operator was added in the algorithm. This approach was applied on modeling the short-term stock prediction. The data come from Shanghai Securities Exchange, dating from 2002-02-27 to 2002-06-20, 52 days of them were used for training and the rest were used for testing. However, through the experimental analysis, the system error was fluctuating randomly, which explains the non-convergence of the evolution of CA.

A new adaptation algorithm focused on FCM design and optimization, the so-called Self-Organizing Migration Algorithms (SOMA), was proposed by Vascak [84] and was compared also to other methods like particle swarm optimization, simulated annealing, active and nonlinear Hebbian learning on experiments with catching targets for future purposes of robotic soccer. Obtained results showed the advantageous characteristics of the proposed method which are apparent and useful for other application domains.

Moreover, supervised learning using gradient method was proposed by Yastrebov & Piotrowska [86], as a modification of the weights in the direction of steepest descent of error function. Although this gradient-based method seems a promising approach, it needs further theoretical foundation and experimental analysis.

Little research has been done on the goal-oriented analysis with FCM. A methodology for decision support was suggested, which uses an immune algorithm to find the initial state of system in given goal state. The proposed algorithm takes the error objective function and constraints as antigen, through genetic evolution, and antibody that most fits the antigen becomes the solution [35].

6.2.1 Evolutionary Approaches for Prediction Tasks

The prediction of multivariate time series is one of the targeted applications of evolutionary fuzzy cognitive maps (FCM). The objective of the research presented in [22] was to construct the FCM model of prostate cancer using real clinical data and then to apply this model to the prediction of patient’s health state. Due to the requirements of the problem state, an improved evolutionary approach for learning of FCM model was proposed. The focus point of the new method was to improve the effectiveness of long-term prediction [22]. The evolutionary approach was verified experimentally using real clinical data acquired during a period of two years. A preliminary pilot-evaluation study with 40 men patient cases suffering with prostate cancer was accomplished. The in-sample and out-of-sample prediction errors were calculated and their decreased values showed the justification of the proposed approach for the cases of long-term prediction.

In the theoretical part, addressing these requirements of the medical problem, a multi-step enhancement of the evolutionary algorithm applied to learn the FCM was introduced. The advantage of using this method was justified theoretically and then verified experimentally [41].

6.2.2 Learning Approaches for Classification Tasks

Papakostas et al. [55] implemented FCMs for pattern recognition tasks. In their study, a new hybrid classifier was proposed as an alternative classification structure, which exploited both neural networks and FCMs to ensure improved classification capabilities. A simple GA was used to find a common weight set which, for different initial state of the input concepts, the hybrid classifier equilibrate to different points. Recently, Papakostas et al. [54] presented some Hebbian-based approaches for pattern recognition, showing the advantages and the limitations of each one.

Another very challenging learning category, which has also been applied for classification tasks and recently emerged, is the Ensemble learning [49]. This ensemble learning method inherits the main ideas of ensemble based learning approaches, such as bagging and boosting. FCM ensemble learning is an approach where the model is trained using non linear Hebbian learning (NHL) algorithm and further its performance is enhanced using ensemble techniques. The Fuzzy Cognitive Map ensembles were used to learn the produced FCM by the already known and efficient data driven NHL algorithm. This new proposed approach of FCM ensembles, applied to a case study regarding the identification of autism, showed results with higher classification accuracy instead of the NHL alone learning technique.

6.3 Hybrid Learning Methods

In this type of FCM learning methodology, the learning goal is to modify/update weight matrices based on initial experience and historical data at a two stage process. The algorithms proposed in the literature target different application requirements and try to overcome some limitations of FCMs. Little literature exists towards this direction [42, 88]. Papageorgiou and Groumpos [42], proposed for first time a hybrid learning scheme composed of Hebbian type and differential evolution algorithms and showed its applicability for in real-world problems for decision making tasks.

Later, Ren [60] presented a hybrid FCM learning method combining NHL and Extended Great Deluge Algorithm (EGDA). This hybrid learning approach has the efficiency of NHL and global optimization ability of EGDA. The FCM is trained at first with the use of NHL, in order to get a set of weights close to optimization structure, and then using EGDA the model is optimized for error minimization. The results were on a simple FCM structure, therefore more complex structures need to be experimented to approve this type of learning.

Another hybrid scheme using RCGA and NHL algorithm was presented by Zhu and Zhang [88], and investigated in a problem of partner selection. Their algorithm inherits the main features of each one learning technique, of RCGA population-based algorithm and NHL type, thus combining expert and data input. Although the first results are encouraging, more research would be essential.

6.3.1 Learning Algorithms Advantages and Limitations

Most of the FCM learning algorithms are devoted to the FCM modeling and optimization. They adapt the weight matrix using the available knowledge from experts and/or historical data. The produced FCM model after training follows system’s/problem’s characteristics. In the case of evolutionary computation techniques, the FCM design is based on the minimization of an error/cost or fitness function. The fitness function for each population-based algorithm might be modified according to the problem type [45]. Some studies have shown that the population-based algorithms increase FCM functionality, robustness and have generalization abilities [45].

Table 2 describes the main applications of FCM learning algorithms which concern the modeling/design, optimization, prediction and decision support. Also the main domains of each one FCM-based application are apposed in Table 2.

Each one learning category has its advantages and limitations, which make it appropriate to specific type of problems according to the data and knowledge availability. Table 3 gathers the most significant advantages and limitations of each one learning category. It is highlighted the usefulness of each one FCM modeling/design and optimization.

Table 2 Learning algorithms for FCM modeling, optimization, prediction and decision support
Table 3 FCM learning comparison

One challenge is to develop efficient semi-automated algorithms to encounter the limitations/problems, presented in Table 3, such as hybrid and ensemble learning algorithms. The semi-automated methods are preferred if some structural constraints have been imposed on the map by the experts. The hybrid learning approaches, which are based on functionalities of Hebbian and population-based learning algorithms and inherit the advantages and disadvantages of both of them, emerge less limitations as most of them can overcome from the fusion of both computational methods. Thus, their operation could be more advantageous in the case of modeling complex systems and systems with time evolving.

If the only criterion is the quality of the model’s dynamic behavior then the fully automated genetic optimization seems to be the best solution [74, 75, 78]. The population-based methods have wider applicability due to their ability to learn FCMs from multiple observed response sequences; they are able to predict time-series, to classify patterns, to simulate chaotic behavior, to model the evolving virtual systems, etc [45].

The main drawback of the population-based learning methods is that they provide solutions that are hard or impossible to interpret and which may lead to incorrect static analysis. Some experimental results that concern both static and dynamic properties of the FCMs learned with the genetic-based methods are promising [73].

7 Example Case Study on FCM Design Using Learning Methods

An example process control problem which was described in [51] was selected to show the effectiveness of each one learning algorithm for FCM design. We concentrate our presentation results on FCM design and adaptation methods using learning algorithms described previously. This process control problem is a well-established case study as it was used by most researchers to experimentally analyze and test their suggested learning techniques. A brief description of how this FCM model can be used to perform various tasks of analysis and simulations in order to obtain useful knowledge about the system being modeled, is presented in [73].

This chemical process problem consists of a tank with three valves that influence the amount of liquid within the tank. Valve 1 and valve 2 empty two different kinds of liquid into the tank, and during the mixing of the two liquids a chemical reaction takes place. Valve 3 opens when a proper mixing of the two liquids is accomplished.

A gauge sensor located inside the tank measures the specific gravity of the produced liquid. The desired quantity of the liquid within the tank depends (a) on the value of the specific gravity G, which should range between a minimum (\(G_{min}\)) and a maximum value (\(G_{max}\)), and (b) on the height H of the liquid within the tank, which should also range within a minimum \(H_{min}\) and a maximum value \(H_{max}\). In this case process control aims to maintain the values of G and H within the desired ranges, which are:

$$\begin{aligned} 0.74 \le G \le 0.80 \end{aligned}$$
(20)
$$\begin{aligned} 0.68 \le H \le 0.74 \end{aligned}$$
(21)

The FCM model for this system involves the following five concepts:

Concept 1 (\(c_1\)):

represents the amount of liquid as measured by its height H within the tank; it depends on the operational state of valves 1, 2 and 3.

Concept 2 (\(c_2\)):

represents the state of valve 1; it may be closed, open or partially open.

Concept 3 (\(c_3\)):

represents the state of valve 2; it may be closed, open or partially open.

Concept 4 (\(c_4\)):

represents the state of valve 3; it may be closed, open or partially open.

Concept 5 (\(c_5\)):

represents the specific gravity G of the liquid within the tank.

Since the monitored parameters of this problem are the specific gravity and the height, the decision concepts (DC) of the cognitive map are the concepts \(c_1\) and \(c_5\). These concepts are connected as illustrated in the form of a graph shown in Fig. 7, which also shows the weights associated with directed connections between all pairs of the concepts.

Fig. 7
figure 7

This figure shows the tank control model

We use the process control system in Fig. 7 to investigate the quality of models learned using the state-of-the-art learning methods which exist in the literature. The usage of different FCM development methods may result in different maps. Following that, we focus on the central theme of this chapter which is the design of FCM-based models, defining concepts relevant to a given system and weighted connections (weights) between the selected concepts.

More specifically, the following learning approaches have been implementing to learn this FCM model: AHL, NHL, NHL-DE, DDNHL, RCGA, PSO, ACO, memetic PSO, and Big Bang-Big Crunch. These nine different learning methodologies were tested experimentally in this process control problem. The simulation results of the conventional FCM model which describes the process control, were used as data to learn the FCM model [74].

Since the RCGA and other evolutionary-based methods were initialized with a 100 randomly generated maps, whereas the three Hebbian-based methods use just a single map, the experiments for all Hebbian-based were repeated 100 times using the 100 initial maps generated for the RCGA and method. The final output was selected as the map that provides simulations either with the lowest value of the simulation-error or with the minimum cost function.

Table 4 presents a summary of the results for the nine learning approaches applied in this process control problem. Both matrix-error and calculation-error (concerning either simulation error or cost/fitness function minimization) were calculated to quantify the quality with respect to both the static and the dynamic analysis. The average values together with the corresponding standard deviations (shown in brackets) are reported.

$$\begin{aligned} Matrix-error = \frac{1}{N\cdot (N-1)} \sum _{i=1}^{N} \sum _{j=1}^{N} (w_{ij}^{estimated} - w_{ij}^{real}) \end{aligned}$$
(22)

where \(w_{ij}^{estimated}\) and \(w_{ij}^{real}\) are the estimated weights after learning and real (initial) weights respectively. The simulation error is computed as follows

$$\begin{aligned} Simulation-error = \frac{1}{N\cdot (K-1)} \sum _{k=1}^{K-1} \sum _{j=1}^{N} |DC_{i}^{estimated} - DC_{i}^{real}| \end{aligned}$$
(23)

where \(DC_{i}^{estimated}\) and \(DC_{i}^{real}\) are the estimated and real values of decision concepts (\(DC\)), \(K\) is the number of available iterations to compare and \(N\) is the number of concepts. The cost function is calculated as follows

$$\begin{aligned} Cost-function = \frac{1}{N\cdot (K-1)} \sum _{k=1}^{K-1} \sum _{j=1}^{N} (DC_{i}^{estimated} - DC_{i}^{real})^{2} \end{aligned}$$
(24)
Table 4 Applied learning methods in a case study problem

It is observed that the weight matrix-error values obtained with maps learned using the population-based methods are substantially lower than the errors of the AHL, NHL and DDNHL. This happens due to the small deviations of the learned weights around their initial values. The ACO, PSO, Big Bang-Big Crunch have shown better performance regarding the quality of the produced maps than the RCGA and Hebbian-based models. In the case of AHL, the matrix error increases as all the zero weights are updated to new non-zero values.

Concerning the simulation-error values obtained with maps learned using the Hebbian-based methods are higher than the errors of the other population-based approaches. In spite of the relatively different connection matrix generated by population-based algorithms, the error/cost function or fitness function minimization is very small (equals \(0.001\) if we consider only the final state). This observation can be generalized, based on experiments reported in the literature (e.g. reference [77]), to a statement that the structurally different maps can generate very similar simulations. The population-based approaches find a number of suboptimal solutions, which in the case of PSO and memetic PSO can be a large one. This is acceptable due to the operation of the evolutionary approaches for optimization tasks.

Summarizing, population-based methods outperform Hebbian-based in terms of model error. The evolutionary learning algorithms use fitness functions for design and optimization tasks, thus defining the problem constraints more efficiently in learning approach.

8 Conclusions and Future Directions

FCM-based methods and computational learning algorithms have emerged attention throughout recent years for modeling and decision support tasks. Many successful applications in diverse domains clearly imply the effectiveness of these methodologies and learning techniques devoted to FCM modeling and decision making.

There is a considerable number of methodologies and learning approaches for FCMs resulted from several years of research and are exploited in this chapter. They emerged to eliminate the drawbacks and limitations of the conventional FCMs, thus improving and automating FCM modeling and construction. It is a research challenge for proposing powerful or efficient FCM-based methodologies and learning algorithms for modeling the complex and difficult tasks of systems evolving.

Models developed by experts are vulnerable to subjectivity of expert(s) beliefs and could be difficult to be developed for complex problems that involve dozens of concepts. Although maps developed by experts provide an accurate static analysis of FCM model, they may lead to inaccurate dynamic analysis.

These limitations motivated researchers towards the use of learning algorithms that would provide models with a more accurate representation of FCM system. The hybrid learning approaches, seem to be more feasible than Hebbian-type or population-based type in order to design an FCM, and this is a promising direction for FCM learning. Therefore, new approaches are required for training FCMs effectively which will increase their potential application in practice.

Recent interest in new FCM methodologies and computational algorithms suggests that these will be the main direction for future research. Even though the first step towards automatic construction of FCM from data was made, there are problems that still need to be overcome.

Although there are many recent attempts toward modeling and learning of FCMs, the application of FCM technique to a wide variety of scientific areas makes crucial the development of a commonly used tool that can assist the creation and simulation of FCMs. Few attempts have been made towards the creation of such a tool, ie. FCMapper (http://www.fcmappers.net/), jfcm (http://jfcm.megadix.it/), FCM designer tool, FCModeler tool.

Although many scientists construct their own FCMs, using experts knowledge and experience, there is no solid and standard representation of them that would make them easily reusable and transportable. There is not any standard software or programming tool that would simulate these FCMs, so every scientist has to create his own program and software system for simulating and analyzing FCMs.

The development of a software system would be very useful in research community because it could be able to assist the creation and simulation of dynamic FCMs, facilitate knowledge sharing and reuse of this knowledge and include learning algorithms and theoretical foundations for a dynamic behavior as well.

Finally, more research is needed on modeling, learning, automatic construction, knowledge representation and software tools development. New theoretical contributions could be included in current or emerging FCM extensions as well.