Keywords

1 Introduction

The Spiking Neural Network (SNN)—the “third generation” of Artificial Neural Network (ANN) [47]—overcomes the flaws in ANN such as biological plausibility, energy efficiency, and powerful computationally. It can mimic the human brain to a great extent. SNN is biologically plausible [58, 86], energy-efficient while simulating in neuromorphic hardware [20, 49, 86], and computationally powerful [19, 48] as it can approximate the same function which other generations of ANN do with very less number of spiking neurons [48]. It primarily differs from SNN in information coding, synapse model, and neuron model. The former uses rate coding which is proven unlikely, and strong arguments are provided against the use of rate coding in the human brain by Thorpe et al. [65, 83]. The latter uses precise timing of spikes as information called temporal coding akin to the human brain [8, 11, 60]. There exists single, as well as multiple synapse models in SNN, implemented using the concept of kernel functions. In addition, SNN mainly uses Leaky-Integrate-and-Fire (LIF) [1, 7, 44, 72, 73, 85], Hodgkin-Huxley model [29,30,31,32], Spike Response Model (SRM) [18, 19, 43], and Izhikevich model [37] as neuron model. On the other hand, traditional ANN generally refers to neuron models as activation functions. There exist various linear as well as non-linear activation functions for traditional ANN, the most popular being the logistic or sigmoid activation function [25], ReLU [57], and softmax [23].

SNN can be defined as a finite set of N spiking neurons, a finite set of \(S \subseteq N \times N\) synapses establishing a connection between elements of set N, synaptic weights between two synapses i and j, i.e., \(w_{ij} \in \mathrm{I\!R}\), a response function between i and j (where \((i, j) \in S\)), \(\varPsi _{ij}: \mathrm{I\!R} \rightarrow \mathrm{I\!R}\), and a spike firing threshold \(\varTheta \). Spiking neurons can mimic biological neurons where the relevant information between two spiking neurons is carried through the synapse(s) connected between any two neurons in terms of short electrical pulse with an amplitude about 10 mV and a duration about 1 ms [19, 60] called action potential or spikes. The synaptic terminals are those junctions where the exchange of information takes place between any two biological neurons through the receptive fields by the diffusion of neurotransmitters, and spiking neurons can mimic the simple form of this concept mathematically. Generally, information sender neuron is called the presynaptic neuron, and information receiver neuron is called the postsynaptic neuron. When presynaptic neuron(s) send information to a postsynaptic neuron, the internal state of the postsynaptic neuron changes. At rest, in a biological neuron membrane, the potential value remains at about [–65 mV to –70 mV] [19, 60] when there is an absence of input stimuli. Upon receiving input stimuli from presynaptic neuron(s), the value of the membrane potential of the postsynaptic neuron called Postsynaptic Potential (PSP) may increase or decrease according to the synapse model. An excitatory synapse model increases the PSP value which is called Excitatory Postsynaptic Potential (EPSP) [19, 49, 86] and an inhibitory synapse model decreases the PSP value which is called Inhibitory Postsynaptic Potential (IPSP) [19, 49, 86]. Note that a postsynaptic spiking neuron issues spike only when its PSP reaches a certain threshold and not at each propagation cycle like traditional ANN. The typical value of threshold in a biological neuron is about –55 mV [19, 49].

SNN has been used widely including in the task of classification and clustering. Its implementation in the hardware requires less energy. IBM’s TrueNorth [9, 13], Intel Loihi [12], Tsinghua Tianjic [61], and DARPA Quad Copter [28] are energy-efficient neuromorphic hardware that use SNN. Note that it is computationally powerful [48] and can efficiently handle non-linear data with a single spiking neuron (works fine without hidden layer(s) as well as hidden neuron(s)), making the system computationally powerful. Since it can efficiently classify non-linear patterns without any hidden layer(s), the synaptic load is less than conventional neural networks. Furthermore, SNN is used to develop the neuroprosthetic system where it exhibits properties that has the ability to adjust to the nonstationarity of the neuro-musculoskeletal system that is suitable to control neuro-prostheses [34, 64, 71].

2 Review of Supervised Learning Methods

The most challenging and crucial part of any supervised learning approach is hyperparameter tuning to optimise the predicted output keeping in mind the target outputs. The optimising phenomenon is generally referred to as learning. With the flow of time, various supervised learning algorithms to train SNN have been developed utilising heterogeneous optimisation techniques. However, almost none of the algorithms is satisfactory, provided there is a fair trade between computational efficiency and biological plausibility. This section discusses the most popular supervised learning algorithms developed for SNN, thoroughly and categorically. Also, in [39, 46, 76, 90], a detailed review of different learning algorithms developed for SNN to train in a supervised manner using various approaches is presented lucidly.

2.1 Learning by Finding the Gradient

The gradient-based method is well-known and is widely used as an optimising tool to train a neural network. In general, gradient or slope is used to find the direction of error in the continuous curve, making it easy to move in that particular direction to fine-tune the overall network error. However, the constraint is that the curve must be continuous, which means it can only provide continuous input. Therefore, it is complicated and challenging to apply this approach in SNN since all information processing happens in discrete forms. For presynaptic spike-times \(x_{i}\), hidden neuron’s spike-times \(y_{j}\), synaptic delays for k synapses \(d^{k}\), and predicted spike-times \(z_{m}\), the change in synaptic weights is represented using Eqs. (1), (4), (5), and (6) applying gradient estimation approach. The change in weights between hidden and output layers \(\Delta w_{jm}^{k}\) is given in Eqs. (1) and (4):

$$\begin{aligned} \Delta w_{jm}^{k} = - \eta \times \delta _{m} \times \xi \left( t-y_{j}-d^{k}\right) \end{aligned}$$
(1)

where \(\xi (t)\) is the \(\alpha \)-kernel which shapes the synapse, the definition is given in Eq. (2):

$$\begin{aligned} \xi (t) = \frac{t}{\tau _{s}}~\exp \left( 1-\frac{t}{\tau } \right) H(t) \end{aligned}$$
(2)

where H(t) is the Heaviside function represented using Eq. (3), and \(\tau _{s}\) is the synaptic time constant:

$$\begin{aligned} H(t) = {\left\{ \begin{array}{ll} 1, &{} \text {if } t>0 \\ 0, &{} \text {otherwise} \end{array}\right. } \end{aligned}$$
(3)

The value of \(\delta _{m}\) is calculated for the lateral use in Eq. (4):

$$\begin{aligned} \delta _{m} = \frac{\left( \delta _{m}-z_{m}\right) }{\sum _{j=1}^{q} \sum _{k=1}^{r} w_{jm}^{k} \times \xi \left( z_{m}-y_{j}-d^{k}\right) \times \left( \frac{1}{z_{m}-y_{j}-d^{k}}-\frac{1}{\tau } \right) } \end{aligned}$$
(4)

Now, the change in weights between the input and hidden layers \(\Delta w_{ij}^{k}\) is given in Eqs. (5) and (6):

$$\begin{aligned} \Delta w_{ij}^{k} = - \eta \times \delta _{j} \times \xi \left( t-x_{i}-d^{k}\right) \end{aligned}$$
(5)
$$\begin{aligned} \delta _{j} = \frac{\sum _{m=1}^{s} \delta _{m} \sum _{k=1}^{r} w_{jm}^{k} \times \xi (z_{m}-y_{j}-d^{k}) \times \left( \frac{1}{z_{m}-y_{j}-d^{k}}-\frac{1}{\tau } \right) }{\sum _{i=1}^{p} \sum _{k=1}^{r} w_{ij}^{k} \times \xi \left( y_{j}-x_{i}-d^{k}\right) \times \left( \frac{1}{y_{j}-x_{i}-d^{k}}-\frac{1}{\tau } \right) } \end{aligned}$$
(6)

The final change in synaptic weights \(\Delta w_{jm}^{k}\) calculated using the value of \(\delta _{j}\) is given using Eq. (6), which is added to the initial synaptic weights \(w_{jm}^{k}\) to get the new synaptic weights. Thus, the training happens in the case of a gradient-based approach for SNN. There are \(i=1,2,3, ..., I\) input neurons, \(j=1,2,3, ..., J\) hidden neurons, and \(m=1,2,3, ..., M\) output or readout neurons present in the network.

The challenge of discontinuity is solved to some extent by Bohte et al. [5] by introducing probably the first popular supervised learning algorithm to train an SNN connected in feed-forward fashion and naming it as SpikeProp [5]. The exciting part of the SpikeProp is its similar analogy with the most popular backpropagation algorithm of ANN. SpikeProp eliminates discontinuity by allowing a single spike-time while discarding the lateral spikes. Although SNN smoothly work with non-linear classification problems if implemented efficiently, without the need of hidden layer(s) and hidden neuron(s), SpikeProp used hidden layers and thereby suffered from the heavy computational cost. The reason is that hidden layers increase the synaptic load in the architecture, and, as a result, more computational power is required.

SpikeProp uses the population coding scheme [19] combined with the concept of time-to-first-spike [19] firing, i.e., in every neuron, the first firing time is most important than the lateral ones. The utilisation of time-to-first-spike eliminates the discontinuity problem by omitting the lateral spike-times and considering only the early spike-time. It is observed that first spike-times are the most relevant in terms of information carrier [51]. Thus, the input, hidden, and readout (i.e., output neurons) are restricted to fire only a single spike. The SRM [19] is selected as the neuron model for providing the dynamics of the membrane potential. In addition, the synapses are connected in a one-to-one fashion between every pair of SRM neurons. The error direction was investigated in SpikeProp by finding the slope since the usage of the time-to-first-spike as given in [5] turns discrete into the continuous nature of spiking. Although SpikeProp was a success to some extent, it lags behind in propping up weights if a neuron (postsynaptic neuron) no longer fires a spike after receiving the input stimuli.

Moreover, inhibitory neurons are not investigated properly and use a minimum value of learning rate. In [50], QuickProp and Rprop improve SpikeProp to some extent, and it is observed that the small value of the learning rate or step size used in SpikeProp can be increased to a large value that also leads to successful training to a certain extent. Note that the convergence rate in online mode, biological plausibility (since the synapses are not well-explored, which is less similar to the biological neurons), and the computational cost of SpikeProp is the flaw of this algorithm. In [6, 21, 50, 68, 91, 92], the convergence rate and multiple spiking nature are further investigated, which makes SpikeProp more generalised and a speedy algorithm than the previous version in [5]. However, the major problem of SpikeProp being a gradient-based supervised learning algorithm persists, that is, the stagnation at the local minimum, and it is a problem with any gradient-based optimisation algorithm.

The surge or sudden jumps present in any optimisation algorithm that uses gradient rule to determine the error-direction disturbs an optimisation algorithm’s consistency. In addition, SpikeProp did not consider the mixture of inhibitory and excitatory neurons because, in this case, there is always a threat to the convergence of the algorithm, and it is also a barrier when we want a synapse model to be biologically more realistic. In [68], Shrestha et al. also explore some of the demerits of this kind along with the problem in formulating the loss function. Some other gradient-based supervised learning uses a slightly different concept by utilising extended delta learning rule developed in [53, 55]. In the algorithm, each spike-train is allowed to go through convolution with the suitable kernel function, distinguishing the algorithm from the others. The gradient descent approach-based SPAN algorithm proposed in [54] uses the concept of the spike-pattern association, which works with a single synapse connected in the form of \(\alpha \)-shaped synaptic curve. It used the area under the curve to compute the overall loss in the network while training is an exciting feature.

However, the common problem is aforementioned to persist. Therefore, moving in a different direction in the search for another approach becomes necessary. This approach is primarily based on the concept of Hebbian learning, especially asymmetric Hebbian learning which is discussed in the next section.

2.2 Asymmetric Supervised Hebbian Learning

The Spike-Time-Dependent Plasticity (STDP) is a biological process that optimises the information processing mechanism among neurons. It is considered the asymmetric form of Hebbian learning that adjusts the synaptic efficacy or weights between neurons, considering the timing of a neuron’s spike-time (relative timings) and input spike-times. The correlation (temporal) between pre- and postsynaptic spiking neurons is taken into consideration. The plasticity generally means change; the meaning maps to the synapse change (the change happens in terms of synaptic efficacy). Like any other synaptic plasticity mechanism, along with the development and fine-tuning of neuronal circuits while in the brain’s development phase, it is believed that STDP handles the learning and storing corresponding information inside the brain [4, 69]. It explains activity-dependent development partially regarding two different concepts: the Long-Term Potential (LTP) and the other is the Long-Term Depression (LTD). When the repeated presynaptic spike arrives a few milliseconds before the postsynaptic spikes, it is referred to as the LTP. On the other hand, when the repeated presynaptic spike comes after the postsynaptic spikes, it is referred to as the LTD.

The learning window which is also called the STDP-function varies for different synapse models. The rapid change in the learning window’s value forces the time scale to be represented to the millisecond. Although it primarily learns in an unsupervised manner and is considered a partial learning algorithm, most researchers combine STDP with a concept called anti-STDP to train in a supervised manner. There are various supervised algorithms for SNN which are developed using the STDP. However, a few are successful to some extent, both computationally and biologically.

The time difference \(\Delta t\) between presynaptic spike (\(t_{\textrm{pre}}\)) and postsynaptic spike (\(t_{post}\)) is represented as \(\Delta t = (t_{\textrm{pre}}-t_{\textrm{post}})\). The change in synaptic weights for excitatory synapse \(w_{\textrm{excitatory}}\) is given in Eq. (7). The exponentially decaying shape shown in Fig. 1 indicates the dependency on the time difference of spikes, i.e., \(\Delta t\):

$$\begin{aligned} \Delta w_{\textrm{excitatory}} = {\left\{ \begin{array}{ll} \mathcal {A}^{+} \exp (\frac{\Delta t}{\tau ^{+}}), &{} \forall \Delta t<0 \\ \mathcal {A}^{-} \exp (-\frac{\Delta t}{\tau ^{-}}), &{} \forall \Delta t>0 \\ 0, &{} \forall \Delta t=0 \end{array}\right. } \end{aligned}$$
(7)

where \(\mathcal {A}^{+}\), and \(\mathcal {A}^{-}\) represents constant value (usually taken as 1.0) for the LTP and LTD, respectively. The values of \(\tau ^{+}\) and \(\tau ^{-}\) known as time constants shape the curve for LTP and LTD, respectively.

Fig. 1
figure 1

The learning window for STDP (relation between synaptic weights and spike-time difference) where LTP is represented by the left curve, and LTD is represented by the right curve

In [70], a learning algorithm for SNN is proposed in which STDP and anti-STDP are used to fit the algorithm in a supervised paradigm. In this algorithm, multiple spiking activity is used where each spiking neuron can fire multiple spikes at a different time step. The architecture of the network is feed-forward, having hidden layers. The demerit of the algorithm is the negligence of the precise spike-times produced by the neurons present in the hidden layers at the time of training. Qiang et al. proposed an algorithm that uses temporal coding to represent real-valued continuous information in the form of discrete spikes to train SNN in a supervised manner [93]. In [81], Aboozar et al. proposed the supervised learning that is biologically plausible called the BPSL algorithm, which is capable of firing multiple target spikes from a spiking neuron. Although it is referred to as the biologically plausible algorithm, there is a lack of proper implementation in the synapse model when essential biological elements are considered.

In [88], John et al. proposed a supervised algorithm using synaptic weight association training and called it SWAT. It is used to classify the non-linear feed input patterns into their respective target patterns. There is an exciting feature present in the SWAT algorithm that uses the dynamic synapse model [10, 84], which is capable of working in terms of the mechanism of long-term plasticity. SWAT has biological properties to some extent. However, the major drawback of SWAT is the computational cost since it has a huge synaptic load to be dealt with having high computational power. The increase in synaptic load results from a huge number of connections formed due to the presence of many hidden neurons in the network topology. Therefore, it is challenging for a computer with moderate computational power to adjust and fine-tune many network parameters. As far as SWAT training is concerned, it is trained using the STDP algorithm transforming into the supervised paradigm.

Tempotron algorithm, proposed in [24], which trains SNN in a supervised manner, came with a slightly different picture. It allows a neuron to learn spike firing decisions (whether to issue a spike or not) when its cell membrane is updated with the potential of incoming input stimuli from several presynaptic neurons. The working of Tempotron’s response is like a switch “on” or “off” akin to a digital system. Instead of precise spike-time learning, Tempotron decides the ability of a neuron’s firing (acts like a decider). This algorithm also lags behind when there is the question of a balanced trade-off between biological plausibility and computational efficiency. In addition, Tempotron can be used only in a single-layered network topology which is a barrier for multilayered network topology. Also, it is restricted to 0 or 1 as output which does not encode information in precise spike timing.

Other more supervised learning algorithms are primarily based on STDP; a few of the most used are discussed in this literature review. J. Wang et al. proposed the OSNN algorithm in [89] which is an online supervised learning algorithm for SNN. The OSNN has an adaptive network structure trained in an online fashion using supervised learning patterns. In [56, 80], a supervised learning algorithm is proposed where the concept of STDP and anti-STDP is used to make the algorithm work as supervised learning. It is well-known that STDP primarily works in an unsupervised fashion. It is not considered a fully functional learning algorithm due to its plasticity updating mechanism, which changes the sign of synaptic efficacy instead of updating a fair value based on all presynaptic neurons’ spike firing times. It is a barrier to STDP-based supervised learning. Note that the Hebbian approach-based supervised learning algorithm has a common problem, which is the continuous change in synapse parameters even if neuron fire spikes exactly match the target spikes. Thus, there is a need for some extra work for adding additional learning rules or constraints to the original algorithm to provide stability. Moreover, in supervised Hebbian learning, all undesired timings of the spike are usually suppressed by the “teaching signal” during the training phase. Therefore, corelation happens only between pre- and postsynaptic spikes, around the desired timings of the spike. Since this type of corelation is absent in all other circumstances, synaptic strength cannot be weakened even if a neuron fires spikes at undesired times during the testing phase.

It is observed from the literature that spiking neurons have the ability to successfully classify non-linear patterns into their respective target classes without using any hidden layer(s), and this powerful feature of spiking neurons is not implemented in the aforementioned learning algorithms except SEFRON proposed in [38]. SEFRON did not use any hidden layer. However, it was successful in classifying the non-linear patterns, thereby decreasing the synaptic load. It explores the computational power to a certain extent by utilising a single spiking neuron. However, we analysed and observed that the number of network parameters could be reduced to half, keeping the classification accuracy unhampered, which we experimented successfully.

2.3 Learning with Remote Supervision

Ponulak et al. [62] proposed a distinguished learning algorithm called ReSuMe that is based on the concept of “remote supervision”. It is argued that ReSuMe eliminates the significant drawbacks found in the supervised Hebbian learning approach. Apart from this, ReSuMe also implements some exciting features. The primary principle is to impose the input-output characteristics into the SNN for yielding the target spike trains in response to the corresponding input spikes. Unlike supervised Hebbian learning, ReSuMe does not directly feed the desired signals to the learning neurons. Nevertheless, it can co-determine the synaptic connection’s plasticity. The algorithm ReSuMe also uses the supervised Hebbian approach for learning, but its “remote supervision” feature primarily distinguishes it from the others that use the supervised Hebbian learning approach. The concept of “remote supervision” is biologically justifiable based on an experimentally observed neurophysiological phenomenon—heterosynaptic plasticity [63, 75, 87]. The working rule of ReSuMe is briefly explained in Eq. (8):

$$\begin{aligned} \frac{\textrm{d}}{\textrm{d}t} w(t) = \left[ \mathcal {S}^{d}(t)-\mathcal {S}^{l}(t)\right] \left[ a + \int \limits _{0}^{\infty } \mathcal W(s) \times \mathcal {S}^{in}(t-s) \textrm{d}s\right] \end{aligned}$$
(8)

where \(\mathcal {S}^{d}(t)\), \(\mathcal {S}^{l}(t)\), and \(\mathcal {S}^{in}(t)\) represent the desired, presynaptic (input), and postsynaptic (output) spike trains, respectively. The parameter a denotes the amplitude of the contribution (non-correlated) to the \(\frac{d}{dt} w(t)\), and the convolution function given in Eq. (8) is the modification (Hebbian-like) of w. The value of s represents the time-delay between spikes of synaptic sites and over s, and the integral kernel \(\mathcal W(s)\) is defined as shown in Eq. (8). A positive value of a corresponds to the excitatory synapses where the shape of \(\mathcal W(s)\) becomes similar to the STDP rules, and a negative value of a corresponds to the inhibitory synapses where the shape of \(\mathcal W(s)\) becomes similar to the anti-STDP rules.

The exciting merit of ReSuMe is its independence from the spiking neuron models. Therefore, it can work with a variety of spiking neuron models. Also, ReSuMe can learn the target temporal as well as spatio-temporal spikes efficiently. In addition, it converges quickly towards the optimum value. There exist algorithms that explore ReSuMe in a better manner, such as in [77, 78], the ReSuMe algorithm is further investigated, where synaptic delays were added. The delay used is the static constant values is not random. In addition, in [79], multiple neurons are successfully trained using the training rules of ReSuMe instead of training a single neuron. However, ReSuMe has many disadvantages despite the advantageous features: ReSuMe claims to be suitable for online learning, but due to the fixed network topology, it is not adaptive to the incoming stimuli. Also, ReSuMe is unable to predict inputs just after single usage of the training patterns. Although ReSuMe is biologically plausible, local behaviour restricts its learning ability.

Another exciting supervised learning algorithm that works on the ReSuMe principle developed to train SNN is called Chronotron, proposed by Florian et al. in [16]. The Chronotron is experimented with using three different models: first is the gradient descent learning (called gradient descent E-learning) where delta learning rule is used, second is the I-learning where gradient descent E-learning and ReSuMe learning rule are combined and used. The third one is the ReSuMe learning rule. Supervised learning is implemented using a sophisticated distance metric called VictorPurpora in Chronotron, an exciting feature of the algorithm. However, Chronotron trained the synaptic efficacies in a batch mode by fixing the network topology, making it unsuitable for online learning.

2.4 Learning with Metaheuristics

Heuristic methods are used as a powerful and comprehensive tool for solving challenging optimisation problems. Although heuristics provide “good balanced” solutions relatively very close to the global optimum in affordable cost and time, their design and development become complicated as they depend on “problem-specific” characteristics [59]. Therefore, to solve the flaw mentioned above, metaheuristics came into existence [22]. Metaheuristics are “problem-agnostic” rather than “problem-specific” and have become remarkably popular in many optimisation areas, such as developing learning algorithms for ANN. However, the power of metaheuristics is significantly less explored and experimented with within the case of SNN. In this section, the metaheuristic approaches which are used to train SNN are briefly discussed.

Metaheuristics such as evolutionary algorithms are mathematically simpler and can work on the real numbers directly, and do not waste time encoding these real numbers into other formats. Therefore, most of the complex classification problems want this strategy. In [67], an evolving network of spiking neurons is proposed, which is based on the Thorpe model [82] called eSNN. The advantages of eSNN include the fast real-time simulation achieved at a low computational cost in a large network architecture. Also, without retaining past data, the model can accumulate knowledge at the time of data arrival. The usage of fuzzy rules for yielding the inference engine is an exciting feature of eSNN. However, eSNN has many disadvantages, such as the “infinite repository” problem. For each new arrival of patterns in online fashion, its repository of neurons grows infinitely. Also, due to the usage of averaging synaptic weights with rank order, eSNN cannot handle input patterns having the same rank (despite having different spike-times), as well as rank order, and also can increase the number of neurons in the network, which may lead to the loss of relevant stored information.

In [14], synaptic efficacies of SNN were optimised to reduce the overall network error using evolutionary techniques where the concept of “self-regulatory” (called the algorithm as SRESN) is appropriately implemented, which regulates the learning process. The current stored knowledge can automatically evolve the output layer neurons based on training patterns. SRESN can add a neuron, change network parameters, or forgo learning from samples based on the class-specific and sample knowledge stored in the network. Thus, SRESN works in a “self-regulatory” mode of learning. This method has both online and offline modes of training. However, SRESN does not use synaptic delays, which is an essential factor, to provide better computational cost compromising the biological plausibility.

Evolutionary methods are also used to improve the gradient-based SpikeProp algorithm [5], which uses the Particle Swarm Optimisation (PSO) technique [41], and it is referred to as SpikeProp-PSO. It enhanced the learning process of SpikeProp using the angle-driven dependency-learning rule. However, it increases the computational cost. Also, it is biologically less plausible since the biological elements present in synapses are neglected.

Differential Evolution (DE) [74] is a powerful optimisation tool known for its simplicity and good performance, which is combined with eSNN [67] to develop another supervised learning algorithm called DEPT-ESNN [66]. The primary goal of DEPT-ESNN is to select the optimum number of eSNN parameters such as modulation factor, similarity factor, and threshold. In DEPT-ESNN, DE plays a vital role by providing suitable values for the mentioned eSNN parameters adaptively rather than trial-and-error. The advantage of DEPT-ESNN includes its simple implementation and generalisation. But biological elements present in synapses are not considered, which makes DEPT-ESNN biologically less plausible.

Although metaheuristic approaches are a bit time-consuming and can work with a single spiking scheme, they have many advantages that are not achievable using other optimisation approaches. Therefore, there is a need for more exploration of metaheuristic approaches to develop an efficient learning algorithm compatible with SNN. Note that other powerful metaheuristics such as Genetic Algorithm (GA) [3, 26, 33] and Grey Wolf Optimisation (GWO) [52] are neither explored nor successfully experimented directly, providing a properly balanced trade-off between the computational cost and biological plausibility, with SNN trained in the supervised manner.

The aforementioned supervised learning algorithms, irrespective of the approach used, did not explore the synapse model thoroughly which is found from the literature. Although in some algorithms such as [5, 77,78,79] synaptic delays [40] is used, those are constant synaptic delays, and wherever the usage of the mixture of excitatory neurons and inhibitory neurons is observed, those are not appropriately implemented like GABA-switch [17, 45]. Synaptic delays are significant when biological plausibility is concerned. In the GABA-switch mechanism, switching from excitatory neuron to inhibitory and vice versa happens randomly. The robustness of an algorithm is tested against noise, and in the biological process, the presence of noise is evident while sharing information among neurons [15]. Therefore, it should be robust for a model to be biologically plausible, which is less explored as far as SNN is concerned. Another important phenomenon observed in a biological neuron is the spontaneous firing of spikes [27, 42] which is almost neglected in most of the synapse models of an SNN architecture.

Moreover, there is a lack of a balanced trade-off between the computational cost and biological plausibility in almost all the aforementioned supervised learning algorithms developed to train an SNN topology. A balanced trade-off between the computational cost and biological plausibility is essential in the case of SNN because if the computational complexity is very high, it is difficult to handle a high-dimensional dataset.

Tables 1 and 2 show a brief summary of the gradient and STDP-based supervised learning algorithms, and a brief summary of remote-supervision and metaheuristic supervised learning algorithms.

Table 1 A brief summary of gradient and STDP-based supervised learning algorithms
Table 2 A brief summary of remote-supervision and metaheuristic-based supervised learning algorithms

3 Conclusion

The supervised learning algorithms discussed in this paper, irrespective of the approach used, did not explore the synapse model thoroughly that is found from the literature. Although in some algorithms such as [5, 77,78,79] synaptic delays [40] are used, those are constant synaptic delays, and wherever the usage of the mixture of excitatory neurons and inhibitory neurons is observed, those are not appropriately implemented like GABA-switch [17, 45]. Synaptic delays are significant when biological plausibility is concerned. In the GABA-switch mechanism, switching from excitatory neuron to inhibitory and vice versa happens randomly. The robustness of an algorithm is tested against noise, and in the biological process, the presence of noise is evident while sharing information among neurons [15]. A model should be robust to be biologically plausible, which is less explored in the case of SNN. Another important phenomenon observed in a biological neuron is the spontaneous firing of spikes [27, 42] which is almost neglected in most of the synapse models of an SNN architecture.

Although metaheuristic approaches are a little time-consuming and can work with a single spiking scheme, they have many advantages that are not achievable using other optimisation approaches. Therefore, there is a need for more exploration of metaheuristic approaches to develop an efficient learning algorithm compatible with SNN. Note that other powerful metaheuristics such as Genetic Algorithm (GA) [3, 26, 33] and Grey Wolf Optimisation (GWO) [52] are rarely explored successfully and experimented directly, providing a properly balanced trade-off between the computational cost and biological plausibility, with SNN trained in the supervised manner except [35] and [36].