Abstract
Biological evidence shows that precise timing spikes can more accurately describe the activity of the neuron and effectively transmit spatio-temporal patterns. However, it is still a core challenge to trigger multiple precise timing spikes in each layer of multilayer spiking neural network (SNN), since the complexity of the learning targets increases significantly for multispike learning. To address this issue, we propose a novel supervised, multispike learning method for multilayer SNNs, which can accomplish the complex spatio-temporal pattern learning of spike trains. The proposed method derives the synaptic weight update rule from the Widrow-Hoff (WH) rule, and then credits the network error simultaneously to preceding layers using backpropagation. The algorithm has been successfully applied to the benchmark datasets from the UCI dataset. Experimental results show that the proposed method can achieve comparable classification accuracy with classical learning methods and a state-of-the-art supervised algorithm. In addition, the training framework effectively reduces the number of connections, thus improving the computational efficiency of the network.
This work is supported by National Natural Science Foundation of China under Grant No. 61906126.
Access provided by Autonomous University of Puebla. Download conference paper PDF
Similar content being viewed by others
Keywords
1 Introduction
Research evidence shows that the precise spatio-temporal firing pattern of groups of neurons can convey relevant information [1], which enables us to use time as a communication and computing resource in spiking neural networks (SNNs). In recent years, learning methods focusing on how to deal with spatio-temporal spikes by a supervised way have also been explored [2]. This scheme can train single or multilayer networks to fire the required output spike train. For single-layer networks, different spike-timing based learning rules have been developed [3, 4]. Theses rules adopt either an error function minimized by gradient descent or an analog of the Widrow-Hoff (WH) rule. Remote supervised method (ReSuMe) [5] is an outstanding method due to its effectiveness. It uses Spike-Timing-Dependent Plasticity (STDP) and anti-STDP window to finish the learning process. All of these existing single-layer algorithms can successfully finish the training, while the efficiency of them is low, especially for complex tasks. Therefore, training a hierarchical SNN in the closest way to the brain is required.
To further improve the learning performance, the Quick Propagation, Resilient Propagation [6] and the SpikeProp [7] are studied. However, due to the sudden jump or discontinuity of error function, the gradient learning method may lead to learning failure. Another thread of research is to use the revised version of the WH learning rule for SNNs. ReSuMe is extended to the Multilayer Remote Supervised Learning Method (Multi-ReSuMe) in [8], where multiple pulses are considered in each layer. The delay of spike propagation is a vital feature in the real biological nervous system [9]. Combining ReSuMe and delay learning, [10] further puts forward a new algorithm for multiple neurons. Although many efforts have been made for SNN structure design and learning, in most of the existing learning methods, the transformation of relevant information is realized by using rate coding or single spike of neurons [11] due to the discontinuous nature of neuronal spike timing. Thus, it remains as one of the challenging problems to build an SNN that can learn such spike pattern-to-spike pattern transformations.
In this paper, a novel supervised learning method is presented, which trains the multilayer SNN for transmitting spatio-temporal spike patterns. In this work, the error function from Widrow Hoff (WH) rule, based on the difference between the actual and expected output spike trains, is first introduced to change the synaptic weights, and then is applied to neurons triggering multispike in each layer through a backpropagation learning rule. The main innovations of this method consist in: 1) extending the WH rule-based PSD rule to learn spatio-temporal spike patterns in multilayer SNNs and 2) effectively reducing the number of connections, thus improving the computational efficiency of the network. Finally, our method is evaluated thoroughly on the benchmark datasets. Experimental results show this algorithm can achieve high learning accuracy and have a significant improvement in the computational efficiency of the network.
2 Neuron Model
Firstly, we define a spike train as a series of impulses triggered by a specific neuron at its firing time, which is given as the following form: \(S(t) = {\sum }_f \delta (t-t^f)\), where \(t^f\) is the f-th firing time, and \(\delta (x)\) is the Dirac function: \(\delta (x) = 1\) (if x = 0) or 0 (otherwise). Then a linear stochastic neuron model is introduced in continuous time for constructing a relation between the input and output impulse trains as used in [8]. The instantaneous firing rate R(t) of a postsynaptic neuron i is the probability density of firing at time t and is determined by the instantaneous firing rates of its presynaptic neurons j: \(R_i(t) = \frac{1}{k}{\sum }_j w_{ij}R_j(t)\) where k is the number of presynaptic neurons. In a single calculation, we only get a concrete spike train S(t) instead of a direct R(t) of the neuron. However, R(t) can be defined as the expectation over S(t) for an infinite number of trials.
where M is the number of trials, and \(S_k(t)\) is the concrete spike train for each trial. In this paper, we use R(t) to derive the learning method because of its smoothness. In a single run, R(t) will be replaced by S(t) at a suitable point.
For simplicity, the leaky integrate-and-fire (LIF) model [12] is considered. For a postsynaptic neuron, the input synaptic current is calculated as:
where \(w_i\) is the synaptic efficacy of the i-th afferent neuron, and \(I_{PSC}^i\) is the un-weighted postsynaptic current (PSC) from the corresponding afferent.
where \(t^j\) is the time of the j-th impulse triggered from the i-th afferent neuron, H(t) represents the Heaviside function, and K is the normalized kernel: \(K(t-t^j)=V_0 \cdot (exp(\frac{-(t-t^j)}{\tau _m})-exp(\frac{-(t-t^j)}{\tau _s}))\). \(V_0\) is the normalization factor. \(\tau _m\) and \(\tau _s\) are slow and fast decay constants, respectively. Their proportion is fixed as \(\tau _m/\tau _s=4\). When \(V_{m}\) crosses the firing threshold \(\vartheta \), the neuron will emit an output spike, and the membrane potential is reset to \(V_\text {r}\).
2.1 Learning Algorithm
The instantaneous training error is computed according to the difference between the actual instantaneous triggering rate \(R_{o}^{a}(t)\) and the desired instantaneous triggering rate \(R_{o}^{d}(t)\):
Our goal is to minimize the network error in triggering a required output spike pattern through gradient ascent with respect to synaptic weights,
where \(\eta \) is the learning rate. The derivative of the error function can be further expanded by introducing the chain rule. Since R(t) can be replaced at a suitable point by an estimate for a single run S(t), the weights is updated according to
Following PSD learning rule derived by the WH rule, we replace the nonlinear product by the spike convolution method, \(\tilde{s}_{h}(t)=s_{h}(t) * K(t)\). Hence,
In PSD, weight adaptation only relies on the current states, which is different from the rules involving STDP, where both the pre- and post-synaptic spiking times are stored and used for adaptation [13]. By combining Eq. 7, we finally get the total weight update:
The weight modifications for hidden layer neurons are computed similarly:
The weight modification formula of hidden neurons becomes
To modify synaptic weights in the same gradient direction, we use the modulus \(|w_{o h}|\) as mentioned in [8]:
The total weights for the hidden neurons are changed
The weights further are changed by synaptic scaling [8],
where f is the scaling factor. We set \(f > 0\) when the firing rate \(r < r_{min}\), and \(f < 0\) for \(r > r_{max}\). The sensitivity of the network to its initial state is reduced by keeping the postsynaptic neuron firing rate within a particular range \([r_{min},r_{max}]\). We introduce the van Rossum metric [13] with a filter function to measure the distance between two spike trains, written as:
f(t) and g(t) are filtered signals of the two pulse trains. \(\tau \) is the free parameter.
3 Simulations
3.1 Learning Sequences of Spikes
There are N input neurons, and each input neuron sends out a random pulse train which has a uniform distribution over a time interval T. The hidden layer contains H neurons, and the output layer contains only M neurons. The default parameters used in the following experiments are set to \(N = 100\), \(H = 200\), \(M = 1\) and \(T = 0.2\) s. The time step is set as \(dt = 0.01\). The initial synaptic weight can be randomly selected from a Gaussian distribution of mean value 0 and standard deviation 0.1. The spike threshold \(\vartheta = 1\), and the reset potential is 0. The refectory time is set to \(t_{ref} = 0\). We set the parameters \(\eta = 0.01\), \(\tau _m = 10\) ms, and \(\tau _s = 2.5\) ms. The target spike sequence is specified as [40, 80, 120, 160] ms. For each run, the training process is performed for up to 500 epochs or until the distance equals 0. 20 independent runs is repeated for averaging our experimental results. Figure 2 shows the learning process. During the time window T, we use van Rossum Dist to present the training error. Initially, the neuron can trigger a spike at any arbitrary time, which causes a large distance value. During the training phase, the neuron gradually is trained to fire spikes at the desired time, which is represented by the decrease of distance. After the last 76 learning epochs, the firing time of the output spikes matches the target spikes, and the error function value is reduced to 0. This experiment shows our method can successfully allow the neuron to fire a target pulse sequence within several training epochs.
3.2 Classification on the UCI Dataset
Iris. A basic benchmark dataset for plant classification. It contains 3 types of iris plants. Each category contains 50 samples and each of which is represented by 4 variables. There are 150 instances. 50% samples are chosen from each class to build the training set, and the rest for testing. We use population coding, as described in [14], to convert the Iris data features into spike times. As a result, each feature value is encoded by 6 identically shaped overlapping Gaussian functions, then 4 \(\times \) 6 = 24 input spikes are obtained as the input of 24 synapses. In addition, all patterns have 5 additional input synapses with input spikes at fixed times [2, 4, 6.5, 7.5, 10] to ensure that the target spikes can be launched. The total number of input neurons is 4 \(\times \) 6 + 5 = 29. There are 50 hidden neurons and 3 output neurons. The total time duration for the input pattern is set to \(T = 10\) ms. The network is trained to trigger a desired train of [6.5, 7.5] ms corresponding to the correct input category, and to keep silent for other categories.
Our approach can achieve comparable or even higher accuracy (reported 96 ± 1.3% accuracy) compared with the traditional neural network [15, 16] in Table 1. This result shows our method is successful in training temporal SNNs. Our method is compared with other spike-based methods. In Table 1, SpikeProp [16], Xie et al. [18, 19], BP-STDP [20] and the proposed method achieve a similar high accuracy on the Iris dataset. However, compared with SpikeProp [16] which requires 1000 convergent epochs, the proposed method only needs 120 convergent epochs. Although Xie et al. [18, 19] have improved the training efficiency and reduced the training epochs from 18 to 2 training epochs, their method is not a real multilayer SNN, where only synaptic weights from input to hidden neurons are adjusted, whereas all synaptic weights from hidden to output neuron are set to 1. For BP-STDP and Multi-ReSuMe, about 75% of the total Iris dataset for each class is used as the training set, but we only use 50% of the total Iris dataset for training and the classification performance can be significantly improved in the testing set. In addition, different from Taherkhani et al. [8, 10, 16], the proposed method does not need sub-connections and thus reduces the number of weight modification.
4 Conclusion
This paper proposes a novel supervised, multispike learning algorithm for multilayer SNNs, which can trigger multiple spikes at precise desired times for each layer. The proposed method derives weight update rule from the WH rule, and then credits the network error simultaneously to previous layers by using backpropagation. Experimental results show our method achieve high learning accuracy with a significant improvement in computational efficiency of the network.
References
Butts, D.A., et al.: Temporal precision in the neural code and the timescales of natural vision. Nature 449(7158), 92 (2007)
Knudsen, E.I.: Instructed learning in the auditory localization pathway of the barn owl. Nature 417(6886), 322 (2002)
Pfister, J.P., Toyoizumi, T., Barber, D., Gerstner, W.: Optimal spike-timing-dependent plasticity for precise action potential firing in supervised learning. Neural Comput. 18(6), 1318–1348 (2006)
Gardner, B., Grüning, A.: Supervised learning in spiking neural networks for precise temporal encoding. PloS one 11(8), e0161335 (2016)
Ponulak, F., Kasiński, A.: Supervised learning in spiking neural networks with resume: sequence learning, classification, and spike shifting. Neural Comput. 22(2), 467–510 (2010)
McKennoch, S., Liu, D., Bushnell, L.G.: Fast modifications of the spikeprop algorithm. In: The 2006 IEEE International Joint Conference on Neural Network Proceedings, pp. 3970–3977. IEEE (2006)
Shrestha, S.B., Song, Q.: Adaptive learning rate of spikeprop based on weight convergence analysis. Neural Netw. 63, 185–198 (2015)
Sporea, I., Grüning, A.: Supervised learning in multilayer spiking neural networks. Neural Comput. 25(2), 473–509 (2013)
Taherkhani, A., Belatreche, A., Li, Y., Maguire, L.P.: Dl-resume: a delay learning-based remote supervised method for spiking neurons. IEEE Trans. Neural Netw. Learn. Syst. 26(12), 3137–3149 (2015)
Taherkhani, A., Belatreche, A., Li, Y., Maguire, L.P.: A supervised learning algorithm for learning precise timing of multiple spikes in multilayer spiking neural networks. IEEE Trans. Neural Netw. Learn. Syst. 99, 1–14 (2018)
Wade, J.J., McDaid, L.J., Santos, J.A., Sayers, H.M.: Swat: a spiking neural network training algorithm for classification problems. IEEE Trans. Neural Netw. 21(11), 1817–1830 (2010)
Gerstner, W., Kistler, W.M.: Spiking Neuron Models: Single Neurons, Populations. Cambridge University Press, Plasticity (2002)
Yu, Q., Tang, H., Tan, K.C., Li, H.: Precise-spike-driven synaptic plasticity: learning hetero-association of spatiotemporal spike patterns. Plos one 8(11), e78318 (2013)
Snippe, H.P.: Parameter extraction from population codes: a critical assessment. Neural Comput. 8(3), 511–529 (1996)
Wang, J., Belatreche, A., Maguire, L., Mcginnity, T.M.: An online supervised learning method for spiking neural networks with adaptive structure. Neurocomputing 144, 526–536 (2014)
Bohte, S.M., Kok, J.N., La Poutre, H.: Error-backpropagation in temporally encoded networks of spiking neurons. Neurocomputing 48(1), 17–37 (2002)
Xu, Y., Zeng, X., Han, L., Yang, J.: A supervised multi-spike learning algorithm based on gradient descent for spiking neural networks. Neural Netw. 43, 99–113 (2013)
Xie, X., Qu, H., Liu, G., Zhang, M., Kurths, J.: An efficient supervised training algorithm for multilayer spiking neural networks. PloS one 11(4), e0150329 (2016)
Xie, X., Qu, H., Yi, Z., Kurths, J.: Efficient training of supervised spiking neural network via accurate synaptic-efficiency adjustment method. IEEE Trans. Neural Netw. Learn. Syst. 28(6), 1411–1424 (2017)
Tavanaei, A., Maida, A.: BP-STDP: approximating backpropagation using spike timing dependent plasticity. Neurocomputing 330, 39–47 (2019)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Xiao, R., Geng, T. (2020). A Supervised Learning Algorithm for Learning Precise Timing of Multispike in Multilayer Spiking Neural Networks. In: Yang, H., Pasupa, K., Leung, A.CS., Kwok, J.T., Chan, J.H., King, I. (eds) Neural Information Processing. ICONIP 2020. Communications in Computer and Information Science, vol 1333. Springer, Cham. https://doi.org/10.1007/978-3-030-63823-8_55
Download citation
DOI: https://doi.org/10.1007/978-3-030-63823-8_55
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-63822-1
Online ISBN: 978-3-030-63823-8
eBook Packages: Computer ScienceComputer Science (R0)