Abstract
In this paper, we propose a model of ESNs that eliminates critical dependence on hyper-parameters, resulting in networks that provably cannot enter a chaotic regime and, at the same time, denotes nonlinear behaviour in phase space characterised by a large memory of past inputs, comparable to the one of linear networks. Our contribution is supported by experiments corroborating our theoretical findings, showing that the proposed model displays dynamics that are rich-enough to approximate many common nonlinear systems used for benchmarking.
Access provided by Autonomous University of Puebla. Download conference paper PDF
Similar content being viewed by others
Keywords
Although the use of Recurrent Neural Networks (RNNs) in machine learning is boosting, also as effective building blocks for deep learning architectures, a comprehensive understanding of their working principles is still missing [4, 26]. Of particular relevance are Echo State Networks (ESNs), introduced by Jaeger [13] and independently by Maass et al. [16] under the name of Liquid State Machine (LSM), which emerge from RNNs due to their training simplicity. The basic idea behind ESNs is to create a randomly connected recurrent network, called reservoir, and feed it with a signal so that the network will encode the underlying dynamics in its internal states. The desired – task dependent – output is then generated by a readout layer (usually linear) trained to match the states with the desired outputs. Despite the simplified training protocol, ESNs are universal function approximators [10] and have shown to be effective in many relevant tasks [2, 3, 7, 19,20,21,22].
These networks are known to be sensitive to the setting of hyper-parameters like the Spectral Radius (SR), the input scaling and the sparseness degree [13], which critically affect their behaviour and, hence, the performance at task. Fine tuning of hyper-parameters requires cross-validation or ad-hoc criteria for selecting the best-performing configuration. Experimental evidence and some results from the theory show that ESNs performance is usually maximised in correspondence of a very narrow region in hyper-parameter space called Edge of Chaos (EoC) [1, 6, 14, 15, 23,24,25, 30]. However, we comment that beyond such a region ESNs behave chaotically, resulting in useless and unreliable computations. At the same time, it is everything but trivial configuring the hyperparameters to lie on the EoC still granting a non-chaotic behaviour. A very important property for ESNs is the Echo State Property (ESP), which basically asserts that their behaviour should depend on the signal driving the network only, regardless of its initial conditions [32]. Despite being at the foundation of theoretical results [10], the ESP in its original formulation raises some issues, mainly because it does not account for multi-stability and is not tightly linked with properties of the specific input signal driving the network [17, 31, 32].
In this context, the analysis of the memory capacity (as measured by the ability of the network to reconstruct or remember past inputs) of input-driven systems plays a fundamental role in the study of ESNs [8, 9, 12, 27]. In particular, it is known that ESNs are characterized by a memory–nonlinearity trade-off [5, 11, 28], in the sense that introducing nonlinear dynamics in the network degrades memory capacity. Moreover, it has been recently shown that optimizing memory capacity does not necessarily lead to networks with higher prediction performance [18].
In a recent paper [29], we proposed an ESN model that eliminates critical dependence on hyper-parameters, resulting in models that cannot enter a chaotic regime. In addition to this major outcome, we showed that such networks denote nonlinear behaviour in phase space characterised by a large memory of past inputs (see Fig. 1): the proposed model generates dynamics that are rich-enough to approximate nonlinear systems typically used as benchmarks. Our contribution was based on a nonlinear activation function that normalizes neuron activations on a hyper-sphere. We showed that the spectral radius of the reservoir, which is the most important hyper-parameter for controlling the ESN behaviour, plays a marginal role in influencing the stability of the proposed model, although it has an impact on the capability of the network to memorize past inputs. Our theoretical analysis demonstrates that this property derives from the impossibility for the system to display a chaotic behaviour: in fact, the maximum Lyapunov exponent is always null. An interpretation of this very important outcome is that the network always operates on the EoC, regardless of the setting chosen for its hyper-parameters.
References
Bertschinger, N., Natschläger, T.: Real-time computation at the edge of chaos in recurrent neural networks. Neural Comput. 16(7), 1413–1436 (2004). https://doi.org/10.1162/089976604323057443
Bianchi, F.M., Scardapane, S., Løkse, S., Jenssen, R.: Reservoir computing approaches for representation and classification of multivariate time series. arXiv preprint arXiv:1803.07870 (2018)
Bianchi, F.M., Scardapane, S., Uncini, A., Rizzi, A., Sadeghian, A.: Prediction of telephone calls load using echo state network with exogenous variables. Neural Netw. 71, 204–213 (2015). https://doi.org/10.1016/j.neunet.2015.08.010
Ceni, A., Ashwin, P., Livi, L.: Interpreting recurrent neural networks behaviour via excitable network attractors. Cogn. Comput. (2019). https://doi.org/10.1007/s12559-019-09634-2
Dambre, J., Verstraeten, D., Schrauwen, B., Massar, S.: Information processing capacity of dynamical systems. Sci. Rep. 2 (2012). https://doi.org/10.1038/srep00514
Gallicchio, C.: Chasing the echo state property. arXiv preprint arXiv:1811.10892 (2018)
Gallicchio, C., Micheli, A., Pedrelli, L.: Comparison between DeepESNs and gated RNNs on multivariate time-series prediction. arXiv preprint arXiv:1812.11527 (2018)
Ganguli, S., Huh, D., Sompolinsky, H.: Memory traces in dynamical systems. Proc. Nat. Acad. Sci. 105(48), 18970–18975 (2008). https://doi.org/10.1073/pnas.0804451105
Goudarzi, A., Marzen, S., Banda, P., Feldman, G., Teuscher, C., Stefanovic, D.: Memory and information processing in recurrent neural networks. arXiv preprint arXiv:1604.06929 (2016)
Grigoryeva, L., Ortega, J.P.: Echo state networks are universal. Neural Netw. 108, 495–508 (2018). https://doi.org/10.1016/j.neunet.2018.08.025
Inubushi, M., Yoshimura, K.: Reservoir computing beyond memory-nonlinearity trade-off. Sci. Rep. 7(1), 10199 (2017). https://doi.org/10.1038/s41598-017-10257-6
Jaeger, H.: Short term memory in echo state networks, vol. 5. GMD-Forschungszentrum Informationstechnik (2002)
Jaeger, H., Haas, H.: Harnessing nonlinearity: predicting chaotic systems and saving energy in wireless communication. Science 304(5667), 78–80 (2004). https://doi.org/10.1126/science.1091277
Legenstein, R., Maass, W.: Edge of chaos and prediction of computational performance for neural circuit models. Neural Netw. 20(3), 323–334 (2007). https://doi.org/10.1016/j.neunet.2007.04.017
Livi, L., Bianchi, F.M., Alippi, C.: Determination of the edge of criticality in echo state networks through Fisher information maximization. IEEE Trans. Neural Netw. Learn. Syst. 29(3), 706–717 (2018). https://doi.org/10.1109/TNNLS.2016.2644268
Maass, W., Natschläger, T., Markram, H.: Real-time computing without stable states: a new framework for neural computation based on perturbations. Neural Comput. 14(11), 2531–2560 (2002). https://doi.org/10.1162/089976602760407955
Manjunath, G., Jaeger, H.: Echo state property linked to an input: exploring a fundamental characteristic of recurrent neural networks. Neural Comput. 25(3), 671–696 (2013). https://doi.org/10.1162/NECO_a_00411
Marzen, S.: Difference between memory and prediction in linear recurrent networks. Phys. Rev. E 96(3), 032308 (2017). https://doi.org/10.1103/PhysRevE.96.032308
Palumbo, F., Gallicchio, C., Pucci, R., Micheli, A.: Human activity recognition using multisensor data fusion based on reservoir computing. J. Ambient Intell. Smart Environ. 8(2), 87–107 (2016)
Pathak, J., Hunt, B., Girvan, M., Lu, Z., Ott, E.: Model-free prediction of large spatiotemporally chaotic systems from data: a reservoir computing approach. Phys. Rev. Lett. 120(2), 024102 (2018)
Pathak, J., Lu, Z., Hunt, B.R., Girvan, M., Ott, E.: Using machine learning to replicate chaotic attractors and calculate Lyapunov exponents from data. Chaos: Interdisc. J. Nonlinear Sci. 27(12), 121102 (2017). https://doi.org/10.1063/1.5010300
Pathak, J., et al.: Hybrid forecasting of chaotic processes: using machine learning in conjunction with a knowledge-based model. Chaos: Interdisc. J. Nonlinear Sci. 28(4), 041101 (2018). https://doi.org/10.1063/1.5028373
Rajan, K., Abbott, L.F., Sompolinsky, H.: Stimulus-dependent suppression of chaos in recurrent neural networks. Phys. Rev. E 82(1), 011903 (2010). https://doi.org/10.1103/PhysRevE.82.011903
Rivkind, A., Barak, O.: Local dynamics in trained recurrent neural networks. Phys. Rev. Lett. 118, 258101 (2017). https://doi.org/10.1103/PhysRevLett.118.258101
Sompolinsky, H., Crisanti, A., Sommers, H.J.: Chaos in random neural networks. Phys. Rev. Lett. 61(3), 259 (1988). https://doi.org/10.1103/PhysRevLett.61.259
Sussillo, D., Barak, O.: Opening the black box: low-dimensional dynamics in high-dimensional recurrent neural networks. Neural Comput. 25(3), 626–649 (2013). https://doi.org/10.1162/NECO_a_00409
Tiňo, P., Rodan, A.: Short term memory in input-driven linear dynamical systems. Neurocomputing 112, 58–63 (2013). https://doi.org/10.1016/j.neucom.2012.12.041
Verstraeten, D., Dambre, J., Dutoit, X., Schrauwen, B.: Memory versus non-linearity in reservoirs. In: IEEE International Joint Conference on Neural Networks, pp. 1–8. IEEE, Barcelona (2010)
Verzelli, P., Alippi, C., Livi, L.: Echo state networks with self-normalizing activations on the hyper-sphere. arXiv preprint arXiv:1903.11691 (2019)
Verzelli, P., Livi, L., Alippi, C.: A characterization of the edge of criticality in binary echo state networks. In: 2018 IEEE 28th International Workshop on Machine Learning for Signal Processing (MLSP), pp. 1–6. IEEE (2018). https://doi.org/10.1109/MLSP.2018.8516959
Wainrib, G., Galtier, M.N.: A local echo state property through the largest Lyapunov exponent. Neural Netw. 76, 39–45 (2016). https://doi.org/10.1016/j.neunet.2015.12.013
Yildiz, I.B., Jaeger, H., Kiebel, S.J.: Re-visiting the echo state property. Neural Netw. 35, 1–9 (2012). https://doi.org/10.1016/j.neunet.2012.07.005
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Verzelli, P., Alippi, C., Livi, L. (2019). Hyper-spherical Reservoirs for Echo State Networks. In: Tetko, I., Kůrková, V., Karpov, P., Theis, F. (eds) Artificial Neural Networks and Machine Learning – ICANN 2019: Workshop and Special Sessions. ICANN 2019. Lecture Notes in Computer Science(), vol 11731. Springer, Cham. https://doi.org/10.1007/978-3-030-30493-5_9
Download citation
DOI: https://doi.org/10.1007/978-3-030-30493-5_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-30492-8
Online ISBN: 978-3-030-30493-5
eBook Packages: Computer ScienceComputer Science (R0)