Abstract
This chapter presents the area of Cybernetics and how it is related to Machine Learning (ML), Learning Automata (LA), Re-inforcement Learning (RL) and Estimator Algorithms – all considered topics of Artificial Intelligence. In particular, Learning Automata are probabilistic finite state machines which have been used to model how biological systems can learn. The structure of such a machine can be fixed, or it can be changing with time. A Learning Automaton can also be implemented using action (choosing) probability updating rules which may or may not depend on estimates from the Environment being investigated.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Agache, M.: Estimator Based Learning Algorithms. M.C.S. Thesis, School of Computer Science, Carleton University, Ottawa, Ontario, Canada, 2000
Agache, M., Oommen, B.J.: Generalized pursuit learning schemes: new families of continuous and discretized learning Automata. IEEE Trans. Syst. Man Cybernet. B, 32(6), 738–749 (2002)
Atkinson, C.R., Bower, G.H., Crowthers, E.J.: An Introduction to Mathematical Learning Theory. Wiley, New York (1965)
Atlasis, A.F., Saltouros, M.P., Vasilakos, A.V.: On the use of a stochastic estimator learning algorithm to the ATM routing problem: a methodology. Proc. IEEE GLOBECOM 21(6), 538–546 (1998)
Atlassis, A.F., Loukas, N.H., Vasilakos, A.V.: The use of learning algorithms in ATM networks call admission control problem: a methodology. Comput. Netw. 34, 341–353 (2000)
Atlassis, A.F., Vasilakos, A.V.: The use of reinforcement learning algorithms in traffic control of high speed networks. In: Advances in Computational Intelligence and Learning, pp. 353–369 (2002)
Barabanov, N.E., Prokhorov, D.V.: Stability analysis of discrete-time recurrent neural networks. IEEE Trans. Neural Netw. 13(2), 292–303 (2002)
Barto, A.G., Bradtke, S.J., Singh, S.P.: Learning to act using real-time dynamic programming. Artif. Intell. 72(1–2), 81–138 (1995)
Bonassi, F., Terzi, E., Farina, M., Scattolini, R.: LSTM neural networks: Input to state stability and probabilistic safety verification. In: Learning for Dynamics and Control. PMLR, pp. 85–94 (2020)
Bush, R.R., Mosteller, F.: Stochastic Models for Learning. Wiley, New York (1958)
Chang, H.S., Fu, M.C., Hu, J., Marcus, S.I.: Recursive learning automata approach to Markov decision processes,. IEEE Trans. Autom. Control 52(7), 1349–1355 (2007)
Hashem, M.K.: Learning Automata-Based Intelligent Tutorial-like Systems, Ph.D. Dissertation, School of Computer Science, Carleton University, Ottawa, Canada, 2007
Huang, D., Jiang, W.: A general CPL-AdS methodology for fixing dynamic parameters in dual environments. IEEE Trans. Syst. Man Cybern. SMC-42, 1489–1500 (2012)
Kabudian, J., Meybodi, M.R., Homayounpour, M.M.: Applying continuous action reinforcement learning automata (CARLA) to global training of hidden Markov models. In: Proceedings of ITCC’04, the International Conference on Information Technology: Coding and Computing, pp. 638–642. Las Vegas, Nevada, 2004
Krinsky, V.I.: An asymptotically optimal automaton with exponential convergence. Biofizika 9, 484–487 (1964)
Krylov, V.: On the stochastic automaton which is asymptotically optimal in random medium. Autom. Remote Control 24, 1114–1116 (1964)
Lakshmivarahan, S.: Learning Algorithms Theory and Applications. Springer, Berlin (1981)
Lanctôt, J.K., Oommen, B.J.: Discretized estimator learning automata. IEEE Trans. Syst. Man Cybern. 22, 1473–1483 (1992)
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436–444 (2015)
Lee, S., Kim, J., Park, S.W., Jin, S.-M., Park, S.-M.: Toward a fully automated artificial pancreas system using a bioinspired reinforcement learning design: IEEE J. Biomed. Health Inform. 25(2), 536–546 (2020)
Meybodi, M.R., Beigy, H.: New learning automata based algorithms for adaptation of backpropagation algorithm pararmeters. Int. J. Neural Syst. 12, 45–67 (2002)
Mofrad, A.A., Yazidi, A., Hammer, H.L.: On solving the SPL problem using the concept of probability flux. Appl. Intell. 49, 2699–2722 (2019)
Mohri, M., Rostamizadeh, A., Talwalkar, A.: Foundations of Machine Learning. MIT Press, Cambridge (2018)
Misra, S., Oommen, B.J.: GPSPA : a new adaptive algorithm for maintaining shortest path routing trees in stochastic networks. Int. J. Commun. Syst. 17, 963–984 (2004)
Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A.A., Veness, J., Bellemare, M.G., Graves, A., Riedmiller, M., Fidjeland, A.K., Ostrovski, G., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015)
Najim, K., Poznyak, A.S.: Learning Automata: Theory and Applications. Pergamon Press, Oxford (1994)
Narendra, K.S., Thathachar, M.A.L.: Learning Automata. Prentice-Hall, Englewood Cliffs (1989)
Nian, R., Liu, J., Huang, B.: A review on reinforcement learning: introduction and applications in industrial process control. Comput. Chem. Eng. 139, 106886 (2020)
Norman, M.F.: On linear models with two absorbing barriers. J. Math. Psychol. 5, 225–241 (1968)
Nowé, A., Verbeeck, K., Peeters, M.: Learning automata as a basis for multi agent reinforcement learning. In: International Workshop on Learning and Adaption in Multi-Agent Systems, pp. 71–85. Springer, Berlin (2005)
Obaidat, M.S., Papadimitriou, G.I., Pomportsis, A.S.: Learning automata: theory, paradigms, and applications. IEEE Trans. Syst. Man Cybern. B 32, 706–709 (2002)
Obaidat, M.S., Papadimitriou, G.I., Pomportsis, A.S., Laskaridis, H.S.: Learning automata-based bus arbitration for shared-medium ATM switches. IEEE Trans. Syst. Man Cybern. B 32, 815–820 (2002)
Oommen, B.J., Christensen, J.P.R.: 𝜖-optimal discretized linear reward-penalty learning automata. IEEE Trans. Syst. Man Cybern. B 18, 451–457 (1998)
Oommen, B.J., Hansen, E.R.: The asymptotic optimality of discretized linear reward-inaction learning automata. IEEE Trans. Syst. Man Cybern. 14, 542–545 (1984)
Oommen, B.J.: Absorbing and ergodic discretized two action learning automata. IEEE Trans. Syst. Man Cybern. 16, 282–293 (1986)
Oommen, B.J., de St. Croix, E.V.: Graph partitioning using learning automata. IEEE Trans. Comput. C-45, 195–208 (1995)
Oommen, B.J., Roberts, T.D.: Continuous learning automata solutions to the capacity assignment problem. IEEE Trans. Comput. C-49, 608–620 (2000)
Oommen, B.J.: Stochastic searching on the line and its applications to parameter learning in nonlinear optimization. IEEE Trans. Syst. Man Cybern. SMC-27, 733–739 (1997)
Oommen, B.J., Raghunath, G.: Automata learning and intelligent tertiary searching for stochastic point location. IEEE Trans. Syst. Man Cybern. SMC-28B, 947–954 (1998)
Oommen, B.J., Raghunath, G., Kuipers, B.: Parameter learning from stochastic teachers and stochastic compulsive liars. IEEE Trans. Syst. Man Cybern. B 36, 820–836 (2006)
Papadimitriou, G.I., Pomportsis, A.S.: Learning-automata-based TDMA protocols for broadcast communication systems with bursty traffic. IEEE Commun. Lett. 4(3)107–109 (2000)
Poznyak, A.S., Najim, K.: Learning Automata and Stochastic Optimization. Springer, Berlin (1997)
Return of cybernetics. Nat. Mach. Intell. 1(9), 385–385, 2019. https://doi.org/10.1038/s42256-019-0100-x
Ryan, M., Omkar, T.: On 𝜖-optimality of the pursuit learning algorithm. J. Appl. Probab. 49(3), 795–805 (2012)
Sastry, P.S.: Systems of Learning Automata: Estimator Algorithms Applications, Ph.D. Thesis, Department of Electrical Engineering, Indian Institute of Science, Bangalore, India, 1985
Santharam, G., Sastry, P.S., Thathachar, M.A.L.: Continuous action set learning automata for stochastic optimization. J. Franklin Inst. 331B5, 607–628 (1994)
Schulman, J., Levine, S., Abbeel, P., Jordan, M., Moritz, P.: Trust region policy optimization. In: International Conference On Machine Learning, pp. 1889–1897. PMLR (2015)
Seredynski, F.: Distributed scheduling using simple learning machines. Eur. J. Oper. Res. 107, 401–413 (1998)
Shapiro, I.J., Narendra, K.S.: Use of stochastic automata for parameter self-optimization with multi-modal performance criteria. IEEE Trans. Syst. Sci. Cybern. SSC-5, 352–360 (1969)
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (2018)
Sutton, R.S., Barto, A.G., Williams, R.J.: Reinforcement learning is direct adaptive optimal control. IEEE Control Syst. Mag. 12(2), 19–22 (1992)
Tao, T., Ge, H., Cai, G., Li, S.: Adaptive step searching for solving stochastic point location problem. Intern. Conf. Intel. Comput. Theo ICICT-13, 192–198 (2013)
Thathachar, M.A.L., Oommen, B.J.: Discretized reward-inaction learning automata. J. Cybern. Inform. Sci. Spring, 24–29 (1979)
Thathachar, M.A.L., Sastry, P.S.: Pursuit algorithm for learning automata. Unpublished paper that can be available from the authors.
Thathachar, M.A.L., Sastry, P.S.: A new approach to designing reinforcement schemes for learning automata. In: Proceedings of the IEEE International Conference on Cybernetics and Society, Bombay, India, 1984
Thathachar, M.A.L., Sastry, P.S.: A class of rapidly converging algorithms for learning automata. IEEE Trans. Syst. Man Cybern. SMC-15, 168–175 (1985)
Thathachar, M.A.L., Sastry, P.S.: Estimator algorithms for learning automata. In: Proceedings of the Platinum Jubilee Conference on Systems and Signal Processing. Department of Electrical Engineering, Indian Institute of Science, Bangalore (1986)
Thathachar, M.A.L.T., Sastry, P.S.: Networks of Learning Automata: Techniques for Online Stochastic Optimization. Kluwer Academic, Boston (2003)
Tsetlin, M.L.: On the behaviour of finite automata in random media. Autom. Remote Control 22, 1210–1219 (1962). Originally in Avtomatika i Telemekhanika 22, 1345–1354 (1961)
Tsetlin, M.L.: Automaton Theory and Modeling of Biological Systems. Academic Press, New York (1973)
Unsal, C., Kachroo, P., Bay, J.S.: Simulation study of multiple intelligent vehicle control using stochastic learning automata. Trans. Soc. Comput. Simul. Int. 14, 193–210 (1997)
Varshavskii, V.I., Vorontsova, I.P.: On the behavior of stochastic automata with a variable structure. Autom. Remote Control 24, 327–333 (1963)
Vasilakos, A.V., Papadimitriou, G.: Ergodic discretized estimator learning automata with high accuracy and high adaptation rate for nonstationary environments. Neurocomputing 4, 181–196 (1992)
Vasilakos, A., Saltouros, M.P., Atlassis, A.F., Pedrycz, W.: Optimizing QoS routing in hierarchical ATM networks using computational intelligence techniques. IEEE Trans. Syst. Sci. Cybern. C, 33, 297–312 (2003)
Verbeeck, K., Nowe, A.: Colonies of learning automata. IEEE Trans. Syst. Man Cybern. B (Cybernetics) 32(6), 772–780 (2002)
Wang, Y., He, H., Tan, X.: Truly proximal policy optimization. In: Uncertainty in Artificial Intelligence, pp. 113–122. PMLR (2020)
Wheeler, R., Narendra, K.: Decentralized learning in finite Markov chains. IEEE Trans. Autom. Control 31(6), 519–526 (1986)
Wu, L., Feng, Z., Lam, J.: Stability and synchronization of discrete-time neural networks with switching parameters and time-varying delays. IEEE Trans. Neural Netw. Learn. Syst. 24(12), 1957–1972 (2013)
Yazidi, A., Granmo, O.-C., Oommen, B.J., Goodwin, M.: A novel strategy for solving the stochastic point location problem using a hierarchical searching scheme. IEEE Trans. Syst. Man Cybern. SMC-44, 2202–2220 (2014)
Yazidi, A., Hassan, I., Hammer, H.L., Oommen, B.J.: Achieving fair load balancing by invoking a learning automata-based two-time-scale separation paradigm. IEEE Trans. Neural Netw. Learn. Syst. 32(8), 3444–3457 (2020)
Yazidi, A., Zhang, X., Jiao, L., Oommen, B.J.: The hierarchical continuous pursuit learning automation: a novel scheme for environments with large numbers of actions. IEEE Trans. Neural Netw. Learn. Syst. 31, 512–526 (2019)
Zhang, X., Jiao, L., Oommen, B.J., Granmo, O.-C.: A conclusive analysis of the finite-time behavior of the discretized pursuit learning automaton IEEE Trans. Neural Netw. Learn. Syst. 31, 284–294 (2019)
Zhang, J., Wang, Y., Wang, C., Zhou, M.: Symmetrical hierarchical stochastic searching on the line in informative and deceptive environments. IEEE Trans. Syst. Man Cybern. SMC-47, 626–635 (2017)
Zhang, X., Granmo, O.C., Oommen, B.J.: The Bayesian pursuit algorithm: a new family of estimator learning automata. In: Proceedings of IEAAIE2011. pp. 608–620. Springer, New York, (2011)
Zhang, X., Granmo, O.C., Oommen, B.J.: On incorporating the paradigms of discretization and Bayesian estimation to create a new family of pursuit learning automata. Appl. Intell. 39, 782–792 (2013)
Zhang, X., Granmo, O.C., Oommen, B.J., Jiao, L.: A formal proof of the 𝜖-optimality of absorbing continuous pursuit algorithms using the theory of regular functions. Appl. Intell. 41, 974–985 (2014)
Zhang, X., Oommen, B.J., Granmo, O.C., Jiao, L.: A formal proof of the 𝜖-optimality of discretized pursuit algorithms. Appl. Intell. (2015). https://doi.org/10.1007/s10489-015-0670-1
Wiener, N.: Cybernetics or Control and Communication in the Animal and the Machine. MIT Press, Cambridge (2019)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Section Editor information
Rights and permissions
Copyright information
© 2023 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Oommen, B.J., Yazidi, A., Misra, S. (2023). Cybernetics, Machine Learning, and Stochastic Learning Automata. In: Nof, S.Y. (eds) Springer Handbook of Automation. Springer Handbooks. Springer, Cham. https://doi.org/10.1007/978-3-030-96729-1_10
Download citation
DOI: https://doi.org/10.1007/978-3-030-96729-1_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-96728-4
Online ISBN: 978-3-030-96729-1
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)