Abstract
Automata learning techniques automatically generate systemmodels fromtest observations. Typically, these techniques fall into two categories: passive and active. On the one hand, passive learning assumes no interaction with the system under learning and uses a predetermined training set, e.g., system logs. On the other hand, active learning techniques collect training data by actively querying the system under learning, allowing one to steer the discovery ofmeaningful information about the systemunder learning leading to effective learning strategies. A notable example of active learning technique for regular languages is Angluin’s \(L^*\)-algorithm. The \(L^*\)-algorithm describes the strategy of a student who learns the minimal deterministic finite automaton of an unknown regular language \(L\) by asking a succinct number of queries to a teacher who knows \(L\).
In this work, we study \(L^*\) -based learning of deterministic Markov decision processes, a class of Markov decision processes where an observation following an action uniquely determines a successor state. For this purpose, we first assume an ideal setting with a teacher who provides perfect information to the student. Then, we relax this assumption and present a novel learning algorithm that collects information by sampling execution traces of the system via testing.
Experiments performed on an implementation of our sampling-based algorithm suggest that our method achieves better accuracy than state-of-the-art passive learning techniques using the same amount of test obser vations. In contrast to existing learning algorithms which assume a predefined number of states, our algorithm learns the complete model structure including the state space.
Article PDF
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
References
Aspnes, J., Herlihy, M.: Fast randomized consensus using shared memory. J Algorithms 11(3), 441–461 (1990)
Aichernig, B.K., Mostowski, W., Mousavi, M.R., Tappler, M., Taromirad, M.: Model learning and model-based testing. In: Bennaceur, A., Hähnle, R., Meinke, K. (eds.) Machine learning for dynamic software analysis: potentials and limits–international Dagstuhl seminar 16172, Dagstuhl Castle, Germany, April 24–27, 2016, revised papers. lecture notes in computer science, vol. 11026, pp. 74–100. Springer (2018)
Angluin, D.: Learning regular sets from queries and counterexamples. Inf Comput 75(2), 87–106 (1987)
Aichernig BK, Tappler M (2017) Learning from faults: mutation testing in active automata learning. In: Barrett C, Davies M, Kahsai T (eds) NASA formal methods—9th international symposium, NFM 2017, Moffett Field, CA, USA, May 16–18, 2017, proceedings, volume 10227 of lecture notes in computer science, pp 19–34
Aichernig, B.K., Tappler, M.: Probabilistic black-box reachability checking. In: Lahiri, S.K., Reger, G. (eds.) Runtime verification–17th international conference, RV 2017, Seattle, WA, USA, September 13–16, 2017, proceedings. lecture notes in computer science, pp 50–67, vol. 10548. Springer (2017)
Aichernig Bernhard, K., Martin, T.: Efficient active automata learning via mutation testing. J Autom Reason 63(4), 1103–1134 (2019)
Aichernig BK, Tappler M (2019) Probabilistic black-box reachability checking (extended version). Form Methods Syst Des
Bacci G, Bacci G, Guldstrand LK, Mardare R. MDPDist library. http://people.cs.aau.dk/~giovbacci/tools/bisimdist.zip. Accessed on 04 Nov 2019
Bacci G, Bacci G, Larsen KG, Mardare R (2013) The BisimDist library: efficient computation of bisimilarity distances for Markovian models. In: Joshi KR, Siegle M, Stoelinga M, D'Argenio PR (eds) Quantitative evaluation of systems—10th international conference, QEST 2013, Buenos Aires, Argentina, August 27–30, 2013. Proceedings, volume 8054 of lecture notes in computer science. Springer, pp 278–281
Bacci G, Bacci G, Larsen KG, Mardare R (2013) Computing behavioral distances, compositionally. In: Chatterjee K, Sgall J (eds) Mathematical foundations of computer science 2013—38th international symposium, MFCS 2013, Klosterneuburg, Austria, August 26–30, 2013. Proceedings, volume 8087 of lecture notes in computer science. Springer, pp 74–85
Baier, C., Katoen, J.P.: Principles of model checking. MIT Press, Cambridge (2008)
Bergadano, F., Varricchio, S.: Learning behaviors of automata from multiplicity and equivalence queries. SIAM J Comput 25(6), 1268–1280 (1996)
Castro, J., Gavalda, R.: Learning probability distributions generated by finite-state machines. In: Heinz, J., Sempere, J.M. (eds.) Topics in grammatical inference, pp. 113–142. Springer, Berlin (2016)
Cassel, S., Howar, F., Jonsson, B., Steffen, B.: Active learning for extended finite state machines. Formal Aspects Comput 28(2), 233–263 (2016)
Chow, T.S.: Testing software design modeled by finite-state machines. IEEE Trans Software Eng 4(3), 178–187 (1978)
Chen Y, Nielsen TD (2012) Active learning of Markov decision processes for system verification. In: 11th international conference on machine learning and applications, ICMLA, Boca Raton, FL, USA, December 12–15, 2012. Volume 2. IEEE, pp 289–294
Carrasco RC, Oncina J (1994) Learning stochastic regular grammars by means of a state merging method. In: Carrasco RC, Oncina J (eds) Grammatical inference and applications, second international colloquium, ICGI-94, Alicante, Spain, September 21–23, 1994. Proceedings, volume 862 of lecture notes in computer science. Springer, pp 139–152
Carrasco, R.C., Oncina, J.: Learning deterministic regular grammars from stochastic samples in polynomial time. Theor Inf Appl 33(1), 1–20 (1999)
de la Higuera, C.: Grammatical inference: learning automata and grammars. Cambridge University Press, New York (2010)
Feng L, Han T, Kwiatkowska MZ, Parker D (2011) Learning-based compositional verification for synchronous probabilistic systems. In: Bultan T, Hsiung P-A (eds) Automated technology for verification and analysis, 9th international symposium, ATVA 2011, Taipei, Taiwan, October 11–14, 2011. Proceedings, volume 6996 of lecture notes in computer science. Springer, pp 511–521
Forejt V, Kwiatkowska MZ, Norman G, Parker D (2011) Automated verification techniques for probabilistic systems. In: Bernardo M, Issarny V (eds) Formal methods for eternal networked software systems—11th international school on formal methods for the design of computer, communication and software systems, SFM 2011, Bertinoro, Italy, June 13–18, 2011. Advanced lectures, volume 6659 of lecture notes in computer science. Springer, pp 53–113
C Ghezzi et al (2014) Mining behavior models from user-intensive web applications. In: Jalote P, Briand LC, van der Hoek A (eds) 36th international conference on software engineering, ICSE'14, Hyderabad, India—May 31–June 07, 2014. ACM, pp 277–287
Hungar H, Niese O, Steffen B (2003) Domain-specific optimization in automata learning. In: Hunt Jr. WA, Somenzi F (eds) Computer aided verification, 15th international conference, CAV 2003, Boulder, CO, USA, July 8–12, 2003. Proceedings, volume 2725 of lecture notes in computer science. Springer, pp 315–327
Hoeffding, W.: Probability inequalities for sums of bounded random variables. J Am Stat Assoc 58(301), 13–30 (1963)
Howar, F., Steffen, B.: Active automata learning in practice–an annotated bibliography of the years 2011 to 2016. In: Bennaceur, A., Hähnle, R., Meinke, K. (eds.) Machine learning for dynamic software analysis: potentials and limits–international Dagstuhl seminar 16172, Dagstuhl Castle, Germany, April 24–27, 2016, revised papers. lecture notes in computer science, vol. 11026, pp. 123–148. Springer (2018)
Isberner M, Howar F, Steffen B (2014) The TTT algorithm: a redundancy-free approach to active automata learning. In: Bonakdarpour B, Smolka SA (eds) Runtime verification—5th international conference, RV 2014, Toronto, ON, Canada, September 22–25, 2014. Proceedings, volume 8734 of lecture notes in computer science. Springer, pp 307–322
Kearns, M.J.: Efficient noise-tolerant learning from statistical queries. J ACM 45(6), 983–1006 (1998)
Kwiatkowska Marta, Z., Gethin, N., David, P.: Analysis of a gossip protocol in PRISM. SIGMETRICS Perform Eval Rev 36(3), 17–22 (2008)
Kwiatkowska MZ, Norman G, Parker D (2011) PRISM 4.0: verification of probabilistic real-time systems. In: Gopalakrishnan G, Qadeer S (eds) Computer aided verification—23rd international conference, CAV 2011, Snowbird, UT, USA, July 14–20, 2011. Proceedings, volume 6806 of lecture notes in computer science. Springer, pp. 585–591
Kwiatkowska MZ, Parker D (2013) Automated verification and strategy synthesis for probabilistic systems. In: Van Hung D, Ogawa M (eds) Automated technology for verification and analysis—11th international symposium, ATVA 2013, Hanoi, Vietnam, October 15–18, 2013. Proceedings, volume 8172 of lecture notes in computer science. Springer, pp 5–22
Komuravelli A, Pasareanu CS, Clarke EM (2012) Learning probabilistic systems from tree samples. In: Proceedings of the 27th annual IEEE symposium on logic in computer science, LICS 2012, Dubrovnik, Croatia, June 25–28, 2012. IEEE Computer Society, pp 441–450
Khalili A, Tacchella A (2014) Learning nondeterministic Mealy machines. In: Clark A, Kanazawa M, Yoshinaka R (eds) Proceedings of the 12th international conference on grammatical inference, ICGI 2014, Kyoto, Japan, September 17–19, 2014, volume 34 of JMLR workshop and conference proceedings. JMLR.org, pp 109–123
Mao H, Chen Y, Jaeger M, Nielsen TD, Larsen KG, Nielsen B (2011) Learning probabilistic automata for model checking. In: Eighth international conference on quantitative evaluation of systems, QEST 2011, Aachen, Germany, 5–8 September, 2011. IEEE Computer Society, pp 111–120
Mao H, Chen Y, Jaeger M, Nielsen TD, Larsen KG, Nielsen B (2012) Learning Markov decision processes for model checking. In: Fahrenberg U, Legay A, Thrane CR (eds) Proceedings quantities in formal methods, QFM 2012, Paris, France, 28 August 2012., volume 103 of EPTCS, pp 49–63
Mao, H., Chen, Y., Jaeger, M., Nielsen, T.D., Larsen, K.G.: Learning deterministic probabilistic automata from a model checking perspective. Mach Learn 105(2), 255–299 (2016)
Margaria T, Niese O, Raffelt H, Steffen B (2004) Efficient test-based model generation for legacy reactive systems. In: Ninth IEEE international high-level design validation and test workshop 2004, Sonoma Valley, CA, USA, November 10–12, 2004. IEEE Computer Society, pp 95–100
Nerode, A.: Linear automaton transformations. Proc Am Math Soc 9(4), 541–544 (1958)
Nouri A, Raman B, Bozga M, Legay A, Bensalem S (2014) Faster statistical model checking by means of abstraction and learning. In: Bonakdarpour B, Smolka SA (eds) runtime verification—5th international conference, RV 2014, Toronto, ON, Canada, September 22–25, 2014. Proceedings, volume 8734 of lecture notes in computer science. Springer, pp 340–355
Norman, G., Shmatikov, V.: Analysis of probabilistic contract signing. J Comput Secur 14(6), 561–589 (2006)
Pferscher A, Aichernig BK (2020) Learning abstracted non-deterministic finite state machines. In: Casola V, De Benedictis A, Rak M (eds) Testing Software and Systems—32nd IFIP WG 6.1 international conference, ICTSS 2020, Naples, Italy, December 9–11, 2020. Proceedings, volume 12543 of lecture notes in computer science. Springer, pp 52–69
Puterman ML (1994) Markov decision processes: discrete stochastic dynamic programming. Wiley series in probability and statistics. Wiley
Rivest, R.L., Schapire, R.E.: Inference of finite automata using homing sequences. Inf Comput 103(2), 299–347 (1993)
Sokolova, A., de Vink, E.P.: Probabilistic automata: system types, parallel composition and comparison. In: Baier, C., Haverkort, B.R., Hermanns, H., Katoen, J.-P., Siegle, M. (eds.) Validation of stochastic systems–a guide to current research. lecture notes in computer science, vol. 2925, pp. 1–43. Springer (2004)
Shahbaz M, Groz R (2009) Inferring mealy machines. In: Cavalcanti A, Dams D (eds) FM 2009: formal methods, second world congress, Eindhoven, The Netherlands, November 2–6, 2009. Proceedings, volume 5850 of lecture notes in computer science. Springer, pp 207–222
Steffen B, Howar F, Merten M (2011) Introduction to active automata learning from a practical perspective. In: Bernardo M, Issarny V (eds) Formal methods for eternal networked software systems—11th international school on formal methods for the design of computer, communication and software systems, SFM 2011, Bertinoro, Italy, June 13–18, 2011. Advanced lectures, volume 6659 of lecture notes in computer science. Springer, pp 256–296
Segala, R., Lynch, N.: Probabilistic simulations for probabilistic processes. Nord J Comput 2(2), 250–273 (1995)
Stoelinga, M.: An introduction to probabilistic automata. Bull. EATCS 78, 176–198 (2002)
Tappler M, Aichernig BK, Bloem R (2017) Model-based testing IoT communication via active automata learning. In: 2017 IEEE international conference on software testing, verification and validation, ICST 2017, Tokyo, Japan, March 13–17, 2017. IEEE Computer Society, pp 276–287
Tappler M, Aichernig BK, Bacci G, Eichlseder M, Larsen KG (2019) \({L}^*\)-based learning of Markov decision processes. In: ter Beek MH, McIver A, Oliveira JN (eds) Formal methods—the next 30 years—third world congress, FM 2019, Porto, Portugal, October 7–11, 2019. Proceedings, volume 11800 of lecture notes in computer science. Springer, pp 651–669
Tappler M (2019) Learning-based testing in networked environments in the presence of timed and stochastic behaviour. PhD thesis, Graz University of Technology
Tappler M (2020) Evaluation material for \({L}^*\)-based learning of Markov decision processes. https://doi.org/10.6084/m9.figshare.7960928.v2. Accessed on 06 Mar 2020, updated for extended version
Tretmans, J.: Test generation with inputs, outputs and repetitive quiescence. Softw Concepts Tools 17(3), 103–120 (1996)
Tretmans, J.: Model based testing with labelled transition systems. In: Hierons, R.M., Bowen, J.P., Harman, M. (eds.) Formal methods and testing, an outcome of the FORTEST network, revised selected papers. lecture notes in computer science, vol. 4949, pp. 1–38. Springer (2008)
Wen-Guey, T.: Learning probabilistic automata and Markov chains via queries. Mach Learn 8, 151–166 (1992)
Vaandrager Frits, W.: Model learning. Commun ACM 60(2), 86–95 (2017)
Valiant, L.G.: A theory of the learnable. Commun ACM 27(11), 1134–1142 (1984)
Volpato M, Tretmans J (2015) Approximate active learning of nondeterministic input output transition systems. ECEASST, 72
Willemse, T.A.C.: Heuristics for ioco-based test-based modelling. In: Brim, L., Haverkort, B.R., Leucker, M., van de Pol, J. (eds.) Formal methods: applications and technology, 11th international workshop, FMICS 2006 and 5th international workshop pdmc 2006, Bonn, Germany, August 26–27, and August 31, 2006, revised selected papers. lecture notes in computer science, vol. 4346, pp. 132–147. Springer (2006)
Wang J, Sun J, Qin S (2016) Verifying complex systems probabilistically through learning, abstraction and refinement. CoRR, abs/1610.06371
Acknowledgements
The work of B.Aichernig, M. Eichlseder and M. Tappler has been carried out as part of the TU Graz LEAD project "Dependable Internet of Things inAdverse Environments". The work of K. Larsen andG. Bacci has been supported by the Advanced ERC Grant no. 867096 (LASSO).
Funding
Open access funding provided by Graz University of Technology.
Author information
Authors and Affiliations
Corresponding author
Additional information
Annabelle McIver, Maurice ter Beek and Cliff Jones
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Tappler, M., Aichernig, B.K., Bacci, G. et al. \(L^*\)-based learning of Markov decision processes (extended version). Form Asp Comp 33, 575–615 (2021). https://doi.org/10.1007/s00165-021-00536-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00165-021-00536-5