Abstract
Nowadays, many critical services are provided by complex distributed systems which are the result of the reuse and integration of a large number of components. Given their multi-context nature, these components are, in general, not designed to achieve high dependability by themselves, thus their behavior with respect to faults can be the most disparate. Nevertheless, it is paramount for these kinds of systems to be able to survive failures of individual components, as well as attacks and intrusions, although with degraded functionalities. To provide control capabilities over unanticipated events, we focus on fault handling strategies, particularly on system’s reconfiguration. The paper describes a framework which provides fault tolerance of components based applications by detecting failures through monitoring and by recovering through system reconfiguration. The framework is based on Lira, an agent distributed infrastructure for remote control and reconfiguration, and a decision maker for selecting suitable new configurations. Lira allows for monitoring and reconfiguration at components and applications level, while decisions are taken following the feedbacks provided by the evaluation of statistical Petri net models.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Garlan, D., Cheng, S.W., Schmerl, B.: Increasing System Dependability through Architecture-based Self-repair. In: de Lemos, R., Gacek, C., Romanovsky, A. (eds.) Architecting Dependable Systems. LNCS, vol. 2677, Springer, Heidelberg (2003)
Knight, J.C., Heimbigner, D., Wolf, A.L., Carzaniga, A., Hill, J., Devanbu, P., Gertz, M.: The Willow Architecture: Comprehensive Survivability for Large-Scale Distributed Applications. In: International Conference of Dependable Computer and Systems (DSN 2002), Washington DC (2002)
Kramer, J., Magee, J.: Dynamic Configuration of Distributed System. IEEE Transaction of Software Engineering SE, 424–436 (1985)
Kramer, J., Magee, J.: The Evolving Philosophers Problem: Dynamic Change Management. IEEE Transactions on Software Engineering 16, 1293–1306 (1990)
Young, A.J., Magee, J.N.: A Flexible Approach to Evolution of Reconfigurable Systems. In: Proc. of IEE/IFIP Int. Workshop on Configurable Distributed Systems (1992)
Magee, J.: Configuration of Distributed Systems. In: Sloman, M. (ed.) Network and Distributed Systems Management, Addison-Wesley, Reading (1994)
Kramer, J., Magee, J.: Analysing Dynamic Change in Software Architectures: A Case Study. In: Proc. 4th Int. Conf. on Configurable Distributed Architecture, pp. 91–100 (1998)
Wermelinger, M.: Towards a Chemical Model for Software Architecture Reconfiguration. In: Proceedings of the 4th International Conference on Configurable Distributed Systems (1998)
Castaldi, M., De Angelis, G., Inverardi, P.: A Reconfiguration Language for Remote Analysis and Application Adaptation. In: Orso, A., Porter, A. (eds.) Proceedings of Remote Analysis and Measurement of Software Systems, pp. 35–38 (2003)
Castaldi, M., Carzaniga, A., Inverardi, P., Wolf, A.: A Light-weight Infrastructure for Reconfiguring Applications. In: Westfechtel, B., van der Hoek, A. (eds.) SCM 2001 and SCM 2003. LNCS, vol. 2649, pp. 231–244. Springer, Heidelberg (2003)
Castaldi, M., Costantini, S., Gentile, S., Tocchio, A.: A Logic-based Infrastructure for Reconfiguring Applications. Technical report, University of L’Aquila, Department of Computer Science, To appear in LNAI, Springer (2003)
Rose, M.T.: The Simple Book: An Introduction to Networking Management. Prentice-Hall, Englewood Cliffs (1996)
Castaldi, M., Ryan, N.D.: Supporting Component-based Development by Enriching the Traditional API. In: Proceedings of Net.Object Days 2002 - Workshop on Generative and Component-based Software Engineering, Erfurt, Germany, pp. 44–48 (2002)
Huang, Y., Kintala, C., Kollettis, N.: Software rejuvenation: Analysis, module and applications. In: Proc. of 25th Int. Symposium on Fault-Tolerance Computing (FTCS-25), Pasadena, CA, USA (June 1995)
Petty, M.D., Weisel, E.W.: A Composability Lexicon. In: Proceedings of the Spring 2003 Simulation Interoperability Workshop, Orlando FL, USA (2003)
Betous-Almeida, C., Kanoun, K.: Stepwise Construction and Refinement of Dependability Models. In: IEEE International Conference on Dependable Systems and Networks, Washington D.C, USA (2002)
Siewiorek, D.P., Swarz, R.S.: Reliable Computer System - Design and Evaluation, 3rd edn. Digital Press (2001)
Chohra, A., Porcarelli, S., Di Giandomenico, F., Bondavalli, A.: Towards Optimal Database Maintenance in Wireless Communication System. In: 5th World Multi-Conference on Systemics, Cybernetics and Informatics (SCI 2001), Orlando, Florida (2001)
Powell, D.: Failure Mode Assumptions and Assumption Coverage. In: Laprie, J., Randell, B., Kopetz, H., Littlewood, B. (eds.) Predictably Dependable Computing Systems, pp. 3–24. Springer, Heidelberg (1995)
Bondavalli, A., Mura, I., Chiaradonna, S., Filippini, R., Poli, S., Sandrini, F.: DEEM: a Tool for the Dependability Modeling and Evaluation of Multiple Phased Systems. In: Proc. of Dependable Systems and Networks, New York, USA (2000)
Marsan, M.A., Chiola, G.: On Petri Nets with Deterministic and Exponentially Distribuited Firing Times. In: Rozenberg, G. (ed.) APN 1987. LNCS, vol. 266, pp. 132–145. Springer, Heidelberg (1987)
Muppala, A.K., Ciardo, G., Trivedi, K.S.: Stochastic reward nets for reliability prediction. Communications in Reliability, Maintenability and Serviceability 1, 9–20 (1994)
Garlan, D., Schmerl, B., Chang, J.: Using Gauges for Architecture-Based Monitoring and Adaptation. In: Proceedings of Working Conference on Complex and Dynamic Systems Architecture, Brisbane, Australia (2001)
Garlan, D., Monroe, R., Wile, D.: Acme: Architectural Description of Component- Based Systems. In: Leavens, G.T., Sitaraman, M. (eds.) Foundations of Component- Based Systems, pp. 47–68. Cambridge University Press, Cambridge (2000)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Porcarelli, S., Castaldi, M., Di Giandomenico, F., Bondavalli, A., Inverardi, P. (2004). A Framework for Reconfiguration-Based Fault-Tolerance in Distributed Systems. In: de Lemos, R., Gacek, C., Romanovsky, A. (eds) Architecting Dependable Systems II. Lecture Notes in Computer Science, vol 3069. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-25939-8_8
Download citation
DOI: https://doi.org/10.1007/978-3-540-25939-8_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23168-4
Online ISBN: 978-3-540-25939-8
eBook Packages: Springer Book Archive