Abstract
In this paper, we use software rejuvenation as a preventive and proactive fault-tolerance technique to maximize the level of reliability for continuous and safety critical systems. We take both transient faults caused by software aging effects and network transmission faults into consideration and mathematically analyze the optimal software rejuvenation period that maximizes system’s reliability. The theoretical result is verified through empirical studies.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
References
Lions, J.-L., et al.: Ariane 5 flight 501 failure (1996)
Arthur, G., Stephenson, D.R., Mulville, F.H., Bauer, G.A.: Mars climate orbiter mishap investigation board phase i report, 44 p. NASA, Washington, DC (1999)
Tai, A., Chau, S.N., Alkalaj, L., Hecht, H.: On-board preventive maintenance: analysis of effectiveness and optimal duty period. In: Proceedings of Third International Workshop on Object-Oriented Real-Time Dependable Systems, pp. 40–47 (February 1997)
Tai, A., Alkalai, L., Chau, S.N.: On-board preventive maintenance for long-life deep-space missions: a model-based analysis. In: Proceedings of IEEE International Computer Performance and Dependability Symposium, IPDS 1998, pp. 196–205 (September 1998)
Chatterjee, S., Fawaz, M., Najm, F.N.: Redundancy-aware electromigration checking for mesh power grids. In: 2013 IEEE/ACM International Conference on Computer-Aided Design (ICCAD), pp. 540–547. IEEE (2013)
Black, J.R.: Black. Electromigration – a brief survey and some recent results. IEEE Transactions on Electron Devices 16(4), 338–347 (1969)
Garg, S., van. Moorsel, A., Vaidyanathan, K., Trivedi, K.S.: A methodology for detection and estimation of software aging. In: Proceedings of the Ninth International Symposium on Software Reliability Engineering, pp. 283–292. IEEE (1998)
Huang, Y., Kintala, C., Kolettis, N., Fulton, N.D.: Software rejuvenation: analysis, module and applications. In: Twenty-Fifth International Symposium on Fault-Tolerant Computing, FTCS-25, Digest of Papers, pp. 381–390 (June 1995), doi:10.1109/FTCS.1995.466961
Matlab R2012b, http://www.mathworks.com/products/new_products/release2012b.html
Bobbio, A., Garg, S., Gribaudo, M., Horvath, A., Sereno, M., Telek, M.: Modeling software systems with rejuvenation, restoration and checkpointing through fluid stochastic petri nets. In: Proceedings of the 8th International Workshop on Petri Nets and Performance Models, pp. 82–91 (1999), doi:10.1109/PNPM.1999.796555
Li, Z., Wang, L., Ren, S., Quan, G.: Energy minimization for checkpointing-based approach to guaranteeing real-time systems reliability. In: 2013 IEEE 16th International Symposium on Object/Component/Service-Oriented Real-Time Distributed Computing (ISORC), pp. 1–8 (June 2013)
Koutras, V.P., Platis, A.N.: Semi-markov availability modeling of a redundant system with partial and full rejuvenation actions. In: Third International Conference on Dependability of Computer Systems, DepCos-RELCOMEX 2008, pp. 127–134 (June 2008)
Hanmer, R.S., Mendiratta, V.B.: Rejuvenation with workload migration. In: 2010 International Conference on Dependable Systems and Networks Workshops (DSN-W), pp. 80–85 (June 2010)
Singh, C.: Reliability modeling of tmr computer systems with repair and common mode failures. Microelectronics Reliability 21(2), 259–262 (1981)
Khoshgoftaar, T.M., Seliya, N.: Tree-based software quality estimation models for fault prediction. In: Proceedings of the Eighth IEEE Symposium on Software Metrics, pp. 203–214 (2002)
Pfening, A., Garg, S., Puliafito, A., Telek, M., Trivedi, K.S.: Optimal software rejuvenation for tolerating soft failures. Perform. Eval. 27–28, 491–506 (October 1996)
Tai, A., Alkalai, L.: On-board maintenance for long-life systems. In: Proceedings of the 1998 IEEE Workshop on Software Engineering Technology, ASSET-1998, pp. 69–74 (March 1998)
Sadek, A., Limnios, N.: Nonparametric estimation of reliability and survival function for continuous-time finite markov processes. Journal of Statistical Planning and Inference 133(1), 1–21 (2005)
Trivedi, K.S., Vaidyanathan, K., Goseva-Popstojanova, K.: Modeling and analysis of software aging and rejuvenation. In: Proceedings of the 33rd Annual Simulation Symposium, SS 2000, pp. 270–279 (2000)
Okamura, H., Dohi, T.: Availability optimization in operational software system with aperiodic time-based software rejuvenation scheme. In: IEEE International Conference on Software Reliability Engineering Workshops, ISSRE Wksp 2008, pp. 1–6 (November 2008)
Koutras, V.P., Platis, A.N., Limnios, N.: Availability and reliability estimation for a system undergoing minimal, perfect and failed rejuvenation. In: International Conference on Software Reliability Engineering Workshops, ISSRE Wksp 2008, pp. 1–6 (November 2008)
Kandasamy, J.P.N., Hayes, Murray, B.T.: Transparent recovery from intermittent faults in time-triggered distributed systems. IEEE Transactions on Computers 52(2), 113–125 (2003)
Barlow, R., Proschan, F.: Mathematical Theory of Reliability. Classics in Applied Mathematics. Society for Industrial and Applied Mathematics (1996)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Guo, C., Wu, H., Hua, X., Ren, S., Nogiec, J.M. (2015). Maximize System Reliability for Long Lasting and Continuous Applications. In: Rocha, A., Correia, A., Costanzo, S., Reis, L. (eds) New Contributions in Information Systems and Technologies. Advances in Intelligent Systems and Computing, vol 353. Springer, Cham. https://doi.org/10.1007/978-3-319-16486-1_59
Download citation
DOI: https://doi.org/10.1007/978-3-319-16486-1_59
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-16485-4
Online ISBN: 978-3-319-16486-1
eBook Packages: Computer ScienceComputer Science (R0)