Abstract
Davies and Wakerly show that Byzantine fault tolerance can be achieved by a cascade of broadcasts and middle value select functions. We present an extension of the Davies and Wakerly protocol, the unified protocol, and its proof of correctness. We prove that it satisfies validity and agreement properties for communication of exact values. We then introduce bounded communication error into the model. Inexact communication is inherent for clock synchronization protocols. We prove that validity and agreement properties hold for inexact communication, and that exact communication is a special case. As a running example, we illustrate the unified protocol using the SPIDER family of fault-tolerant architectures. In particular we demonstrate that the SPIDER interactive consistency, distributed diagnosis, and clock synchronization protocols are instances of the unified protocol.
This work was supported by the National Aeronautics and Space Administration under NASA Contract No. NAS1-97046 while the second author was in residence at the National Institute of Aerospace, Hampton, VA 23666, USA.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Azadmanesh, M.H., Kieckhaferm, R.M.: Exploiting omissive faults in synchronous approximate agreement. IEEE Transactions on Computers 49(10), 1031–1042 (2000)
Caspi, P., Salem, R.: Threshold and bounded-delay voting in critical control systems. In: Joseph, M. (ed.) FTRTFT 2000. LNCS, vol. 1926, pp. 70–81. Springer, Heidelberg (2000)
Davies, D., Wakerly, J.F.: Synchronization and matching in redundant systems. IEEE Transactions on Computers 27(6), 531–539 (1978)
Geser, A., Miner, P.S.: A new on-line diagnosis protocol for the SPIDER family of Byzantine fault tolerant architectures. Technical Memorandum NASA/TM-2003-212432, NASA Langley Research Center, Hampton, VA (December 2003) (in print)
Hoyme, K., Driscoll, K.: SAFEbusTM. In: 11th AIAA/IEEE Digital Avionics Systems Conference, Seattle, WA, October 1992, pp. 68–73 (1992)
Kopetz, H.: Real-Time Systems. Kluwer Academic Publishers, Dordrecht (1997)
Kieckhafer, R.M., Walter, C.J., Finn, A.M., Thambidurai, P.M.: The MAFT architecture for distributed fault tolerance. IEEE Transactions on Computers 37(4), 398–405 (1988)
Latronico, E., Miner, P., Koopman, P.: Quantifying the reliability of proven SPIDER group membership service guarantees. In: Proceedings of the International Conference on Dependable Systems and Networks (June 2004)
Lynch, N.A.: Distributed Algorithms. Morgan Kaufmann, San Francisco (1996)
Miner, P.S.: Verification of fault-tolerant clock synchronization systems. NASA Technical Paper 3349, NASA Langley Research Center, Hampton, VA (November 1993)
Miner, P.S., Malekpour, M., Torres-Pomales, W.: Conceptual design of a Reliable Optical BUS (ROBUS). In: 21st AIAA/IEEE Digital Avionics Systems Conference DASC, Irvine, CA (October 2002)
Owre, S., Rushby, J., Shankar, N., von Henke, F.: Formal verification for fault-tolerant architectures: Prolegomena to the design of PVS. IEEE Transactions on Software Engineering 21(2), 107–125 (1995)
Pike, L., Maddalon, J., Miner, P., Geser, A.: Abstractions for fault-tolerant distributed system verification. In: Proceedings of Theorem-Proving in Higher-Order Logics (TPHOLs). Theorem Proving in Higher-Order Logics, TPHOLs (2004), Accepted. Available at http://shemesh.larc.nasa.gov/fm/spider/spider_pubs.html
Rosenlicht, M.: Introduction to Analysis. Dover Publications, Inc., New York (1968)
Rushby, J.: A comparison of bus architectures for safety-critical embedded systems. Technical Report NASA/CR-2003-212161, NASA Langley Research Center, Hampton, VA (March 2003)
Shankar, N.: Mechanical verification of a generalized protocol for Byzantine fault-tolerant clock synchronization. In: Vytopil, J. (ed.) FTRTFT 1992. LNCS, vol. 571, pp. 217–236. Springer, Heidelberg (1991)
SPIDER homepage, NASA Langley Research Center, Formal Methods Team, Available at http://shemesh.larc.nasa.gov/fm/spider/
Schwier, D., von Henke, F.: Mechanical verification of clock synchronization algorithms. In: Ravn, A.P., Rischel, H. (eds.) FTRTFT 1998. LNCS, vol. 1486, pp. 262–271. Springer, Heidelberg (1998)
Thambidurai, P., Park, Y.-K.: Interactive consistency with multiple failure modes. In: 7th Reliable Distributed Systems Symposium, October 1988, pp. 93–100 (1988)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Miner, P., Geser, A., Pike, L., Maddalon, J. (2004). A Unified Fault-Tolerance Protocol. In: Lakhnech, Y., Yovine, S. (eds) Formal Techniques, Modelling and Analysis of Timed and Fault-Tolerant Systems. FTRTFT FORMATS 2004 2004. Lecture Notes in Computer Science, vol 3253. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30206-3_13
Download citation
DOI: https://doi.org/10.1007/978-3-540-30206-3_13
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23167-7
Online ISBN: 978-3-540-30206-3
eBook Packages: Springer Book Archive