×

Programming simultaneous actions using common knowledge. (English) Zbl 0646.68031

This work applies the theory of knowledge in distributed systems to the design of efficient fault-tolerant protocols. We define a large class of problems requiring coordinated, simultaneous action in synchronous systems, and give a method of transforming specifications of such problems into protocols that are optimal in all runs: these protocols are guaranteed to perform the simultaneous actions as soon as any other protocol could possibly perform them, given the input to the system and faulty processor behavior.
This transformation is performed in two steps. In the first step we extract, directly from the problem specification, a high-level protocol programmed using explicit tests for common knowledge. In the second step we carefully analyze when facts become common knowledge, thereby providing a method of efficiently implementing these protocols in many variants of the omissions failure model.
In the generalized omissions model, however, our analysis shows that testing for common knowledge is NP-hard. Given the close correspondence between common knowledge and simultaneous actions, we are able to show that no optimal protocol for any such problem can be computationally efficient in this model.
The analysis in this paper exposes many subtle differences between the failure models, including the precise point at which this gap in complexity occurs.

MSC:

68N25 Theory of operating systems
Full Text: DOI

References:

[1] J. Burns and N. A. Lynch, The Byzantine firing squad problem, MIT Technical Report MIT/LCS/TM-275, 1985.
[2] Chandy, K. M.; Misra, J., How processes learn, Distrib. Comput., 1, 1, 40-52 (1986) · Zbl 0602.68026 · doi:10.1007/BF01843569
[3] B. Coan, A communication-efficient canonical form for fault-tolerant distributed protocols,Proceedings of the Fifth PODC, 1985, pp. 63-72.
[4] B. Coan, D. Dolev, C. Dwork, and L. Stockmeyer, The distributed firing squad problem,Proceedings of the Seventeenth STOC, 1985, pp. 335-345. · Zbl 0679.68051
[5] D. Dolev, R. Reischuk, and H. R. Strong, Eventual is earlier than immediate,Proceedings of the 23th FOCS, 1982, pp. 196-203.
[6] Dwork, C.; Moses, Y.; Halpern, J. Y., Knowledge and common knowledge in a Byzantine environment: The case of crash failures, 149-170 (1986), Los Altos, CA: Morgan Kaufmann, Los Altos, CA
[7] M. J. Fisher, The consensus problem in unreliable distributed systems (a brief survey), Yale University Technical Report YALEU/DCS/RR-273, 1983.
[8] Fischer, M. J.; Immerman, N.; Halpern, J. Y., Foundations of knowledge for distributed systems, 171-185 (1986), Los Altos, CA: Morgan Kaufmann, Los Altos, CA
[9] Fischer, M. J.; Lynch, N. A., A lower bound for the time to assure interactive consistency, Infor. Process. Lett., 14, 4, 183-186 (1982) · Zbl 0493.68026 · doi:10.1016/0020-0190(82)90033-3
[10] Garey, M. R.; Johnson, D. S., Computers and Intractability: A Guide to the Theory of NP-Completeness (1979), San Francisco: W. H. Freeman and Company, San Francisco · Zbl 0411.68039
[11] V. Hadzilacos, A lower bound for Byzantine agreement with fail-stop processors, Harvard University Technical Report TR-21-83.
[12] J. Y. Halpern and R. Fagin, A formal model of knowledge, action, and communication in distributed systems,Proceedings of the Fourth PODC, 1985, pp. 224-236.
[13] J. Y. Halpern and Y. Moses, Knowledge and common knowledge in a distributed environment. Version of August 1987 is available as IBM Research Report RJ 4421. Early versions appeared inProceedings of the Third PODC, 1984, pp. 50-61; and as IBM Research Report RJ 4421, 1984 and 1986.
[14] J. Y. Halpern and Y. Moses, A guide to the modal logic of knowledge and belief,Proceedings of the Ninth IJCAI, 1985, pp. 480-490.
[15] Hintikka, J., Knowledge and Belief (1962), Ithaca, NY: Cornell University Press, Ithaca, NY
[16] Hopcroft, J. E.; Ullman, J. D., Introduction to Automata Theory, Languages, and Computation (1979), Reading, Massachusetts: Addison-Wesley, Reading, Massachusetts · Zbl 0426.68001
[17] L. Lamport and M. J. Fischer, Byzantine generals and transaction commit protocols, SRI Technical Report Op. 62, 1982.
[18] R. Michel, Attaining common knowledge in synchronous distributed networks, unpublished manuscript, 1986.
[19] C. Mohan, H. R. Strong, and S. Finkelstein, Methods for distributed transaction commit and recovery using Byzantine agreement within clusters of processors,Proceedings of the Second PODC, 1983, pp. 89-103.
[20] Y. Moses, Knowledge in a distributed environment, Ph.D. Thesis, Stanford University Technical Report STAN-CS-1120, 1986.
[21] R. Parikh and R. Ramanujam, Distributed processes and the logic of knowledge (preliminary report),Proceedings of the Workshop on Logics of Programs, 1985, pp. 256-268. · Zbl 0565.68025
[22] Pease, M.; Shostak, R.; Lamport, L., Reaching agreement in the presence of faults, J. Assoc. Comput. Mach., 27, 2, 228-234 (1980) · Zbl 0434.68031
[23] Perry, K.; Toueg, S., Distributed agreement in the presence of processor and communication faults, IEEE Trans. Software Engrg, 12, 3, 477-482 (1986) · Zbl 0587.68026
[24] M. O. Rabin, Efficient solutions to the distributed firing squad problem, private communication.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.