skip to main content
research-article

Raft Refloated: Do We Have Consensus?

Published: 20 January 2015 Publication History

Abstract

The Paxos algorithm is famously difficult to reason about and even more so to implement, despite having been synonymous with distributed consensus for over a decade. The recently proposed Raft protocol lays claim to being a new, understandable consensus algorithm, improving on Paxos without making compromises in performance or correctness.
In this study, we repeat the Raft authors' performance analysis. We developed a clean-slate implementation of the Raft protocol and built an event-driven simulation framework for prototyping it on experimental topologies. We propose several optimizations to the Raft protocol and demonstrate their effectiveness under contention. Finally, we empirically validate the correctness of the Raft protocol invariants and evaluate Raft's understandability claims.

References

[1]
CoreOS website. http://coreos.com. Accessed on 02/09/2014.
[2]
W. J. Bolosky, D. Bradshaw, R. B. Haagens, N. P. Kusters, and P. Li. Paxos replicated state machines as the basis of a highperformance data store. In Proceedings of the 8th USENIX Symposium on Networked Systems Design and Implementation (NSDI), 2011.
[3]
M. Burrows. The Chubby lock service for loosely-coupled distributed systems. In Proceedings of the 7th Symposium on Operating Systems Design and Implementation (OSDI), pages 335--350, 2006.
[4]
T. D. Chandra, R. Griesemer, and J. Redstone. Paxos made live: An engineering perspective. In Proceedings of the 26th ACM Symposium on Principles of Distributed Computing (PODC), pages 398--407, 2007.
[5]
F. Chang, J. Dean, S. Ghemawat, W. C. Hsieh, D. A. Wallach, M. Burrows, T. Chandra, A. Fikes, and R. E. Gruber. BigTable: A distributed storage system for structured data. ACM Transactions on Computer Systems (TOCS), 26(2):4, 2008.
[6]
B. Clark, T. Deshane, E. Dow, S. Evanchik, M. Finlayson, J. Herne, and J. N. Matthews. Xen and the art of repeated research. In Proceedings of the USENIX Annual Technical Conference, pages 135--144, 2004.
[7]
C. Collberg, T. Proebsting, G. Moraila, A. Shankaran, Z. Shi, and A. M. Warren. Measuring reproducibility in computer systems. Technical report, University of Arizona, 2014.
[8]
G. Delzanno, M. Tatarek, and R. Traverso. Model Checking Paxos in Spin. ArXiv e-prints, Aug. 2014.
[9]
S. Ghemawat, H. Gobioff, and S.-T. Leung. The Google file system. ACM SIGOPS Operating Systems Review, 37(5):29--43, 2003.
[10]
M. P. Herlihy and J. M. Wing. Linearizability: A correctness condition for concurrent objects. ACM Transactions on Programming Languages and Systems, 12(3):463--492, 1990.
[11]
H. Howard. ARC: Analysis of Raft Consensus. Technical Report UCAM-CL-TR-857, University of Cambridge, Computer Laboratory, July 2014.
[12]
P. Hunt, M. Konar, F. P. Junqueira, and B. Reed. ZooKeeper: wait-free coordination for internet-scale systems. In Proceedings of the 2010 USENIX Annual Technical Conference (USENIX ATC), volume 8, pages 145--158, 2010.
[13]
J. Kovacevic. How to encourage and publish reproducible research. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, volume 4, pages 1273--1276, April 2007.
[14]
L. Lamport. The part-time parliament. ACM Transactions on Computer Systems (TOCS), 16(2):133--169, 1998.
[15]
L. Lamport. Paxos made simple. ACM SIGACT News 32.4, pages 18--25vi, 2001.
[16]
L. Lamport. Fast Paxos. Distributed Computing, 19(2):79--103, 2006.
[17]
L. Lamport and M. Massa. Cheap Paxos. In Proceedings of the International Conference on Dependable Systems and Networks, pages 307--314, 2004.
[18]
B. Liskov and J. Cowling. Viewstamped replication revisited. Technical Report MIT-CSAIL-TR-2012-021, MIT Computer Science and Artificial Intelligence Laboratory, 2012.
[19]
A. Madhavapeddy. Creating high-performance statically type-safe network applications. PhD thesis, University of Cambridge, 2006.
[20]
A. Madhavapeddy. Combining static model checking with dynamic enforcement using the statecall policy language. In Proceedings of the 11th International Conference on Formal Engineering Methods: Formal Methods and Software Engineering, pages 446--465, 2009.
[21]
D. Mazieres. Paxos made practical. http://www.scs.stanford.edu/~dm/home/papers/paxos.pdf. Accessed on 02/09/2014.
[22]
B. M. Oki and B. H. Liskov. Viewstamped replication: A new primary copy method to support highly-available distributed systems. In Proceedings of the 7th annual ACM Symposium on Principles of Distributed Computing (PODC), pages 8--17, 1988.
[23]
D. Ongaro. Consensus: Bridging Theory and Practice. PhD thesis, Stanford University, 2014.
[24]
D. Ongaro and J. Ousterhout. In search of an understandable consensus algorithm (extended version). http:// ramcloud.stanford.edu/raft.pdf. Accessed on 13/09/2014.
[25]
D. Ongaro and J. Ousterhout. Raft: A consensus algorithm for replicated logs (user study). http://www.youtube.com/watch?v=YbZ3zDzDnrw. Accessed on 02/09/2014.
[26]
D. Ongaro and J. Ousterhout. In search of an understandable consensus algorithm. In Proceedings of the USENIX Annual Technical Conference, 2014.
[27]
M. Pease, R. Shostak, and L. Lamport. Reaching agreement in the presence of faults. Journal of the ACM (JACM), 27(2):228--234, 1980.
[28]
R. Van Renesse. Paxos made moderately complex. http://www.cs.cornell.edu/courses/cs7412/2011sp/paxos.pdf, 2011. Accessed on 02/09/2014.
[29]
P. Vandewalle, J. Kovacevic, and M. Vetterli. Reproducible research in signal processing. Signal Processing Magazine, IEEE, 26(3):37--47, 2009.
[30]
A. Varga et al. The OMNeT++ discrete event simulation system. In Proceedings of the European Simulation Multiconference, volume 9, page 185, 2001.
[31]
J. Yang, T. Chen, M. Wu, Z. Xu, X. Liu, H. Lin, M. Yang, F. Long, L. Zhang, and L. Zhou. Modist: Transparent model checking of unmodified distributed systems. In Proceedings of the 6th USENIX Symposium on Networked Systems Design and Implementation (NSDI), pages 213--228, 2009.

Cited By

View all
  • (2024)Raft Protocol for Fault Tolerance and Self-Recovery in Federated LearningProceedings of the 19th International Symposium on Software Engineering for Adaptive and Self-Managing Systems10.1145/3643915.3644093(110-121)Online publication date: 15-Apr-2024
  • (2024)A Scalable Secure Fault Tolerant Aggregation for P2P Federated Learning2024 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)10.1109/IPDPSW63119.2024.00060(222-231)Online publication date: 27-May-2024
  • (2024)RaftOptima: An Optimised Raft With Enhanced Fault Tolerance, and Increased Scalability With Low LatencyIEEE Access10.1109/ACCESS.2024.343597812(105974-105989)Online publication date: 2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM SIGOPS Operating Systems Review
ACM SIGOPS Operating Systems Review  Volume 49, Issue 1
Special Issue on Repeatability and Sharing of Experimental Artifacts
January 2015
155 pages
ISSN:0163-5980
DOI:10.1145/2723872
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 January 2015
Published in SIGOPS Volume 49, Issue 1

Check for updates

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)105
  • Downloads (Last 6 weeks)9
Reflects downloads up to 19 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Raft Protocol for Fault Tolerance and Self-Recovery in Federated LearningProceedings of the 19th International Symposium on Software Engineering for Adaptive and Self-Managing Systems10.1145/3643915.3644093(110-121)Online publication date: 15-Apr-2024
  • (2024)A Scalable Secure Fault Tolerant Aggregation for P2P Federated Learning2024 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)10.1109/IPDPSW63119.2024.00060(222-231)Online publication date: 27-May-2024
  • (2024)RaftOptima: An Optimised Raft With Enhanced Fault Tolerance, and Increased Scalability With Low LatencyIEEE Access10.1109/ACCESS.2024.343597812(105974-105989)Online publication date: 2024
  • (2024)TLS Guard for TLS 1.3 zero round-trip time (0-RTT) in a distributed environmentJournal of King Saud University - Computer and Information Sciences10.1016/j.jksuci.2023.10179735:10Online publication date: 4-Mar-2024
  • (2024)A survey of fault tolerant consensus in wireless networksHigh-Confidence Computing10.1016/j.hcc.2024.1002024:2(100202)Online publication date: Jun-2024
  • (2023)Cost-Effective Strong Consistency on Scalable Geo-Diverse Data ReplicasIEEE Transactions on Cloud Computing10.1109/TCC.2022.316129711:2(1764-1776)Online publication date: 1-Apr-2023
  • (2023)RAFT Consensus Reliability in Wireless Networks: Probabilistic AnalysisIEEE Internet of Things Journal10.1109/JIOT.2023.325740210:14(12839-12853)Online publication date: 15-Jul-2023
  • (2023)Consensus algorithms for Opt-in/Opt-out and proximity marketing context2023 International Conference on Computing, Networking and Communications (ICNC)10.1109/ICNC57223.2023.10074503(207-213)Online publication date: 20-Feb-2023
  • (2023)MRTOM: Mostly Reliable Totally Ordered Multicast, a Network Primitive to Offload Distributed Systems2023 IEEE 43rd International Conference on Distributed Computing Systems (ICDCS)10.1109/ICDCS57875.2023.00022(638-648)Online publication date: Jul-2023
  • (2023)Barriers to blockchain-based decentralised energy trading: a systematic reviewInternational Journal of Sustainable Energy10.1080/14786451.2023.217141742:1(41-71)Online publication date: 13-Feb-2023
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media