Reliable Distributed Algorithms I and II

This course gives you a comprehensive introduction to the theory and practice of distributed algorithms for designing scalable, reliable services. Part II is a continuation.

Both Part I and Part II of this course are now archived. None of the teachers participate actively in the classroom anymore but you can still access certain material.

Read more about the courses

The course requires an intermediate level of knowledge in computer programming, basic knowledge in computer science and mathematical skills. The course will use the programming language Scala in the programming assignments.

Computer Science

Reliable Distributed Algorithms I and II

This course is the first course in a series of two. Both courses provide a solid foundation in the area of reliable distributed computing, including the main concepts, results, models and algorithms in the field.

Today's global IT infrastructures are distributed systems; from the Internet to the data-centers of cloud computing that fuel the current revolution of global IT services. At the core of these services you find distributed algorithms. These algorithms run on multiple computers and communicate only by sending and receiving messages. It is crucial for the implemented services to continue to work 24/7 even if some of the computers fail or some of the messages are lost in transit. This is the subject of reliable distributed algorithms in computer science.

ID2203.1x covers models of distributed algorithms based on input/output automata; specifications of fault tolerant abstractions and failure detectors; specific distributed abstractions and fault-tolerant algorithms, including reliable broadcast and causal broadcast; key-value stores and consistency models; single-value consensus and the Paxos algorithm.

To complete the course with a full grade (100%) students are required to answer the graded quizzes provided every week, as well as the programming assignments.

Learning outcomes

After completing the course you will be able to:

  • Event-driven concurrent programming of distributed algorithms
  • Formal models of asynchronous systems using input/output automata
  • Failure detectors and equivalence between various distributed abstractions
  • Specifications and algorithms for reliable and causal-order broadcast
  • Distributed shared memory and consistency models
  • Single value consensus and related consensus algorithms, including Paxos.

Faculty and research