Language-based Approaches to Safe and Efficient Distributed Programming
Time: Tue 2021-02-23 08.30
Subject area: Computer Science
Doctoral student: Xin Zhao , Teoretisk datalogi, TCS
Opponent: Professor Wolfgang De Meuter, Vrije Universiteit Brussel
Supervisor: Philipp Haller, Teoretisk datalogi, TCS; Roberto Guanciale, Teoretisk datalogi, TCS
Distributed systems address the increasing demand for fast access to resources and fault tolerance for data. Strong consistency ensures that all clients observe consistent data updates atomically on all servers in a distributed system, and it is widely used in systems such as relational databases. However, such systems need to sacrifice availability when synchronization occurs. Due to scalability requirements, software developers need to trade consistency for performance, which introduces new challenges. First, there is a lack of comprehensive understanding of how to choose a suitable consistency model. Second, consistency guarantees for specific data may be weakened if the application correctness is unaffected, but it is hard to reason the safety of mixed usage of weakly and strongly consistent data. Third, for an operation sequence such as a transaction, the co-existence of mixed consistency data complicates the reasoning about the implementation correctness.
This thesis explores how to aid distributed program developers of mixed consistency systems through language-based approaches. First, we propose a new consistency protocol, the observable atomic consistency protocol (OACP), to make write-dominant applications as fast as possible and as consistent as needed. OACP combines the advantages of (1) mergeable data types, specifically, convergent replicated data types, to reduce synchronization and (2) reliable total order broadcast, to provide on-demand strong consistency. We also provide a high-level programming interface to improve the efficiency and correctness of distributed programming.
Second, we propose lattice-based consistency types for replicated data (CTRD), a higher-order static consistency-typed language with replicated data types. The type system of CTRD supports shared data among multiple clients and statically enforces noninterference between data types with weaker consistency and data types with stronger consistency. The language can be applied to many distributed applications and guarantees that weakly-consistent data updates can never affect strongly-consistent data.
Finally, we propose a type-based approach for statically verifying transaction correctness. A standard property for correctness is serializability, which means that the actual execution of transactions is equivalent to a corresponding serial execution with equivalent behavior in which the instructions of each transaction are not interleaved with instructions from other transactions. We formalize an information-flow type system, SerialT, to keep track of data consistency levels and use the noninterference between consistent and available data to postpone available operations within a transaction. Moreover, we use location information generated from type checking to enforce serializability properties for transactions, and we also introduce two additional commutative operations to reduce unnecessary locks.