As PROPHET completes at the end of 2016, we are very happy to announce that two students whose research was fully (Kirill Bogdanov) or partially (Georgios Katsikas) funded by PROPHET have successfully defended their licentiate theses (licentiate is a degree at KTH half-way to a PhD). We are very grateful to Prof. Gerald Q. Maguire Jr. for a fantastic job co-advising Kirill and Georgios. Their theses are available online:
In our recent journal article, we analyzed NF(V) service chains to identify their traffic classes and associate each traffic class with a newly-synthesized network function. By doing so, we eliminated I/O and processing redundancy and achieved 40 Gbps throughput with low latency on only one machine with 8 CPU cores. The full abstract is as follows:
In this paper we introduce SNF, a framework that synthesizes (S) network function (NF) service chains by eliminating redundant I/O and repeated elements, while consolidating stateful cross layer packet operations across the chain. SNF uses graph composition and set theory to determine traffic classes handled by a service chain composed of multiple elements. It then synthesizes each traffic class using a minimal set of new elements that apply single-read-single-write and early-discard operations. Our SNF prototype takes a baseline state of the art network functions virtualization (NFV) framework to the level of performance required for practical NFV service deployments. Software-based SNF realizes long (up to 10 NFs) and stateful service chains that achieve line-rate 40 Gbps throughput (up to 8.5x greater than the baseline NFV framework). Hardware-assisted SNF, using a commodity OpenFlow switch, shows that our approach scales at 40 Gbps for Internet Service Provider-level NFV deployments.
We are very happy to announce that Maciej Kuzniar and Peter Peresini have successfully defended their PhDs at EPFL during Summer of 2016. They are the first PhD graduates of the PROPHET project. We are very grateful to Willy Zwaenepoel for being their advisor at EPFL, and to EPFL for providing a superb research environment. Their theses are available at the links below:
Maciej: Measuring and Managing Switch Diversity in Software Defined Networks
Peter: Simplifying Development and Management of Software-Defined Networks
At CoNEXT ’15 in Heidelberg, Peter presented our paper on Monocle, our system for dynamic monitoring of OpenFlow switch dataplanes. Monocle is capable of fine-grained monitoring for the majority of rules, and it can identify a rule suddenly missing from the data plane or misbehaving in a matter of seconds. Also, during network updates Monocle helps controllers cope with switches that exhibit transient inconsistencies. The paper itself is available here.
At the Symposium on Cloud Computing conference (SoCC) 2015, Kirill Bogdanov presented our work on performance debugging of replica selection algorithms in geo-distributed storage systems. We found bugs in widely-used systems, such as Cassandra and MongoDB.
Kirill entered the Student Research Competition at SIGCOMM 2015 and described the work with this three-minute video
The full abstract is as follows:
Modern distributed systems are geo-distributed for reasons of increased performance, reliability, and survivability. At the heart of many such systems, e.g., the widely used Cassandra and MongoDB data stores, is an algorithm for choosing a closest set of replicas to service a client request. Dynamically changing network conditions pose a significant problem, with suboptimal replica choices resulting in reduced performance due to increasing response latency. In this paper we present GeoPerf, a tool that tries to automate the process of systematically testing the performance of replica selection algorithms for geo-distributed storage systems. At the core of our approach is a novel technique that combines symbolic execution and lightweight modeling to generate a set of inputs that can expose weaknesses in replica selection. As part of our evaluation, we analyzed network round trip times between geographically distributed regions of the Amazon EC2 cloud, and compute how often the order of nearest replicas changed per day from any given region’s perspective. We tested Cassandra and MongoDB using our tool, and found bugs in each of these systems. Finally, we use our collected Amazon EC2 latency traces to study the behavior of these buggy replica selection algorithms under realistic circumstances. We find that significant time was lost due to programming errors. For example due to the bug in Cassandra, the median wasted time for 10% of all requests is above 50 ms.