KTH Logo

Presentation at CoNEXT ’15 on dynamic, fine-grained monitoring of OpenFlow switch dataplanes

At CoNEXT ’15 in Heidelberg, Peter presented our paper on Monocle, our system for dynamic monitoring of OpenFlow switch dataplanes. Monocle is capable of fine-grained monitoring for the majority of rules, and it can identify a rule suddenly missing from the data plane or misbehaving in a matter of seconds. Also, during network updates Monocle helps controllers cope with switches that exhibit transient inconsistencies. The paper itself is available here.

[db-video id=”fm00hs8o”]

Presentation at SOCC ’15 on performance debugging of replica selection algorithms in geo-distributed storage systems

At the Symposium on Cloud Computing conference (SoCC) 2015, Kirill Bogdanov presented our work on performance debugging of replica selection algorithms in geo-distributed storage systems. We found bugs in widely-used systems, such as Cassandra and MongoDB.

Kirill entered the Student Research Competition at SIGCOMM 2015 and described the work with this three-minute video

[db-video id=”04o6v4e3″]

 

The full abstract is as follows:

Modern distributed systems are geo-distributed for reasons of increased performance, reliability, and survivability. At the heart of many such systems, e.g., the widely used Cassandra and MongoDB data stores, is an algorithm for choosing a closest set of replicas to service a client request.  Dynamically changing network conditions pose a significant problem, with suboptimal replica choices resulting in reduced performance due to increasing response latency. In this paper we present GeoPerf, a tool that tries to automate the process of systematically testing the performance of replica selection algorithms for geo-distributed storage systems. At the core of our approach is a novel technique that combines symbolic execution and lightweight modeling to generate a set of inputs that can expose weaknesses in replica selection. As part of our evaluation, we analyzed network round trip times between geographically distributed regions of the Amazon EC2 cloud, and compute how often the order of nearest replicas changed per day from any given region’s perspective. We tested Cassandra and MongoDB using our tool, and found bugs in each of these systems. Finally, we use our collected Amazon EC2 latency traces to study the behavior of these buggy replica selection algorithms under realistic circumstances. We find that significant time was lost due to programming errors. For example due to the bug in Cassandra, the median wasted time for 10% of all requests is above 50 ms.

Presentation at PAM ’15 on OpenFlow switch differences and performance characteristics

At the Passive and Active Measurements Conference (PAM) 2015 in New York, Maciej presented our paper on detailed measurements of OpenFlow switch differences and performance characteristics.

The abstract is as follows: SDN deployments rely on switches that come from various vendors and differ in terms of performance and available features. Understanding these differences and performance characteristics is essential for ensuring successful deployments. In this paper we measure, report, and explain the performance characteristics of flow table updates in three hardware OpenFlow switches. Our results can help controller developers to make their programs efficient. Further, we also highlight differences between the OpenFlow specification and its implementations, that if ignored, pose a serious threat to network security and correctness.

Our upcoming CoNEXT ’14 paper: “Providing Reliable FIB Update Acknowledgments in SDN”

Our paper titled “Providing Reliable FIB Update Acknowledgments in SDN” will appear at CoNEXT 2014 in Sydney. Maciej Kuzniar will present the work. Here is the abstract:

In this paper, we first show that transient, but grave problems such as violations of security policies can occur with real switches even when using consistent updates to Software Defined Networks. Next, we present techniques that are effective in ameliorating this problem. Our key insight is in creating a transparent layer that relies on control and data plane measurements to confirm rule updates only when the rule is visible in the data plane.