TCS Seminar Series
Please tell Kilian Risse if you want to give a seminar in the series. They are ideally held Mondays after lunch, but exceptions are possible.
TCS Seminar Series Fall 2019
24 Jan 2020 at 11:15 in 4423
Software Engineering ReThought - future challenges and empirical software engineering
(Tony Gorscheck, Blekinge Institute of Technology)
Software Engineering ReThought is a new research project with the aim to take on the next generation challenges facing companies developing software-intensive systems and products. We as an engineering lab are working on introducing 3:rd generation empirical software engineering – denoting close co-production of pragmatic problem-solving in close collaboration with our industrial partners as we perform engineering research. The focus is on how to affect change, today.
17 Dec 2019 at 12:00 in 1537
Towards Optimal Algorithms For Single-Source Shortest Paths in Dynamic Graphs
(Maximilian Probst Gutenberg, University of Copenhagen)
In this talk, I am reviewing the State-of-the-Art for Single-Source Shortest Paths (SSSP) and discuss very recent advances in high-level that make promising progress on different variations of the problem. In particular, I will give a brief introduction to the framework that was used to obtain the currently best deterministic algorithm for SSSP in undirected graphs. We then outline progress in the directed setting that still seems to be rather poorly understood. Finally, I will talk about future directions towards solving the problem (near-)optimally.
16 Dec 2019 at 13:15 in 1537
Circular proofs with names for the hybrid mu-calculus
(Sebastian Enqvist, Stockholm University)
I present ongoing work on developing a circular proof system for the hybrid mu-calculus, in the style of Stirling's proof system for the modal mu-calculus which uses names for fixpoint unfoldings. Stirling's system has recently been used by Afshari and Leigh to provide a cut-free complete sequent system for the mu-calculus, and I am hoping to extend their result to the hybrid mu-calculus in future work. The hybrid mu-calculus extends the modal mu-calculus with nominals referring to points in a model and corresponding satisfaction operators that describe which formulas are true at the value of a nominal. While this is a relatively gentle extension of the mu-calculus, it presents some new challenges, and could provide a stepping stone towards proof theory for even richer languages like hybrid mu-calculus with converse modalities or guarded fixpoint logic.
28 Nov 2019 at 11:15 in 1537
Engineering a Safer World: introduction to Prover Technology
(Iago Abal, Prover Technology)
Prover Technology provides software products and services for development and V&V of railway signaling systems. Our customer base includes railway infrastructure managers and railway signaling system suppliers around the world, as well as tool developers that license our model-checking technology.
In this talk we will take a walk through our product offering, the technical challenges we are facing, and some of our current R&D projects.
22 Nov 2019 at 10:00 in 1537
DRaGen: a tool for generating random values for abstract data types
(Alejandro Russo, Chalmers)
In random testing, it is challenging to control random data generators' distributions---specially when it comes to complex data like user-defined algebraic data types (ADT). In this talk, we show how to adapt results from an area of mathematics known as branching processes, and show how they help to analytically predict (at compile-time) the expected number of generated constructors, even in the presence of mutually recursive or composite ADTs. Using our probabilistic formulas, we design heuristics capable of automatically adjusting probabilities in order to synthesize generators whose distributions are aligned with users' demands. We provide a Haskell implementation of our mechanism in a tool called DRaGen and perform case studies with real-world applications. Furthermore, we will also show that sometimes becomes necessary to generate structurally richer ADTs values in order to test deeper software layers. For that, we will leverage static information found in the codebase as a manner to improve the generation process. Namely, our generators are capable of considering how programs branch on input data as well as how ADTs values are built via interfaces---while still providing compile-time guarantees and heuristic that tries to adjust their distributions to users demands. When generating random values, DraGen's synthesized random generators show improvements in code coverage when compared with those automatically derived by state-of-the-art tools.
This talk is based on the following articles: http://www.cse.chalmers.se/~russo/publications_files/haskell2018.pdf (HASKELL 2018)
http://www.cse.chalmers.se/~mista/assets/pdf/ast19.pdf (AST@ICSE 2019)
20 Nov 2019 at 10:15 in 1537
Language support for consistency in distributed systems
Large-scale distributed systems replicate data in order to tolerate faults such as network partitions. However, frequent replica updates (e.g., due to write-heavy workloads) as well as geo-replication (i.e., geographically-distributed replicas) make replica synchronization very expensive. This challenge suggests two research directions: (1) lowering the cost of synchronization while satisfying given consistency requirements; and (2) building systems that weaken consistency requirements when possible.
In this talk, I will present our results which make progress in both of these directions. (1) A consistency model and a corresponding protocol which integrate conflict-free replicated data types with a strongly consistent system to reduce synchronization cost while providing observable atomic consistency guarantees. (2) A type-and-effect system for a higher-order distributed language which enables the safe use of different consistency levels within the same program.
11 Nov 2019 at 13:15 in 4523
30% Seminar: Secure and Trustworthy Configurations of DMA Devices
I will give a presentation about the research I have done so far and my future research plans.
I will describe my contributions to a formally verified secure embedded Linux system with Internet connectivity. This system is secure in the sense that only Linux code (kernel and applications) with certain fingerprints (hash values) can be executed. Linux is executed as a guest on top of a formally verified hypervisor that isolates its guests, and alongside another formally verified guest that checks that executable Linux code has valid hash values. To make the system more usable, Internet connectivity is added by enabling Linux to use a network interface controller (NIC). However, since the NIC can read and write arbitrary memory regions, if Linux is in direct control of the NIC, Linux could use the NIC to inject untrusted code, modify the database of valid hash values, or modify the code of the hypervisor or the page tables to get access to all system resources. Therefore the NIC must be protected by the hypervisor so that the NIC can only access certain memory regions.
My contribution concerns the protection of the NIC. The NIC is protected by a NIC monitor of about 1000 lines of C code located in the hypervisor. The NIC monitor is invoked each time Linux attempts to reconfigure the NIC. The NIC monitor checks that the NIC reconfiguration satisfies a security invariant of the NIC, and only if so applies the reconfiguration. The NIC monitor code has not been verified. However, the security invariant of the NIC has been verified in the interactive theorem prover HOL4. This verification involves three components: A detailed formal model of the NIC that describes all memory accesses the NIC can perform; the definition of the security invariant stating what NIC states are secure; and the proof of that the invariant implies that each NIC operation preserves the invariant and that the NIC can access only certain memory regions.
Making a model of the NIC, identifying the security invariant, and prove that the invariant is correct is a time consuming task. Therefore, I plan to investigate how this work can be generalized and automated to include any I/O device that can perform direct memory accesses (DMA). This would involve identifying and formally describing the common operations of DMA devices, and automatically proving that their operations preserve a "universal security invariant." Furthermore, it would be helpful to be able to automatically generate models of hardware based on its formal design (written in e.g. VHDL or Verilog). This would speed up the verification process, and significantly enhance the trustworthiness of the verification.
11 Nov 2019 at 11:15 in 4523
On Benchmarks of Bugs for Studies in Automatic Program Repair
Benchmarks play an important role in any research for evaluating competing methods and tools. In the context of automatic program repair, benchmarks of bugs are the mean to investigate the performance of repair tools in fixing bugs. Despite their big importance in empirical studies, benchmarks of bugs have received little attention in the field. In this talk, I will first discuss on the importance of the adoption of publicly available benchmarks of bugs for evaluating repair tools, followed by the presentation of the state-of-the-art of benchmarks of bugs for the field. Second, I will present the recently defined benchmark overfitting problem, and its investigation in the context of 11 repair tools and the existing five benchmarks of bugs for the Java language. Finally, I will discuss on random unexplored topics on benchmarks, which are ongoing works or opportunities for future research.
06 Nov 2019 at 11:15 in 4523
Supporting Continuous Engineering with Automated Log Clustering and Diagnosis
(Leon Moonen, Simula Research Laboratory)
Continuous engineering (CE) practices, such as continuous integration and continuous deployment, have become key to modern software development. They are characterized by short automated build and test cycles that give developers early feedback on potential issues. CE practices help to release software more frequently, and reduces risk by increasing incrementality. However, effective use of CE practices in industrial projects requires making sense of the vast amounts of data that results from the repeated build and test cycles.
In this talk, we will explore to what extent these data can be treated more effectively by automatically grouping logs of runs that failed for the same underlying reasons, and what effort reduction can be achieved. Our proposed approach builds on earlier work for system log clustering, and we empirically investigate how choices in the clustering pipeline affect the grouping of continuous deployment logs provided by our industrial collaborator. We conclude by discussing a log diagnosis technique that builds on event spectra can be used to highlight which events in a failure log are most likely to indicate the causes of the failure.
04 Nov 2019 at 10:15 in 4523
Deniability, Anonymity and Verifiability: Towards Democracy-Enhancing Technologies
(Daniel Bosk, KTH)
Democracy is a decentralized process, each individual participates in a common-knowledge formation process. There are several requirements for this to work, one of them is free speech. In this talk, we will discuss some technologies to support this process, "democracy-enhancing technologies".
Totalitarian regimes usually perform surveillance to, among other things, limit free speech by self-censorship to disrupt this common-knowledge formation. We introduce a messaging protocol that provides deniability against a stronger than usual adversary. We also investigate how to distribute messages to their recipients while hiding who communicates with whom and how to do robust anonymous communication without a reliable infrastructure.
However, providing channels for free speech with strong privacy properties is not enough. They can also be abused and turned against democracy: the adversary can "poison" the public space through disinformation campaigns, e.g. spreading "fake news" using these channels. Some cases of this problem can be mitigated by introducing mechanisms that provide verifiable statements. We introduce a protocol to verifiably ascertain the participation counts of protests, but in a privacy-preserving manner to counteract surveillance.
From a scientific point of view, these problems pose an interesting interplay between privacy (deniability, anonymity) on the one hand and verifiability on the other. We can achieve the desired properties by developing our own protocols and, whenever possible, modifying existing cryptographic protocols to work with the weaker assumptions (e.g. on the limits of the adversary) required in the context of state surveillance.
28 Oct 2019 at 13:15 in 4523
Automatic Unit Test Amplification for DevOps
(Benjamin Danglot, PhD candidate at Inria Lille)
In recent years, unit testing has become an essential part of any serious software project to verify its proper functioning. However, the tests are tedious and expensive for the industry: it is difficult to evaluate the return on investment. To overcome this problem, the research is investigating the automation of test creation. Although these automatically generated test suites are of high quality, there are still barriers to the industry adoption of such techniques. This is because of the difficulty in understanding, integrating, and managing the generated test suite. The objective of this thesis is to overcome these difficulties by proposing a new technique that produces new tests based on existing tests. This technique is called "test amplification". By construction, these new tests are close to the tests written by humans, and therefore are easy to understand, integrate and manage. This approach is implemented as a tool called DSpot.
21 Oct 2019 at 11:15 in 1537
(30% Seminar) Security and privacy of smart home-base IoT systems
(Md Sakib Nizam Khan, KTH)
In this talk, I will discuss two of our research works concerning the security and privacy of smart home-based IoT systems.
The first work that I will present is titled ‘chownIoT: Enhancing IoT Privacy by Automated Handling of Ownership Change’. With the increasing deployment of smart home IoT devices, their ownership is likely to change during their life-cycle. IoT devices, especially those used in smart home environments, contain privacy-sensitive user data, and any ownership change of such devices can result in privacy leaks. In this work, we present a system called chowIoT, capable of securely handling ownership change of IoT devices. This work is published at IFIP International Summer School 2018 on Privacy and Identity Management. For more information, please follow the link to the full paper https://link.springer.com/chapter/10.1007/978-3-030-16744-8_14
The second work that I will discuss is our upcoming survey paper on smart home-based Ambient Assisted Living (AAL) or elderly health monitoring systems and their security and privacy considerations and implications. In this work, we performed a systematic literature review on smart home/IoT based AAL systems and their privacy and security aspects. Besides literature, we also performed a survey on commercially available AAL systems to understand the gap in terms of privacy and security between the state of the art research and industry. I will discuss the method of our systematic literature review and also our findings.
Lastly, I will also shed some light on our future research directions concerning the privacy and security of IoT systems.
11 Oct 2019 at 11:15 in 4523
Failure-Driven Program Repair
(Davide Ginelli, PhD student at the University of Milano-Bicocca)
Program repair techniques can dramatically reduce the cost of program debugging by automatically generating program fixes. Although program repair has been already successful with several classes of faults, it also turned out to be quite limited in the complexity of the fixes that can be generated. This Ph.D. thesis addresses the problem of cost-effectively generating fixes of higher complexity by investigating how to exploit failure information to directly shape the repair process. In particular, this thesis proposes Failure-Driven Program Repair, which is a novel approach to program repair that exploits its knowledge about both the possible failures and the corresponding repair strategies, to produce highly specialized repair tasks that can effectively generate non-trivial fixes.
10 Oct 2019 at 13:15 in 4523
(50% seminar) Trustworthy Binary Analysis
(Andreas Lindner, KTH)
The safety and security of a computer system simply hinges on its hardware and the interactions with it. While binary program code is the software that describes these interactions, typical examples for critical software in this context are the implementations of device drivers and isolation abstractions like virtual memory. More advanced challenges emerge when software implementations are supposed to cover security flaws in the hardware design by limiting their hardware interactions in very peculiar ways. In this talk, I will present some of our research work related to theorem-prover-powered binary analysis. The main topics will be a summary of our previous work on our binary verification framework HOLBA, a generalization of Hoare triples for binary code and worst case execution time analysis using symbolic execution.
07 Oct 2019 at 13:15 in 4523
TripleAgent: Monitoring, Perturbation and Failure-obliviousness for Automated Resilience Improvement in Java Applications
(Long Zhang, KTH)
In this talk, I would like to present TripleAgent, a novel resilience improvement system for Java applications. The unique feature of this system is to combine automated monitoring, automated perturbation injection, and automated resilience improvement. The latter is achieved thanks to the failure-oblivious computing, a concept introduced in 2004 by Rinard and colleagues. We design and implement the system as agents for the Java virtual machine. We evaluate the system on two real-world applications: a file transfer client and an email server. Our results show that it is possible to automatically improve the resilience of Java applications with respect to uncaught or mishandled exceptions. Paper if you feel interested: https://arxiv.org/pdf/1812.10706.pdf
17 Sep 2019 at 10:15 in 1537
(50% seminar) d-Galvin Families and Separating Arithmetic Circuit Classes
I will present results from my paper "d-Galvin Families" (with Johan Håstad and Guillaume Lagarde). The Galvin problem asks for the minimum size of a family F of sets from [n] so that for any set A of size n/2 we can find S in F that is balanced on A. We generalize this, finding a polynomial size F so that for any A we can partition [n] into d pieces, each balanced on A.
This generalization is motivated by an approach to separating arithmetic circuit classes. Specifically, it rejects an approach to a super-polynomial separation between the size of syntactically multilinear arithmetic branching programs and arbitrary arithmetic circuits. In this talk I also present questions about identity testing for arithmetic circuits that I have considered in my research.
One approach to checking if a circuit evaluates to 0 is to check its value at certain (often random) inputs. I show a lower bound on the size of matrix inputs that are required to find a nonzero point of a polynomial. I also discuss questions related to derandomizing deterministic identity testing.
09 Sep 2019 at 13:15 in 4523
mCoq: Mutation Proving for Analysis of Verification Projects
(Karl Palmskog, The University of Texas at Austin)
Mutation analysis, which introduces artificial defects into software systems, is the basis of mutation testing, a technique widely applied to evaluate and enhance the quality of test suites. However, despite the deep analogy between tests and formal proofs, mutation analysis has seldom been considered in the context of deductive verification. We propose mutation proving, a technique for analyzing verification projects that use proof assistants. We implemented our technique for the Coq proof assistant in a tool dubbed mCoq. mCoq applies a set of mutation operators to Coq definitions of functions and datatypes, inspired by operators previously proposed for functional programming languages. mCoq then checks proofs of lemmas affected by operator application. To make our technique feasible in practice, we implemented several optimizations in mCoq such as parallel proof checking. We applied mCoq to several medium and large scale Coq projects, and recorded whether proofs passed or failed when applying different mutation operators. We then qualitatively analyzed the mutants, and found several examples of weak and incomplete specifications. For our evaluation, we made many improvements to serialization of Coq code and even discovered a notable bug in Coq itself, all acknowledged by developers. We believe mCoq can be useful both to proof engineers for improving the quality of their verification projects and to researchers for evaluating proof engineering techniques.
Joint work with Ahmet Celik, Marinela Parovic, Emilio Jésus Gallego Arias, and Milos Gligoric.
05 Sep 2019 at 11:15 in 1537
Neural machine translation for automatic software repair: the Transformer architecture
In neural machine translation (NMT), state of the art developments are impacted by the wide application of the Transformer architecture, which outperforms older LSTM-based models. This architecture has also been used more recently to achieve significant results in other natural language tasks by utilizing transfer learning. In automatic software repair, the process of automatically fixing software bugs, LSTM-based sequence-to-sequence learning models have been applied to correct buggy lines in a manner similar to the machine translation task. This presentation introduces the recent advances in the area of NMT, discusses their application on automatic software repair, presents some preliminary results and proposes ideas for future steps that can improve the effectiveness of NMT methods for software repair.
04 Sep 2019 at 15:15 in 4523
Bisimulations for logics of strategic reasoning
(Catalin Dima, Université Paris-Est Créteil)
The notion of bisimulation is of central importance in modal logics, as it frequently allows characterizing the expressive power of a logic within a class of Kripke models. We present some recent work towards the characterization of the expressive power of some logics of strategic reasoning in multi-agent systems with perfect or imperfect information with the aid of appropriate notions of bisimulations. The talk will also provide a variety of counterexamples of systems related by some type of bisimulation but not satisfying the same formulas in one of the strategy logics under consideration. If time allows, we will also provide some applications in the verification of coercion-freeness properties in a simple voting system or as the statement of the elimination of dominated strategies in multi-player games as a bisimulation property.
21 Aug 2019 at 11:01 in 4523, Lindstedtsvägen 5 - floor 5
Dependencies Gone Wild: Tales from automating 10,000+ project's dependency updates
Renovate Bot is a tool used to automate dependency updates across dozens of package managers, in use by 10's of thousands of active projects. This presentation will focus on lessons learned working with dependencies in the wild: vulnerabilities, different approaches across languages, and the use and misuse of semver and other versioning approaches.
20 Aug 2019 at 14:30 in Room 4523
(50% seminar) Algebraic Data-Structures for Distances in Graphs
(Jan van den Brand)
I will present results from my two upcoming FOCS papers "Dynamic Approximate Shortest Paths and Beyond: Subquadratic and Worst-Case Update Time" (with Danupon Nanongkai) and "Sensitive Distance and Reachability Oracles for Large Batch Updates" (with Thatchaphol Saranurak). In both papers we use algebraic techniques to create new data-structures which can efficiently maintain distances in a graph that undergoes changes to its edges. We answer several open questions: * When we want to maintain distances from a single source node, a fundamental question is to beat trivially calling Dijkstra's algorithm whenever the graph changes, taking O(n^2) time on dense graphs. We answer this question by giving an O(n^1.823) upper bound. * We also obtain the first data-structure with subquadratic update time for nearly 1.5-approximating the diameter. * When we want to maintain distances between all-pairs of nodes explicitly, the O(n^2) amortized update time by Demetrescu and Italiano [STOC'03] already matches the trivial Omega(n^2) lower bound. A fundamental question is whether it can be made worst-case. We answer this question by giving an O(n^2) upper bound for undirected unweighted graphs. * We give the first failure sensitive distance oracle that can maintain the distance in a graph for any number of edge failures in subquadratic time.
19 Aug 2019 at 13:15 in 4523, Lindstedtsvägen 5 - floor 5
Testing Small Vertex Connectivity in Near-Linear Queries.
(Sorrachai Yingchareonthawornchai, Aalto University, Finland)
Vertex connectivity is a classic extensively-studied problem. Given an integer k, its goal is to decide if an n-node m-edge graph G can be disconnected by removing k-1 vertices (otherwise, G is k-connected). In the study of testing graph properties, we want to distinguish between when G is k-connected and when it is "far" from having such property. It has been open for many years whether the bounds from [Goldreich, Ron'02], [Orenstein, Ron'11], [Yoshida, Ito'11, and '12] which are exponential in k can be made polynomial (this was asked in e.g. [Orenstein, Ron'11]). In this talk, we give a testing algorithm that queries edges that is near-linear in k. The key technique is from the recent local vertex connectivity algorithm [Nanongkai, Saranurak, Yingchareonthawornchai'19] in an appropriate form for property testing. This talk is based on joint work with Sebastian Forster, Danupon Nanongkai, Thatchaphol Saranurak, and Liu Yang.
Note: Sorrachai will be visiting Danupon between August 19-30
14 Aug 2019 at 13:15 in 4523, Lindstedtsvägen 5 - floor 5
Fast Approximate Shortest Paths in the Congested Clique
(Michal Dory, Technion, Israel)
I will discuss our recent shortest paths algorithms in the distributed congested clique model. Our first contribution is a (2+epsilon)-approximation algorithm for all-pairs shortest paths (APSP) requiring only polylogarithmic number of rounds. This is the first sub-polynomial constant approximation for APSP in this model. We also signiﬁcantly improve upon the state-of-the-art for approximate multi-source shortest paths and diameter, as well as exact single-source shortest paths (SSSP). Our main techniques are new distance tools that are obtained via improved algorithms for sparse matrix multiplication, which we leverage to construct eﬃcient hopsets and shortest paths. Joint work with Keren Censor-Hillel, Janne H. Korhonen and Dean Leitersdorf.
Note: Michal will be visiting Danupon between Aug 12-30
Previous years' seminars.