Effective use of networked resources requires the ability to solve complex large-scale optimization problems fast while accounting for many input variables and performance requirements, such as end-to-end latency. Advancing beyond heuristic approaches, we begin with surveying the current state of applied machine learning to solve complex combinatorial optimization problems over networks. In our IEEE Access article titled “Learning Combinatorial Optimization on Graphs: A Survey with Applications to Networking”, we qualitatively analyse existing learning approaches and applications in the networking domain. Full abstract is as follows:
“Existing approaches to solving combinatorial optimization problems on graphs suffer from the need to engineer each problem algorithmically, with practical problems recurring in many instances. The practical side of theoretical computer science, such as computational complexity, then needs to be addressed. Relevant developments in machine learning research on graphs are surveyed for this purpose. We organize and compare the structures involved with learning to solve combinatorial optimization problems, with a special eye on the telecommunications domain and its continuous development of live and research networks.”
This work was done by Natalia Vesselinova (RISE), Rebecca Steinert (RISE), Daniel Felipe Perez-Ramirez (RISE) and Magnus Boman (KTH).
At USENIX ATC 2020, Alireza presented our paper titled “Reexamining Direct Cache Access to Optimize I/O Intensive Applications for Multi-hundred-gigabit Networks”. Full materials (video, slides, PDF) are available at the USENIX site. The paper abstract is below. This is joint work with Alireza Farshin, Amir Roozbeh, Gerald Q. Maguire Jr., and Dejan Kostić.
Memory access is the major bottleneck in realizing multi-hundred-gigabit networks with commodity hardware, hence it is essential to make good use of cache memory that is a faster, but smaller memory closer to the processor. Our goal is to study the impact of cache management on the performance of I/O intensive applications. Specifically, this paper looks at one of the bottlenecks in packet processing, i.e., direct cache access (DCA). We systematically studied the current implementation of DCA in Intel ® processors, particularly Data Direct I/O technology (DDIO), which directly transfers data between I/O devices and the processor’s cache. Our empirical study enables system designers/developers to optimize DDIO-enabled systems for I/O intensive applications. We demonstrate that optimizing DDIO could reduce the latency of I/O intensive network functions running at 100Gbps by up to ~30%. Moreover, we show that DDIO causes a 30% increase in tail latencies when processing packets at 200Gbps , hence it is crucial to selectively inject data into the cache or to explicitly bypass it.
Having published an NOMS 2018 paper on reliable distributed control planes, we continued working on this important problem and added an angle of guaranteed performance. Besides filing for a patent application, the work culminated in an IEEE Access article title “Fast Deployment of Reliable Distributed Control Planes with Performance Guarantees“. Full abstract is as follows:
Current trends strongly indicate a transition towards large-scale programmable networks with virtual network functions. In such a setting, deployment of distributed control planes will be vital for guaranteed service availability and performance. Moreover, deployment strategies need to be completed quickly in order to respond flexibly to varying network conditions. We propose an effective optimization approach that automatically decides on the needed number of controllers, their locations, control regions, and traffic routes into a plan which fulfills control flow reliability and routability requirements, including bandwidth and delay bounds. The approach is also fast: the algorithms for bandwidth and delay bounds can reduce the running time at the level of 50x and 500x, respectively, compared to state-of-the-art and direct solvers such as CPLEX. Altogether, our results indicate that computing a deployment plan adhering to predetermined performance requirements over network topologies of various sizes can be produced in seconds and minutes, rather than hours and days. Such fast allocation of resources that guarantees reliable connectivity and service quality is fundamental for elastic and efficient use of network resources.
The work was done at RISE by Shaoteng Liu, Rebecca Steinert, Natalia Vesselinova, and Dejan Kostić.
On November 22, 2019 we had our second meeting with the Industrial Advisory Board (IAB). Present for part or most of the day were:
- Azimeh Sefidcon and Björn Skubic, Ericsson
- Ulrik Janusson, Scania
- Shahryar Khan, Telia
We presented our status, published work, and short- and long-term plans. The IAB feedback was positive, and we are very excited to explore further ways our work can be of use to the Swedish industry!
Below are the images from the poster session held at RISE:
Yesterday was a superb day for us, with the KTH graduation taking place at the Stockholm’s concert hall, followed by the banquet at the Stockholm City Hall. Kirill and Georgios received their PhD degrees, and here is how we looked!