High-Performance I/O Programming Models for Exascale Computing
Tid: To 2019-02-14 kl 11.15 - 12.15
Föreläsare: Sergio Rivas Gomez, CST/EECS/KTH
Plats: Room 4423, Lindstedtsvägen 5, KTH, Stockholm
The success of the Exascale supercomputer remains largely dependent on novel breakthroughs that effectively solve some of the challenges faced by current Petascale supercomputers. For instance, while the concurrency of upcoming HPC clusters is expected to increase 100-1000x over the next few years, the bandwidth and access latency of the I/O subsystem is projected to remain roughly constant in comparison. Furthermore, the integration of deep-learning and data analytics applications on HPC aggravates the chances for unexpected failures at Exascale, making fault-tolerance a major requirement for such use-cases.
In order to overcome some of these limitations, upcoming large-scale systems will feature a variety of Non-Volatile RAM (NVRAM), next to traditional hard-disks and conventional DRAM. Compute nodes implicitly become heterogeneous, providing several advantages for HPC applications (e.g., data locality). Nonetheless, this technological transformation increases the programming complexity and poses additional challenges for de-facto interfaces, such as MPI. Consequently, we observe the inherent need to provide a seamless transition between existing and future programming models for HPC.
In this presentation, we address the challenge of adapting MPI to the changes in the memory and storage hierarchies of Exascale supercomputers. We present the concept of MPI storage windows, an extension to the MPI one-sided communication model that provides a unique interface for programming memory and storage. We then illustrate how this concept can be integrated into a novel MapReduce framework for HPC, that features a decoupled strategy and cutting-edge fault-tolerance support during highly-parallel executions. The presentation will conclude outlining current and future work on this topic.