Can networking applications achieve suitable performance with IOMMU at high rates? Our recent PeerJ CS article answers this question by characterizing the performance implications of IOMMU and its cache (IOTLB) on recent Intel Xeon Scalable & AMD EPYC processors at 200 Gbps. Our study shows that enabling IOMMU at high rates could result in an up-to-20-percent throughput drop due to excessive IOTLB misses. Moreover, we present potential mitigation techniques to recover the introduced throughput drop caused by the “IOTLB wall” by using hugepage-backed buffers in the Linux kernel. This is joint work with Alireza Farshin (KTH), Luigi Rizzo (Google), Khaled Elmeleegy (Google), and Dejan Kostic (KTH). Follow the links for PDF and code.”
Recent Posts
- Giacomo Verardo’s PhD Defense
- We welcome Mircea-Costin, a new industrial doctoral student at NVIDIA!
- Daniel Perez’s PhD Defense
- Our upcoming ICLR paper: “KVComm: Enabling Efficient LLM Communication through Selective KV Sharing”
- Presentation at CoNEXT ’24: “FAJITA: Stateful Packet Processing at 100 Million pps”
No comments yet. Be the first to comment!