Modern datacenters offer an abundance of online services to billions of daily users. The most demanding online services come with tight response latency requirements, forcing service providers to keep their massive datasets memory-resident, distributed across the datacenter’s servers. Every user request accesses the memory of hundreds of servers; therefore, fast access to the aggregate memory pool is of crucial importance for service quality. The entity binding all the memory resources together is the datacenter network. Unfortunately, despite the dramatic evolution of datacenter fabrics over the past decade, networking still incurs a latency overhead of tens of microseconds to every remote memory access, making memory pooling impractical. In this talk, I will propose a holistic network-centric system redesign to enable memory pooling in future datacenters, which includes (i) a specialized lightweight network stack; (ii) on-chip integration of the network interface logic; and (iii) new network operations with richer, end-to-end semantics. I will highlight the role and demonstrate the effect of each of these three key features with systems built throughout my dissertation.
Alexandros (Alex) Daglis is a sixth-year PhD student at EPFL, advised by Prof. Babak Falsafi and Prof. Edouard Bugnion. His research interests lie in rack-scale computing and datacenter architectures. Alex advocates tighter integration and co-design of network and compute resources as a necessary approach to tackling the performance overheads associated with inter-node communication in scale-out architectures. He has been a founding member of Scale-Out NUMA, an architecture, programming model, and communication protocol for low-latency, distributed in-memory processing. Scale-Out NUMA has been prototyped and awarded a US patent. As an intern at HP Labs, Alex worked on the design of The Machine’s unique memory subsystem. More details can be found on his CV.