Re-architecting Next-Gen Computing Systems
Traditional datacenters, facilities that house and manage central computer systems, are built using servers. Each server tightly integrates a small amount of central processing unit (CPU), memory, and storage onto a single motherboard. The slowing down of processor speeds has led to the surfacing of several fundamental limitations of such server-centric architectures. Consequently, a new computing paradigm is emerging—a disaggregated architecture where each resource type, such as CPU and memory, is built as a standalone blade. A network fabric, the system by which data is passed between components, interconnects the resource blades. Disaggregated architectures have the potential to increase resource capacity by 10 to 100 times server-centric architectures.
While beneficial architecturally, resource disaggregation alters several fundamental assumptions that once guided the design and optimization of existing networks, systems, and applications. Capitalizing on the benefits will thus require re-architecting legacy systems and networks.
Rachit Agarwal, Computer Science, is leading a team of Cornell researchers—Christina Delimitrou, Electrical and Computer Engineering, and Hakim Weatherspoon, Computer Science—and University of California, Berkeley researchers Sylvia Ratnasamy and Scott Shenker, with the goal of co-designing the network, storage, and compute fabrics for this emerging computing paradigm. Multiple industry collaborators have also joined the team including Google, Microsoft, Intel, and Snowflake.
The team is designing ultra-fast network fabrics, including a new set of network software components that incorporates congestion control, failure tolerance, and scheduling mechanisms. The co-design of network and storage fabrics will lead to new memory and storage management software systems and a resource manager that provides essential guarantees across multiple applications sharing disaggregated storage and network fabrics. Also, the team is building new distributed programming frameworks and re-architecting existing applications to efficiently and correctly operate on disaggregated architectures. The project will provide solutions to some of the most difficult and important technical questions surrounding this emerging computing paradigm.