Server CPUs are terrific for executing a wide range of workloads and tasks, but they are not the optimal answer to every computing problem. CPUs trade off universality for performance and efficiency. Custom hardware delivers better performance with lower power consumption if the task at hand is well defined or encapsulated, or if the task’s definition has gelled to the point where it won’t change. There are many such tasks currently being executed by CPU-based servers in the data center including IPsec and Open vSwitch (OVS). That’s likely to change for next-generation servers if AMD’s, Broadcom’s, Intel’s, Marvell’s, and Nvidia’s plans for infrastructure offload engines, referred to as SmartNICs, Infrastructure Processing Units (IPUs), and Data Processing Units (DPUs), are realized.
The problem being addressed here is very real. Data centers are maxed out on power consumption as a result of both the electrical power that servers consume, and the infrastructure required to power and cool these servers. Data center service expansion relies on either becoming more efficient within the existing data center’s power and cooling footprint or building another data center. To date, cloud service providers (CSPs), communications service providers (CoSPs), and enterprise data centers have relied on both strategies. CPU vendors continue to provide more efficient server CPUs by adding more processor cores to each new generation of server CPU chip produced on increasingly smaller semiconductor process nodes and CSPs, CoSPs, hyperscalers, and enterprises continue to build new data centers.
Alternatively, the entire data center architecture can be rethought by adding additional processing engines for specific tasks. While architectural extensions are increasingly common for specific external workloads like graphics and AI, architectural extensions are also appropriate for many of the traditional infrastructure overhead tasks like network management, storage, and security. The idea of transferring tasks from CPUs to more efficient infrastructure offload engines isn’t new. The key FPGA players, AMD/Xilinx and Intel, and other semiconductor vendors including Broadcom, Marvell, and Nvidia have been offering programmable SmartNICs capable of offloading certain infrastructure tasks for several years. AMD/Xilinx offers a line of FPGA-based Alveo adaptable accelerator cards for this purpose. Intel has migrated from SmartNICs to FPGA- and ASIC-based IPUs to offload infrastructure tasks. Nvidia purchased Mellanox in 2020 and offers a line of Bluefield DPUs for “offloading, accelerating, and isolating a broad range of advanced networking, storage, and security services.” These cards are all PCIe boards designed to plug into data center servers to help accelerate infrastructure tasks while reducing the power required to execute those tasks.
Nvidia recently published a white paper titled “DPU Power Efficiency” that quantifies the power-saving benefits of its DPUs. The white paper’s introductory paragraph neatly sums up why you might want to pay some attention to this SmartNIC/IPU/DPU trend:
“One of the best ways to improve efficiency is to use a Data Processing Unit (DPU) or SmartNIC to offload and accelerate networking, security, storage or other infrastructure functions and control-plane applications, which reduces server power consumption up to 30%. The amount of power savings increases as server load increases and can easily save $5.0 million in electricity costs for a large data center with 10,000 servers over the 3-year lifespan of the servers, plus additional savings in cooling, power delivery, rack space, and server capital costs.”
Nvidia’s white paper contains several detailed technical examples based on trials by Ericsson, Nvidia, and other companies that demonstrate the power savings and consequent ROI that can be achieved with infrastructure offload engines.
The question is not whether these types of accelerators will see use, because the economics of these accelerators are undeniable. Rather, the question is, which company’s vision for these infrastructure accelerators will prevail. Currently, Nvidia envisions a straightforward offloading of infrastructure tasks by adding infrastructure accelerators to servers. VMware’s Project Monterey and Red Hat’s OpenShift Container Platform already incorporate the means to offload infrastructure tasks to supported infrastructure offload engines. The Linux Foundation is following suit with the launch of the Open Programmable Infrastructure (OPI) project. Nvidia expects that upgrades will not be achieved by plugging infrastructure accelerators into existing servers. Instead, the company expects that next-generation servers installed in the next round of server upgrades within data centers will include infrastructure accelerator cards pre-integrated into the servers. TIRIAS Research believes that infrastructure acceleration with the accompanying power efficiencies will appear starting with the next wave of server upgrades.
While Intel is happy to see its IPUs integrated into data center servers in this same manner, the company has broader goals for the future. During the company’s Architecture Day 2021, Intel’s Guido Appenzeller, then the CTO of the Intel Data Platforms Group, laid out a vision that moves IPUs to the core of the data center. In that role, IPUs are responsible for managing the construction and teardown of composable systems using CPUs in servers, memory, and storage as interchangeable parts in the dynamic assembly/disassembly process. This somewhat heretical vision requires a major overhaul in data center architecture and a major rewrite of data center operating systems.
Appenzeller is no longer working at Intel and the company really hasn’t elaborated on this concept since his departure, so there’s no telling when or whether the company will ultimately take this path. However, TIRIAS Research thinks this idea is an intriguing concept, well worth exploring in conjunction with enabling technologies such as the Compute Express Link (CXL) interface for coherent, high-speed processor-to-processor and processor-to-memory interface. Meanwhile, in the here and now, there are performance gains and power efficiencies to be had simply by using infrastructure offload engines – whether they be called SmartNICs, DPUs, or IPUs – within the boundaries of today’s operating systems.
Ivo Bolsens, OpenNIC empowers next-generation SmartNIC research