Yes, the NVIDIA Vera Rubin platform explicitly includes dedicated network fabric as a foundational component of its architecture. This design is central to its purpose as a full-stack AI supercomputing platform, engineered to manage the complex and multi-step workflows of agentic AI efficiently. NVIDIA has integrated several advanced networking technologies to ensure high-speed, low-latency communication both within and between its computational units, effectively treating networking as a core element rather than a secondary layer. The platform’s comprehensive approach to networking is designed to eliminate critical bottlenecks in communication and memory movement, which is essential for scaling AI operations and supercharging inference capabilities.
Key components of this dedicated network fabric include the sixth-generation NVLink and NVLink Switch, which serve as a high-speed GPU interconnect fabric capable of unifying up to 72 NVIDIA Rubin GPUs into a single performance domain. This NVLink technology provides substantial bandwidth, reaching 3.6 terabytes per second (TB/s) per GPU and 260 TB/s of connectivity within a rack, coupled with low latency to facilitate rapid data exchange. Additionally, the Vera Rubin platform incorporates ConnectX-9 SuperNICs and BlueField-4 DPUs (Data Processing Units) for advanced networking, data processing, and security tasks, alongside Spectrum-6 Ethernet switches to handle high-performance Ethernet traffic. These elements collectively form a robust and integrated networking infrastructure crucial for the platform’s overall performance.
Furthermore, the NVIDIA Vera Rubin platform features specialized fabrics such as the Scalable Coherency Fabric for the Vera CPU and NVLink-C2C for high-bandwidth, coherent CPU-GPU memory access. A particularly innovative addition is the Context Memory eXchange (CMX) or ICMS, which functions as a dedicated network entirely separate and purpose-built for managing context memory, especially vital as AI workloads expand to handle millions of tokens and extensive workload parallelism. For scaling beyond single racks, the platform supports NVIDIA Quantum-X800 InfiniBand and Spectrum-X Ethernet networking components, ensuring high computing utilization across massive GPU clusters. This holistic and deeply integrated approach to networking underscores the Vera Rubin platform’s commitment to delivering a cohesive, high-throughput, low-latency, and energy-efficient infrastructure for next-generation AI workloads, positioning networking as a central factor in achieving its performance goals.