Milvus
Zilliz

What is Vera Rubin?

The NVIDIA Vera Rubin AI platform is a comprehensive, full-stack supercomputing platform unveiled at GTC 2026, specifically engineered to power the next generation of agentic AI. It represents a significant advancement in AI infrastructure, moving beyond individual components to offer a unified system designed for complex, multi-step autonomous AI workflows. The platform is not merely a single chip but an integrated ecosystem comprising seven new chips working in concert: the NVIDIA Vera CPU, NVIDIA Rubin GPU, NVIDIA NVLink™ 6 Switch, NVIDIA ConnectX®-9 SuperNIC, NVIDIA BlueField®-4 DPU, NVIDIA Spectrum™-6 Ethernet switch, and the NVIDIA Groq 3 LPU. This deep codesign across compute, networking, and storage enables the Vera Rubin platform to function as a single, massive AI supercomputer, supporting all phases of AI development, from large-scale pre-training and post-training to real-time inference for AI agents.

This integrated architecture is built to address the demanding requirements of agentic AI, focusing on mastering multi-step problem-solving and handling massive long-context workflows at scale. The NVIDIA Vera CPU, for instance, is purpose-built for agentic AI and reinforcement learning, delivering twice the efficiency and 50% faster performance compared to traditional rack-scale CPUs in these workloads. The platform boasts impressive performance metrics, including up to 10x higher inference throughput per watt and the potential for 10x more revenue opportunity for trillion-parameter models, aiming for up to 15x token generation and support for 10x larger models for richer multi-agent interactions. This level of integration and performance is critical for scenarios where large-scale data processing and sophisticated reasoning are required, such as in advanced recommendation systems or complex simulations where a vector database like Milvus could be used to efficiently manage and retrieve high-dimensional embeddings generated by these powerful AI models.

The NVIDIA Vera Rubin platform signifies a paradigm shift towards tightly integrated, full-stack AI infrastructure, optimizing entire systems for scalability and efficiency rather than just individual parts. It also incorporates advanced security features through its third-generation NVIDIA Confidential Computing, which extends security to the full-rack scale, establishing a unified, trusted execution environment across its CPUs, GPUs, and NVLink fabric. Products based on the Vera Rubin platform are slated for availability in the second half of 2026, with major cloud providers like Amazon Web Services, Google Cloud, and Microsoft Azure, along with global system manufacturers, expected to offer systems leveraging this new technology. This broad adoption highlights the industry’s move towards specialized, high-performance computing environments tailored for the evolving landscape of AI.

Like the article? Spread the word