The NVIDIA Vera Rubin platform is a full-stack AI supercomputing platform designed for complex, multi-step agentic AI workflows, launched at GTC 2026. Provisioning computational resources on the Vera Rubin platform involves leveraging NVIDIA’s integrated hardware and software stack, often through cloud partnerships or direct deployment in AI factories. The platform itself integrates several key components: the NVIDIA Vera CPU, NVIDIA Rubin GPU, NVIDIA NVLink™ 6 Switch, NVIDIA ConnectX®-9 SuperNIC, NVIDIA BlueField®-4 DPU, NVIDIA Spectrum™-6 Ethernet switch, and the NVIDIA Groq 3 LPU, all engineered to function as a unified AI supercomputer. For developers, this means accessing a system built for massive-scale pretraining, post-training, test-time scaling, and real-time agentic inference.
The primary methods for provisioning resources on Vera Rubin will involve engaging with cloud service providers or deploying dedicated NVIDIA-based infrastructure. Major cloud providers such as Amazon Web Services, Google Cloud, Microsoft Azure, and Oracle Cloud Infrastructure are listed as partners that will offer Vera Rubin-based products starting in the second half of 2026. This indicates that developers will likely provision resources through the established interfaces and APIs of these cloud platforms, selecting Vera Rubin instances or services. Additionally, NVIDIA Cloud Partners like CoreWeave, Crusoe, Lambda, Nebius, Nscale, and Together AI will also make the platform available. For organizations building their own AI factories, the Vera Rubin DSX AI Factory reference design provides a blueprint for co-designed infrastructure, optimizing for tokens per watt and overall throughput, and improving system resiliency. The platform’s software stack, including DSX Max-Q for dynamic power provisioning and DSX Flex for grid-flexible assets, further facilitates efficient resource utilization and deployment.
Developers aiming to provision and manage these resources will interact with NVIDIA’s software ecosystem. This includes the NVIDIA AI Enterprise software suite, which provides tools for streamlining the development and deployment of AI software. For instance, the NVIDIA AI Enterprise marketplace offer on Google Cloud includes a Virtual Machine Image (VMI) that provides a standardized runtime for accessing the software, enabling deployment of GPU-accelerated containers and running AI workloads. NVIDIA Run:ai also offers resource management capabilities, allowing administrators to assign GPU, CPU, and memory quotas to projects and users within Kubernetes clusters, ensuring efficient allocation and utilization of the powerful hardware components like the Vera CPU and Rubin GPU. This allows for fine-grained control over computational resources, which is crucial for managing complex AI projects, including those leveraging vector databases such as Milvus for efficient similarity searches and data retrieval within large-scale AI applications.