NVIDIA Rubin Platform Launches with 6 New Chips, Promising 10x Inference Cost Reduction Over Blackwell
Tags Infrastructure ยท AI

NVIDIA's Rubin platform, comprising six new chips now in full production, delivers up to a 10x reduction in inference token cost compared to the Blackwell platform and a 4x reduction in the number of GPUs needed to train mixture-of-experts models. The Rubin GPU provides 50 petaflops of NVFP4 compute for AI inference, and the Vera Rubin NVL72 is the first rack-scale platform with NVIDIA Confidential Computing. Microsoft's Fairwater AI superfactories will deploy Vera Rubin NVL72 rack-scale systems. The platform will be available from partners including AWS, Google Cloud, Microsoft, and OCI in the second half of 2026. Rubin represents a generational leap in AI hardware that could reshape the economics of AI deployment.