Skip to content

Rubin CPX GPU NVIDIA’s Leap into Massive-Context AI

At the AI Infra Summit 2025, NVIDIA unveiled the Rubin CPX GPU—a revolutionary processor purpose-built for massive-context AI inference. Designed to handle workloads involving millions of tokens, Rubin CPX marks a new era in AI computing, enabling unprecedented performance in software development, generative video, and long-context reasoning.

Rubin CPX is a specialized GPU within NVIDIA’s Rubin architecture, optimized for long-context AI tasks. It’s the first CUDA GPU engineered specifically for inference workloads that require models to reason across vast sequences of data—whether that’s an entire codebase or an hour-long video.

Key Specifications:

  • Compute Power: Up to 30 petaflops using NVFP4 precision
  • Memory: 128GB of GDDR7 memory
  • Architecture: Monolithic die design for cost-efficiency and high density
  • Video Capabilities: Integrated NVENC and NVDEC encoders for seamless video processing

Rubin CPX operates within the Vera Rubin NVL144 CPX platform, which combines:

  • Rubin CPX GPUs
  • Standard Rubin GPUs
  • Vera CPUs

This hybrid system delivers:

  • 8 exaflops of AI compute
  • 100TB of fast memory
  • 1.7 petabytes/sec of bandwidth

Rubin CPX is tailored for industries pushing the boundaries of AI:

1. AI Coding Assistants

Rubin CPX enables tools to comprehend and optimize entire software repositories, transforming simple code generators into intelligent development partners.

2. Generative Video

With the ability to process up to 1 million tokens per hour of video, Rubin CPX supports high-quality video search, editing, and generation—all within a single chip.

3. AI Economics

NVIDIA claims that Rubin CPX can generate up to $5 billion in token revenue for every $100 million invested, making it a cornerstone of future AI factories.

Rubin CPX plays a critical role in NVIDIA’s disaggregated serving model:

  • Context Phase: Handled by Rubin CPX for prefill and reasoning
  • Generation Phase: Managed by standard Rubin GPUs for output creation

This separation boosts throughput by nearly 50%, optimizing both performance and cost.

Leading AI companies like Cursor, Runway, and Magic are already integrating Rubin CPX into their workflows:

  • Cursor: Accelerating code generation and developer insights
  • Runway: Empowering creators with cinematic-quality generative video
  • Magic: Enabling autonomous software agents with million-token context windows

Rubin CPX isn’t just another GPU—it’s a strategic pivot toward AI systems that think deeper, remember longer, and create with greater fidelity. If NVIDIA’s predictions hold true, Rubin CPX could be the catalyst that propels the company toward a $10 trillion valuation.capability, and flexible cooling, it empowers designers to build more efficient, reliable, and compact systems.tects, it’s not just about building faster—it’s about building smarter.

Share to your social below!

Tags:

Leave a Reply

Request Quote
Request one quote by partnumbers or upload a BOM, we will get back to you soon!

    Request Quote
    Request one quote by partnumbers or upload a BOM, we will get back to you soon!