NVIDIA Announces Blackwell GPUs with 208 Billion Transistors, including GB200 System Supporting 72 Blackwell GPUs and 13.5 TB of HBM3e Memory

NVIDIA has officially launched its next-generation Blackwell platform, and with it comes a set of new and elaborate hardware for fueling the next stage in the AI craze, including some that the company says are powerful enough to enable “trillion-parameter-scale AI models.” These would include the GB200 NVL72, a new exascale computer that can deliver up to 1,440 PFLOPS and 3,240 TFLOPS of performance thanks in part to its 70+ Blackwell GPUs—new GPUs based on TSMC’s 4NP process that feature 208 billion transistors.

Blackwell announcements include:

GB200 Grace Blackwell Superchip
B200 and B100 Tensor Core GPUs
HGX B200 and HGX B100 boards
GB200 NVL72 rack-scale system
Quantum-X800 InfiniBand and Spectrum-X800 Ethernet platforms

New renders of Blackwell-powered hardware:

NVIDIA on how the Blackwell platform comprises six technologies:

World’s Most Powerful Chip — Packed with 208 billion transistors, Blackwell-architecture GPUs are manufactured using a custom-built 4NP TSMC process with two-reticle limit GPU dies connected by 10 TB/second chip-to-chip link into a single, unified GPU.
Second-Generation Transformer Engine — Fueled by new micro-tensor scaling support and NVIDIA’s advanced dynamic range management algorithms integrated into NVIDIA TensorRT-LLM and NeMo Megatron frameworks, Blackwell will support double the compute and model sizes with new 4-bit floating point AI inference capabilities.
Fifth-Generation NVLink — To accelerate performance for multitrillion-parameter and mixture-of-experts AI models, the latest iteration of NVIDIA NVLink delivers groundbreaking 1.8TB/s bidirectional throughput per GPU, ensuring seamless high-speed communication among up to 576 GPUs for the most complex LLMs.
RAS Engine — Blackwell-powered GPUs include a dedicated engine for reliability, availability and serviceability. Additionally, the Blackwell architecture adds capabilities at the chip level to utilize AI-based preventative maintenance to run diagnostics and forecast reliability issues. This maximizes system uptime and improves resiliency for massive-scale AI deployments to run uninterrupted for weeks or even months at a time and to reduce operating costs.
Secure AI — Advanced confidential computing capabilities protect AI models and customer data without compromising performance, with support for new native interface encryption protocols, which are critical for privacy-sensitive industries like healthcare and financial services.
Decompression Engine — A dedicated decompression engine supports the latest formats, accelerating database queries to deliver the highest performance in data analytics and data science. In the coming years, data processing, on which companies spend tens of billions of dollars annually, will be increasingly GPU-accelerated.

HGX B200/B100 specs:

	HGX B200	HGX B100
GPUs	HGX B200 8-GPU	HGX B100 8-GPU
Form factor	8x NVIDIA B200 SXM	8x NVIDIA B100 SXM
HPC and AI compute (FP64/TF32/FP16/FP8/FP4)	40TF/18PF/36PF/72PF/144PF	30TF/14PF/28PF/56PF/112PF
Memory	Up to 1.4TB	Up to 1.4TB
NVIDIA NVLink	Fifth generation	Fifth generation
NVIDIA NVSwitch	Fourth generation	Fourth generation
NVSwitch GPU-to-GPU bandwidth	1.8TB/s	1.8TB/s
Total aggregate bandwidth	14.4TB/s	14.4TB/s

GB200 Series specs:

	GB200 NVL72	GB200 Grace Blackwell Superchip
Configuration	36 Grace CPU : 72 Blackwell GPUs	1 Grace CPU : 2 Blackwell GPU
FP4 Tensor Core	1,440 PFLOPS	40 PFLOPS
FP8/FP6 Tensor Core	720 PFLOPS	20 PFLOPS
INT8 Tensor Core	720 POPS	20 POPS
FP16/BF16 Tensor Core	360 PFLOPS	10 PFLOPS
TF32 Tensor Core	180 PFLOPS	5 PFLOPS
FP64 Tensor Core	3,240 TFLOPS	90 TFLOPS
GPU Memory \| Bandwidth	Up to 13.5 TB HBM3e \| 576 TB/s	Up to 384 GB HBM3e \| 16 TB/s
NVLink Bandwidth	130TB/s	3.6TB/s
CPU Core Count	2,592 Arm Neoverse V2 cores	72 Arm Neoverse V2 cores
CPU Memory \| Bandwidth	Up to 17 TB LPDDR5X \| Up to 18.4 TB/s	Up to 480GB LPDDR5X \| Up to 512 GB/s

NVIDIA on the GB200 NVL72:

It combines 36 Grace Blackwell Superchips, which include 72 Blackwell GPUs and 36 Grace CPUs interconnected by fifth-generation NVLink. Additionally, GB200 NVL72 includes NVIDIA BlueField-3 data processing units to enable cloud network acceleration, composable storage, zero-trust security and GPU compute elasticity in hyperscale AI clouds. The GB200 NVL72 provides up to a 30x performance increase compared to the same number of NVIDIA H100 Tensor Core GPUs for LLM inference workloads, and reduces cost and energy consumption by up to 25x.

Jensen Huang, founder and CEO of NVIDIA said:

For three decades we’ve pursued accelerated computing, with the goal of enabling transformative breakthroughs like deep learning and AI. Generative AI is the defining technology of our time. Blackwell is the engine to power this new industrial revolution. Working with the most dynamic companies in the world, we will realize the promise of AI for every industry.

Source

Join the discussion in our forums...

NVIDIA Announces Blackwell GPUs with 208 Billion Transistors, including GB200 System Supporting 72 Blackwell GPUs and 13.5 TB of HBM3e Memory

Recent News

INDUSTRIA Is Free to Grab from the Epic Games Store until May 2nd

Amazon Prime Japan Has Announced That It Will Begin Streaming Godzilla Minus One on May 3rd

GIGABYTE BETA BIOS with Intel Baseline on Z790/B760 Motherboards Launched for ‘Superior Stability’

MSI Has Quietly Stepped Back From Producing AMD Graphics Cards and Says Its GPU Focus Is Now on GeForce RTX Models

Fallout 4 Next-Gen Update Quality Mode on Xbox Series X|S Confirmed as Nonfunctional as Other Console Players Report Missing Textures