NVIDIA GeForce RTX 4090 GPU Family Specifications

GPU Family Basics
GPU Manufacturer NVIDIA
GPU Generation RTX 40 Series
GPU Architecture Name Ada Lovelace
GPU Die Code AD102
Default TGP 450 W
Launch Price $1,599
GPU Family Memory
Default Memory Bandwidth 1008 GB/s
Default Memory Size 24 GB
Memory Bus Width 384 bit
Memory Type GDDR6X
GPU Family Chip Details
Transistors 76.3 billion
Die Size 608.5 mm^2
CUDA Cores 16384
RT Cores 128
ROPs 176
TMUs 512
GPU Family Clock Speeds
Default GPU Base Clock
Default GPU Boost Clock 2520 MHz
Default Memory Clock 21 GHz

The NVIDIA Lovelace Architecture is what powers the GeForce RTX 40 Series graphics cards.  It all starts with the new manufacturing process, GeForce RTX 40 series GPUs are manufactured on a new custom TSMC and NVIDIA-made 4N process, allowing 76.3 Billion transistors and a die size smaller than the previous generation with AD102, powering the GeForce RTX 4090.  The full AD102 includes 12 Graphics Processing Clusters (GPCs), 72 Texture Processing Clusters (TPCs), 144 Streaming Multiprocessors (SMs) and 18,432 CUDA Cores and 144 RT Cores, and 576 Tensor Cores and Texture Units.  Note that this is not the RTX 4090 specs, it has one GPC disabled, we’ll go over the RTX 4090 specifically on the next page.

The GPC includes a dedicated raster engine, two raster operations partitions, eight individual ROP units, and six TPCs which include one PolyMorph engine and two SMs.  Each SM contains 128 CUDA Cores, one third-generation RT Core and four fourth-generation Tensor Cores, four Texture Units, a 256KB register file, and 128KB of L1 cache.  L1 cache, L2 cache, and register file sizes have all been increased in Ada architecture.  The full AD102 spec has each Ada SM having 128KB of L1 cache, a total of 18,432KB of L1 cache, and the L2 cache has been increased to 98,304KB, which is an improvement of 16x over GA102.