Intel Xe3 GPU

This might get confusing, so let’s focus in on this slide before we move to the architecture. The integrated graphics inside Intel Panther Lake’s GPU tile is a newer Xe3 architecture, compared to Xe2 on Intel Lunar Lake and Intel Battlemage discrete GPUs. However, Intel is going to keep the Xe3 architecture family under the Intel Arc B-Series branding. Yes, that is a bit confusing, but what it means is that Intel’s Xe architecture exists independently in a way of the Arc graphics branding.
As you can see, the first Xe architecture is found in the Intel Arc A-Series and Intel Arc Graphics, in both the discrete Alchemist GPU and the integrated Meteor Lake and Arrow Lake CPUs. Intel Lunar Lake, while using a newer Xe2 architecture, remained in part of the Intel Arc Graphics family naming. For Xe3, it will exist within the Intel Arc B-Series branding alongside the Battlemage discrete GPU, and apparently, this is the only place we will see Xe3, in Panther Lake. There will be a ‘next generation’ architecture called Xe3P, which will be the next Intel Arc family, or what you might hear called Celestial. It’s all a bit confusing, but just keep in mind that Xe3 is a newer architecture, even if it is under the Intel Arc B-Series branding.




















The Intel Xe3 architecture is an evolution of the Intel Xe2 architecture, with optimizations and enhancements; it is Xe2 scaled up. Quite literally, Xe3 scales up the render slice by adding more cores per slice. In the Xe2 architecture, there were 4x Xe cores and 4 ray tracing units. The Xe3 render slice is able to go up to 6x Xe cores and 6 ray tracing units. All specifications, therefore, scale upwards as you’d expect. We move from 32 XMX engines up to 96 XMX engines, 4MB of L2 cache up to 16MB, 1 geometry pipeline up to 2, 4 samplers up to 12, 4 RTUs up to 12, and 2 pixel backends up to 4. This allows for the full 12 Xe3 core version we see in the 16-core CPU configuration.
The 3rd gen Xe core has 8 512-bit vector engines, 8 2048-bit XMX engines, and an increase in L2 shared cache. The Xe vector engine has increased utilization with up to 25% more threads, and importantly, variable register allocation, and also FP8 support is added. The Xe XMX engines are capable of up to 120 TOPS. The ray tracing unit also gets some work with dynamic ray management for async ray tracing. There is a new URB manager in play, and up to 2x anisotropic filtering and 2x stencil test rate. The vertex and pixel rates on performance stay the same, but many microarchitecture features have increased performance. The end result is either 50% performance greater vs Lunar Lake at the same power, or 40% performance per Watt of Arrow Lake-H.




Intel has also been working on its software stack to improve many factors. The entire compiler has been updated to support the new variable register allocation and a faster scheduler with direct preemption. Intel is also supporting DirectX’s new Cooperative Vector technology as well as Neural Radiance Field.
NPU 5














We won’t go into detail on this, but Panther Lake does introduce Intel’s new NPU 5, with new features and performance. Importantly, it supports INT8 and FP8 now. You can check out the slides above if interested.
Gaming Performance




In these slides, Intel is showing how some of the scalable updates in the Intel Xe3 architecture can help improve performance per frame, reducing render time in milliseconds. For example, some key features like variable registers have produced sizeable improvements in render times, and increasing L2 to 16MB makes a difference.













Not exactly a trend we are excited about, but Frame Generation and more AI hybrid rendering techniques are being pushed. Like it or not (and we don’t), Frame Generation is here to stay, for now. Intel is very bold, stating, “With XeSS All Pixels on Screen Are Generated.” That is about as bold as it gets, and we can see where Intel is headed with its technologies.
Intel is introducing XeSS-MFG for Multi-Frame Generation. This means you can now have your GPU insert 3x AI-generated frames in-between raster frames, increasing ‘smoothness’ by 4x. Note, this does not make a 30FPS game feel like a 90FPS game; you still need a high base framerate to get a good experience. A 90FPS frame-generated performance will still ‘feel’ like 30FPS gameplay if your base framerate is 30FPS. Frame Generation is not a performance feature; it’s a smoothing feature, so make sure you are clear on that, and don’t fall into the marketing speak about it. Intel’s technology does use Optical Flow Reprojection, so motion vectors are calculated, making it more accurate.







Intel is very committed to improving stutters and loading times in games. Utilizing a new precompiled shader distribution, Intel can lower launch times and reduce stuttering on first game launch. This is done by compiling shaders in a cloud, and having your game actually download the pre-compiled shaders, saving you time compiling shaders at game launch.
Intel is also optimizing its Intelligent Bias Control v2, which will help balance power and frequency based on loads. Intel is actually claiming improvements to 0.1% and 1% Lows. It biases the E-Cores first on scheduling, and smooths out power distribution with fewer power spikes overall on Panther Lake.
