AMD Vega GPU architectural analysis
Published: 6th January 2017 | Source: AMD Radeon Technologies Group |
Next Generation Geometry and Compute Engine Design
With Vega, AMD has not limited their architectural changes to just their memory, announcing several improvements to both their Geometry pipeline and their Compute units.
Vega now supports Primitive shaders, which is a new type of low-level shader that developers can use to specify all shading stages that they want, allowing them to run at a higher rate than shaders using the traditional DirectX shader pipeline model. In an ideal scenario, developers will work to perform these optimisations, though AMD can use their graphics driver to deliver pre-defined cases for games, where several DirectX shaders can be replaced by a single primitive shader for improved performance.
AMD has also improved their geometry pipeline to deliver much higher peak levels of geometry throughput, with AMD's R9 Fury X offering 4 Geometry engines with peak throughputs of 4 polygons per clock, with Vega coming with 4 geometry engined that can handle up to 11 polygons per clock, a 2.6x increase.
With their next generation compute units AMD seeks to improve gaming performance with their Next Generation Compute Units (NCU), offering higher levels of mixed precision compute performance, greater 16-bit and 8-bit compute performance as well as higher clock speeds.
On the compute side, AMD's NCU will also be capable of calculating 8-bit, 16-bit and 32-bit operations with perfect scaling, which is perfect for those that require higher levels of lower precision compute performance. This also opens up the option to do several of these varying precision levels of compute at the same time, with mixed precision compute capabilities.
With a new feature called "rapid packed math", AMD can now clump 16-bit and 8-bit math together to increase the amount of compute tasks that AMD can do per clock. In theory, developers could optimise some of their code to run as 16-bit or 8-bit operations to increase GPU performance, which is something that developers are currently exploring in the PS4 Pro.
Lastly, AMD's new compute unit designs will be better suited to running at higher clock speeds, which will allow AMD to deliver more performance by completing more clock cycles and well as increasing the performance that AMD can deliver per clock.