ATI Radeon R600 Page: 1
http://www.overclock3d.net/gfx/articles/2007/05/15150008899l.jpg

ATI Radeon R600

The long awaited and hugely anticipated ATI card is finally here. We have been given access to information and media documenting ATI's first DirectX10 graphics card. The technical specifications look promising, with some features never before seen on a GPU, making these cards innovative and one of a kind.

This article is designed to run through a little bit of the technical information without confusing the hell out of you. It is a short and sweet introduction into r600 and it's architecture and is meant to cover it in an understandable way.

The HD 2900 Series has a massive transistor count of 700 million which is slightly more than nVidia's G80 which has 681 million. The R600 die is manufactured on 80nm process technology, slightly lower than nVidia's G80 which uses 90nm. There are already plans to possibly reduce this to 65nm, which will mean lower power consumption & more importantly (for the overclockers) lower temperatures.


Interesting Extra Features

Before we launch into the technical jargon let's take a look at the extra features that ATI have implemented on the 2x00 cards...each card in the 2x00 series has both HDMI connectivity as well as 5.1 Audio.

A lot of people have been questioning the 5.1 Audio, but it really makes sense if you want to build a HTPC and makes it a all-in-one solution.


Clock Speed, numbers and more

Let's take a brief overview of the actual clock speeds of the 2x00 family. ATI have entered with three initial SKU's giving quite a breadth to their releases. The 2900XT is the card that tops the range, with the 2400 and 2600 filling in the gaps. I'm sure we will see more from AMD/ATI in the future, however.

http://www.overclock3d.net/gfx/articles/2007/05/15182846450s.jpg http://www.overclock3d.net/gfx/articles/2007/05/15182846450s.jpg

Now some tech specs in an easy to read format:


Technical Specification


R600 GPU's
GPUAvailabilityCoreDie
Clocks (MHz) Core/Mem
Texture Units
SPUs/Shaders
Memory Memory Interface
Memory Bus
HD_2900XT
May 2007 R60080nm
750 / 1650
16
320 / 64
512 MB
512-bit
GDDR3
HD 2600XT
May 2007 RV630 65nm800 / 2200
8
120 / 24
256 MB
128-bitGDDR4
HD 2600 May 2007 RV630 65nm 8120 / 24
 128-bit GDDR3
HD 2400XT
May 2007 RV610 65nm700 / 1600
440 / 8
256 MB 64-bit
GDDR3
HD 2400
May 2007 RV610 65nm 440 / 8
 64-bitDDR2

I'm sure a lot of people have already read articles about these cards aswell as seeing the current benchmarks. I think we will see an improvement on benchmarks and general performance as new drivers are released.



ATI Radeon R600 Page: 2
http://www.overclock3d.net/gfx/articles/2007/05/15150008899l.jpg

ATI Radeon R600

Value and energy effeciency seems quite prominent in ATI's design & Marketing with the 2400 & 2600 cards, with them only requiring 25W & 45W respectively. As well as making a All-In-One solution for surround sound gaming & multimedia.ATI are well behind nVidia in terms of the release date, but the wait may be worthwhile.

There are a few physical differences with the card. Most noticably the addition of a 8-pin power header (you can use 2x6-pin connectors if your PSU does not have a new 8-pin connector). The cooler is quite similar to previous models with the heat exhaust style system, only with an added improvement of heatpipes. Another, not so prominent difference is that there actually seems to be 2 fan headers on the card.
http://www.overclock3d.net/gfx/articles/2007/05/15192612621l.jpg

The ATI Radeon™ HD 2900 unleashes the awesome power of DirectX® 10 for an immersive HD gaming experience

  • Powers today’s titles and introduces newtechnology for DirectX® 10 games
  • 320 unified stream processors
  • 512-bit memory
  • HDMI and 5.1 surround audio
  • CrossFire™ scalability
  • Delivers powerful graphics performance,improved stability, and an immersive HD gaming experience for Windows Vista™
  • Shader Model 4.0 Support
  • Up to 24x Custom Filter Anti-Aliasing (CFAA) for improved quality
  • Native CrossFire™ upgradeability makes it easy to scale up graphics performance
  • 700 million transistors

The 2900XT really looks a force to be reckoned with. I guess we will see in good time, stay tuned as I'm sure we will have more info available very soon.

ATI Radeon R600 Page: 3
http://www.overclock3d.net/gfx/articles/2007/05/15150008899l.jpg

Techinical Aspects

Now onto the Technical bits that make up the r600. As I said earlier this is going to be a brief run-through and not a University dissertation on r600.

Unified Shader Architecture

The R600's design is similar to the xbox360's GPU as it is based on the Unified Superscalar shader architecture.

The Unified Shader Architecture coupled with Microsofts new DirectX 10 API allows the shaders to be 'unified'.
What this means is that instead of having a fixed number of shaders assigned to certain tasks, (eg x1900: 8 vertex shader processors & 48 pixel shader processors) they can be assigned depending on the job at hand meaning all of the shaders are used.

As a result, the GPU will be fully utilised and therefore performance will be enhanced, gaining up to 25% in performance.

http://www.overclock3d.net/gfx/articles/2007/05/15182846450s.jpg

Unified Superscalar Shader Architecture

  • 320 stream processing units
    • Dynamic load balancing and resource allocation for vertex, geometry, and pixel shaders
    • Common instruction set and texture unit access supported for all types of shaders
    • Dedicated branch execution units and texture address processors
  • 128-bit floating point precision for all operations
  • Command processor for reduced CPU overhead
  • Shader instruction and constant caches
  • Up to 80 texture fetches per clock cycle
  • Up to 128 textures per pixel
  • Fully associative multi-level texture cache design
  • DXTC and 3Dc+ texture compression
  • High resolution texture support (up to 8192 x 8192)
  • Fully associative texture Z/stencil cache designs
  • Double-sided hierarchical Z/stencil buffer
  • Early Z test, Re-Z, Z Range optimization, and Fast Z Clear
  • Lossless Z & stencil compression (up to 128:1)
  • Lossless color compression (up to 8:1)
  • 8 render targets (MRTs) with anti-aliasing support
  • Physics processing support

Now I don't need to say that the 2X00 series support DX10 fully with full SM 4.0 support. This is ATI's first card supporting the new Microsoft API and they've followed their Xenos Xbox 360 chip in making r600. In saying that, r600 has taken Xenos that bit futher and gone with a completely unified shader architecture, something nVidia have also done with G80.

The r600 consists of 320 independent stream processing units, made up of 64 superscaler shader processors, each a five-way unit. These all support FP32 bit precision in mathematical operations. Added in are 16 texture units (TMU's) and 16 ROPs (render backends). The 2900 XT has a 512mb ring-bus architecture, taken forward from r580, which can break through 100BG/sec; the same as the G80 can, but at a more interesting price-point.

ATI have also added a "programmable tessellation unit". This is notable as the current DX10 specification do not include this, although Microsoft is said to have plans in including it at a later stage.

Shader Units

ATI have gone a different router to nVidia in as far as their Shader Units are concerned. This isn't in so much as the actual number (though they do differ), more the approach to the way that each stream processor is layed out. r600 uses a 5-way superscaler shader processor, with 5 parts and 5 instructions per clock. These sit in clusters of 16 shaders (80 stream processing units each cluster altogether) and added to them is a branch execution unit. Added is a 64KB memory read/write cache which can be accessed by any shader cluster. This means that in DX10 architecture you can avoid going through the render backend and write straight to memory.

r600 shader

So that's the big difference in architecture of the stream processing units as I see it. ATI have added a lot of additional features, but as this is a brief article, I will leave you to click on the links I have provided below for a more in-depth analysis of the architecture. Now onto what some of this means.


ATI Radeon R600 Page: 4
http://www.overclock3d.net/gfx/articles/2007/05/15150008899l.jpg

Custom Filter Anti-Aliasing

The R600 range have a new type of Anti-Aliasing called Custom Filter Anti-Aliasing (CFAA), featuring edge detect, narrow tent & wide tent filters to reduce texture shimming. Along with higher compression ratios for Multi-Sample Anti-Aliasing (MSAA).
The way CFAA works is different to the standard MSAA in that instead of using a box/grid to divide the screen and having samples taken from inside these boxes it uses an array of larger circular areas which allow overlapping and also better coverage.

CFAA supports programmable sample patterns & filters. It is also upgradable with driver updates which is one of the reasons why we could see some improvements in performance with each driver update.

http://www.overclock3d.net/gfx/articles/2007/05/15182846450s.jpg

It's worth noting that ATi's new implementation CFAA has quite a few good points to it (such as speeding up per-pixel samples on limited memory), but this technique could lead to actually a blurring of the image, rather than a sharpening and reducing jaggies. nVidia's implementation here shows it's strengths with a more precise implementation in it's QAA.

Tessellation

This is another advance in 3D hardware that was developed on the xbox360's GPU. Tesselation makes it easier for developers by creating a means of hardware smoothing. This means developers can have something which might look coarse and even not very detailed, the GPU will then do most of the work. The end product looks much more detailed and life-like. This also obviously works for terrains and helps in animation, converting primitive data into something beautifull.


http://www.overclock3d.net/gfx/articles/2007/05/15182846450s.jpg http://www.overclock3d.net/gfx/articles/2007/05/15182846450s.jpg


512-bit memory interface

As mentioned earlier the Radeon 2900XT has the worlds first 512-bit memory interface, giving it a huge bandwidth boost. When comparing it to the X1900XT for example the 2900XT has over 2x the bandwidth (X1900XT 46.4GB/second)

The ATI Radeon™ HD 2900 Memory Controller
  • Worlds first 512-bit memory interface
  • Over 100GB/sec memory bandwidth
  • 512-bit 8-channel GDDR3/4 memory interface
  • Ring Bus Memory Controller
    • Fully distributed design with 1024-bit internal ring bus for memory reads and writes
    • Optimized for high performance HDR (High Dynamic Range) rendering at high display resolutions

http://www.overclock3d.net/gfx/articles/2007/05/15182846450s.jpg http://www.overclock3d.net/gfx/articles/2007/05/15182846450s.jpg

ATI have talked about the ring-bus architecture before, but this time they've given their card 512 bits to play with - a worlds first. The ring-bus is a bi-directional approach to memory architecture meaning the data gets to where it needs to go by the shortest route, making it highly efficient. It's all about bandwidth here, and that's what GPU's really benefit from.



ATI Radeon R600 Page: 5
http://www.overclock3d.net/gfx/articles/2007/05/15150008899l.jpg

HD - the buzzword

Nowadays it's all about High Definition and getting the very best image from your hardware as possible. ATI have upgraded the HD performance of their latest cards and now HDCP over two monitors is supported (something 8800 series don't do). As mentioned before an HDMI converter is included with the 2900 XT and ATI have gone with HDMI version 1.2, not including Dolby Digital Plus, DTS HD Master or Dolby TrueHD. However if your an HTPC enthusiast you are going to have a decent discrete APU (sound card to you and me - Ed) for your audio so I see the 5.1 sound card as a pretty neat if not slightly useless addition to the 2X00 series.


AVIVO


The new AMD theater 200 chip gives the R600 cards a great boost. Not only does it support 5.1 sound (stereo in previous versions) but it is also a UVD (Universal Video Decoder) & a AVP (Advanced Video Processor).
UVD provides direct hardware decoding of HD-DVD & Blue-ray standards. It also provides extra features for better picture quality when watching videos. These features are frequency transform, pixel prediction and deblocking, bitstream processing/entropy decode.

Because the video and sound are not seperated the card conforms to the protected content output path, which also makes it compliant with Windows Vista Premium Logo requirements.
The audio controller supports formats of 32kHz, 44.1kHz and 48kHz 16-bit PCM stereo & also supports AC3 compressed multi-channel audio (allowing Dolby Digital & DTS).

ATI Avivo™ HD Video and Display Platform
  • Two independent display controllers
    • Drive two displays simultaneously with independent resolutions, refresh rates, color controls and video overlays for each display
    • Full 30-bit display processing
    • Programmable piecewise linear gamma correction, color correction, and color space conversion
    • Spatial/temporal dithering provides 30-bit color quality on 24-bit and 18-bit displays
    • High quality pre- and post-scaling engines, with underscan support for all display outputs
    • Content-adaptive de-flicker filtering for interlaced displays
    • Fast, glitch-free mode switching
    • Hardware cursor
  • Two integrated dual-link DVI display outputs
    • Each supports 18-, 24-, and 30-bit digital displays at all resolutions up to 1920x1200 (single-link DVI) or 2560x1600 (dual-link DVI)1
    • Each includes a dual-link HDCP encoder with on-chip key storage for high resolution playback of protected content2
  • Two integrated 400 MHz 30-bit RAMDACs
    • Each supports analog displays connected by VGA at all resolutions up to 2048x15363
  • HDMI output support
    • Supports all display resolutions up to 1920x10801
    • Integrated HD audio controller with multi-channel (5.1) AC3 support, enabling a plug-and-play cable-less audio solution
  • Integrated AMD Xilleon™ HDTV encoder
    • Provides high quality analog TV output (component/S-video/composite)
    • Supports SDTV and HDTV resolutions
    • Underscan and overscan compensation
  • HD decode acceleration for H.264/AVC, VC-1, DivX and MPEG-2 video formats
    • Flawless DVD, HD DVD, and Blu-ray™ playback
    • Motion compensation and IDCT (Inverse Discrete Cosine Transformation)
  • HD video processing
    • Advanced vector adaptive per-pixel de-interlacing
    • De-blocking and noise reduction filtering
    • Edge enhancement
    • Inverse telecine (2:2 and 3:2 pull-down correction)
    • Bad edit correction
    • High fidelity gamma correction, color correction, color space conversion, and scaling
  • MPEG-2, MPEG-4, DivX, WMV9, VC-1, and H.264/AVC encoding and transcoding
  • Seamless integration of pixel shaders with video in real time
  • VGA mode support on all display outputs

ATI have added a Universal Video Decoder to r600 that handles the entire decode process, offloading the decode from the CPU onto the GPU. As we have discussed in earlier articles the power of the GPU is only now coming to the fore and hardware manufacturers are realising it's potential. ATI enhances what is already out there by getting r600 to do it all:

r600 decoding

r600 decode 1 r600 decode comparison

ATI have stepped up in this department and it will be interesting to see how this measures up against G80 in real terms. Certainly this is a step up from ATI and good to see that decoding and playing HD is easier for everyone.


Conclusion

In conclusion AMD/ATI have implemented a very clean and well-rounded DX10 card - and we think it's about time too. The review of the 2900 XT is on it's way soon but already we can see that the 2X00 series are a good competitior for the 8800GTS cards. The only thing that I am a little disappointed about is that they haven't come out all guns blazing with a card that blows nVidia out of the water - but hopefully that's to come.

The High-Defition aspects of the 2X00 series are excellent and as we progress we will see more computers moving into the living room. AMD have made it clear with 690G that this is their intention and it's great to see the top-end following this through too.

Would I buy one...you'll have to wait for the full review...

Discuss in our Forums

Additional Links

Bit-Tech have an awesome write-up of r600 and it's architecture so have a good read if you're interested.
techPowerUp! have a nice read too
AMD/ATI have lots of info on their site