DirectX 10 & the Future of Gaming

How is DirectX 10 and its Unified Architecture going to benefit gamers? What is the gamer going to need to take and advantage of it? We recently sat down with ATI and talked about DirectX 10 and how their next generation desktop GPU will benefit.

continued...

DirectX 10 Introduction

Our new API will be officially named “DirectX 10.” Over the past years it has been called by many names creating some confusion. The API has been known as DirectX Next, Windows Graphics Foundation 1.0 and 2.0, as well as its common nickname, DX10. So now that we know what it is called, where can we get it?

DirectX 10 will be available to Windows Vista users only at its introduction. You will not find DirectX 10 being released for the Windows XP operating system. DirectX 10 is deeply embedded into Windows Vista operation and we currently know of no plans by Microsoft to allow Windows XP to officially support the new API. Also embedded into Windows Vista is DirectX 9.0L to allow for compatibility with DirectX 9 components. Think of it like two separate DirectX systems. We will have DirectX 9.0L for DirectX 9 hardware and we will have DirectX 10 for DirectX 10 capable hardware. If you want DirectX 10 you will have to go with Windows Vista as your OS. Because of this we will see an expensive upgrade path associated with the experience of DirectX 10. You will need Windows Vista, DirectX 10 hardware and of course some DirectX 10 coded games.

The obvious question for the gamer that arises is, “Will this terribly expensive and arduous upgrade path positively impact my gaming experience enough to justify the cost?” That has yet to be seen and can only be answered with the games we have yet to play. We can however discuss some of capabilities of DirectX 10 with a unified architecture and how it can potentially benefit gamers.

DirectX 9 Limitations

Article Image

Before we talk about what is new with DirectX 10 we first need to understand what DirectX 9 is doing. While DirectX 9 was a huge leap over DirectX 8 it still has its limitations.

The Small Batch Problem

Article Image

One of the biggest limitations is API object overhead. In fact game content developers are currently being bottlenecked by this overhead. Out of all the improvements that could be pushed into DirectX 10, the issue at the top of the list for most game developers was to lessen API object overhead.

What we mean by API object overhead is that the API is using CPU cycles to achieve tasks necessary for rendering before being output to the video card for drawing. When rendering a game, the application first has to call to the API and then the API calls to the driver before it ever interacts with your video card’s GPU. These calls are all handled by the CPU, using valuable resources and creating a potential bottleneck.

Article Image Article Image

Within those procedures are limitations of how many objects you can show onscreen at one time in one frame. Objects can be anything in the game, a character or tree for example. Current limitations are around 500 objects in one frame. Anything over that and you can run into severe CPU bottlenecks. The game content developer has to carefully balance the game so they don’t get bottlenecked by this software and CPU limitation. This puts a huge limit on the immersion you experience in a game. For example let’s take trees. Right now trees are mostly done by taking a tree and then just making copies of it, maybe changing color and amount of leaves to create a forest. The limitation is so that thousands of unique trees cannot be displayed in one frame in real time. DirectX 10 will relieve this limitation.

In the slides above you can see how the object is being passed from the application to the API and the driver, as it goes through that path it carries some overhead. This overhead increases the execution time which means lower performance. This problem is commonly referred to as the “Small Batch Problem.”

Constraints of a fixed pipeline shader architecture

Article Image

The next constraint with DirectX 9 and current GPUs are the nature of the fixed pipeline path. In a GPU all the vertex and pixel processing are separated with a fixed number of processors for both. For example in current X1900 XTX technology there are a fixed 8 vertex units and a fixed 48 pixel processing units. With this fixed architecture you can run into problems where a scene might need a lot of vertex processing power, but with only 8 vertex units it may be bottlenecked.

What is becoming more and more typical in games these days is to have a scene that has a lot of vertex AND pixel processing requirements in the same scene. With the limitations of current hardware the vertex engine may become fully loaded (in a high vertex scenario) while the pixel shader processors just sit there idle! Or the other way around, there may be a highly complex pixel shader intensive scene happening so while that is going on the vertex unit’s just sit there idle! This creates an unbalanced processing engine. With future games we are going to see more and more scenes that utilize a high amount of both vertex and pixel shader processing in the same scene.