Deferred Shading.

One very popular rendering technique these days is deferred shading.  It’s a very simple concept, and it comes with plenty of benefits.  It’s generally very easy to implement, and it avoids a whole slew of problems on some platforms.  To top it all off, it has been successfully used in several high profile games!

Due to its list of benefits, its drawbacks are often overlooked.  However, I personally think it’s important to examine both sides of any technology, good and bad, to determine if it’s the best fit for a given situation.  That said, I’d like to present to you my observations on deferred shading.

For people who are not familiar with the deferred shading, I’m going to give a brief overview.  The premise is that you render out scene details, rather than render fully shaded/lit colors.  The only details that you render are the ones needed for your shading model.  For instance, you need surface normals for lighting, so therefore, you render out surface normals to a buffer (typically called a g-buffer, which I *think* is short for “graphics buffer”).  Due to the ability of modern graphics cards to render to multiple buffers simultaneously, it becomes very simple to render out all of the relevant scene details in a single pass.  Once this data is available, all shading calculations occur using the data stored in the buffers to light the scene.  Now that we’ve covered the idea behind deferred shading, I can talk about its pros and cons.

On the positive side for deferred shading, it solves several key problems:

- On some platforms, such as the PC, draw calls are very expensive.  By using deferred shading, you separate texture blending from shading.  A separate draw call is required for objects that have different shaders, different textures, or different shader parameters.  Because texturing and shading are broken up into two distinct stages, the number of parameters that need to change on a per object basis decreases.  Therefore, more objects can be rendered in the same draw call.  This is one of deferred shading’s biggest wins!

- Geometry doesn’t have to be processed multiple times.  In multi-pass renderers (renderers that draw an object once per light — Doom 3 is a good example of this), you have to render an object once for every light that affects it.  This requires geometry to be retransformed, and on platforms that have unified shading units (such as the X-Box 360 or any DX10 hardware), this gives you more shading units during the lighting phase.  Not only do you save time processing geometry, you actually gain more processors for use with shading!

- It helps with small triangles.  Because triangles only have to be rendered once, small triangles won’t hit your framerate as hard as they would in a multipass or forward renderer.

- It’s simple.  It’s very easy to write a deferred shading engine.  Because you don’t have to write complex batching systems or figure out how to manage shaders, you can simply avoid those things entirely.  Those take time and energy, and they’re typically not very fun to write.

Now then, deferred shading isn’t all rainbows and ponies… it’s time to cover the problems!

- On the XBox 360, you don’t have enough framebuffer space to render to high-precision render targets.  Therefore, you end up with very limited framebuffer space and have to put in all kinds of hacks to make the most of what you have.  The list of these hacks is quite extensive, but it generally means that you lose color specular (or that you end up with incorrect specular lighting colors).

- You have to transform normals into view-space or world-space.  If you use tangent-space normal maps, then you have to pass a rotation matrix from your vertex shader into your pixel shader.  In multipass lighting engines, you can simply perform lighting directly in tangent space.  This means that you only need to transform lighting vectors in the vertex shader, which can then be passed directly to the pixel shader for lighting (this is quite efficient).  Also, simply interpolating the rotation matrix from the vertex shader to the pixel shader has a cost.  Interpolation of variables is not free!

- You have to render to multiple render targets.  This is one of *the* biggest bottlenecks.  This requires a high write bandwidth (imagine a simple low-precision case of 4 RGBA buffers… that’s 16 bytes per pixel, plus an additional 4 bytes for depth/stencil) and the video card can have trouble keeping up.  This means that rendering to your g-buffers is not cheap!  Even when reading data from the g-buffers, because it’s uncompressed, it requires much more bandwidth than the compressed textures that would typically be used.  Note that on some platforms, requiring a high write throughput is not a problem.  For instance, the X-Box 360 has 10MB of very fast RAM that acts as the framebuffer.  Thanks to that high speed RAM, it’s very unlikely that you will become framebuffer-write limited.

- You lose the ability to change lighting models for objects.  Because you render all of your scene data to g-buffers, in order to get different lighting models on objects, you have to encode the required lighting model into your g-buffer somewhere.  In a shader that calculates lighting, you would have to branch based on the lighting model for a given pixel.  Because branching is prohibitively expensive in a case like this, it’s almost always avoided.

- When shadowing a scene, you have to transform individual pixels into shadow space to determine whether or not they should be shadowed.  This increases the cost of shadowing considerably.

To sum everything up, deferred shading cuts down a lot of the management complexity, but it typically does so at a cost to fill rate.  In an environment where bandwidth is at a high premium, I wouldn’t necessarily recommend going with a deferred shading engine.  However, it really all depends on your requirements.  If there is a chance that you’re going to be draw call limited, then deferred shading is a very practical way of circumventing that entire problem.  If you’re going to be fill rate limited and you have programmer time to burn, then you might want to consider a multipass renderer.  Not many people realize this, but I worked on a shipped title that was of the “Builder” genre that used a multipass renderer.  Thanks to a very sophisticated batching system, it could handle a whole lot of lights — more than most deferred shading engines could achieve!  That said, I personally believe you can achieve better performance and better quality through a multipass renderer in almost all cases.  However, implementing a multipass renderer can require a great deal of infrastructure and frankly, it’s sometimes just better to put your time towards something else such as special effects, particles, or other eye candy.

Advertisement

~ by ebray99 on June 28, 2010.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Connecting to %s

 
Follow

Get every new post delivered to your Inbox.