Color and Lighting A big part of creating a believeable and compelling 3D environment for gaming lies in the creative and realistic use of color, light, and shadow. Because shadows are themselves 2D projections of 3D objects, the human visual system can use them to provide additional cues to aid in understanding a scene. The end result is a picture with far more feeling of "depth" than if shadows are ommitted.
Notice how the shadows make the "floating" crate appear more in the foreground in the second image. Also notice how the natural shading of corners reinforces the separation of the floor and walls, and emphasizes the subtle angles of the diagonal walls. (You may need to click through the thumbnails and compare the full size 640x480 images to see these effects clearly.) The rest of this article will discuss techniques for getting from A to B without sacrificing rendering performance. Color Models and ModesComputers represent and reproduce color (which is by nature a continuous phenomenon) by quantizing it to some number of discrete levels. The more levels used in the quanitizing process, the more accurate and realistic the resulting color images are when displayed.One common model for quanitizing color is 24 bit RGB. In this color model, 8 bits are used to represent each of three primary colors (red, green, and blue) which are then mixed by summation to produce the final color (e.g. taupe or moss). This color model allows for the representation of 224 (more than 16 million) colors, which approaches the color resolution of mainstream media such as 35 mm photographic film, or professional video. A second important color model is HLS or Hue, Luminance, and Saturation. In this model, the Hue number represents the color on a scale of red - orange - yellow - green - cyan - blue - magenta - red. The Saturation represents the "purity" of the color in terms of how much white or grey has been added, and the Luminence represents how bright or dark the color is. Colors can be converted from RGB to HLS using the following function:
hls.L = (BYTE) (240.0 * (Max + Min) / 510.0); if (Max == Min) { // achromatic -> r==g==b
To represent an image in 24 bit RGB, at least three bytes are needed for each pixel of the image. This is rather expensive in terms of disk space for long term image storage and video memory space for image display, so different video modes have been created as a compromise between space and color fidelity. For gaming, we are mainly concerned with 16 bit (sometimes only 15 bit) color and 8 bit color. Video modes for 16 and 15 bit color use only two bytes per pixel, but are capable of representing significantly fewer colors than the full 24 bit color model allows. These two video modes compromise by using fewer bits per primary color (usually 5-6-5 RGB for 16 bit or 5-5-5 RGB for 15 bit, although some hardware uses other encodings). To convert a 24 bit RGB value to 16 bit color, one simply divides (i.e. arithmetic shifts right) each component by about 8 (i.e. 2 or 3 bits), and then merges the results into a 16 bit word. Although 16 bit video modes can only represent about 64000 colors, this is usually sufficient to convey a realistic image with only minor color banding artifacts. Video modes for 8 bit color use only one byte per pixel, but can only represent 256 colors at a time. This is such a small number of colors that most images can not be displayed accurately. Fortunately, many interesting images have concentrations of colors in certain small ranges with other color values being under-represented. This allows for optimizing image quality with a color palette. A color palette is a limited collection of 24 bit color values that will be used to color the pixels of an image. In 8 bit video modes, each pixel of an image must be colored with one of the 256 colors in the current palette. If a forest scene is to be rendered, then a palette consisting of mostly greens and browns may be devised to maximize the fidelity of the image. Converting a 24 bit RGB color into an 8 bit color is a matter of searching
the current color palette for the closest available match. This is usually
done using a distance minimization algorithm, where the distance between
two 24 bit colors is defined as:
double r = (c1.r - c2.r)/256.0;
double d = sqrt(r*r + g*g + b*b);
double h = (c1.h - c2.h)/240.0 * 2.0;
double d = sqrt(h*h + l*l + s*s);
Most games (including the Alpha engine) use an 8 bit video mode for rendering 3D scenes. The remainder of this article will concentrate on 8 bit color, although the techniques described can be used in any color mode. The Lighting ProblemThe basic lighting problem is determining, for each pixel of each visible polygon, what color to display, when given the source texel and the light level for the pixel. In other words:screen[j][i] = light_function( texture[v][u], light_level ); This problem can be solved using the HLS color model, and a simple model of how light affects colored surfaces. If we accept a range for the light level input of zero (total darkness) to sixteen (normal brightness) to thirty-one (over saturated), we can define a simple mapping from an input color to an output color. The mapping should preserve colors at the "normal brightness" setting, so: light_function( x, 16 ) == x As the light level is reduced from "normal", the input color is modified by reducing the Luminance component in the HLS color model. In addition, since the human eye has reduced color perception in low light settings, the Saturation compenent should be reduced accordingly. If the light level is increased from "normal", the Luminance of the input color is increased. When the light level reaches maximum saturation, the input color is mapped to almost pure white. Finally, the resulting color must be mapped to the appropriate color model. In an 8 bit video mode, this means that the nearest match for the desired output color must be selected from the color palette. The Alpha engine employs the following light model:
clr_out.L = clr_in.L * light/16.0; if (light <16) clr_out.S="clr_in.S" / (2.0 light/16.0); } Color Palette and Look Up TableThe first step in lighting a game world is to establish a palette and color look up table or CLUT. The Alpha engine can use any palette and CLUT, but all the sample images on these pages are rendered using the palette and CLUT shown in Figure 3.
The palette appears as the middle row of colors in the CLUT rectangle. It was designed to offer ranges of light-to-dark variations of several common base colors: steel blue, gold, grey, green, tan, olive, red, pure blue, yellow, and so forth. The color look up table is used to implement the light_function described above. For each input color (X) and incident light level (Y), the CLUT contains the index of the color in the palette which most closely matches the desired output of the light_function. These values are pre-calculated by a program, lighttable, which implements the light model computations defined in the previous section. Use of a CLUT reduces the previous light calculation to:
Lighting TechniquesThere are several different possible methods for computing the light level of each pixel in a game scene. Each has its advantages and disadvantages as we shall see.Flat ShadingThis is the simplest of all shading methods (except for no shading at all). In this technique, a single light level is computed for each polygon, and then used for every pixel in the polygon. I can't think of a case where this has been used in a game, but it's probably been done somewhere. Its main advantage is simplicity, and its overwhelming disadvantage is that it looks pretty lame.Flat Run ShadingWhat DOOM, Dark Forces, Duke Nukem, and most other 1PP shooters have used. This technique calculates a single light level for each horizontal or vertical strip of pixels as they are texture mapped. The light level is usually a base ambient value, minus some amount for the distance the current run is from the viewer. This technique relies on constant-z texture mapping to produce a simple depth cueing effect.Flat run shading looks acceptable, and is even perspective correct if
you don't look up or down. This technique was a good compromise at the
time, but doesn't really work in true 3D 6 DOF environments.
Interpolated shading looks pretty good, but is not perspective correct. Its main advantage is that it is supported in hardware on most 3D accellerators, and that it is reasonably easy to compute, even in software. Light MappingWhat Quake, Prey, and most other new games are using. This technique calculates an accurate light level every n pixels using ray tracing or radiosity, and then stores those values in a light map array that is analogous to a texture map. This technique allows for very accurate and expensive processing to be performed at level build time, with only a simple linear intepolation performed at run time.Light mapping generally gives the best looking results. It is perspective correct, and allows for accurate placement of shadows. Diffuse and spot light sources can be simulated because the processing is performed off-line. The major disadvantages of light mapping are the increased memory requirements, and the difficulty of dynamically modifying the light maps. One additional disadvantage is that most 3D accellerators do not provide direct support for light map based shading. Figure 4 shows a sample light map from the scene rendered in Figure 1. The light map shown is from the wall directly "behind" the floating crate in the scene. The two shadows on the light map are cast by the crate, from two different point light sources.
Rendering with Light MapsThe basic process of rendering with both a light map and a texture map is to compute perspective correct indices into each map, and then use the independent light and texture values (lumels and texels) as inputs to the light_function. Thus, our rendering equation evolves into:screen[j][i] = CLUT[ lightmap[v][u] ][ texture[v][u] ]; This form of the rendering equation implies that both the texture and the light map are the same size and resolution. In general, neither of these assumptions are true. Usually the texture is somewhat smaller than the surface being mapped, and the texture needs to be smoothly "tiled" (i.e. replicated) across the surface. Often the light map is computed at a lower resolution than the texture map. Both of these facts indicate that the rendering equation can not be implemented exactly as described above. In fact, even if both rendering assumptions were true in a given circumstance, it still would not be a good idea to implement the inner loop of the renderer using the code above. This form of the code is very inefficient as it unnecessarily repeats operations from frame to frame. The code implements the following algorithm:
Find the lumel at the same index in the light map array. Find the color at index (texel, lumel) in the look up table. Draw the color on the screen. Happily, this also solves the problems caused when the texture and light map are at different sizes and resolutions. Surface CacheingThe idea behind using a surface cache is to divide the work of the rendering equation into two phases: combining the light map and the texture map into a single surface map; and copying pixels from the surface map onto the screen. This algorithm also greatly simplifies the code for the inner loop of the renderer, which can completely ignore the existence of the lighting algorithm.A surface cache is just an array of pre-lit, pre-tiled textures that are properly aligned for fast rendering. In the Alpha engine, each surface cache entry is a 256x256 pixel array aligned on a 64 Kbyte boundary. (Note: it is possible to make better use of memory than this, with some complication of the texture mapping code. In practice, Windows NT's virtual memory manager seems to do an acceptable job, so this optimization has been considered low priority.) In addition to the surface arrays themselves, the surface cache contains index information needed to implement the least-recently-used (LRU) replacement scheme. Each polygon (or surface) in the world must have a unique id that can be used to locate its surface information in the surface cache. Alpha uses a 32 bit integer for this purpose. In addition, the surface cache must store height, width, and mip number information about each surface, as well as its age in the cache. When the rendering pipeline needs to draw a polygon on the screen, it
asks the surface cache for a pointer to the polygon's texture array, given
the polygon's unique id:
The next three arguments to the Get() function describe the data to be loaded if the entry is not already present. The cache finds the light map and the appropriate mip map of the desired texture in the world database. These structures contain both texture data and the width and height for the texture arrays. The loading process begins by finding the larger of the texture and light map dimensions, and using these for the surface dimensions. Then the texture is tiled into the surface array within the boundaries of the surface dimensions. Finally, the light map is applied to the surface data using the color look up table as described above. If the light map is of lower resolution than the texture map, then an intermediate light map is constructed by interpolating the sparse light map values. Using the surface of Figure 4 as an example, the cache finds mip map number 1 of the texture, and the appropriate light map for the surface:
Since the light map is larger, the cache uses it for the surface dimensions which are 128x128. The light map compiler always creates light maps which have dimensions that are powers of two. The texture is tiled into the cache entry, and the light map is applied to it:
One additional complication arises when mip mapped textures are used. If the surface being rendered is distant from the view point, a larger mip number is selected by the rendering pipeline. The mip mapped version of the texture is of smaller dimensions than the original and it will therefore use less memory in the cache. The problem is that the light maps are not also mip mapped. This means that the texture map and light map are no longer at equivalent resolutions. The cache deals with this by sampling the light map every nth lumel to scale the light map down to the appropriate resolution. This can cause some visible artifacts when the sampling interacts with a dithered light map:
In practice, this still tends to look better than the "sparkling" effects of not mip mapping at all. Light Map CalculationNow that we have seen how light maps are applied during the rendering process, we will examine how the light maps are created in the first place.Most of the light map calculation is done once, by the level editor (or an associated program such as Quake's light) when the level is compiled. The basic process is quite simple: The level editor starts by creating a lightmap for each polygon. Each lightmap is a rectangular array large enough to paint the associated polygon. The level editor may trade off image quality for reduced memory needs by computing a light level every nth pixel. For each lumel in the lightmap, the level editor calculates a light level using a Lambertian illumination equation for each visible light source. The Lambertian (or diffuse) illumination equation is: I = Ia + fatt Ip kd (N • L)Where Ia is the ambient light level, Ip is the point light source intensity, fatt is the light-source attentuation factor, kd is the diffuse-reflection coefficient, N is the surface normal of the polygon, and L is the normalized vector from the lumel to the point light source. This equation basically says that the reflected light at any point on a polygon is equal to the (reflectected) ambient light in the room, plus the reflected directional light. The amount of directional light reflected decreases according to the obliqueness of the incident ray; the more "straight on" the light ray falls, the more light will be reflected by the surface. Furthermore, the amount of reflected light decreases according to the attenuation function as the distance between the light source and the reflecting surface increases. In reality, light is attenuated proporational to the square of the distance from the light source. However, using a true square-law attenuation is problematic in an 8 bit color model. Therefore it is common to use an attenuation function of the following form: fatt = 1 / ( adL2 + bdL + c )The attenutation coefficients a, b, and c are chosen to produce images that "look good". The Alpha level builder uses a = 0, b = 0.5, and c = 0.1. In C++, this calculation looks like:
// compute L and distance:
// check for ray-polygon intersection
// clamp light to reasonable values
To correctly compute shadow boundaries, this step must check the visibility
of each light source for each lumel by determining whether a light ray
cast from the light source to the lumel is occluded by any polygon in the
level. This last step requires a moderately expensive 3D ray-polygon intersection
test for each lumel times each light source times each polygon. In the
sample code above, this operation is handled by the blocked()
function. This routine iterates through all the polygons in the level and
tests for ray-polygon intersection in 3D.
Since most polygons will not occlude the light ray, we can improve
performance by using a series of progressively more expensive and accurate
tests. The first test checks whether the ray-plane intersection point falls
between the two endpoints of the light ray. If it does, the second test
checks whether the light ray pierces the cubic bounding volume of the polygon.
If this test also passes, the third test checks to see whether the ray-plane
intersection point lies within the polygon boundaries. A major performance
improvement comes from using the PVS to ignore both light sources and polygons
that are known not to be visible from the polygon being illuminated.
Figure 8 shows the effects of using a lower resolution light map with
simple linear interpolation. This is approximately equivalent to the light
mapping style used in Quake.
Figure 9 shows the results up to this point in the lighting algorithm
with high resolution lightmaps. Each lumel in each lightmap is computed
by finding the nearest integer to the true light level (which is computed
as a double precision value). This results in visible and distracting "banding"
artifacts where the light level suddenly changes from one integral value
to another. This technique only produces acceptable results when used with
"noisy" textures (like the carpeted floor), which mask the banding.
The last two steps in the lighting algorithm greatly improve the image
quality with only a minimal performance penalty at level compile time:
The first step is to pass the lightmap array through a digital smoothing
filter. This blurs the edges of the shadows resulting in a softer, more
natural look. Finally, the lightmap is converted from double precision
to integral values by using alternating Floyd-Steinberg error diffusion.
This step "dithers" the boundaries between adjacent light levels to produce
smooth gradations of illumination.
The smoothing filter blurs the edges of the shadows in a cheap simulation
of area light sources. Correctly computing soft edges with area light sources
in the illumination algorithm above means extra computation on each lumel
to determine if it is partly shaded by a polygon. For this to be feasible,
we should probably compute shadow boundaries in object precision using
Weiler-Atherton or a "shadow volume" approach.
The smoothing filter is a simple 3x3 digital filter with the following
matrix:
The key visual enhancement comes from using a simple error diffusion
to dither the light level boundaries. When a double precision light level
(e.g. 12.48) is quantized to an integral value (e.g. 12), a fractional
light level is "lost" in the process. This fractional light level is the
error between the true value and what can be represented in a limited number
of quanta. Error diffusion algorithms work by saving this error value and
spreading it to neighboring pixels, thereby minimizing the error over a
larger region.
The Alpha level builder uses an alternating direction Floyd-Steinberg
error diffusion to dither the lightmaps. The Floyd-Steinberg algorithm
spreads the error to adjacent pixels according to the following diffusion
matrix:
The use of high resolution lighting with accurate shadow volumes and
error diffusion combine to create high impact images that heighten the
sense of place and realism. The net effect is to draw the player more deeply
into the game world and to produce a more satisfying experience.
Hosted by Irth Networks - http://www.irth.net |