OTMPOLY.DOC - Complete HOW TO of polygons
released 12-01-94
by Voltaire/OTM (Zach Mortensen)
email -
mortens1@nersc.gov
INTRODUCTION
After receiving numerous requests to do so, I have
compiled a HOW TO doc on polygon filling. It seems that a lot
of people out there are a lot like myself, they really dislike
using other people's code because it is extremely difficult to
figure out, especially if it is highly optimized. Sometimes
text files are the answer, often times however they do more
harm than good. When I was writing my 3d engine a few
erroneous text files caused me to waste several days debugging,
and in the end I wound up deriving everything on my own.
Hopefully I have learned from my painful experiences and will
make my explanations clear and concise, yet still offer ideas
for optimization. My polygon routines are among the fastest I
have seen on the PC, but that is only because I am a
perfectionist obsessed with being the best ;)
I have attempted to arrange the sections of this
document in a somewhat logical fashion; that is, you should
feel confident in one area before attempting to move on to the
next. This is the order in which I did things, so I figured
what worked for me will most likely work for everyone else as
well. For my code examples I will use C++, and I will give
ideas for assembler optimizations of these routines.
Everything here applies to triangles only, although it
is possible to adapt the techniques used here to four sided
polygons (quadrangles?) as well. Why use triangles? The
simple and straightforward answer: triangles are the most
mathematically correct. Take any three points in space, ANY
three, and you can make a two dimensional plane out of them.
If you were to take four points on the other hand, it is very
likely that you will end up with a 3 dimensional shape instead.
Triangles also make vector operations (dot and cross product)
much simpler.
SCAN CONVERSION
The process of determining the coordinates along the
edges of a polygon is known as SCAN CONVERSION. The name comes
from the fact that most people use the horizontal scanlines on
the screen as the basis for doing this. You will need to scan
convert each edge of every polygon you draw, and to do this you
need two points (the endpoints of the edge) P1 (X1, Y1) and P2
(X2, Y2)
P1 . (X1, Y1)
\
\
\
\
\
\
\
P2 \. (X2, Y2)
Now it's time to take a little trip back in time to the Algebra
you suffered through in Junior High (if you ever thought you
could be a coder without using any math you may as well turn
off your computer right now and throw it out the window). The
equation
y = mx + b
should be indelibly etched in your mind. To refresh the memory
those who were asleep in class, m is the slope (change in
Y/change in X or dy/dx if you are a calculus type), and b is
the y intercept (value of y when x = 0). A variation of this
formula you learned in your Algebra II class is
y - Y1 = m(x - X1)
which is much more useful in our case, because we really don't
care about the value of y at (x = 0).
What I am about to say will confuse a lot of you, but
I'm going to say it anyway. When filling polygons, we need to
switch all the Xs and Ys in the above equation. WHY??? The
reason is simple: the above equation is solved for y in terms
of x, which means "you feed me an x value, and I'll tell you
the y value at that point." When filling polygons, we would
much rather draw horizontal lines than vertical lines, because
horizontal line drawing is MUCH easier and much faster.
Therefore, we want an equation that will give us the x value at
a given y:
x - X1 = m(y - Y1)
That's simple enough, but there is a catch. The m in the above
equation is no longer dy/dx, instead it is dx/dy. This is
purely logical, dy/dx can be read as "change in y with respect
to change in x." Since we are changing y instead of x in the
above equation, it is logical to have m = "change in x with
respect to change in y."
When you begin scan converting an edge, you will only
know X1, Y1, X2, and Y2. Therefore, you need to calculate m.
The formula for this is
(X2 - X1)
m = ---------
(Y2 - Y1)
Once we have a value for m, we can determine the x
value of any given y. We will find x values for all integer y
values between Y1 and Y2 in order to determine the edges of our
polygon, so we know where to start and stop drawing horizontal
lines when we fill it. One equation which will give us these
values is
x = m(y - Y1) + X1
Now, you could solve this equation for every integer y
value between Y1 and Y2, but this is extrememly slow. If you
remember that we are starting at y = Y1 and we only interested
in the value of x at each scanline (we are using integers for
y), you can do a little trick to simplify this equation:
point 0 -> x = m((Y1 + 0) - Y1) + X1
point 1 -> x = m((Y1 + 1) - Y1) + X1
point 2 -> x = m((Y1 + 2) - Y1) + X1
...
point n -> x = m((Y1 + n) - Y1) + X1
It is obvious that ((Y1 + n) - Y1) is simply n, so we arrive at
the general equation
x = m(n) + X1
Now, if we take the difference between two consecutive x
values (x values at scanlines n + 1 and n), we find that
(m(n + 1) + X1) - (m(n) + X1)
= mn + m + X1 - mn - X1
= m
WOW, that makes life easy. The difference between the x values
of consecutive y values is simply m. That means that we now
have a recursive definition for x (recursive means that the
value of each term is based on the value of the previous one)
Xn = X(n-1) + m
So by adding m to the x value of the previous scanline, we can
determine the x value at the current scanline. Of course we
also know the starting point of this sequence, X1. Some C++
code that implements this might look like
float x, m, edge[200]; // use edge to store the x
int count; // values at each scanline,
// there are a maximum of 200
// lines on the screen, so we
// need room for 200 x values
m = (X2 - X1)/(Y2 - Y1);
x = X1;
for (count = Y1; count < Y2; count++)
{
edge[count] = x;
x += m;
}
There is another problem here, we are dealing with floating
point numbers, which are incredibly slow in calculations if
you don't have a coprocessor. The solution to this problem is
to use fixed point integer math. In fixed point, you multiply
each number by a scaling factor. When you perform any
calculations, divides in particular, the precision of your
answer is accurate to a fixed number of decimal places, hence
the name fixed point.
The most common implementations of fixed point scale by
factors of either 256 or 65536 because multiplying and dividing
by these numbers can be accomplished by shifting bits, and each
of these numbers is equivalent to half a register in assembler.
In order to use 16 bit or 8.8 fixed point, scale by 256 (you
now have 8 bits for the integer part and 8 bits for the decimal
part), and scale by 65536 in order to use 32 bit or 16.16 fixed
point (16 integer bits, 16 decimal bits). When you are done
with all your calculations and need an integer answer, you
simply use the top 8 or 16 bits of your fixed point integer,
depending on what your scale factor was.
The above example converted to fixed point would look something
like this
int x, m, count, edge[200]; // if you are using 16 bit
// code, these should be of
// type long rather than int
m = (X2 - X1) << 16; // dx scaled to 16.16 fixed
// point, I'm assuming you are
// using 32 bit code
m /= (Y2 - Y1) // notice that you do NOT scale
// the denominator, if you did
// you would lose all precision
// in your answer
x = X1 << 16; // scale the starting x value
for (count = Y1; count < Y2; count++)
{
edge[count] = x >> 16; // store only the integer part
x += m;
}
Now that you have an incredibly fast way to scan
convert a single edge, you must do this for all the edges in
your polygon. Here's another place where triangles are better
than quadrangles: on any triangle, there is a top, middle, and
bottom point. The side connecting the top and bottom points is
ALWAYS the longest. Therefore, the best strategy for scan
converting an entire triangle is as follows:
1. Set up two edge lists, one for the right and one for the
left side of each scanline.
2. Scan convert the longest edge and store it in the left edge
list.
3. Scan convert the remaining two edges and store them in the
right edge list.
4. Well...FILL IT OF COURSE!
FLAT FILLING
Flat filling is the simplest and least impressive form
of polygon filling. The entire polygon is filled with one
color. In order to flat fill a polygon, you need to write two
routines
1. A routine to scan convert an entire polygon
2. A routine to draw a horizontal line
THAT'S IT!
Horizontal line drawing is very simple in chained
(packed pixel) video modes such as VGA mode 13h. For the sake
of simplicity, I will use mode 13h as the example here.
Your horizontal line routine should definitely be in
assembler, because it is going to get called a LOT.
Preferably, it will be integrated into your poly filler so you
don't have to waste time pushing arguments and calling another
procedure.
To draw a horizontal line you need to know 4 things:
the starting and ending x values (X1, X2), the y value (Y), and
the color (C).
The first thing to do is make sure that X1 < X2. If
not, switch them before you continue.
I'm not going to cover mode 13h graphics basics here
because they are too basic. If you don't know mode 13h yet,
you have no business writing anything until you learn it! The
easiest way to draw a horizontal line in mode 13h is to
1. Determine the starting memory address of the line (A000h +
(320 * Y) + X1)
2. Determine the length of the line in pixels
3. Store (length) bytes of value (color) starting at the
starting memory address.
In order to make use of the 32 bit processor in your
machine, you will want to store doublewords instead of bytes.
This makes flat filling in mode 13h almost as fast as flat
filling in mode x. Too store doublewords, you need to
1. Perform steps 1 and 2 listed above
2. Convert your byte color value into a dword (just make it 4
bytes in a row of your original color)
3. Store (length / 4) doublewords
4. Store (length % 4) bytes (a quick way to do (length % 4)
is (length & 3))
Once you have your horizontal line routine up and
running, you need to integrate it into your scan converter.
The easiest way to do this is to make a loop from Y1 to Y2
where you draw horizontal lines between the edges of the
polygon. Here's some pseudo code
scanLongSide();
scanMidSide();
scanShortSide();
for (count = Y1; count < Y2; count++)
hline(lEdge[count], rEdge[count], count, color);
Easy as pi...
CLIPPING
We quicly run into a problem in mode 13h, images drawn
off the screen wrap around to the next scanline. This is not
very aesthetically pleasing to say the least. In protected
mode, this poses an even nastier problem. Any memory writes
outside the 64k window reserved for the VGA produce a general
protection fault, very nasty. The answer to these problems is
to "stay inside the lines" when we are drawing, to "clip" our
drawings so we only draw what is physically on screen.
When incorporated in a poly filler, the most
rudimentary form of clipping checks y values at scan conversion
time and x values when drawing horizontal lines.
Scan conversion C++ code that implements y clipping would look
somewhat like this
for (count = Y1; count < Y2; count++)
{
if ((count >= 0) && (count < 200))
edge[count] = x;
x += m;
}
Notice that you change the x value every time through the loop,
but you only store the edges that are on the screen. This
assures that the x values you get later in the loop are
accurate.
Clipping in the x direction is best if done in the
horizontal line routine, and is even easier to implement than
y clipping in the scan conversion. Once you make sure
that X1 < X2, all you need to do is substitute 0 for X1
if X1 is smaller than 0, and 319 for X2 if X2 is greater
than 0. Also, if X1 > 319 or X2 < 0 you don't need to
draw anything, the line is entirely off screen.
Some more complicated clipping that will improve your
performance significantly involves checking to see if the
polygon is on screen before you draw. Check your vertices to
see if the maximum x value is less than 0, the minimum x value
is greater than 319, the maximum y value is less than 0, or the
minimum y value is greater than 199. If any of these are true,
your polygon is completely off screen and you have no need to
scan convert or fill.
GOURAUD SHADING
If you have never heard of or seen gouraud shading,
I pity you. It is the easiest way to make your boring flat
shaded polygons come to life. In order to explain gouraud
shading, I ask you to bear with my while I digress and explain
a bit about 3d math.
According to Lambert's law, the intensity of light
falling on a plane is directly related to the angle made when
the light vector intersects the normal vector of a plane.
Lambert shading a polygon involves taking the dot product of
the normal vector and the vector of the light intersecting it,
which gives the cosine of the angle made by the intersection.
Based on this value, the color of a plane can be calculated.
I'm not going to give any further explanation than this because
I don't want to be up all night typing.
Gouraud shading is a simple extension of Lambert
shading. Instead of finding the intensity of light falling on
a plane, you determine the average intensity of light at a
vertex based on a normalized average of the normal vectors of
all the planes that share that point. Once again, I'm not
going to give any further explanation than this of the math
behind Gouraud shading. Suffice to say that Gouraud shading
uses a separate color value for each vertex of each polygon
rather than a single color for the entire plane.
After mathematically determining the color of each
vertex, the color of each point in the plane can be
approximated using linear interpolation. First, the color
values along the edges are interpolated between the color
values at each vertex. Then color values along each scanline
are interpolated between the color values at the edges of the
line
Start with /16 interpolate 16/16 interpolate 16/16
color values / / edge E/ /E scanline E/E/E
at vertices / / colors C/ /D colors C/CD/D
/ / A/ /B A/AAB/B
/ / 8/ /A 8/899A/A
/ / 6/ /8 6/67778/8
/ / 4/ /7 4/455667/7
/ / 2/ /5 2/2333445/5
/--------/ 0/--------/4 0/--------/4
0 4 001122334 001122334
step 1 step 2 step 3
Scan conversion is a form of linear interpolation in
which x values are interpolated between two known points.
Scan conversion for Gouraud shading very similar, but instead
of interpolating only x values, we interpolate x values and
colors. A Gouraud polygon filler must be given more
information than a flat filler. For each vertex, we need to
know (X, Y, C) instead of just (X, Y). As I said before, we
also need to trace color values while scan converting. This
makes the scan conversion more complicated, however, color is
traced (interpolated) in exactly the same way as x
mx = (X2 - X1) << 16;
mx /= (Y2 - Y1);
mc = (C2 - C1) << 16;
mc /= (Y2 - Y1);
x = X1 << 16; // scale the starting x value
c = C1 << 16; // scale the starting c value
for (count = Y1; count < Y2; count++)
{
edge[count] = x >> 16; // store only the integer part
color[count] = c >> 16; // store only the integer part
x += mx;
c += mc;
}
The resulting color values are stored in color lists
corresponding with the existing edge lists. Now we have two x
values and two color values for each scanline. The color
values for each scanline are not equal, so we need to
interpolate colors along each scanline as we draw. This is
done almost exactly as scan conversion is done, with the
sole exception that we use dc/dx instead of dx/dy.
The Gouraud horizontal line routine needs to be passed
x values for the start and end of each scanline (X1, X2) as
well as color values (C1, C2) and a y value (Y). Once
again, you need to make sure that X1 < X2. If you switch
X1 and X2, be sure to switch C1 and C2 as well! Here is some
C++ code for a gouraud hline routine
mc = (C2 - C1) << 8;
mc /= (X2 - X1);
c = C1 << 8;
for (count = X1; count < X2; count++)
{
setPixel(count, y, c >> 8); // set pixel at (count, y) to
// color c
c += mc;
}
Notice that I used 8.8 fixed point here. This is because you
are GUARANTEED that you will not have any color greater than
255 when you are using mode 13h, so you can shift the value
left 8 bits without any word overflow. The reason you want to
use 8.8 fixed point is so you can do a little trick when you
convert to assembler that will allow you to eliminate the shift
right
; first, get the starting memory address of the line in edi
; and the number of pixels to draw in ecx
mov edx, mc ; (dc/dx * 256)
mov ebx, C1 ; starting color
shl ebx, 8 ; C1 * 256
ghlLoop:
mov [edi], bh ; draw the upper 8 bits (integer part)
add ebx, edx ; c += mc
inc edi ; go to next screen location
dec ecx ; pixels to draw --
jnz ghlLoop
That inner loop is VERY FAST, it should almost run in one
memory wait state, which means that you get all the cpu cycles
for free while you are waiting to be able to write to memory
again.
Clipping here is implemented exactly the same as for flat
polygons.
Z-BUFFERING
Ah yes, the crux of the biscuit indeed. Z-buffering is
a technique used by more advanced 3d systems, it allows you to
do several things. First, z-buffering speeds up 3d code by
eliminating plane sorting. You can draw planes in any order,
and they will come out looking right. This is a big
advantage when drawing objects with many faces, because sorting
routines take exponentially longer to sort larger data sets.
Second, z-buffering correctly draws intersecting polygons
WITHOUT having to calculate where they intersect, which
produces some impressive effects and doesn't require any extra
calculation.
Z-buffering accomplishes these feats by interpolating z
values between vertices and scanline edges much in the same way
Gouraud shading interpolates color. The resulting z values for
each pixel on the screen are stored in a 'z-buffer' containing
(screenWidth * screenHeight) elements (64000 in mode 13h).
Before a new pixel is drawn, the z-buffer value for that
pixel is examined. If the new pixel's z-value is less than
(closer to the viewer than) the z-buffer value, the new z value
is stored in the z-buffer and the pixel is drawn to the screen.
Otherwise, the z-buffer and the screen remain unchanged.
Through this process, the pixel at each screen location which
is closest to the viewer (and therefore not obscured by
anything else) is always displayed.
Because we don't want to have a 3d world which is only
128 pixels deep, the z-buffer elements cannot be bytes; we need
to store at least 16 bits of z data in each array element.
This allows us to have a world which is 32768 pixels deep,
adequate for most applications. Right away, we see a big
problem with z-buffering, MEMORY! It takes a LOT of memory to
store all those z values, 128k for a 16 bit z-buffer in mode
13h. In my opinion, the advantages of z-buffering disappear
when you are in real mode due to the incredible amount of
memory required, so switch to protected mode before attempting
to implement z-buffering.
As I said before, z-buffering is almost identical to
Gouraud shading, with the exception that z values are
interpolated instead of color values. Of course, you need to
pass some more information to your z-buffered poly routine,
namely the z values of each vertex. When scan converting, store
z values in z lists the same way you store color values in
color lists. Here's some C++ code
mx = (X2 - X1) << 16;
mx /= (Y2 - Y1);
mz = (Z2 - Z1) << 16;
mc /= (Y2 - Y1);
x = X1 << 16; // scale the starting x value
z = Z1 << 16; // scale the starting z value
for (count = Y1; count < Y2; count++)
{
edge[count] = x >> 16; // store only the integer part
zval[count] = z >> 16; // store only the integer part
x += mx;
z += mz;
}
When you are tracing horizontal lines, the z values are
interpolated in exactly the same way as they are with Gouraud
shading. After you determine a z value, check it against the z
value stored in the z-buffer for the current screen location to
see if the current pixel is visible. If it is, draw it;
otherwise skip it and go on. Here's some C++ again
mz = (Z2- Z1) << 16;
mz /= (X2 - X1);
z = Z1 << 16;
for (count = X1; count < X2; count++)
{
if ((z >> 16) < zBuffer[y * 320 + x])
{
setPixel(count, y, c);
zBuffer[y * 320 + x] = z >> 16;
}
z += mz;
}
Of course, you don't perform the slow z-buffer address
calculation every time through the loop. Just find the
starting offset into the screen, use that to find the starting
offset into the z-buffer, and then increment the z-buffer
pointer by 2 (for words) every time through the loop. Also
make sure you use 16.16 fixed point here.
Don't forget to clear the z-buffer every time you clear
the screen. This is accomplished by filling it with 32767
(7fffh), or whatever your maximum depth is.
By now you should know how to clip, it's exactly the
same for z-buffered polygons as it is for all others.
Notice that all the above examples deal with flat
shaded polygons. This is because once you have written a
Gouraud poly filler, it will take you about a half hour to
convert it to a flat z-buffered filler if you were smart about
how you wrote your code. If you are feeling REALLY brave, you
should try writing a Gouraud shaded z-buffered poly routine;
but let me warn you - YOU _WILL_ RUN OUT OF REGISTERS! Besides
that, the code will be huge. My gouraud shaded z-buffered poly
routine was well over 1200 lines of assembler, about twice as
long as this file! But it's by no means impossible, and it's a
lot of fun just sitting and watching two gouraud shaded objects
intersecting each other when you are done.
CLOSING COMMENTS
WOW, I didn't intend to write this entire text in one
night. Oh well, at least it's done. Hopefully I have been of
some assistance to just about everyone who reads this file. I
apologize if some of the basics were too simple or if some of
the more complex points were not covered in enough detail, but
that is what you risk by catering to such a wide group of
people.
If you need any tips on optimization feel free to mail
me (mortens1@nersc.gov). The source to my flat and Gouraud
poly fillers has already been released as part of my 3d engine
(V3DT090.ZIP availible via ftp at hornet.eng.ufl.edu
/demos/code/library/graph/), and my z-buffering code will be
released shortly (as soon as I clean it up and document it so
that it is READABLE ;))
GREETS
Siri - I LOVE YOU I LOVE YOU I LOVE YOU
All of OTM - What can I say...we rule ;)
Hurricane/OTM - TED LIVES!
Zilym Limms/OTM - I can't wait to convert OP to pmode ;)
Alex Chalfin/OTM - we need to get you a handle ;)
Patrick Aalto - for explaining Gouraud shading to me
Arnold -_
Hans -_- Larry Liverwurst Natural Lavatory
Lothar -
Tran & DareDevil - for PMODE/W
All my #coders buddies...(no particular order)
Bone_
Trinel
ShadowH
bri_acid
OC
Addict
doom
StarScrm
MainFrame
BEAn_dIP
GodHead
Moomin
Zep
Druggie
Stimpee
pel
Opium
PhulAdder
Shades
BarryE
MindRape