4.1 Z-Sorting
The idea of z-sorting is that we sort polygons to an
order determined by the average of their z coordinates. It's easy to use
quicksort to achieve this, as in the following pseudo code:
- function quicksort(left, right)
- if left<right
- q = partition(left,right)
- quicksort(left,q)
- quicksort(q+1,right)
- endif
- endf
- function partition(left, right)
- - crd is the table of the rotated vertices
- - face is the polygon table
- x = crd[face[left,0],2] + crd[face[left,1],2] +
crd[face[left,2],2]
- a = left-1
- b = right+1
- loop forever:
- decrement b by one as long as (crd[face[b,0],2] +
- crd[face[b,1],2] + crd[face[b,2],2]) is less than x
- increment a by one as long as (crd[face[a,0],2] +
- crd[face[a,1],2] + crd[face[a,2],2]) is greater than x
- if a<b
- else
- break looping and return b
- endif
- endloop
- endf
Now we just draw the polygons straight from the
table.
Z-sort is really not a perfect sorting technique but it's enough for many
purposes (for example many asm97-demos used z-sort). Anyway, I suggest
learning also at least Z-buffer and maybe BSP-tree and S-buffer if you have
time and interest.
4.2 Z-buffer
Z-buffer is the easiest (but by no means the fastest) way
of doing objects which can intersect each other. Contrary to BSP-tree,
Z-buffer doesn't require splitting polygons when they intersect. BSP-tree is
unbeatable for stable scenes, though. As the name indicates, we use a buffer
(a table), into which we save the z and color values of the points. The size
of this table should be the same as the size of the screen, for example in the
MCGA mode 64000 at least 16bit elements. Z-buffer looks like this in C:
Before we start drawing anything
to the z-buffer, we need to set the buffer to the maximum value it can reach.
Then, when drawing polygons, we interpolate also z. In the inner loop we
perform a compare, and if the current z is less than the z in the z-buffer, we
set the current z as the z-buffer value and draw the pixel. If not, we
continue with the next pixel.
This check slows the routine down dramatically, but luckily we can
interpolate z just like x or anything else: only one add per per pixel (plus
the check). Finally, when all the points of all the polygons have been
checked, we draw the z-buffer into the screen.
A short piece of pseudo code from the hline:
- z = z1
- for x=x1 -> x2
- if (z < zbuffer[y*320+x])
- zbuffer[y*320+x] = z
- plot(x,y,color)
- endif
- z = z + kz ; interpolate z
- endfor
About optimizing: you don't need to reset the
z-buffer so often (never I'd say) if you interpolate 1/z and perform all the
operations above but change the compare from '<' to '>' :)
How to implement it, you ask. Easy: 1/z is always at the range 0..1, so if
we substract 1 from each value of the z-buffer, it's 'reset' (every possible
value of 1/z is greater than this value which is at the range -1..0). The best
way is to use this trick in the inner loop:
- 1st frame:
- if (new_1/z_value + 1) > zbuffer[].z
- 2nd frame:
- if (new_1/z_value + 2) > zbuffer[].z
etc.
It's strongly recommended to interpolate 1/z instead of z even if you
didn't use the trick above -- in the screen space, z isn't actually linear
when 1/z is (see 3.3.1
Perspective correction), so it's is interpolated right only when using 1/z
interpolation. So the pseudo code above changes a bit (opz = one per zed):
- opz = opz1
- for x=x1 -> x2
- if ( (opz+cur_frame) < zbuffer[y*320+x] )
- zbuffer[y*320+x] = opz
- plot(x,y,color)
- endif
- opz = opz + k_opz ; interpolate z
- endfor
4.3 BSP-Tree
BSP-tree (Binary Space Partitioning tree) is a very
hard-to-implement but great sorting technique. I'm not describing how to code
linked lists, so if you can't use them, find a good programming book and read
about it. The special case of linked lists, a binary tree, means that
every element of the tree is attached to two other elements below it in the
hierarchy:
Etcetc.
Implementing this should be straightforward so I don't believe you'll have any
problems.
4.3.1 The main idea
BSP-tree is as easy use in 2-space as in
3-space. It's easier to draw a 2D world so I'll describe the main idea using
one. Here are six lines:
(X = camera
location) These should be drawn in the right order using BSP-tree. First
we'll create the tree:
We start with the line 1. We check on which side of it the rest of the
lines are, and make a binary tree in which we put the lines depending on the
side they're located. If some lines are partly at the left, partly at the
right side of the line 1, they're split, and the two pieces are placed one
to the left, one to the right side. Then we take the line 2 and perform the
same thing with the lines 3, 4, 5, and 6, and go on like this until all the
lines have been arranged. If we want, we can stop in some certain point and
sort the remaining lines with some else sorting technique.
When we're ready to draw the lines, we perform the next operation: We
check, on which side of line 1 the camera is located. If it's on the left,
we check if there are lines at right and if there are some, we go
down the right branch. When the right branch is ready, we draw the line one
and go down the left branch. REMEMBER to draw the line 1 in the place
mentioned.
In our example, we perform the following splits:
The binary
tree looks like this:
The drawing
order of the lines in respect to the camera point in the picture is so the
following:
- 1) We're on the right side of 1. 2a is drawn first, then 1, then we go
down the right branch.
- 2) We're also on the right side of 2b, so we go down the left branch.
- 3) We're on the left of 3a, so we draw first 3a, and then go down the
left branch (no lines on the right).
- 4) We're on the right of 4. So the order is 5c, 4, 5b.
- 5) We return back to 2b and draw it, and go down the right branch
towards 3b.
- 6) We're on the left of 3b so we draw 6a and 3b and go down the right
branch.
- 7) Related to 5a we're at left, so we draw 6b and finally 5a.
So the final order is 2a,1,3a,5c,4,5b,2b,6a,3b,6b,5a. Seems to be
quite rational. I suggest testing the functioning of BSP with a paper. I
myself drew a 18-line 'world' just to clarify the idea of BSP-tree. An A4
full of strange lines and symbols X)
4.3.2 Required formulas
Neat thing that, but how to implement it?
Here I'll describe the use of all the needed formulas.
The equation of a plane is the following:
Nx * (X-ax) + Ny * (Y-ay) + Nz * (Z-az) = 0,
where N is the
normal vector of the plane, X, Y, and Z arbitrary variables, and (ax,ay,az)
one point on the plane. Now we should derive the equation of the plane going
through all the vertices of our polygon. To do it, we need the normal
vector, which can be calculated by creating two vectors from the polygon
vertices and by taking their cross product. Now we place the coordinates of
the desired point into the equation to the places of X, Y, and Z. If we get
zero as a result, the point is on the plane. If the result is negative, the
point is located on the one side of the plane, and if it's positive, it's on
the other side. This technique is used to determine on which side of a plane
the points are.
If all the vertices of a polygon aren't at the same side of a plane, the
plane and the polygon intersect each other. In this kind of situation we
need to calculate the intersecting points and with the help of those form
two new polygons which are on different sides of the plane. To do this, the
equation of a line in 3-space is required:
- X = X1 + (X2-X1)t
- Y = Y1 + (Y2-Y1)t
- Z = Z1 + (Z2-Z1)t
This line goes through the points
(X1,Y1,Z1) and (X2,Y2,Z2) t getting all real values. When we place these X,
Y, and Z to the places of X, Y, and Z in the plane equation, we've get the
coordinates of the intersection point:
- Nx*(X1+(X2-X1)t-ax) +
- Ny*(Y1+(Y2-Y1)t-ay) +
- Nz*(Z1+(Z2-Z1)t-az) = 0
We solve t:
- Nx*(X2-X1)t + Ny*(Y2-Y1)t + Nz*(Z2-Z1)t =
- Nx*(X1-ax) + Ny*(Y1-ay) + Nz*(Z1-az)
- t * ( Nx*(X2-X1) + Ny*(Y2-Y1) + Nz*(Z2-Z1) ) =
- Nx*(ax-X1) + Ny*(ay-Y1) + Nz*(az-Z1)
- Nx*(ax-X1) + Ny*(ay-Y1) + Nz*(az-Z1)
- t = --------------------------------------
- Nx*(X2-X1) + Ny*(Y2-Y1) + Nz*(Z2-Z1)
Luckily we need to use this only in the init part... X)
4.3.3 Hints
1) Calculating the plane equation is so slow it'd maybe
be better to calculate the normals just at the beginning of the program and
rotate them among vertices.
2) The easiest way is seldom the fastest. It would maybe be better to
find the alternative with the least number of polygon splits -- the number
of polygons can easily be reduced to a half which speeds up the thing
dramatically.
4.4 S-buffer
S-buffer (or scanline buffer or segmented buffer or span
buffer or... :) is an enhanced -- and remarkably faster -- version of
z-buffer, where we sort scanlines rather than single pixels. In the form I'm
describing here, it was developed by Paul Nettle of Hot Wax Software (_-_ at
#coders), but some people had used kind of an s-buffer many years before
Nettle developed his version. In any case, we can get rid of a great amount of
calculations by using s-buffer; in most of the horizline calls we need only
compare the ending coordinates. So it looks like a very interesting way of
sorting polygons.
So we have a table or a linked list into which we save the data of
every hline call. We compare them to each other, act accordingly, and finally
draw the visible scanlines (or parts of them). For example the following type
of table works (for texturemapping, pmode):
- typedef struct
- {
- short xb,xe,zb,ze; // x_begin, x_end, ...
- byte ub,ue,vb,ve;
- long kz,ku,kv; // kz = (ze-zb)*65536/(xe-xb) etc
- unsigned char used; // if 0 -> not used
- } sbuf_t;
- sbuftype sbuf[200][100]; // 200 lines in mode 320x200,
- // max. 100 segments / line
Note! The segments are
ordered by the size of the x coordinates from left to right. This kind of
table consumes memory 200*100*28=560000 bytes (due to alignment), and the more
complicated the world is, the greater should the maximum number of segments
be. I recommend using a linked list instead of a table; there's no maximum
number of segments, and adding a segment between two other segments is far
easier.
An s-buffered polygon routine takes also the z coordinates of the points
(which have already been transformed into 2D) as parameters. In the outer
loop, we interpolate also z coordinates. An sbuf-hline doesn't actually draw
anything but it adds a new scanline into the s-buffer, and not until the
s-buffer is ready the whole lot is drawn into the screen, just like with
z-buffer.
Segments on the same horizontal line can be located in many ways related to
each other, so they must be compared and act accordingly. There are six ways
in which segments can be located related to each other (old line = v, new line
= u):
With these six cases we can present all the
possible locations related to each other, also special cases (like one pixel
segments and segments located to equal (x,y) coordinates). Humm... Paul
mentioned in #coders that his own s-buffer requires only two or three compares
in the hline! I would be interested to see the implementation. Paul has
written a document describing his technique but he hasn't released it yet (so
as not to annoy his employees in some weird way...) So everyone flood his
email until he gives up and finally releases the doc! ;) [addition: the
technique is quite probably this: he uses BSP-tree for stabile parts of the
scene, then he inserts these pre-sorted parts into a s-buffer front->back,
then uses normal s-buffer for moving objects and moving parts of the scene.
The bsp-tree->s-buffer insertion routine can be optimized ultra fast; a
scanline is added if and only if there's no segment in the s-buffer already in
those coordinates.]
In the case 1 we just add the segment into the buffer before the
next segment and return to the main polygon routine (all the segments to the
left have already been checked; the list or table is ordered left ->
right).
In the case 2 we try the next old segment if it exists. If not, we add the
segment into the buffer as the last one.
In the case 3 we split the segment at the end of the old segment (remember
to calculate the new values of z, u, and v), compare the left part to the old
segment, and add it into the buffer if needed. Then we continue with the right
part and the next old segment.
In the case 4 we split the segment at the beginning of the old segment, add
the left part into the buffer before the old segment, and act with the right
part like above with the left part.
In the case 5 we split the segment at both ends of the old scanline and act
like in the cases 3 and 4.
In the case 6 we act directly like with the left part in the case 3.
Here are routines to empty and flip a s-buffer using the sbuf table above.
- - tmap is 256x256-sized bitmap
- function flip_sbuf
- integer i,j,k
- integer du,dv
- for i=0 -> 199
- j = 0
- while sbuf[i][j].used<>0 and j<100
- du = 0
- dv = 0
- for k=sbuf[i][j].xb -> sbuf[i][j].xe
- putpixel(k,i,tmap[sbuf[i][j].ub+du/65536+
- (sbuf[i][j].vb+dv/65536)*256])
- du = du + sbuf[i][j].ku
- dv = dv + sbuf[i][j].kv
- endfor
- j = j + 1
- endwhile
- endfor
- endf
- function clear_sbuf()
- integer i,j
- for i=0 -> 199
- j = 0
- while sbuf[i][j].used<>0 and j<200
- sbuf[i][j].used=0
- j = j + 1
- endwhile
- endfor
- endf