Tiled Texture Mapping for pow2 Texture Sizes
---------------------
by
TheGlide/SpinningKids
Milan, Italy - June 1st, 1998
INTRODUCTION:
-------------
I assume here you know the basics of texture mapping, as eplained in
fatmap and fatmap2 docs by MRI/Doomsday.
This doc is about texture mapping using texture maps stored as tiles,
namely 8x8 pixels tiles. Storing the maps this way can improve very much
cache access. Most of the time we have to traverse the texture through
non-horizontal lines, and this causes many cache misses. The worst situation
happens when we have to traverse the texture vertically: each texel we access
will be on a different row, and this will require, from the processor side,
a whole cache line load. And this is very slow.
Storing the texture in 8x8 tiles ensures that every tile fits in two 32 bytes
cache lines (on the pentium), and as we traverse the texture we have a
greater chance to read from the same cache line for a longer time.
Let's assume for the moment that you have 256x256 textures.
So the u and v coordinates take up 8 bits.
u : xxxxxxxx
v : xxxxxxxx
TILING - METHOD 1:
------------------
The first way to tile the map in 8x8 tiles is this one:
---------------------------------
| 0 | 1 | 2 | 3 | 4 | ....
---------------------------------
| 32 | 33 | 34 | 35 | 36 | ....
---------------------------------
| 64 | 65 | ....
---------------------------------
where numbers 0... indicate the order by which the 8x8 tiles are stored in
memory.
This way we can go from the original u v coordinates to the ones in
the tiled map with the following:
u : xxxxxXXX -> u' = 00000xxxxx000XXX
v : xxxxxXXX -> v' = xxxxx00000XXX000
u' = (u&0x7)|((u<<3)&0x7c0);
v' = ((v<<3)&0x38)|((v<<8)&f800);
That is the lower 3 bits of both u and v (XXX) are used to address the texel
inside a single tile, whereas the 5 upper bits are used to select the
texture. The C code to convert normal texture coordinates (u,v) to
tiled-texture coordinates is the following:
u' = (u&0x7)|((u<<3)&0x7c0);
v' = ((v<<3)&0x38)|((v<<8)&f800);
This code enables us to convert a straight texture to a tiled texture:
tiledtmap [u'+v'] = tmap [u+v*256]
TILING - METHOD 2 - THE BETTER METHOD:
--------------------------------------
But there's another way to tile a texture map. This one:
---------------------------------
| 0 | 32 | 64 | 96 | ....
---------------------------------
| 1 | 33 | 65 | 97 | ....
---------------------------------
| 2 | 34 | ...
---------------------------------
| 4 | ....
---------------------------------
And with this tiling method we get from the u v of the original
map to u' v' relative to the tiled map with this method:
u : xxxxxXXX -> u' = xxxxx00000000XXX
v : xxxxxXXX -> v' = 00000xxxxxXXX000
The corresponding C code is:
u' = (u&0x7)|((u<<8)&0xf800);
v' = (v<<3);
and as before it can be readily plugged in a converter from straight
textures to tiled textures.
The code really 'looks better' than the first. It is easier and faster to
convert from v to v'. That's why we will choose this second method.
Now, we could easily get our usual tmap scanline filler, put those relations
inside the inner loop, and see the result. Slooow.
At the expense of a little overhead, we can get a loop that is really
little and optimized. So what can we do to directly use u' and v' in the loop
and the corresponding du' and dv', and read from the tiled texture ?
We convert all of our starting u and v, and the corresponding deltas (du,dv),
that are calculated in the tmapper before entering the inner loop:
(all quantities in 8.16 fixed point format, xxx is the integer part,
XXX is the fractional part):
u : xxxxxxxx,XXXXXXXXXXXXXXXx -> u' = xxxxx00000000xxx,0XXXXXXXXXXXXXXX
v : xxxxxxxx,XXXXXXXXXXXXXXXx -> v' = 00000xxxxxxxx000,0XXXXXXXXXXXXXXX
du : xxxxxxxx,XXXXXXXXXXXXXXXx -> du' = xxxxx11111111xxx,1XXXXXXXXXXXXXXX
dv : xxxxxxxx,XXXXXXXXXXXXXXXx -> dv' = 00000xxxxxxxx111,1XXXXXXXXXXXXXXX
We have to fill the gaps in du'/dv' with 1 because when we add them to the
current u'/v' values we must propagate the carry from the lower bits to the
bits that lie after the gap. After the addition we must not forget to mask
out the 1s from the u'/v' we obtain.
Of the 16 bit fractional part we keep only the upper 15 bits. There's a
valid reason to do this: when calculating the offset to access the texel
we add u' and v' and shift left by 16. If we kept all of the fractional
bits, an hypotetical carry would propagate to the integer part, thus
influencing the offset value. Keeping instead only the upper 15 bits of
the fractional part, and putting a 1 bit gap between fractional and integer
part the problem gets solved automatically. If this explanation seems
harsh, look at the 'picture' of u'/v' above.
Now, an hypothetical tiled tmap scanline filler would look like:
void tiledtmapline (int u, int v, int du, int dv,
int run, const unsigned char * vid, const unsigned char * tmap) {
// on entry u,v,du,dv are in 8.16 format
u = (( u<<8)&0xf8000000)|( u&0x70000)|(( u>>1)&0x7fff);
du = ((du<<8)&0xf8000000)|(du&0x70000)|((du>>1)&0x7fff)|0x7f88000;
v = (( v<<3)&0x07f80000)|(( v>>1)&0x7fff);
dv = ((dv<<3)&0x07f80000)|((dv>>1)&0x7fff)|0x78000;
vid+=run;
for (run=-run;run;run++) {
*(vid+run) = tmap [((unsigned int)(u+v)>>16)];
u =(u+du)&0xf8077fff; // addition + masking out the 1s in the gaps
v =(v+dv)&0x07f87fff; // same as above
}
EXTENDING TO POW2 TEXTURES:
---------------------------
Now comes the cool part. We will extend all the formulas we have developed
to other texture dimensions (actually always power of 2). Let's look at the
u' and v' formats:
111111
5432109876543210
u : xxxxxXXX -> u' = xxxxx00000000XXX
v : xxxxxXXX -> v' = 00000xxxxxXXX000
bits 0-2 of u' and bits 3-5 of v' are the coordinates in the single
8x8 tile. Since we always use 8x8 tiles, those fields wont change in
bitwidth. Let's look at the remaining 5 bits of u' (bits 11-16) and
v' (bits 6-10). 5 bits are need for 32 tiles.
So 32tiles*8pixels = 256 pixels.
It takes a minute to understand that by varying the number of those bits we
can account for different texture sizes. With 4 bits we get 16 tiles, that
is a 16*8=128 pixels width/height texture. Here are a couple of cases to
make everything more clear:
128x128 tiled map ( = 16tiles x 16 tiles):
u' = 00xxxx0000000XXX
v' = 000000xxxxXXX000
64x64 tiled map ( = 8tiles x 8tiles):
u' = 0000xxx000000XXX
v' = 0000000xxxXXX000
and so on.
So how can we handle all those cases in the formulas we wrote above ? Easy:
we simply need a parameter that tells us the number of bits for the
'inter-tile' addressing, and the corresponding mask. In formulas this will
look like:
// u,v,du,dv 16.16 fixed point quantities
// bits = tile addressing bits
// mask = tile addressing bit mask
ushift = (3+bits);
umask = (mask<<(16+6+bits));
vmask = (mask<<(16+6))|0x380000;
dumask = vmask|0x8000;
u = (( u<>1)&0x7fff);
du = ((du<>1)&0x7fff)|dumask;
v = (( v<<3)&vmask)|(( v>>1)&0x7fff);
dv = ((dv<<3)&vmask)|((dv>>1)&0x7fff)|0x78000;
and that's all.
Here are the correct bits & mask values for the different texture sizes:
bits mask
256x256 5 0x1f
128x128 4 0xf
64x64 3 0x7
32x32 2 0x3
16x16 1 0x1
8x8 0 0
The inner loop then looks like:
innerumask = umask|0x77fff;
innervmask = vmask|0x07fff;
vid+=run;
for (run=-run;run;run++) {
*(vid+run) = tmap [((unsigned int)(u+v)>>16)];
u =(u+du)&innerumask;
v =(v+dv)&innervmask;
}
And you got it! That's a tiled texture mapper ready to handle any power of 2
texture size, subdvided in 8x8 tiles. ushift, umask, vmask, innerumask and
innervmask do not need to be calculated at each scanline obviously as they
depend solely on the dimensions of the texture. But a little overhead still
remains; that's true especially when you use this scanline filler in a
perspective correct tmapper that linearly interpolates every 16 pixels.
One last thing to note is that wrapping is still allowed with this method.
MORE EXTENSIONS:
----------------
An obvious limit of the method I presented is that you can apply it to
textures with a maximum dimension of 256x256 texels. Extending beyond this
limit is not a problem: you only have to trade some bits from the fractional
part, so they can be used to address more texels :)
GREETS:
-------
.MRI / Doomsday:
because I was introduced to this subject from his fatmapX docs.
.Crossbone / Suburban Creations:
for patiently beta-testing this doc, since I wrote it even before
actually writing the code :)
.Vipa / Purple
Some italian greets now :
.Pan / SpinningKids:
vabbe' che il tiling non fa tendenza, pero' fa molto figo :)
.Junta / SpinningKids:
ora' capisci perche' non scrivo mai...sono impegnato a scrivere
articoloni sul coding e a far figuracce in giro per il mondo:)
.Ghe & Blade / Absurd:
codate e fatevi sentire!
BYE BYE:
--------
I would like to hear your comments, suggestions and, most of all, corrections
to this document.
That's all for now.
Ciao,
<> Luca Gerli
<> TheGlide / SpinningKids
<> email: gerli@ipeca8.elet.polimi.it
<> email: luca.gerli@usa.net (preferred after July '98)
--Enf of Doc--