Maggie 26: Falcon Programming
DSP M-Registers
This article is in response to No of Escape who wanted some info about
how the M-registers worked on the DSP, particularly when using ring-
buffers (that is, a buffer that automatically loops around when you
reach the end).
Once you get the idea, using the registers is pretty easy, so I'll
launch straight in. Then I'll introduce some code to demonstrate the
idea.
M-Registers: Basics
According to Motorola, "M-Register" stands for "modifier register". An
m-register's job is to take an effective address that is used, then
"modify" it to produce automatically a different effective result that
is actually used.
There are 8 registers, named m0 to m7. Each one is coupled with the
respective r-register, so m0 refers to r0 and so on. This is the same
as the offset 'n'-registers. Each m-register is a 16-bit value.
There are 6 modes of addressing that an address register can use,
which are affected by the m-register. Here they are:-
Type syntax address fetched new value of r0
if using move after pipelining
Postincrement by 1 (r0)+ r0 r0 + 1
Postdecrement by 1 (r0)- r0 r0 - 1
Postincrement by offset (r0)+n0 r0 r0 + n0
Postdecrement by offset (r0)-n0 r0 r0 - n0
Indexed by offset (r0+n0) r0 + n0 r0
Predecrement by 1 -(r0) r0 - 1 r0 - 1
There are two sets of effective addresses calculated by the instruction.
The third column indicates the effective address where data is fetched
from; the fourth column indicates the value of r0 after the instruction is
executed and the pipelining has taken effect.
The m-registers affect both these two sets of values if the register is
set to the correct value.
M-Registers: Linear Operation
Normally an m-register has the value of -1, or $FFFF. This means that it
leaves all effective addresses unchanged. This is called the "linear
modifier" by Motorola.
M-Registers: Modulo Operation
This is the mode used for ring buffers. Here the m-register has a value
between 1 and 32767. This causes all effective addresses to be calculated
to exist between a lower and upper bound address.
Calculating the bound addresses
Let us assume that we want a ring buffer of size M, where M = 21.
Value in m-register = (M - 1) = 21 - 1 = 20
Lower Boundary
(This is the inter
The lower boundary must have a base address of L, where the lower k bits
of L are all zero.
'k' is calculated by finding the lowest value where 2^k >= M.
Another way of thinking of this is to consider the lowest value in the
sequence 2,4,8,16,32,64,128,256...32768 which is greater than M.
So for our example 32 is the first value greater than 21. This means that
the lower boundary of our range must be a multiple of 32, for example
0,32,64,96,128 etc.
Upper Boundary
The upper boundary is now (L + M - 1), since the base address is L and the
size must be M.
Setting the boundaries
Once we have set the size of the ring buffer, the value of the lower
boundary is set by the address "r"-register.
Let's say that we want our ring buffer to start at address 96.
move #20,m0 ;ring buffer size 21
move #96,r0 ;start of buffer is now 96
However (and this is important) our buffer still starts at 96 if we
use the following:
move #20,m0 ;ring buffer size 21
move #100,r0 ;start of buffer is now 96
For example, the in-built sine table has 256 entries and exists at
address Y:$100:
move #$ff,m0
move #$100,r0
In addition, the equivalent cosine table starts at $140, runs to $1ff
and then "wraps round" back to $100 to end at $13f. We can handle the
wrapping part automatically using:-
move #$ff,m1
move #$140,r1
Effective address calculation
Let us assume that an effective address of "ea" is calculated. Using
modulo-modification, the new address will be:
Lower Boundary + ((ea - Lower Boundary) MOD buffersize)
where "buffersize" is the value in the m-regiser plus 1.
This works even when the "ea" is a value *lower* than the Lower
Boundary. The value wraps round to the top of the buffer.
MEMORY MAP:
effective address: <---x---->
LB UB EA
|--------------------|--------V------------...
resultant address:
<---x---->
LB EA2 UB
|--------V-----------|---------------------...
IMPORTANT NOTE:
If an n-register is used to create an effective address, if Nn>M then
the results are unpredictable and unreliable!
The exception to this is where Nn is a multiple of 2^k that was
mentioned before. eg. our buffer size is 21, and n0 = 32.
When using the (r0)+n0 addressing mode, this increases the value of r0
by n0, or the opposite for (r0)-n0.
This is useful when making the address "jump" to another block of ring
buffers somewhere else!
Reverse-Carry Modifier
This is in operation when Mn = 0. This is a complex operation used in
things such as FFT generation.
Reverse carry means that the "carry" value used in addition is
propagated (ie. passed on) from the Most Significant Bit (MSB) down to
the Least Significant Bit (LSB).
Imagine a normal binary addition, let's say %1111+%0001. We start by adding
the two LSB's: 1 and 1. This gives us 2, or %10. We write "0" in our
answer column and keep 1 as the "carry". Now we add the next two LSBs,
plus our carry, and so on. The carry "propagates" upwards.
In "reverse carry" the opposite happens. Assume that we add r0 and n0
using reverse carry. We can make it easy by reversing all the bits of
both r0 and n0, adding, then reversing all the bits again. Not very
useful?
Now, here's the interesting bit. If Nn = 2^k where k is any number,
then the reverse carry addition is equivalent to reversing the last k
bits of r0, incrementing (adding 1) and then re-reversing the last k
bits of r0 again. Apparently this is *very* useful when doing things
like "twiddle factors" with FFTs.
Interestingly(?), if we consider a setting where Nn = 1024, using
reverse carry repeatedly with the following code:
move #output_buffer,r1
move #0,r0
move #0,m0 ; select reverse-carry
move #512,n0 ; our reverse carry "increment"
do #100,rc_loop
move r0,x:(r1)+
lua (r0)+n0,r0
nop ; wait for pipeline
rc_loop:
... produces the following sequence:
0, 512, 256, 768, 128, 640 ... or in binary:
000000000
100000000
010000000
110000000
001000000
101000000
011000000
This may look strange, but when an FFT is produced the data is
"scrambled". In the produced table, value 0 is at 0, value 1 is at
512, value 2 at 256, and so on...
Steven Tattersall