Maggie 26: Falcon Programming

                           DSP M-Registers
                                   
This article is in response to No of Escape who wanted some info about 
how the M-registers worked on  the  DSP, particularly when using ring-
buffers (that is, a buffer  that  automatically  loops around when you 
reach the end).

Once you get the idea,  using  the  registers  is pretty easy, so I'll 
launch straight in. Then I'll  introduce  some code to demonstrate the 
idea.

                         M-Registers: Basics

According to Motorola, "M-Register" stands for "modifier register". An 
m-register's job is to take  an  effective  address that is used, then 
"modify" it to produce automatically a different effective result that 
is actually used.

There are 8 registers, named m0  to  m7.  Each one is coupled with the 
respective r-register, so m0 refers to r0  and so on. This is the same 
as the offset 'n'-registers. Each m-register is a 16-bit value.

There are 6 modes  of  addressing  that  an  address register can use, 
which are affected by the m-register. Here they are:-

Type                      syntax   address fetched     new value of r0
                                   if using move       after pipelining

Postincrement by 1        (r0)+    r0                  r0 + 1
Postdecrement by 1        (r0)-    r0                  r0 - 1
Postincrement by offset   (r0)+n0  r0                  r0 + n0
Postdecrement by offset   (r0)-n0  r0                  r0 - n0
Indexed by offset         (r0+n0)  r0 + n0             r0
Predecrement by 1         -(r0)    r0 - 1              r0 - 1

There are two sets of  effective  addresses calculated by the instruction. 
The third column indicates  the  effective  address  where data is fetched 
from; the fourth column indicates the value of r0 after the instruction is 
executed and the pipelining has taken effect.

The m-registers affect both these two  sets  of  values if the register is 
set to the correct value.


                      M-Registers: Linear Operation

Normally an m-register has the value of  -1,  or $FFFF. This means that it 
leaves all effective  addresses  unchanged.  This  is  called  the "linear 
modifier" by Motorola.

                      M-Registers: Modulo Operation

This is the mode used for  ring  buffers.  Here the m-register has a value 
between 1 and 32767. This causes  all effective addresses to be calculated 
to exist between a lower and upper bound address.

Calculating the bound addresses

Let us assume that we want a ring buffer of size M, where M =  21.

Value in m-register = (M - 1) = 21 - 1 = 20

Lower Boundary

(This is the inter
The lower boundary must have a base  address  of L, where the lower k bits 
of L are all zero.

'k' is calculated by finding the lowest value where 2^k >= M.

Another way of thinking of this  is  to  consider  the lowest value in the 
sequence 2,4,8,16,32,64,128,256...32768 which is greater than M.

So for our example 32 is the first  value greater than 21. This means that 
the lower boundary of our  range  must  be  a  multiple of 32, for example 
0,32,64,96,128 etc.

Upper Boundary

The upper boundary is now (L + M - 1), since the base address is L and the 
size must be M.

Setting the boundaries

Once we have set the  size  of  the  ring  buffer,  the value of the lower 
boundary is set by the address "r"-register.

Let's say that we want our ring buffer to start at address 96.

        move #20,m0             ;ring buffer size 21
        move #96,r0             ;start of buffer is now 96

However (and this is important) our  buffer  still  starts at 96 if we 
use the following:

        move #20,m0             ;ring buffer size 21
        move #100,r0            ;start of buffer is now 96

For example, the in-built sine  table  has  256  entries and exists at 
address Y:$100:

        move #$ff,m0
        move #$100,r0

In addition, the equivalent cosine table  starts at $140, runs to $1ff 
and then "wraps round" back to $100 to  end at $13f. We can handle the 
wrapping part automatically using:-

        move #$ff,m1
        move #$140,r1


                    Effective address calculation

Let us assume that an effective  address  of "ea" is calculated. Using 
modulo-modification, the new address will be:

        Lower Boundary + ((ea - Lower Boundary) MOD buffersize)

where "buffersize" is the value in the m-regiser plus 1.
This works even when  the  "ea"  is  a  value  *lower*  than the Lower 
Boundary. The value wraps round to the top of the buffer.


MEMORY MAP:

effective address:            <---x---->
        LB                    UB      EA
         |--------------------|--------V------------...

resultant address:
         <---x---->
        LB       EA2          UB
         |--------V-----------|---------------------...



IMPORTANT NOTE:
If an n-register is used to create  an effective address, if Nn>M then 
the results are unpredictable and unreliable!

The exception to this  is  where  Nn  is  a  multiple  of 2^k that was 
mentioned before. eg. our buffer size is 21, and n0 = 32.

When using the (r0)+n0 addressing mode, this increases the value of r0 
by n0, or the opposite for (r0)-n0.
This is useful when making the address "jump" to another block of ring 
buffers somewhere else!


                        Reverse-Carry Modifier

This is in operation when Mn = 0.  This is a complex operation used in 
things such as FFT generation.

Reverse carry  means  that  the  "carry"  value  used  in  addition is 
propagated (ie. passed on) from the Most Significant Bit (MSB) down to 
the Least Significant Bit (LSB).

Imagine a normal binary addition, let's  say  %1111+%0001. We start by adding 
the two LSB's: 1 and 1. This gives  us  2, or %10. We write "0" in our 
answer column and keep 1 as the "carry". Now we add the next two LSBs, 
plus our carry, and so on. The carry "propagates" upwards.

In "reverse carry" the opposite happens. Assume  that we add r0 and n0 
using reverse carry. We can make it  easy by reversing all the bits of 
both r0 and n0, adding, then  reversing  all  the bits again. Not very 
useful?

Now, here's the interesting bit. If  Nn  =  2^k where k is any number, 
then the reverse carry addition is  equivalent to reversing the last k 
bits of r0, incrementing (adding 1)  and  then re-reversing the last k 
bits of r0 again. Apparently this  is  *very* useful when doing things 
like "twiddle factors" with FFTs.

Interestingly(?), if we consider  a  setting  where  Nn  = 1024, using 
reverse carry repeatedly with the following code:

        move    #output_buffer,r1
        move    #0,r0
        move    #0,m0           ; select reverse-carry
        move    #512,n0         ; our reverse carry "increment"
        do      #100,rc_loop
         move   r0,x:(r1)+
         lua    (r0)+n0,r0
         nop                    ; wait for pipeline
rc_loop:

... produces the following sequence:

0, 512, 256, 768, 128, 640 ... or in binary:

        000000000
        100000000
        010000000
        110000000
        001000000
        101000000
        011000000

This may look  strange,  but  when  an  FFT  is  produced  the data is 
"scrambled". In the produced table, value  0  is  at  0, value 1 is at 
512, value 2 at 256, and so on...



Steven Tattersall