LAUGHTON ELECTRONICS

The KimKlone: Bride of Son of Cheap Video

X-Indirect-Y Addressing Using the W Register

Addr
Bus
Data
Bus
Comment
1 PC CBh/A1h Substitution results in
LDA (ind,X) OpCode
2 +1 0 inst. operand
3     (65C02 dead cycle)
4 X ptr low ptr low --> W low
5 +1 ptr high ptr high --> W high
6 ptr=W
=TOS
? result low
Timing for LDA_X)_PTR>W (0,X). In the code example (right) this special instruction moves the first byte, using X-Indirect mode. It also does the setup for using Indirect-Y.


Addr
Bus
Data
Bus
Comment
1 PC B1h LDA (ind),Y OpCode
2 +1 Wreg inst. operand
3 Wreg W low  
4 +1 W high   
5 W+Y=
TOS+1
? result high
To get the other byte(s) simply use Indirect-Y. This is a perfectly ordinary 65xx mode. Wreg is a text equate for the address in Z-Pg where W can be read.

Another use for W, besides its role in the NEXT instruction, is to give the 65C02 an extra address mode (more or less). Of course 65xx chips already have Indexed Indirect address mode and Indirect Indexed address mode, but the W register lets us emulate Indexed Indirect Indexed mode — Ie; addressing which is indexed both before and after the fetch of the indirect pointer.

This oh-so-esoteric capability is actually startlingly useful — especially in the Forth context, where it's very common for an address on stack to point to a multi-byte structure. This implies two steps. We want to index via X into the Forth data stack to find the pointer to the structure. Then, since the pointer only indicates the base of the structure, we index from there to access the other byte(s).

Here is a simple example. The Forth word @ (pronounced "fetch") treats the top-of-stack value TOS as an address, returning the value "at" that address. The value at the address is simply a 16-bit number — a basic instance of a multi-byte structure.

Classic version of @
(36 cycles typical)

LDA (0,X)      ;get byte at base of structure
PHA            ;stash byte
INC 0,X        ;overwrite the pointer itself!
BNE IncDone
INC 1,X
LDA (0,X)      ;get byte at base+1
STA 1,X        ;return result hi-byte
PLA            ;un-stash
STA 0,X        ;return result lo-byte

KimKlone @
  (21 cycles; 19 cycles if Y can be assumed=1)

LDA_X)_PTR>W (0,X) ;get byte at base. Copy ptr to W.
STA 0,X            ;return result low byte
LDY #1             ;omit this if Y=1 by convention
LDA (Wreg),Y       ;Get byte at base+1
STA 1,X            ;return result high byte

"X-Indirect / pointer-to-W"

Here's what happened. The instruction LDA_X)_PTR>W (0,X) is the same as LDA (0,X) but with the added feature that, on cycles 4 and 5 when the Zero Page pointer is accessed, the two pointer bytes are copied to the W register. (See 1st chart, left.) The copy operation is equivalent to the following imaginary sequence: (But no extra cycles are consumed, and A is undisturbed.)

LDA 0,X
STA Wreg
LDA 1,X
STA Wreg+1

Thereafter the top-of-stack value is accessible without any need for X indexing, because KK address decoding provides access to W at Wreg, a fixed pair of addresses in Zero Page. That opens the door for subsequent indirect-Y accesses via Wreg, as in the example. (Incidentally, KK Forth eliminates the LDY #1 instruction. Instead Y is initialized to 1 on startup, and by convention any routine altering Y will subsequently reset it to 1.)

To match LDA_X)_PTR>W, the KK also has a STA_X)_PTR>W instruction, which for example is handy for accelerating Forth's ! ("store") operation. But LDA and STA aren't the only op's that can use the new address mode. There's always the option of using a "dummy" LDA_X)_PTR>W or STA_X)_PTR>W for the sole purpose of rapidly copying a pointer to W. Then subsequent code can use ADC or EOR or whatever Indirect-Y instructions it needs to get the job done.

X-Indirect-Y addressing is a significant asset for applications (including Forth) which need to index both before and after the fetch of an indirect pointer. In our example the performance boost is 89%. The main limitation is that W can only address one item at a time. But the impact of that is slight, given that W loads (and can be reloaded) at such high speed — 6 cycles. And in many cases the one-item-at-a-time limitation has no impact at all.


<Previous Page   KK Index     Next Page >
visit
LAUGHTON
ELECTRONICS
Home
Commercial& Manufacturing
Stage&Studio
Laboratory&OEM
copyright notice (Jeff Laughton)