The KimKlone: Bride of Son of Cheap Video

Op-Code Mapping - the old instructions and the new

This section examines the binary encodings for 65C02 instruction opcodes — particularly NOPs — and how KimKlone instructions succeed in functioning within that framework. Among the new KK opcodes, some were freely and arbitrarily assigned; others exploit specific encodings that allow KK to benefit from the C02's cast-in-silicon peculiarities.

The original NMOS 6502 featured 151 op-codes, leaving more than a hundred op-codes undefined. (NMOS undefined op-codes were eventually documented by hobbyists and experimenters, although not by the chip manufacturers. The undefined results are bizarre, ranging from marginally useful to dangerous.) When Western Design Center's W65C02 was created, and later Rockwell's R656C02, the designers originated several new instructions as well as a new addressing mode. In doing so, they raised the number of legitimate op-codes to 210 — and reduced the number of undefined op-codes to 46.

In addition, they "fixed" the 65C02 so undefined op-codes would execute harmlessly. They wouldn't crash the chip, nor even alter any registers. Table 7-1 of the W65C02S datasheet lists the bytes and cycles consumed by each undefined instruction, noting that, "All are NOP's." But this is easily misunderstood. At the hardware level, some of the undefined instructions are not NOPs. My experiments with the 65C02 revealed two categories.

32 of the undefined op-codes merely increment the Program Counter. I call these the "boring" NOPs. They don't seem to be good for much — they are one-byte do-nothings. (On the plus side, these guys execute in just a single cycle... which makes them capable of doing nothing at a very high rate of speed!)

The 14 "interesting" NOPs execute by fetching one or two operand bytes following the op-code... and some of them then generate a memory access using those operands! This is not the behavior we expect from a bunch of purported NOPs. However, I noticed the memory accesses were benign (always a read, never a write), and no processor registers were changed other than the PC advancing.

No change in the registers means data from the memory access didn't go anywhere. In other words it was fetched and then thrown away — loaded and discarded. I have accordingly chosen "LDD" as the mnemonic for this undocumented "Load And Discard" operation. The immediate mode of LDD ignores its operand byte, but otherwise LDD operand bytes are used to form an address. It's the data subsequently fetched from that address which is discarded.

LDD address modes are already familiar — they are modes that are also used by normal instructions. But the opcode mapping is somewhat redundant, since, as shown below, thirteen LDD opcodes express only four address modes. There are

  • 2 op-codes for LDD Absolute (DC FC)
  • 1 op-code   for LDD Zero-Pg   (44)
  • 3 op-codes for LDD Zero-Pg,X (54 D4 F4)
  • 7 op-codes for LDD Immediate (02 22 42 62 82 C2 E2)
The address mode indicates what the instruction length and bus behavior (including the number of cycles) will be. For example, the instruction length and bus behavior for LDD using Absolute address mode are the same as those for LDA using Absolute address mode. Although LDD is "only going through the motions," it does indeed fetch a byte from memory, and the motions are the same ones which other instructions go through in doing so. The only truly irregular undefined op-code is 5C, which scarcely seems to have an address mode; see below.

I knew that the Load-and-Discard operations could be put to use. How obliging of the 'C02 to generate addresses then stand aside as KK circuitry uses the memory accesses for its own purposes! As for the "boring" NOP's, although they themselves are almost useless, at least they acted as placeholders. There's huge potential in the fact that their op-codes do not appear in pre-existing software.

KimKlone Op-Code Mapping

Obviously the design challenge with the KimKlone was to give meanings to the undefined codes — to associate them with useful operations. Here's an example:

In KK terms, the two-byte instruction 44 11 means Load register K2 with whatever's in zero-page location 11. Upon execution the CPU fetches the 44h op-code, the 11h operand, and then, on the third cycle, obediently fetches and discards the contents of zero-page location 11. But location 11's contents appear on the data bus, and microcode tickles a control line that copies whatever's on the data bus into the K2 register.

This satisfies the job description for the LDK2 instruction, z-pg mode. The example is an instance of giving new meaning to a previously undefined op-code. I used the other Load-And-Discards as well, including those with other addressing modes.

This explains the general procedure for KK loading and storing its new registers to and from memory. The CPU's job is simply to generate the address. Microcode determines what'll happen on the data bus.

(It would have been easy to use the LDD's — phantom loads — to implement stores, just by asserting data on the bus and having microcode force a low onto the R/W line going to memory. This would be a good way to write my new registers to memory. But the only writes I wanted were pushes, so I chose a different tactic, one that would adjust the stack pointer. The PHK0, PHK1, PHK2 and PHK3 instructions are seen by the CPU as $08, or PHP. On the final cycle it is Register File A' — not the CPU — that supplies memory with the data to be written.)

KimKlone Op-Code Re-Mapping

op-code substitution logic
Schematic excerpt (simplified) showing how certain op-codes are intercepted and replaced as they are passed along to the 65C02. Microcode is fed from upstream of the substitution, and thus has full knowledge of the true op-code.

Making use of the boring NOPs requires a slightly fancier technique. Luckily the boring NOP's are plentiful (there are 32 of them), but the problem is that they are NOPs (the CPU doesn't even do a Load-And-Discard). If they're given straight to the CPU "as is," they do nothing. But... (and here the plot thickens...)

The KimKlone uses a brilliantly sneaky trick I learned from two classic Don Lancaster publications, The Cheap Video Cookbook and Son Of Cheap Video. (See Cheap Video à la Lancaster ) It's really easy to lie to the CPU when it's trying to fetch an op-code from memory! By "lying" I mean hiding what's actually there and making an alternative representation. Just create a momentary disconnect between memory and the CPU, such that the byte on the memory data bus isn't reproduced at the CPU. (See the diagrams, left.)

The 32 boring NOP's are very easily recognized. That's because they all conform to the binary pattern xxxxx011. (Each "x" character shows the position of a "don't care" bit.) Notice that the upper five bits can vary, but the "boring NOP" op-codes always have binary 011 in the three least-significant bits. KimKlone hardware recognizes the xxxxx011 patterns and uses them to trigger a substitution.

During op-code fetch cycles the bus will be disconnected if the byte being fetched matches the xxxxx011 pattern. A 32 by 8 TTL PROM (74S288) takes the high 5 bits (the xxxxx) and reads out a corresponding substitute op-code chosen to suit my purpose. In this way, each of the 32 boring NOPs has an associated op-code that gets fed to the CPU. (KK rams are fast enough to leave a margin for the substitution delay; alternatively a wait state could have been provided.)

Here's an example of a KK instruction with a substituted op-code. D3h is the KimKlone op-code for K2_LDA abs. It's the familiar LDA using absolute address mode, except that K2 says what bank to fetch from. If K2 holds 12h, say, then the three-byte instruction D3 56 34 means LDA from memory at 123456h. (Note the 24-bit address!)

This is a four-cycle instruction. On the first cycle memory fetches the D3. Because it's an op-code-fetch cycle, and D3 in binary matches the xxxxx011 pattern, the D3 never reaches the CPU bus. The disconnect prevails and the PROM sends ADh to the CPU instead, which is simply the 65C02 op-code for LDA abs.

The second and third cycles fetch the operand bytes 56h and 34h to the CPU without incident; then on the fourth cycle the memory read occurs. The CPU places 3456h on the low 16 address lines, and microcode selects K2 to drive the high 8 address lines, making them equal 12h.

Memory responds with the contents of 123456h, and the CPU inputs that byte from the data bus to the Accumulator. That concludes the instruction; the next cycle will be an op-code fetch.

Op-Code Mapping Summary

To recap, of 256 possible codes...

 • 210 are standard 65C02 codes. These execute normally (except memory reference instructions which have been preceded by a prefix).

 • 14 are "interesting" NOP's — the Load-And-Discards noted above. KK uses three of the LDD Immediates to load Bank Registers and two simply to trigger control pulses (for SINC and DINC instructions, discussed later). Two LDD Immediates (62 and 82) are unused. Of the other seven LDDs, six use conventional address modes other than Immediate. Microcode can finagle any of these into a read or write of one of the new registers. Finally, op-code 5C consumes 3 bytes and 8 cycles but conforms to no known address mode; it remains interesting but useless. I tested the instruction "5C 1234h" (stored little-endian as 5Ch 34h 12h) as an example, and observed the following: 3 cycles fetching the instruction, 1 cycle reading FF34, then 4 cycles reading FFFF.

 • 32 are "boring" xxxxx011 codes — the 1-byte, 1-cycle NOPs which are subject to substitution.

The xxxxx011 codes granted a lot of flexibility, because I had the design freedom to alias them arbitrarily. I had the PROM translate some of them to standard 65C02 codes and others to Load-and-Discards. But it's always the unique, un-substituted version of an op-code that determines what the microcode will do.

The KK goings-on that accompany the xxxxx011 ops get pretty gnarly, as you will see in the following section.

< Previous Page   KK Index     Next Page >

Servicing the unserviceable
Main/extra index

copyright notice (Jeff Laughton)