BugsForth by Myron Plichota

BugsForth is a dialect tailored for the Bugs18 Forth CPU running at 80 MHz on the author's (obsolete) Xilinx Spartan3A FPGA evaluation board. The data and return stacks are register arrays (not in memory) of 16 members each with no hardware provision to detect stack overflow/underflow. There is also a program counter and an invisible instruction packet register which will be explained further.

Code, data, and I/O are all in one flat memory space:

{00000..000D7}  USB/UART serial bootloader with recyclable procedures

{000D8..here-1} BugsForth

{here..04FFF}   free dictionary to the end of physical RAM on the FPGA

{05000..0FFFF}  maximum reach of the program counter, for reference only

{10000..1FFFF}  undefined

{20000..3FFFF}  I/O, includes 18x18 multiplier and carry from addition

Bugs18 memory cells, instruction packets, and most registers are 18 bits. There is no byte addressing. ASCII strings are packed 2 chars per cell by convention.

The program counter addresses complete instruction packets containing 1 or 4 instructions, not the individual instructions within a packet. After an instruction packet fetch, the program counter is incremented to the address of the next packet or inline literal satisfying an occurrence of the # instruction within the current packet. The # instruction is the sole means to convey a compile-time expression to a run-time constant.

A Bugs18 instruction packet has fields within, referred to as slots {0..4}. If slot 0 = 0 (not a jump or call) then slot 1 is the current instruction. When slot 1 is retired, slot 0 retains its value and the rest of the instruction packet register is shifted so:

slot 1 <- slot 2 <- slot 3 <- slot 4 <- nop

Eventually slots {2..4} are filled with nops. If the final instruction in slot 1 does not access memory, then the next packet is fetched in parallel, otherwise a shifted-in nop will appear in slot 1, preserving CPU registers while the next packet is fetched in sequence and the process repeats.

If slot 0 <> 0 then slots {1..4} contain the jump target address. The entire packet is retired without any of the above slot shifting, and another packet will be immediately fetched.

     .----- slot 0: 2 bits, qualifies slots {1..4}

     |.---- slot 1: 4 bits, 1st instruction or digit 3 of absolute address

     ||.--- slot 2: 4 bits, 2nd instruction or digit 2 of absolute address

     |||.-- slot 3: 4 bits, 3rd instruction or digit 1 of absolute address

     ||||.- slot 4: 4 bits, 4th instruction or digit 0 of absolute address

     |||||

nop  00000 do nothing ( -- ) 0|1|2 cycles Note 1

#    01111 inline literal ( -- n ) program counter increments, 2 cycles

@    02222 fetch ( addr -- n ) 2 cycles

!    03333 store ( n addr -- ) 1 cycle

dup  04444 duplicate top ( n -- n n ) 1 cycle Note 2

over 05555 duplicate next ( n1 n2 -- n1 n2 n1 ) 1 cycle Note 2

drop 06666 discard top ( n -- ) 1 cycle Note 2

>r   07777 move top to return stack ( n -- ) ( r: -- n ) 1 cycle Note 2

r>   08888 move return stack to top ( -- n ) ( r: n -- ) 1 cycle Note 2

r@   09999 copy return stack to top ( n -- ) ( r: n -- n ) 1 cycle Note 2

2/   0AAAA signed shift right ( n -- n>>1 ) 1 cycle Note 2

+    0BBBB add ( n1 n2 -- n1+n2 ) 1 cycle Note 2

&    0CCCC and ( n1 n2 -- n1&n2 ) 1 cycle Note 2

|    0DDDD or  ( n1 n2 -- n1|n2 ) 1 cycle Note 2

^    0EEEE xor ( n1 n2 -- n1^n2 ) 1 cycle Note 2

;    0FFFF return ( -- ) ( r: addr -- ) 2 cycles

jz   1aaaa jump if zero ( flag -- ) 2 cycles

j    2aaaa jump ( -- ) 2 cycles

p    3aaaa call procedure ( -- ) ( r: -- addr ) 2 cycles



Note 1:

0 cycles - when filling the trailing slots of a packet that was compiled

           not-full because a jump target at the next address is required

1 cycle  - when an occurrence is followed by any non-nop(s) in the packet

2 cycles - when a packet full of nops has been fetched



Note 2:

These instructions may execute in parallel with a 2-cycle packet fetch.

A random read of synchronous block RAM requires 2 cycles. The first cycle

provides the read address, making the read data available in the second

cycle.

Why 18 bits?
1) Many competing FPGA vendors provide 18-bit block RAM and multiplier fabric, making it a de-facto standard.
2) 18-bit integers are 4*more better than 16 bits.
3) The instruction packet format is simple and consistent, reducing the pain of viewing hex dumps when ya jist gotsta know.

Unlike most Forths, BugsForth is case-sensitive so that dictionary searches are as simple and fast as possible. The author thinks twice before defining a new word whose name differs from an existing word only by case.

The BugsForth prompt is a new line without the traditional ok. No news is good news. Hexadecimal is the only implemented numeric text conversion base. Redefinition of existing words is forbidden.

All words in the dictionary contain an executable flag in their headers. Code is flagged executable, data is flagged not executable.

BugsForth is subroutine-threaded, with no inner interpreter. In conventional Forth terms, there is no distinction between compiling and interpreting states. All executable words may be regarded as IMMEDIATE, running whenever encountered by the outer interpreter. Certain parsing words that consume the next text word are required to create new words or yield the pointer to an existing executable word. Examples:

: a-new-code-word

jz_ an-existing-code-word

j_ an-existing-code-word

p_ an-existing-code-word

create a-new-data-word

This involves a novel contract with the dictionary find boilerplate:

: find ( -- |address executable-flag true|false| )

\ returns false if not found, else 3 cells on stack

In gforth Bugs18 cross-compiler syntax, the read/evaluate/print loop is:

  begin, \ repl

    begin, \ get a non-empty line

      p_ __cr p_ _getline \ update line-limit and >in

    until, \ at least 1 char before carriage return

    begin, \ words loop

      20 #, dup, p_ _delimit \ update word-start, word-limit, and >in

      if, \ success

        p_ _pack-word

        p_ _find \ -- |addr cflag true|false|

        if, \ found

          if, \ code

            p_ _execute ( ? -- ? )

          then, \ -- may be addr of data or results of executed word

          dup, dup, ^, \ continue flag

        else, \ not found, attempt number conversion

          p_ _number \ -- |n true|false|

          if, \ hex number

            dup, dup, ^, \ continue flag

          else, \ make a fuss about it

            p_ _where

            p_ __cr _name$ #, p_ _.$ char ? #, p_ __emit

            ALL-BITS #, \ terminate flag

          then,

        then,

      else,

        ALL-BITS #, \ terminate flag

      then,

    until, \ line terminated

  again,

The annotated transcript below shows that the BugsForth outer interpreter does not crash when the data stack overflows, and provides an example of what to expect when heavily-nested conditionals (which produce and consume housekeeping data at compile-time) are in the source.

The return stack is not so heavily loaded, and offers some relief via >r, r@, and r>, within a given procedure. Return stack overflow/underflow is fatal. The words rsnap and rdump are provided to examine the return stack.

[start transcript]
[Bugs18 FPGA board serial bootloader prompt]

Bugs18 is ready to receive code

[host terminal does file transfer of BugsForth.app binary without echo]
[BugsForth signon]

BugsForth free cells, here are hex 045B4 00A4C

[human input, BugsForth response]

words

-app repeat while again until begin then else if p_ j_ jz_ ; #^, ^, #|, |,

#&, &, #+, +, 2/, r@, r>, >r, drop, over, dup, #!, !, #@, @, #, nop, :

create $` parse \ char words <= > >= < 0< <> 0<> = 0= - negate invert swap

nip ^! |! &! +! error rdump rsnap dump spaces space cr .free .here here ?

.$ d. . emit key key? d+ m* cos sin ^ | & + 2/ drop over dup ! @ allot ,

[host terminal does file transfer of isine.bf source, BugsForth echos]

16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 \ diagnostic: fill the stack



\ Interactive sine oscillator, BugsForth translation

\ The audio dac is 12 bits, left-justified, signed.

\ The 10 msbs of an oscillator's phase indexes a 1024-entry sine lookup table.

\ The 8 lsbs are the fractional part.

\ Fs = 35156.25 Hz

\ Kf = 262144/Fs

\ omega = f*Kf \ the phase increment per sample given desired f Hz



\ decimal

\ 262144e0 35.15625e3 f/

\ fdup fconstant Kf \ 1 Hz in floating-point

\ 440e0 f* f>d drop constant A440 \ 440 Hz in Bugs18 integer

\ hex



\ BugsForth constants are implemented as procedures

: A440    0CD0 #, ; \ initial frequency

: ARX    3FFF5 #, ; \ read rx char

: PSTAT  3FFF6 #, ; \ read peripheral status

: ADAC-READY 8 #, ; \ audio dac mask

: ADAC   3FFFE #, ; \ write audio dac



create theta    0 , \ phase

create omega A440 , \ phase increment per sample



: .omega! \ ( n -- ) update omega and print

  dup, omega #!, j_ . \ eliminate tail call



: command \ ( -- quit-flag ) handle keyboard commands

  p_ key?

  if

    p_ ARX @, \ ( -- c ) get rx byte

    p_ cr dup, p_ emit p_ space \ echo on a new line

    \ consider some cases

    dup, char i #^, if \ not i

    dup, char P #^, if \ not P

    dup, char p #^, if \ not p

    dup, char O #^, if \ not O

    dup, char o #^, if \ not o

    dup, char q #^, if \ not q

    dup, ^,                     ;          then \ none of the above, ignore

                                ;          then \ q, quit

    dup, ^, omega #@, 2/,       j_ .omega! then \ o, down 1 octave

    dup, ^, omega #@, dup, +,   j_ .omega! then \ O, up 1 octave

    dup, ^, omega #@, 3FFFF #+, j_ .omega! then \ p, down

    dup, ^, omega #@, 1 #+,     j_ .omega! then \ P, up

    dup, ^, p_ A440             j_ .omega!      \ i, A440

  then

  dup, dup, ^,

  p_ rsnap \ probe

  ;



create help$

  $` commands: i(nitialize) O(ctave+) o(ctave-) P(itch+) p(itch-) q(uit)`



: sine-osc

  p_ cr help$ #, p_ .$

  begin

    p_ PSTAT @, p_ ADAC-READY &,

    if \ audio dac ready

      theta #, dup, >r, @,

      dup, p_ sin p_ ADAC !, \ adac <- sin theta

      omega #@, +, r>, !,  \ theta <- theta + omega

    then

    p_ command

  until ; \ quit flag



. . . . . . . . . . . . . . . . \ diagnostic: show stack survivors  00001

00002 00003 00003 00003 00003 00003 00003 00003 00003 00003 00003 00003

00003 00003 00003

[human input, BugsForth response]

words

sine-osc help$[] command .omega! omega[] theta[] ADAC ADAC-READY PSTAT ARX

A440 -app repeat while again until begin then else if p_ j_ jz_ ; #^, ^,

#|, |, #&, &, #+, +, 2/, r@, r>, >r, drop, over, dup, #!, !, #@, @, #,

nop, : create $` parse \ char words <= > >= < 0< <> 0<> = 0= - negate

invert swap nip ^! |! &! +! error rdump rsnap dump spaces space cr .free

.here here ? .$ d. . emit key key? d+ m* cos sin ^ | & + 2/ drop over dup

! @ allot ,



sine-osc

commands: i(nitialize) O(ctave+) o(ctave-) P(itch+) p(itch-) q(uit)

i  00CD0

q

[a run-time rsnap probe is in the source for a factor of sine-osc, i.e. command]

rdump

00ABC 00AFD 00A24 00974 00000 00000 00000 00000 00000 00000 00000 00000

00000 00000 00000 00000

[resample the return stack from the BugsForth command line]

rsnap rdump

00A24 00974 00000 00000 00000 00000 00000 00000 00000 00000 00000 00000

00000 00000 00000 00000

[end transcript]

At the time of writing, the author has just begun to discover the new coding styles that the BugsForth dialect makes possible, codespace/speed/readability tradeoffs, and effective debugging and performance profiling practices.

The author is willing to share:
* the gforth source for BugsForth
* example BugsForth applications source
* the VHDL source for the FPGA

Future plans include re-writing the Bugs18 CPU and SoC project in Verilog and fitting it to an inexpensive and current-available FPGA evaluation board. The hardware target has not been selected at the time of writing. The criteria are:
* at least 16Kx18 bits of block RAM
* at least 1 18x18->36 signed multiplier
* board I/O headers use 0.1" pin spacing
* FPGA Verilog compiler/fitter/configuration burner via USB tools proven to work hosted on Ubuntu Linux, with acquiring a Windows computer to do the job a reluctant second choice

Comments and suggestions (especially from those with similar experience) are welcome.

Finally, my thanks for your kind attention to this announcement.

Myron Plichota - 2018-01-20
myronplichota@gmail.com