BugsForth is a dialect tailored for the Bugs18 Forth CPU running at 80 MHz on the author's (obsolete) Xilinx Spartan3A FPGA evaluation board. The data and return stacks are register arrays (not in memory) of 16 members each with no hardware provision to detect stack overflow/underflow. There is also a program counter and an invisible instruction packet register which will be explained further.
Code, data, and I/O are all in one flat memory space:
{00000..000D7} USB/UART serial bootloader with recyclable procedures
{000D8..here-1} BugsForth
{here..04FFF} free dictionary to the end of physical RAM on the FPGA
{05000..0FFFF} maximum reach of the program counter, for reference only
{10000..1FFFF} undefined
{20000..3FFFF} I/O, includes 18x18 multiplier and carry from addition
Bugs18 memory cells, instruction packets, and most registers are 18 bits. There is no byte addressing. ASCII strings are packed 2 chars per cell by convention.
The program counter addresses complete instruction packets containing 1 or 4 instructions, not the individual instructions within a packet. After an instruction packet fetch, the program counter is incremented to the address of the next packet or inline literal satisfying an occurrence of the # instruction within the current packet. The # instruction is the sole means to convey a compile-time expression to a run-time constant.
A Bugs18 instruction packet has fields within, referred to as slots {0..4}. If slot 0 = 0 (not a jump or call) then slot 1 is the current instruction. When slot 1 is retired, slot 0 retains its value and the rest of the instruction packet register is shifted so:
slot 1 <- slot 2 <- slot 3 <- slot 4 <- nop
Eventually slots {2..4} are filled with nops. If the final instruction in slot 1 does not access memory, then the next packet is fetched in parallel, otherwise a shifted-in nop will appear in slot 1, preserving CPU registers while the next packet is fetched in sequence and the process repeats.
If slot 0 <> 0 then slots {1..4} contain the jump target address. The entire packet is retired without any of the above slot shifting, and another packet will be immediately fetched.
.----- slot 0: 2 bits, qualifies slots {1..4}
|.---- slot 1: 4 bits, 1st instruction or digit 3 of absolute address
||.--- slot 2: 4 bits, 2nd instruction or digit 2 of absolute address
|||.-- slot 3: 4 bits, 3rd instruction or digit 1 of absolute address
||||.- slot 4: 4 bits, 4th instruction or digit 0 of absolute address
|||||
nop 00000 do nothing ( -- ) 0|1|2 cycles Note 1
# 01111 inline literal ( -- n ) program counter increments, 2 cycles
@ 02222 fetch ( addr -- n ) 2 cycles
! 03333 store ( n addr -- ) 1 cycle
dup 04444 duplicate top ( n -- n n ) 1 cycle Note 2
over 05555 duplicate next ( n1 n2 -- n1 n2 n1 ) 1 cycle Note 2
drop 06666 discard top ( n -- ) 1 cycle Note 2
>r 07777 move top to return stack ( n -- ) ( r: -- n ) 1 cycle Note 2
r> 08888 move return stack to top ( -- n ) ( r: n -- ) 1 cycle Note 2
r@ 09999 copy return stack to top ( n -- ) ( r: n -- n ) 1 cycle Note 2
2/ 0AAAA signed shift right ( n -- n>>1 ) 1 cycle Note 2
+ 0BBBB add ( n1 n2 -- n1+n2 ) 1 cycle Note 2
& 0CCCC and ( n1 n2 -- n1&n2 ) 1 cycle Note 2
| 0DDDD or ( n1 n2 -- n1|n2 ) 1 cycle Note 2
^ 0EEEE xor ( n1 n2 -- n1^n2 ) 1 cycle Note 2
; 0FFFF return ( -- ) ( r: addr -- ) 2 cycles
jz 1aaaa jump if zero ( flag -- ) 2 cycles
j 2aaaa jump ( -- ) 2 cycles
p 3aaaa call procedure ( -- ) ( r: -- addr ) 2 cycles
Note 1:
0 cycles - when filling the trailing slots of a packet that was compiled
not-full because a jump target at the next address is required
1 cycle - when an occurrence is followed by any non-nop(s) in the packet
2 cycles - when a packet full of nops has been fetched
Note 2:
These instructions may execute in parallel with a 2-cycle packet fetch.
A random read of synchronous block RAM requires 2 cycles. The first cycle
provides the read address, making the read data available in the second
cycle.
Why 18 bits?
1) Many competing FPGA vendors provide 18-bit block RAM and
multiplier fabric, making it a de-facto standard.
2) 18-bit integers are 4*more better than 16 bits.
3) The instruction packet format is simple and consistent,
reducing the pain of viewing hex dumps when ya jist gotsta know.
Unlike most Forths, BugsForth is case-sensitive so that dictionary searches are as simple and fast as possible. The author thinks twice before defining a new word whose name differs from an existing word only by case.
The BugsForth prompt is a new line without the traditional ok. No news is good news. Hexadecimal is the only implemented numeric text conversion base. Redefinition of existing words is forbidden.
All words in the dictionary contain an executable flag in their headers. Code is flagged executable, data is flagged not executable.
BugsForth is subroutine-threaded, with no inner interpreter. In conventional Forth terms, there is no distinction between compiling and interpreting states. All executable words may be regarded as IMMEDIATE, running whenever encountered by the outer interpreter. Certain parsing words that consume the next text word are required to create new words or yield the pointer to an existing executable word. Examples:
: a-new-code-word jz_ an-existing-code-word j_ an-existing-code-word p_ an-existing-code-word create a-new-data-word
This involves a novel contract with the dictionary find
boilerplate:
: find ( -- |address executable-flag true|false| ) \ returns false if not found, else 3 cells on stack
In gforth Bugs18 cross-compiler syntax, the read/evaluate/print loop is:
begin, \ repl
begin, \ get a non-empty line
p_ __cr p_ _getline \ update line-limit and >in
until, \ at least 1 char before carriage return
begin, \ words loop
20 #, dup, p_ _delimit \ update word-start, word-limit, and >in
if, \ success
p_ _pack-word
p_ _find \ -- |addr cflag true|false|
if, \ found
if, \ code
p_ _execute ( ? -- ? )
then, \ -- may be addr of data or results of executed word
dup, dup, ^, \ continue flag
else, \ not found, attempt number conversion
p_ _number \ -- |n true|false|
if, \ hex number
dup, dup, ^, \ continue flag
else, \ make a fuss about it
p_ _where
p_ __cr _name$ #, p_ _.$ char ? #, p_ __emit
ALL-BITS #, \ terminate flag
then,
then,
else,
ALL-BITS #, \ terminate flag
then,
until, \ line terminated
again,
The annotated transcript below shows that the BugsForth outer interpreter does not crash when the data stack overflows, and provides an example of what to expect when heavily-nested conditionals (which produce and consume housekeeping data at compile-time) are in the source.
The return stack is not so heavily loaded, and offers some relief via >r, r@, and r>, within a given procedure. Return stack overflow/underflow is fatal. The words rsnap and rdump are provided to examine the return stack.
[start transcript]
[Bugs18 FPGA board serial bootloader prompt]
Bugs18 is ready to receive code
[host terminal does file transfer of BugsForth.app binary without
echo]
[BugsForth signon]
BugsForth free cells, here are hex 045B4 00A4C
[human input, BugsForth response]
words -app repeat while again until begin then else if p_ j_ jz_ ; #^, ^, #|, |, #&, &, #+, +, 2/, r@, r>, >r, drop, over, dup, #!, !, #@, @, #, nop, : create $` parse \ char words <= > >= < 0< <> 0<> = 0= - negate invert swap nip ^! |! &! +! error rdump rsnap dump spaces space cr .free .here here ? .$ d. . emit key key? d+ m* cos sin ^ | & + 2/ drop over dup ! @ allot ,
[host terminal does file transfer of isine.bf source, BugsForth echos]
16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 \ diagnostic: fill the stack
\ Interactive sine oscillator, BugsForth translation
\ The audio dac is 12 bits, left-justified, signed.
\ The 10 msbs of an oscillator's phase indexes a 1024-entry sine lookup table.
\ The 8 lsbs are the fractional part.
\ Fs = 35156.25 Hz
\ Kf = 262144/Fs
\ omega = f*Kf \ the phase increment per sample given desired f Hz
\ decimal
\ 262144e0 35.15625e3 f/
\ fdup fconstant Kf \ 1 Hz in floating-point
\ 440e0 f* f>d drop constant A440 \ 440 Hz in Bugs18 integer
\ hex
\ BugsForth constants are implemented as procedures
: A440 0CD0 #, ; \ initial frequency
: ARX 3FFF5 #, ; \ read rx char
: PSTAT 3FFF6 #, ; \ read peripheral status
: ADAC-READY 8 #, ; \ audio dac mask
: ADAC 3FFFE #, ; \ write audio dac
create theta 0 , \ phase
create omega A440 , \ phase increment per sample
: .omega! \ ( n -- ) update omega and print
dup, omega #!, j_ . \ eliminate tail call
: command \ ( -- quit-flag ) handle keyboard commands
p_ key?
if
p_ ARX @, \ ( -- c ) get rx byte
p_ cr dup, p_ emit p_ space \ echo on a new line
\ consider some cases
dup, char i #^, if \ not i
dup, char P #^, if \ not P
dup, char p #^, if \ not p
dup, char O #^, if \ not O
dup, char o #^, if \ not o
dup, char q #^, if \ not q
dup, ^, ; then \ none of the above, ignore
; then \ q, quit
dup, ^, omega #@, 2/, j_ .omega! then \ o, down 1 octave
dup, ^, omega #@, dup, +, j_ .omega! then \ O, up 1 octave
dup, ^, omega #@, 3FFFF #+, j_ .omega! then \ p, down
dup, ^, omega #@, 1 #+, j_ .omega! then \ P, up
dup, ^, p_ A440 j_ .omega! \ i, A440
then
dup, dup, ^,
p_ rsnap \ probe
;
create help$
$` commands: i(nitialize) O(ctave+) o(ctave-) P(itch+) p(itch-) q(uit)`
: sine-osc
p_ cr help$ #, p_ .$
begin
p_ PSTAT @, p_ ADAC-READY &,
if \ audio dac ready
theta #, dup, >r, @,
dup, p_ sin p_ ADAC !, \ adac <- sin theta
omega #@, +, r>, !, \ theta <- theta + omega
then
p_ command
until ; \ quit flag
. . . . . . . . . . . . . . . . \ diagnostic: show stack survivors 00001
00002 00003 00003 00003 00003 00003 00003 00003 00003 00003 00003 00003
00003 00003 00003
[human input, BugsForth response]
words sine-osc help$[] command .omega! omega[] theta[] ADAC ADAC-READY PSTAT ARX A440 -app repeat while again until begin then else if p_ j_ jz_ ; #^, ^, #|, |, #&, &, #+, +, 2/, r@, r>, >r, drop, over, dup, #!, !, #@, @, #, nop, : create $` parse \ char words <= > >= < 0< <> 0<> = 0= - negate invert swap nip ^! |! &! +! error rdump rsnap dump spaces space cr .free .here here ? .$ d. . emit key key? d+ m* cos sin ^ | & + 2/ drop over dup ! @ allot , sine-osc commands: i(nitialize) O(ctave+) o(ctave-) P(itch+) p(itch-) q(uit) i 00CD0 q
[a run-time rsnap probe is in the source for a factor of sine-osc, i.e. command]
rdump 00ABC 00AFD 00A24 00974 00000 00000 00000 00000 00000 00000 00000 00000 00000 00000 00000 00000
[resample the return stack from the BugsForth command line]
rsnap rdump 00A24 00974 00000 00000 00000 00000 00000 00000 00000 00000 00000 00000 00000 00000 00000 00000
[end transcript]
At the time of writing, the author has just begun to discover the new coding styles that the BugsForth dialect makes possible, codespace/speed/readability tradeoffs, and effective debugging and performance profiling practices.
The author is willing to share:
* the gforth source for BugsForth
* example BugsForth applications source
* the VHDL source for the FPGA
Future plans include re-writing the Bugs18 CPU and SoC project in
Verilog and
fitting it to an inexpensive and current-available FPGA evaluation
board.
The hardware target has not been selected at the time of writing.
The criteria are:
* at least 16Kx18 bits of block RAM
* at least 1 18x18->36 signed multiplier
* board I/O headers use 0.1" pin spacing
* FPGA Verilog compiler/fitter/configuration burner via USB tools
proven to
work hosted on Ubuntu Linux, with acquiring a Windows computer to
do the job
a reluctant second choice
Comments and suggestions (especially from those with similar experience) are welcome.
Finally, my thanks for your kind attention to this announcement.
Myron Plichota - 2018-01-20
myronplichota@gmail.com