A Forth vocabulary for iteration

I recently wrote a small 16-bit Forth for 8086 PCs running DOS. I built the basic one-liner loop words that can trivially be built with just “branch if zero” and “goto”: begin, while, repeat, until, again. But I held off on implementing do / loop at first.

It didn't seem like too much of a hardship. In a previous Forth I'd built, I'd implemented do / loop using the return stack, but it was... ugly. The code to implement it was ugly, the code it generated was ugly (and large!), and I didn't find a lot of places where it was actually much nicer to use than explicit begin-based loops. I was able to implement an 8086 assembler and a Minesweeper game without bothering to build do / loop. I didn't really miss it, but I had a design percolating in the back of my mind that I wanted to try.

At some point I came across some writing that suggested that Forth had a “loop control stack”. Wouldn't it be nice if I could implement some kind of loop control stack that worked for all kinds of iteration?

The thing I built has blown me away with how flexible, composable, and useful it's turned out to be. It's way more powerful than I was expecting. And the code that leverages it is inevitably much simpler and easier to read.

The Stacks

I added two loop control stacks – what I call the i-stack, and the next-stack. The i-stack contains the current value(s) being iterated over, and is read from with the i and j words like normal. The next-stack is where the magic happens.

When iterating, the top value of the next-stack is a pointer to a small structure called an iterator. It's a very simple structure, only two cells. The first cell contains the execution token of a word that will either update the current values on the i-stack and return true, or remove its state from both stacks and return false. The second cell points to a cancellation function, that cleans up whatever state the iterator has kept on the two stacks without iterating further, and returns nothing.

Iterators

I built some simple helpers for creating iterators. It took a few tries to nail down this design, but I'm happy with it now. defiter creates a “blank” iterator, which, when called, pushes itself to the next-stack. :iter does the same thing but allows you to write some code that accepts parameters and prepares the loop stacks first. :next defines a new anonymous “next-word” and assigns it to the most-recently defined iterator. :cancel does the same thing, but for cancellation.

On its own, this is already quite nice. I've got a page or so of basically-trivial iterators. Here's one:

:iter times ( n -- ) >i ;
:next <i dup 1- >i finish? ;
:cancel idrop nextdrop ;

times keeps its state in the i-stack – it initializes itself by pushing the number of times to repeat onto it. When fetching the next value, it pops the current value off the i-stack, decrements it, and pushes it back, leaving the old value on the data stack. finish? is a simple helper word that peeks at the top of the stack and runs the current cancellation function if it's false, or in this case, if we've already hit 0. Since cleaning up after an iterator is often the same job whether you're exiting early or not, this word is very handy. Explicitly defining cancellation for this iterator isn't actually necessary in my current implementation, because idrop nextdrop is common enough that I use it as the default.

each / next

I can use these iteration words (within a compiled definition) like this:

5 times each i . next
( outputs: 4 3 2 1 0 )

All the common loop types are easy to build in this system, as well as some uncommon ones:

5 10 for each i . next ( outputs: 5 6 7 8 9 )
0 10 2 for+ each i . next ( outputs: 0 2 4 6 8 )
( pchars yields pointers to each byte in a zero-terminated string )
s" hello" pchars each i b@ emit next ( outputs: hello )

Generic cancellation, of course, allows us to trivially implement break; just cancel the iteration at the top of the stack, and then jump to the each loop exit point, after next. continue is even simpler, just jump back to the top of the loop.

5 times each i 3 < if break then i . next ( outputs: 4 3 )
5 times each i 2 % if continue then i . next ( outputs 4 2 0 )

Under the hood, each just calls the “next-word” of the iterator and jumps to the end of the loop if it returns 0 – conceptually identical to begin iterate while, with next meaning the same thing as repeat. This allows for iterators that return no values.

0 times each i . next ( outputs: )

Generators

That's nice, but it's not exactly setting the world on fire; it's a fair amount of work just to end up with a few different ways of writing “for” loops in practice, that Forth systems have had forever anyway. Is it really worth the cost of this abstraction?

Turns out, absolutely, yes, it is, because you can also build generators on it, and that blows things wide open.

First, a simple example:

: 5-2-8 (( 5 yield 2 yield 8 yield )) ;
5-2-8 each i . next ( outputs: 5 2 8 )

(( defines the start of the generator, and )) defines the end (and pushes it onto the next-stack). Any valid Forth code goes in between. yield takes the top of the stack, pushes it onto the i-stack, and then suspends the generator until the next iteration. How does this work? Essentially, yield takes the top of the return stack and pushes it onto the next-stack, then pushes an iterator that pops it off the next-stack and pushes it back onto the return stack. The details get a little messier in order to support some more advanced use cases, but that's the simple idea at the core of it.

OK, neat trick, we've built ourselves a nice little coroutine-like system. But wait! It gets better! When yield resumes, it immediately removes all of its state from the iteration stacks. This means that generators can safely interact with any iterator that might be “underneath” it. They can iterate over things and yield in the middle! They can yield different things based on those values! We've accidentally built an extremely powerful, totally generic map/filter capability!

: doubled (( each i i + map next )) ;
5 times doubled each i . next ( outputs: 8 6 4 2 0 )
: odd (( each i 2 % filter next )) ;
5 times odd each i . next ( outputs: 3 1 )

map and filter are more yield-like words – it turns out that there's a number of these that you might want to implement, with different logic for suspending, resuming, and cancelling. map saves the top of the i-stack onto the next-stack and replaces it with the input, restoring the original value after resuming (necessary since the iterator underneath might be using that value as its state). filter conditionally suspends based on the top of the data stack but otherwise doesn't touch the i-stack, leaving whatever iterator is running underneath to provide the value. Both of these words push iterators with special cancel logic that knows that there is another iterator underneath, and can cancel again recursively once they've cleaned themselves up.

Generator state

This design can almost be made to work for generators that have extra state, but it's awkward and incomplete. You must ensure the data stack is clean whenever you yield, so you're forced to manually shuffle data to and from the next stack. Consider a filter that only returns values that are divisible by a certain number:

: divisible-by ( n -- ) >next 
  (( <next each i over % 0 = swap >next filter <next next drop )) ;
5 divisible-by 21 times each i . next ( ouputs: 20 15 10 5 0 )

This works, but there's so much stack noise! And it breaks down if you need to cancel, because filter has no idea that there's extra stuff on the next-stack that it needs to clear. Ideally there would be some automatic way of keeping the state of the generator on the data stack while it's running, and push it safely away when we suspend. Could there be some way to write divisible-by like this?

: divisible-by ( n -- ) >arg (( each i over % 0 = filter next drop )) ;

In fact, this code works in my implementation. The scheme to make this happen is a little bit subtle, but it can be done efficiently with a minimum of bookkeeping noise in most cases. I define a variable, gen-arg-count, that starts at zero. >arg is an immediate word that compiles a call to >next and increments that variable. Then, any time I compile a yielding word, I append the value of gen-arg-count to the instruction stream – much like lit. When suspending, the yielding word reads that value out of the instruction stream and transfers that many values from the data stack to the next-stack. Then it moves the pointer to the instruction stream from the return stack to the next-stack, and finally pushes the yielding iterator. That iterator then pulls the instruction pointer back off the next-stack to determine how many values to move from the next-stack back onto the data stack, as well as where to resume the instruction stream. Cancellation similarly can read the arg-count byte to know how many extra values to drop from the next-stack.

Generators need to ensure the data stack is empty before exiting at )). At one point I considered having )) compile the appropriate number of drop calls automatically, but in the end I decided that it's reasonable and idiomatic to expect a generator to exit with a clean stack, like any other Forth word would.

With this extension, it's trivial to write all kinds of new iterators – we could even do away with the base iterator system entirely and just express everything as generators. There are lots nice one-line definitions of times:

( 1 ) : times ( n -- ) >arg (( begin dup while 1- dup yield repeat drop )) ;
( 2 ) : times ( n -- ) >next (( <next begin dup while 1- yield> repeat drop )) ;
( 3 ) : times ( n -- ) >arg (( -arg begin dup while 1- yield> repeat drop )) ;
( 4 ) ( suspend ) ' noop ( resume ) ' noop ( cancel ) ' idrop :yield iyield
: times ( n -- ) >i (( begin i while <i 1- >i iyield repeat idrop )) ;

Definition 1 doesn't use anything I haven't already explained. The state of the iterator is managed on the data stack, and automatically shuffled back and forth from the next-stack by yield.

Definition 2 adds a new word. yield> is a yielder that moves the yielded value from the i-stack back onto the data stack when it resumes, instead of dropping it. The state of the iterator starts on the next-stack but is moved to the i-stack once the iteration loop actually starts.

Definition 3 is virtually the same as 2, but demonstrates the ability to handle changes in the amount of state. -arg is an immediate word that generates no code, but decrements gen-arg-count so that you can express that you've consumed the argument and the next yield should preserve one less value on the data stack. (+arg is also defined, performing an increment, in case you generate more values on the stack than you started with.)

Definition 4 is built to keep all state on the i-stack from the beginning. Here we use :yield to define a new yielding word. I realized I hadn't built a yielder that left the i-stack alone when resuming, but would drop the value when cancelling, so I added one.

All of these options will correctly be cancelled if the code iterating over it calls break, with no special effort!

Final thoughts

With this scheme, generators always take up at least two spaces on the next-stack – one for the yielder's iterator, and one for the resume point. But if all iterators were defined as generators, and all yielding words had to be defined with :yield to ensure a uniform structure, we could just push the resume point. iterate and cancel could easily find the appropriate function pointer by looking next to the resume point for the address of the yielder and digging inside. I think this could be built in such a way that it would be basically as efficient as the existing scheme, at the cost of making the whole thing more complex to explain. It might be worth pursuing, because generators are so pleasant to read and write, and raw iterators are... less so. I basically never want to write a raw iterator besides the very basic ones that are built-in.

All the source for my Forth system is available online; the iteration system is defined in iter.jrt. There are some interesting examples of generators in embed.jrt, dialer.jrt and rick.jrt – some highlights: