Tiny Great Languages: MOUSE

This is part 3 from series “Tiny Great Languages”.

Let’s go Forth. A concatenative language available on early computers, a great example of how small, elegant languages can be both powerful and efficient.

I’ve already covered building a proper Forth from the ground up in an earlier post. Now, let’s explore a much smaller Forth-like language: MOUSE.

MOUSE, created in 1976, is often recommended as a good starting point for writing your first interpreter. It uses reverse Polish notation with single-letter operators and variables for its syntax. Like most languages from the 70s, MOUSE is compact, small enough to fit into just a few kilobytes of memory. But could the implementation be shorter than 50 lines of code?

The original MOUSE source code was published in the famous BYTE magazine, which taught an early generation of programmers how to make the most out of their micro- (now “retro-") computers. There is also a newer and extended version of MOUSE by the same author, but I’ll be using the version published in BYTE ‘79.

To get a feel for MOUSE, here’s a factorial program written in it:

' A simple factorial program in MOUSE language
N 10 =

I 2 =
F 1 =
(
    N. 1 + I. - ^
    F F. I. * =
    I I. 1 + =
)
N.! " => " F.! "!"

At first glance, it might look cryptic, but let’s break it down. In this code, N, I, and F are global variables. Unlike most Forth implementations, MOUSE relies heavily on named variables rather than stack manipulation (there’s no “dup”, “drop”, “swap”, etc.). There are 26 variables, named A to Z. The equals sign assigns a value to a variable, so N 10 = is like N := 10. In newer versions of MOUSE, they replaced the = with : for assignment. A dot after a variable fetches its value and pushes it onto the stack. So, N.1+I.- translates to N+1-I. An exclamation point is the print operator, so N.! is print(N).

Parentheses combined with a caret create a conditional loop: if the code to the left of the caret results in a non-positive value (0 or negative), the loop continues. Otherwise, it terminates. MOUSE also has a [ ... ] operator for “if/then” blocks, which are only executed if the current value on the stack is non-positive.

While MOUSE doesn’t have functions, it does support macros! A macro is defined using $name ... @ and is called with #name ... ;. Macro parameters, named %A, %B, etc., allow passing values into the macro and returning results. Other variables inside macros behave as local variables and are cleared when the macro exits. Macros can be nested and called recursively (your RAM is the limit, which in the 70s was often very low).

The syntax and logic of the language are simple enough to implement in a single function. We’ll also need a helper function, skip(), to find matching pairs of parentheses, brackets, and quotes.

import sys

# Returns a position in string "s" of the unmatched delimiter "r" (skipping nested l+r pairs)
#
#     skip("2+(3+(4+5)))+6", "(", ")") -> 12 ("+6")
#     skip("a [ b  [] [] c]] d []", "[", "]") -> 16 (" d []")
#
def skip(s, l, r): return next(i + 1 for i, c in enumerate(s) if (c == r and (s[:i].count(l) - s[:i].count(r)) == 0))

# A MOUSE interpreter in one function:
def mouse(s):
    # We need to store macro definitons (they don't shadow variables),
    # data stack, return stack and memory for variables.
    defs, ds, rs, data = {}, [], [], [0] * 200
    # First we loop over code and store starting addresses of all macros
    for i, c in enumerate(s):
        if c == "$": defs[s[i + 1]] = i + 2
    # "i" is a code pointer, offset if the start of the first local variable
    # "A" in current environment (environments can be nested, so inner
    # function's "A" becomes 26, the other inner starts with 52 etc.
    i, offset = 0, 0
    while i < len(s) and s[i] != "$":
        # Skip whitespace
        if s[i] in " \t\r\n]$": pass
        # Skip comments till the end of line
        elif s[i] == "'": i += s[i + 1 :].find("\n")
        # Parse numbers into integers
        elif s[i].isdigit():
            n = 0
            while s[i].isdigit(): n = n * 10 + ord(s[i]) - ord("0"); i = i + 1
            i = i - 1
            ds.append(n)
        # Put variable address on data stack
        elif s[i] in "ABCDEFGHIJKLMNOPQRSTUVWXYZ": ds.append(ord(s[i]) - ord("A") + offset)
        # Read user input as a number
        elif s[i] == "?": ds.append(num(input("> ")))
        # Print value from data stack as a number
        elif s[i] == "!": print(ds.pop(), end="")
        # Print a literal string, "!" is newline
        elif s[i] == '"': j = skip(s[i + 1 :], '"', '"'); print(s[i + 1 : i + j].replace("!", "\n"), end=""); i = i + j
        # Common arithmetics and comparison (note the lack of == and != operators
        elif s[i] == "+": ds.append(ds.pop() + ds.pop())
        elif s[i] == "-": ds.append(-ds.pop() + ds.pop())
        elif s[i] == "*": ds.append(ds.pop() * ds.pop())
        elif s[i] == "/": ds.append(int(1 / (ds.pop() / ds.pop())))
        elif s[i] == ">": ds.append(int(ds.pop() < ds.pop()))
        elif s[i] == "<": ds.append(int(ds.pop() > ds.pop()))
        # Fetch/dereference variable
        elif s[i] == ".": ds.append(data[ds.pop()])
        # Store value into a variable
        elif s[i] == "=": x = ds.pop(); data[ds.pop()] = x
        # If data stack has non-positive value: execute the block, otherwise skip it
        elif s[i] == "[": i += skip(s[i + 1 :], "[", "]") if ds.pop() <= 0 else 0
        # Start of the loop: store it in return stack
        elif s[i] == "(": rs.append(("LOOP", i, offset))
        # Loop condition: if condition is non-positive - continue, otherwise skip until pairing ")"
        elif s[i] == "^":
            if ds.pop() <= 0: _, i, _ = rs.pop(); i += skip(s[i + 1 :], "(", ")")
        # End of loop: return to its start (which is stored on return stack)
        elif s[i] == ")": _, i, offset = rs[-1]
        # Call a macro: save current code pointer and variable offset, jump to the start of the macro
        elif s[i] == "#":
            if s[i + 1] in defs:
                rs.append(("MACRO", i, offset))
                i = defs[s[i + 1]]
                offset = offset + 26
            else: i += skip(s[i + 1 :], "#", ";")
        # End of macro: return to the macro call address and ignore until the last parameter (";")
        elif s[i] == "@": _, i, offset = rs.pop(); i += skip(s[i + 1 :], "#", ";")
        # Macro parameter delimiter: return back to the macro and continue execution from there
        elif s[i] == "," or s[i] == ";": _, i, offset = rs.pop()
        # Macro parameter inside macro: jump to the matching value from the macro call
        elif s[i] == "%":
            pn = ord(s[i + 1])-ord("A")+1
            rs.append(("PARAM", i + 1, offset))
            pb = 1
            tmp = len(rs) - 1
            while pb:
                tmp = tmp - 1
                tag, nn, off = rs[tmp]
                if tag == "MACRO": pb = pb - 1
                elif tag == "PARAM": pb = pb + 1
            _, i, offset = rs[tmp]
            while i < len(s) and pn and s[i] != ";":
                i = i + 1
                if s[i] == "#": i += skip(s[i:], "#", ";")
                if s[i] == ",": pn = pn - 1
            if s[i] == ";": _, i, offset = rs.pop()
        i = i + 1


mouse("\n".join([line for line in sys.stdin]))

Not much more to add besides the comments. The code closely replicates what was suggested in BYTE magazine.

Macros must be defined at the end of the program, which is how the language was originally designed. This also means that the full program must be entered before execution (unlike BASIC, MOUSE is not a REPL).

Our interpreter first scans the code for macros, remembers their positions, and then executes the program byte by byte: arithmetics, conditionals, loops, input/output are just single-byte instructions that are executed immediately as they are found in code.

Without macros our MOUSE interpreter would have been half the size, but also half as useful. Macro implementation in MOUSE is rather interesting. Since the language limits itself to 26 variables - every “call frame” (a set of local variables per macro call) is also 26 cells long. So we keep track of an offset for the current call frame in memory and every macro can only reach out to the variables in that frame. We increase the offset as we enter the macro and restore it on return.

When a macro is called (#...;) the interpreter looks up its name in the definitions table. A missing macro is a no-op and does not halt the interpreter. If a macro is found - the interpreter jumps into its body (remembering the current address in code to return back) and keeps running from there.

Macro body always ends with @, at which point the interpreter pops up the return address and skips everything until the matching ; of the macro call. Macros without parameters are simple as that: store address when a macro is called, restore it at the end of the macro body, skip macro call until the ;.

Macro parameters make it more challenging. When a parameter is found in a macro (%A, %B, …) – the interpreter converts it to a number (position in the argument list during the call) and jumps back to the macro call to find the actual parameter value at that position. The value in macro call is being evaluated each time the parameter is used in a macro, which makes macros very different from functions. But overall it works very intuitively, and allows some nice tricks: writing to “outer” variables from the macro, using expressions as macro parameters, calling macros from inside macros and even calling them recursively.

Now that we understand how macros work - we can rewrite MOUSE factorial in a fancier and terser manner:

"10 => " #F,10;! "!"
$F F%A=1(F.^F.*FF.1-=) @

That’s it! MOUSE is an interesting little language and there is even a whole book written about it. For inspiration, check out FALSE, a similar but more modern and esoteric language. Or, if you’re up for a challenge, try building a full Forth system, especially one that runs on real hardware instead of a VM.

But for now, let’s move on! In the next chapter, we’ll take a closer look at Lisp. Stay tuned!

I hope you’ve enjoyed this article. You can follow – and contribute to – on Github, Mastodon, Twitter or subscribe via rss.

Sep 11, 2024

See also: Tiny Great Languages: BASIC and more.