Tiny Great Languages: BASIC

This is part 2 from series “Tiny Great Languages”.

Meet BASIC, the king of home computing in the late 1970s. Originally designed to promote computer literacy in schools, BASIC inspired a whole generation of professional software engineers.

BASIC typically combined a simple text editor with a command shell and interpreter. Lines starting with a number were added, deleted, or edited in the program space, while lines without numbers were executed immediately. For example:

NEW
20 B=1
10 A=5
30 PRINT A+B
20 B=7
RUN

Here, NEW and RUN are immediate commands – NEW clears the user program, and RUN executes it. Numbered lines are stored in order, and typing the LIST command would display the entire program:

10 A=5
20 B=7
30 PRINT A+B

The outer loop of our BASIC would be simple, we have a luxury to store lines of code in a hash map (unlike the old BASICs that had to run on just a few kilobytes of RAM and stored very compact tokenised input instead):

code = {}

# Cut a number from a string and return the rest,
# i.e. "10 A=3" -> (10, " A=3"), "LIST" -> (0, "LIST")
def num(s):
    n = 0
    while s and s[0].isdigit(): n, s = n * 10 + int(s[0]), s[1:]
    return n, s.strip()

def stmt(s):
    ....

for line in sys.stdin:
    lineno, line = num(line)
    if lineno: code[lineno] = line.strip()
    else: stmt(line)

Numbered lines are stored in code, empty lines effectively “delete” code. Otherwise, lines are executed by the stmt() function, which is the core of our BASIC interpreter. Simple commands can be handled with one-liners in Python – NEW is code.clear(), BYE is sys.exit(0), REM is a no-op and LIST is:

print("\n".join([f"{n:3} {ln}" for (n, ln) in sorted(code.items()) if ln]))

Although we are targeting an extremely tiny BASIC subset, we still need to support GOTO. Our stmt() function will handle jumping between the lines by keeping track of the “current line number”. For simplicity, we’ll introduce a variable table (dict), and make line number a special variable named # (because it’s a number, plus there were a few BASIC dialects that used this notation).

Now we can complete the stmt() function:

def stmt(s):
    def do_if(s):
        n, ln = expr(s)
        if n:
            stmt(ln)

    while s != None:
        cmd = s.split(None, 1)
        vars["#"] += 1 if vars["#"] else 0
        ops = {
            "rem": lambda args: None,
            "new": lambda args: code.clear(),
            "bye": lambda args: sys.exit(0),
            "list": lambda args: print(
                "\n".join([f"{n:3} {ln}" for (n, ln) in sorted(code.items()) if ln])
            ),
            "print": lambda args: print(
                args[1:-1] if args and args[0] == '"' else expr(args)[0]
            ),
            "input": lambda args: vars.update({args[0]: int(input("] "))}),
            "goto": lambda args: vars.update({"#": expr(args)[0]}),
            "if": lambda args: do_if(args),
            "run": lambda args: vars.update({"#": 1}),
        }
        if cmd and cmd[0].lower() in ops:
            ops[cmd[0].lower()](cmd[1] if len(cmd) > 1 else "")
        elif s:
            assign = s.split("=", 1)
            vars[assign[0]], _ = expr(assign[1])
        if vars["#"] <= 0:
            break
        vars["#"], s = next(
            ((n, ln) for (n, ln) in sorted(code.items()) if n >= vars["#"]), (0, None)
        )

Our BASIC only supports 4 commands: PRINT for output, INPUT for reading numbers from console, GOTO for loops and jumps, IF for conditional execution. The rest are rather editing commands and provide an interactive shell: REM is a comment, NEW clears the code space, BYE exits back to the OS shell (although in most cases BASIC was the OS shell), LIST prints full code on display.

Out stmt() function however relies on another expr() function for evaluating math expressions, which can be implemented as a recursive descent parser:

vars = {"#": 0}

def expr(s):
    res, s = term(s); op = ""
    while s and s[0] in "+-":
        op = s[0]
        n, s = term(s[1:])
        res += n if op == "+" else -n
    return res, s

def term(s):
    res, s = factor(s)
    while s and s[0] in "*/":
        op = s[0]
        n, s = factor(s[1:])
        res *= n if op == "*" else 1 / n if n != 0 else 0
    return res, s

def factor(s):
    if s and s[0] == "(": res, s = expr(s[1:]); return res, s[1:]
    elif s and s[0].isdigit(): return num(s)
    else:
        i = 0
        while i < len(s) and s[i].isalnum(): i += 1
        return vars.get(s[:i], 0), s[i:]

Adding more operations isn’t that hard (to add <, >, = and # as a not-equal operator), but is left as an exercise to the reader.

Putting all the code pieces together - we get a BASIC interpreter/shell in 53 lines of code!

import sys

code, vars = {}, {"#": 0}

def num(s):
    n = 0
    while s and s[0].isdigit(): n, s = n * 10 + int(s[0]), s[1:]
    return n, s.strip()

def expr(s):
    res, s = term(s); op = ""
    while s and s[0] in "+-":
        op = s[0]
        n, s = term(s[1:])
        res += n if op == "+" else -n
    return res, s


def term(s):
    res, s = factor(s)
    while s and s[0] in "*/":
        op = s[0]
        n, s = factor(s[1:])
        res *= n if op == "*" else 1 / n if n != 0 else 0
    return res, s


def factor(s):
    if s and s[0] == "(": res, s = expr(s[1:]); return res, s[1:]
    elif s and s[0].isdigit(): return num(s)
    else:
        i = 0
        while i < len(s) and s[i].isalnum(): i += 1
        return vars.get(s[:i], 0), s[i:]


def stmt(s):
    def do_if(s):
        n, ln = expr(s)
        if n: stmt(ln)

    while s != None:
        cmd = s.split(None, 1)
        vars["#"] += 1 if vars["#"] else 0
        ops = {
            "rem": lambda args: None,
            "new": lambda args: code.clear(),
            "bye": lambda args: sys.exit(0),
            "list": lambda args: print("\n".join([f"{n:3} {ln}" for (n, ln) in sorted(code.items()) if ln])),
            "print": lambda args: print(args[1:-1] if args and args[0] == '"' else expr(args)[0]),
            "input": lambda args: vars.update({args[0]: int(input("] "))}),
            "goto": lambda args: vars.update({"#": expr(args)[0]}),
            "if": lambda args: do_if(args),
            "run": lambda args: vars.update({"#": 1}),
        }
        if cmd and cmd[0].lower() in ops: ops[cmd[0].lower()](cmd[1] if len(cmd) > 1 else "")
        elif s: assign = s.split("=", 1); vars[assign[0]], _ = expr(assign[1])
        if vars["#"] <= 0: break
        vars["#"], s = next(((n, ln) for (n, ln) in sorted(code.items()) if n >= vars["#"]), (0, None))


for line in sys.stdin:
    lineno, line = num(line)
    if lineno: code[lineno] = line.strip()
    else: stmt(line)

Despite its size, it is surprisingly powerful and easy to extend. You can add more commands by simply dropping lambdas into the ops dictionary or enhance the expr/factor/term logic for better arithmetic support. Adding loops (FOR .. TO .. NEXT) is also possible and would require keeping track of loop start addresses and counter boundaries. Adding GOSUB and RETURN is even simpler and requires storing a global call stack (to remember return addresses when entering a subroutine).

Can our BASIC calculate a factorial, though?

REM
REM Factorial program (fac.bas)
REM

NEW

10  PRINT "Enter N:"
20  INPUT N
30  F=1
50  IF N GOTO 70
60  GOTO 100
70  F=F*N
80  N=N-1
90  GOTO 50
100 PRINT F

LIST
PRINT ""
PRINT "Try 10!..."
RUN

Running python3 basic.py < fac.bas gives us the correct factorials, try it yourself!

While BASIC has been criticized for its lack of structured programming features, I still believe it’s a fantastic language for beginners. I’ve seen 10-year-olds struggle with Python, JavaScript and other “modern” languages, yet they light up when creating simple games in BASIC on retrocomputer emulators!

As an inspiration for exploring BASICs I’d suggest implementing a full TinyBASIC dialect, then add strings and arrays, maybe named functions and of course run a “STARTREK.BAS” on it! Or try writing a BASIC in Assembly to get the real taste of how the original BASICs were made.

In the next part we’ll move on to concatenative languages, stay tuned!

I hope you’ve enjoyed this article. You can follow – and contribute to – on Github, Mastodon, Twitter or subscribe via rss.

Sep 10, 2024

See also: Tiny Great Languages: Assembly and more.