Tiny Great Languages: Lisp

This is part 4 from series “Tiny Great Languages”.

Done with the concatenative language MOUSE, we can now turn our attention to another small and elegant language from way back: Lisp. Lisp is famous for its minimalist syntax (similar to Forth, the parser is almost nonexistent) and its clear, logical evaluation rules.

Created by John McCarthy in 1958, Lisp introduced ideas that forever changed how we think about programming. The language focused on recursion and symbolic computation, showing that with just a few operators and lambdas (anonymous functions), you can easily implement almost anything.

Interestingly, the original Lisp syntax wasn’t the parentheses-heavy version we know today. It started with M-expressions: car[cons[A,B]], which looks more like how modern languages call functions. However, S-expressions – a simpler and more uniform syntax – quickly took over. That’s the Lisp syntax we use today, with all those parentheses.

S-expressions fit beautifully with Lisp’s core idea that “code is data”. In Lisp, everything follows the same structure and can be processed in the same way. This flexibility is why macros in Lisp can manipulate code just like any other data.

To build a Lisp, all you need is a parser for S-expressions and an eval() function to evaluate the resulting syntax tree (or list of lists).

Lisp 1.5

There is a famous reference from the Lisp 1.5 manual (page 13, for the curious) shows Lisp implemented in itself:

apply[fn;x;a] =
     [atom[fn] -> [eq[fn;CAR] -> caar[x];
                  eq[fn;CDR] -> cdar[x];
                  eq[fn;CONS] -> cons[car[x];cadr[x]];
                  eq[fn;ATOM] -> atom[car[x]];
                  eq[fn;EQ] -> eq[car[x];cadr[x]];
                  T -> apply[eval[fn;a];x;a]];
     eq[car[fn];LAMBDA] -> eval[caddr[fn];pairlis[cadr[fn];x;a]];
     eq[car[fn];LABEL] -> apply[caddr[fn];x;cons[cons[cadr[fn];
                               caddr[fn]];a]]]

eval[e;a] = [atom[e] -> cdr[assoc[e;a]];
     atom[car[e]] ->
      [eq[car[e],QUOTE] -> cadr[e];
      eq[car[e];COND] -> evcon[cdr[e];a];
      T -> apply[car[e];evlis[cdr[e];a];a]];

evcon[c;a] = [eval[caar[c];a] -> eval[cadar[c];a];
      T -> evcon[cdr[c];a]]

evlis[m;a] = [null[m] -> NIL;
      T -> cons[eval[car[m];a];evlis[cdr[m];a]]]

It relies on a few basic primitives (CAR, CDR, CONS, ATOM, EQ) and leaves out all arithmetic operations (which we’ll need for calculating factorials). Still, this structure is exactly what we need to start implementing a basic Lisp interpreter.

For our implementation, we’ll use Python lists as the core data structure. While original Lisp used pairs to build lists, Python’s built-in list utilities make it much easier (and save us a lot of time). We’ll treat an empty list [] as nil and use the string “t” as our equivalent of True (it will be used as a quoted symbol 't).

Let’s start with the atom() primitive. In our Lisp, an atom is anything that isn’t a list (or is an empty list):

def atom(x): return "t" if type(x) != type([]) or len(x) == 0 else []

We don’t need to explicitly define “car” and “cdr” functions—Python lets us access the first element of a list with list[0] and the rest of the list with list[1:]. Similarly, we can use Python’s native == for equality checks instead of defining “eq”. For “cons”, we’ll use Python’s + operator to concatenate lists.

This gives us enough to define the core apply function:

def apply(f, args, L):
    if f == "atom": return atom(args[0])
    elif f == "car": return args[0][0]
    elif f == "cdr": return args[0][1:]
    elif f == "cons": return [args[0]] + args[1]
    elif f == "eq": return "t" if atom(args[0]) and args[0] == args[1] else []
    elif f == "+": return args[0] + args[1]
    elif f == "-": return args[0] - args[1]
    elif f == "*": return args[0] * args[1]
    elif f == "/": return args[0] / args[1]
    elif f[0] == "lambda": return eval(f[2], pairlis(f[1], args, L))
    else: return apply(eval(f, L), args, L)

Unlike eval that operates of special Lisp forms, apply merely executes functions. Atom checks if the argument is an atom, car returns the first element, cdr returns the rest, cons combines arguments into a list, eq checks for equality (two lists are never equal, only atoms can be compared like this). The goes arithmetics and finally we compute lambdas or fall back to a generic user function evaluation.

Lambdas in our Lisp have the following syntax:

((lambda (x y z) (+ (+ x y) z)) 1 2 3)  ; returns 6

The lambda takes a list of parameters and a body. When evaluating a lambda, we substitute the parameters with their actual arguments in the current environment (without evaluating them yet). This is how pairlis does the substitution – it prepends pairs of (argument + value) to the environment list L:

def pairlis(x, y, L): return L if not x else [[x[0], y[0]]] + pairlis(x[1:], y[1:], L)

pairlis([3], [4], [])
# [[3, 4]]

pairlis([1, 2, 3], [4, 5, 6], [[7, 8]])
# [[1, 4], [2, 5], [3, 6], [7, 8]]

Our environment would be holding variables (symbols) and their values in pairs and lookup would happen left-to-right. So if after pairlis a global variable is “shadowed” by the local parameter – it would be found first during the lookup.

Now we can implement eval and the rest of the interpreter:

def evlis(x, L): return [eval(x[0], L)] + evlis(x[1:], L) if x else []

def evcon(x, L): return [] if len(x) == 0 else eval(x[0][1], L) if eval(x[0][0], L) else evcon(x[1:], L)

def assoc(x, L): return [] if not L else L[0][1] if L[0][0] == x else assoc(x, L[1:])

def eval(x, L):
    if x == "nil": return []
    elif x == "t" or type(x) == int: return x
    elif type(x) == str: return assoc(x, L)
    elif x[0] == "quote": return x[1]
    elif x[0] == "cond": return evcon(x[1:], L)
    elif x[0] == "label": L.insert(0, [x[1], x[2]]); return x[1]
    else: return apply(x[0], evlis(x[1:], L), L)

Short and simple, evlis calls eval recursively for every element in the listand returns each computed value. This is needed to evaluate lambda parameters before calling it. Another helper is evcon, which looks up for a “key” in the list of pairs and evaluates the “value” part of it. This is required for cond form, which is analogous to switch/case in other languages. Last thing is assoc which retuns a value by the key without evaluating it. We need it to resolve symbols (variables) in the environments.

Putting it all together with a minimal parser - and we have a Lisp that can calculate factorials!

import sys
def lex(code): return code.replace("(", " ( ").replace(")", " ) ").split()
def parse(tokens):
    t = tokens.pop(0)
    if t == "(":
        sexp = []
        while tokens[0] != ")": sexp.append(parse(tokens))
        tokens.pop(0)
        return sexp
    try: return int(t)
    except: return t
def pairlis(x, y, L): return L if not x else [[x[0], y[0]]] + pairlis(x[1:], y[1:], L)
def assoc(x, L): return [] if not L else L[0][1] if L[0][0] == x else assoc(x, L[1:])
def atom(x): return "t" if type(x) != type([]) or len(x) == 0 else []
def apply(f, args, L):
    if f == "atom": return atom(args[0])
    elif f == "car": return args[0][0]
    elif f == "cdr": return args[0][1:]
    elif f == "cons": return [args[0]] + args[1]
    elif f == "eq": return "t" if atom(args[0]) and args[0] == args[1] else []
    elif f == "+": return args[0] + args[1]
    elif f == "-": return args[0] - args[1]
    elif f == "*": return args[0] * args[1]
    elif f == "/": return args[0] / args[1]
    elif f[0] == "lambda": return eval(f[2], pairlis(f[1], args, L))
    else: return apply(eval(f, L), args, L)
def evcon(x, L): return [] if len(x) == 0 else eval(x[0][1], L) if eval(x[0][0], L) else evcon(x[1:], L)
def evlis(x, L): return [eval(x[0], L)] + evlis(x[1:], L) if x else []
def eval(x, L):
    if x == "nil": return []
    elif x == "t" or type(x) == int: return x
    elif type(x) == str: return assoc(x, L)
    elif x[0] == "quote": return x[1]
    elif x[0] == "cond": return evcon(x[1:], L)
    elif x[0] == "label": L.insert(0, [x[1], x[2]]); return x[1]
    else: return apply(x[0], evlis(x[1:], L), L)

G = []
[print(eval(parse(lex(line.split(";", 1)[0])), G)) for line in sys.stdin if line.strip()]

Out test factorial program (due to parsing limitation all top-level S-expressions must be one-liners):

(label fac (lambda (x) (cond ((eq x 0) 1) (1 (* x (fac (- x 1)))))))
(fac 1)
(fac 2)
(fac 3)
(fac 4)
(fac 5)
(fac 6)
(fac 7)
(fac 8)
(fac 9)
(fac 10)

It calculates factorials for each number from 1 to 10, and does it correctly!

For inspiration you may try yourself in implementing a Lisp in another language that has no garbage collector (which is an important part of a Lisp machine). Or implement a proper subset of Scheme, there are good implementations such as TinyScheme or Bit Scheme to refer to.

In the next chapter we give further into the world of mathematics and look at the array processing languages, stay tuned!

I hope you’ve enjoyed this article. You can follow – and contribute to – on Github, Mastodon, Twitter or subscribe via rss.

Sep 12, 2024

See also: Tiny Great Languages: MOUSE and more.