Lisp as the Maxwell’s equations of software

On my first day of physics graduate school, the professor in my class on electromagnetism began by stepping to the board, and wordlessly writing four equations:

$\nabla \cdot E = \frac{\rho}{\epsilon_0} \hspace{2cm} \nabla \times E+\frac{\partial B}{\partial t} = 0$ $\nabla \cdot B = 0 \hspace{1.1cm} \nabla \times B-\mu_0 \epsilon_0 \frac{\partial E}{\partial t} = \mu_0 j$

He stepped back, turned around, and said something like [1]: “These are Maxwell’s equations. Just four compact equations. With a little work it’s easy to understand the basic elements of the equations – what all the symbols mean, how we can compute all the relevant quantities, and so on. But while it’s easy to understand the elements of the equations, understanding all their consequences is another matter. Inside these equations is all of electromagnetism – everything from antennas to motors to circuits. If you think you understand the consequences of these four equations, then you may leave the room now, and you can come back and ace the exam at the end of semester.”

Alan Kay has famously described Lisp as the “Maxwell’s equations of software”. He describes the revelation he experienced when, as a graduate student, he was studying the LISP 1.5 Programmer’s Manual and realized that “the half page of code on the bottom of page 13… was Lisp in itself. These were “Maxwell’s Equations of Software!” This is the whole world of programming in a few lines that I can put my hand over.”

Here’s the half page of code that Kay saw in that manual:

What we’re going to do in this essay is understand what that half page of code means, and what it means that Lisp is the Maxwell’s equations of software. However, we won’t literally work through the half page of code above. Instead, we’ll do something much more informative: we’ll create a modern, fully executable equivalent of the code above. Furthermore, to make this essay accessible, I won’t assume that you know Lisp. Instead, I’ll teach you the basic elements of Lisp.

That perhaps sounds over-ambitious, but the good news is that it’s easy to learn the basic elements of Lisp. Provided you have a little facility with computer programming and comfort with mathematics, you can learn how Lisp works in just a few minutes. Frankly, it’s much easier than understanding the elements of Maxwell’s equations! And so I’ll start by explaining a subset of the Lisp programming language, and getting you to write some Lisp code.

But I won’t stop with just showing you how to write some Lisp. Once we’ve done that we’re going to write an interpreter for Lisp code. In particular, we’ll create a interpreter based on a beautiful Lisp interpreter written by Peter Norvig, which contains just 90 lines of Python code. Our interpreter will be a little more complex, due mostly to the addition of a few conveniences absent from Norvig’s interpreter. The code is still simple and easy to understand, provided you’re comfortable reading Python code. As we’ll see, the benefit of writing the interpreter is not just that it gives us a running interpreter (although that’s no small thing). It’s that writing an interpreter also deepens our understanding of Lisp. It does that by taking what would otherwise be some rather abstract concepts in our description of Lisp, and giving them concrete, tangible representations in terms of Python code and data structures. By making concrete what was formerly abstract, the code for our Lisp interpreter gives us a new way of understanding how Lisp works.

With our Python Lisp interpreter up and running, we’ll then write a modern equivalent to the code on the bottom of page 13 of the LISP 1.5 Programmer’s Manual. But while our code will be essentially the same as the code from page 13, it will have the considerable advantage that it’s also executable. We can, if we wish, play with the code, modify it, and improve it. In other words, it’s a living version of the Maxwell’s equations of software! Furthermore, with our new understanding it becomes an easy and fun exercise to understand all the details on page 13 of the LISP Manual.

This second part of the essay is based primarily on two sources: the first chapter of the LISP 1.5 Manual, of course, but also an essay by Paul Graham (postscript) in which he explains some of the early ideas behind Lisp. Incidentally, “LISP” is the capitalization used in the LISP Manual, but otherwise I’ll use the modern capitalization convention, and write “Lisp”.

The great Norwegian mathematician Niels Henrik Abel was once asked how he had become so good at mathematics. He replied that it was “by studying the masters, not their pupils”. The current essay is motivated by Abel’s admonishment. As a programmer, I’m a beginner (and almost completely new to Lisp), and so this essay is a way for me to work in detail through ideas from masters such as Alan Kay, Peter Norvig, and Paul Graham. Of course, if one takes Abel at his word, then you should stop reading this essay, and instead go study the works of Kay, Norvig, Graham, and the like! I certainly recommend taking the time to study their work, and at the end of the essay I make some recommendations for further reading. However, I hope that this essay has a distinct enough point of view to be of interest in its own right. Above all, I hope the essay makes thinking about Lisp (and programming) fun, and that it raises some interesting fundamental questions. Of course, as a beginner, the essay may contain some misunderstandings or errors (perhaps significant ones), and I’d welcome corrections, pointers, and discussion.

Some elements of Lisp

In this and the next two sections we’ll learn the basic elements of Lisp – enough to develop our own executable version of what Alan Kay saw on page 13 of the LISP Manual. We’ll call the dialect of Lisp we develop “tiddly Lisp”, or just tiddlylisp for short. Tiddlylisp is based on a subset of the programming language Scheme, which is one of the most popular modern dialects of Lisp.

While I can show you a bunch of examples of Lisp, you’ll learn much more if you type in the examples yourself, and then play around with them, modifying them and trying your own ideas. So I want you to download the file tiddlylisp.py to your local machine. Alternately, if you’re using git, you can just clone the entire code repository associated to this essay. The file tiddlylisp.py is the Lisp interpreter whose design and code we’ll work through later in the essay. On Linux and Mac you can start the tiddlylisp interpreter by typing python tiddlylisp.py from the command line. You should see a prompt:

tiddlylisp>

That’s where you’re going to type in the code from the examples – it’s an interactive Lisp interpreter. You can exit the interpreter at any time by hitting Ctrl-C. Note that the interpreter isn’t terribly complete – as we’ll see, it’s only 153 lines! – and one way it’s incomplete is that the error messages aren’t very informative. Don’t spend a lot of time worrying about the errors, just try again.

If you’re on Windows, and don’t already have Python installed, you’ll need to download it (I suggest Python 2.7 for this code), before starting the interpreter by running python tiddlylisp.py. Note that I’ve only tested the interpreter on Ubuntu Linux, not on Windows or the Mac.

As our first example, type the following code into the tiddlylisp interpreter:

tiddlylisp> (+ 2 3)
5

In the expression you typed in, + is a built-in procedure, which represents the addition operation. It’s being applied to two arguments, in this case the numbers 2 and 3. The interpreter evaluates the result of applying + to 2 and 3, i.e., of adding 2 and 3 together, returning the value 5, which it then prints.

That first example was extremely simple, but it contains many of the concepts we need to understand Lisp: expressions, procedures, arguments, evaluation, returning a value, and printing. We’ll see many more illustrations of these ideas below.

Here’s a second example, illustrating another built-in procedure, this time the multiplication procedure, *:

tiddlylisp> (* 3 4)
12

It’s the same basic story: * is a built-in procedure, this time representing multiplication, and is here being applied to the numbers 3 and 4. The interpreter evaluates the expression, and prints the value returned, which is 12.

One potentially confusing thing about the first two examples is that I’ve called + and * “procedures”, yet in many programming languages procedures don’t return a value, only functions do. I’m using this terminology because it’s the standard terminology in the programming language Scheme, which is the dialect of Lisp that tiddlylisp is based on. In fact, in some modern dialects of Lisp – such as Common Lisp – operations such as + and * would be called “functions”. But we’ll stick with Scheme’s useage, and talk only of procedures.

Another example, illustrating a slightly different type of procedure:

tiddlylisp> (< 10 20)
True

Here, the built-in procedure < represents the comparison operator. And so tiddlylisp prints the result of evaluating an expression which compares the constants 10 and 20. The result is True, since 10 is less than 20. By contrast, we have

tiddlylisp> (< 17 12)
False

since 17 is not less than 12.

Many Lisp beginners are initially confused by this way of writing basic numerical operations. We're so familiar with expressions such as 2 + 3 that the changed order in (+ 2 3) appears strange and unfamiliar. And yet our preference for the infix notation 2 + 3 over the prefix notation (+ 2 3) is more a historical accident than a consequence of anything fundamental about arithmetic. Unfortunately, this has the consequence that some people back away from Lisp, simply because they dislike thinking in this unfamiliar way.

In this essay we won't get deep enough into Lisp to see in a lot of concrete detail why the prefix notation is a good idea. However, we will get one hint: our tiddlylisp interpreter will be much simpler for using the same (prefix) style for all procedures. If you really strongly dislike the prefix notation, then I challenge you to rewrite tiddlylisp so that it uses infix notation for some operations, and prefix notation for others. What you'll find is that the interpreter becomes quite a bit more complex. And so there is a sense in which using the same prefix style everywhere makes Lisp simpler.

Another useful thing Lisp allows us to do is to nest expressions:

tiddlylisp> (< (* 11 19) (* 9 23))
False

To compute the value of this expression, tiddlylisp evaluates the nested expressions, returning 209 from (* 11 19), and 207 from (* 9 23). The output from the outer expression is then just the result of evaluating (< 209 207), which of course is False.

We can use tiddlylisp to define variables:

tiddlylisp> (define x 3)
tiddlylisp> (* x 2)
6

define is used to define new variables; we can also assign a value to a variable which has previously been defined:

tiddlylisp> (set! x 4)
tiddlylisp> x
4

One question you may have here is about the slightly unusual syntax: set!. You might wonder whether the exclamation mark means something special - maybe set! is some type of hybrid procedure or something. Actually, there's no such complexity: set! is just a single keyword, just like define, nothing special about it at all. It's just that tiddlylisp allows exclamation marks in keyword names. So there's nothing special going on indicated by the exclamation mark.

A deeper question is why we don't simply eleminate define, and make it so set! checks whether a variable has been defined already, and if not, does so. I'll explain a little bit later why we don't do this; for now, you should just note the distinction.

We can use define to define new procedures in a way similar to how we use it to define variables. Here's an example showing how to define a procedure named square. I'll unpack what's going on below, but first, here's the code,

tiddlylisp> (define square (lambda (x) (* x x)))
tiddlylisp> (square 7)
49

Ignore the first of the lines above for now. You can see from the second line what the procedure square does: it takes a single number as input, and returns the square of the number.

What about the first line of code above? We already know enough to guess quite a bit about how this line works: a procedure named square is being defined, and is assigned the value of the expression (lambda (x) (* x x)), whatever that value might be. What's new, and what we need to understand, is what the value of the expression (lambda (x) (* x x)) is. To understand this, let's break the expression into three pieces: lambda, (x), and (* x x). We'll understand the three pieces separately, and then put them back together.

The first piece of the expression - the lambda - simply tells the tiddlylisp interpreter that this expression is defining a procedure. I must admit that when I first encountered the lambda notation I found this pretty confusing [2] - I thought that lambda must be a variable, or an argument, or something like that. But no, it's just a big fat red flag to the tiddlylisp interpreter saying "Hey, this is a procedure definition". That's all lambda is.

The second part of the expression - the (x) - tells tiddlylisp that this is a procedure with a single argument, and that we're going to use the temporary name x for that argument, for the purposes of defining the procedure. If the procedure definition had started instead with (lambda (x y) ...) that would have meant that the procedure had two arguments, temporarily labelled x and y for the purposes of defining the procedure.

The third part of the expression - the (* x x) - is the meat of the procedure definition. It's what we evaluate and return when the procedure is called, with the actual values for the arguments of the procedure substituted in place of x.

Taking it all together, then, the value of the expresion (lambda (x) (* x x)) is just a procedure with a single input, and returning the square of that input. This procedure is anonymous, i.e., it doesn't have a name. But we can give it a name by using define, and so the line

tiddlylisp> (define square (lambda (x) (* x x)))

tells tiddlylisp to define something called square, whose value is a procedure (because of the lambda) with a single argument (because of the (x)), and what that procedure returns is the square of its argument (because of the (* x x)).

An important point about the variables used in defining procedures is that they're dummy variables. Suppose we wanted to define a procedure area which would return the area of a triangle, with arguments the base of the triangle and the height. We could do this using the following procedure definition:

tiddlylisp> (define area (lambda (b h) (* 0.5 (* b h))))
tiddlylisp> (area 3 5)
7.5

But what would have happened if we'd earlier defined a variable h, e.g.:

tiddlylisp> (define h 11)
tiddlylisp> (define area (lambda (b h) (* 0.5 (* b h))))

It probably won't surprise you to learn that inside the procedure definition, i.e., immediately after the lambda (b h), the h is treated as a dummy variable, and is entirely different from the h outside the procedure definition. It has what's called a different scope. And so we can continue the above with the following

tiddlylisp> (area 3 h)
16.5

that is, the area is just 0.5 times 3 times the value of h set earlier, outside the procedure definition. At that point the value of h was 11, and so (area 3 h) returns 16.5.

There's a variation on the above that you might wonder about, which is what happens when you use variables defined outside a procedure inside that procedure definition? For instance, suppose we have:

tiddlylisp> (define x 3)
tiddlylisp> (define foo (lambda (y) (* x y)))

What happens now if we evaluate foo? Well, tiddlylisp does the sensible thing, and interprets x as it was defined outside the procedure definition, so we have:

tiddlylisp> (foo 4)
12

What happens to our procedure if we next change the value of x? In fact, this changes foo:

tiddlylisp> (set! x 5)
tiddlylisp> (foo 4)
20

In other words, in the procedure definition lambda (y) (* x y) the x really does refer to the variable x, and not to the particular value x might have at any given point in time.

Let's dig down a bit more into how tiddlylisp handles scope and dummy variables. Let's look at what happens when we define a variable:

tiddlylisp> (define x 5)
tiddlylisp> x
5

The way the interpreter handles this internally is that it maintains what's called an environment: a dictionary whose keys are the variable names, and whose values are the corresponding variable values. So what the interpeter does when it sees the first line above is add a new key to the environment, "x", with value 5. When the interpreter sees the second line, it consults the environment, looks up the key x, and returns the corresponding value. You can, if you like, think of the environment as the interpreter's memory or data store, in which it stores all the details of the variables defined to date.

All this is pretty simple. We can go along defining and changing variables, and the interpreter just keeps consulting and modifying the environment as necessary. But when you define a new procedure using lambda the interpreter treats the variables used in the definition slightly differently. Let's look at an example:

tiddlylisp> (define h 5)
tiddlylisp> (define area (lambda (b) (* b h)))
tiddlylisp> (area 2)
10

What I want to concentrate on here is the procedure definition (lambda (b) (* b h)). Up to this point the interpreter had been chugging along, modifying its environment as appropriate. What happens when it sees the (lambda ...), though, is that the interpreter creates a new environment, an environment called an inner environment, as distinguished from the outer environment, which is what the interpreter has been operating in until it reached the (lambda ...) statement. The inner environment is a new dictionary, whose keys initially are just the arguments to the procedure - in this case a single key, b - and whose values will be supplied when the procedure is called.

To recap, what the interpreter does when it sees (lambda (b) (* b h)) is create a new inner environment, with a key b whose value will be set when the procedure is called. When evaluating the expression (* b h) (which defines the result returned from the procedure) what the interpreter does is first consult the inner environment, where it finds the key b, but not the key h. When it fails to find h, it looks instead for h in the outer environment, where the key h has indeed been defined, and retrieves the appropriate value.

I've described a simple example showing how environments work, but tiddlylisp also allows us to have procedure definitions nested inside procedure definitions, nested inside procedure definitions (and so on). To deal with this, the top-level tiddlylisp interpreter operates inside a global environment, and each procedure definition creates a new inner environment, perhaps nested inside a previously created inner environment, if that's appropriate.

If all this talk about inner and outer environments has left you confused, fear not. At this stage it really is important to have gotten the gist of how environments work, but you shouldn't worry if the details still seem elusive. In my opinion, the best way to understand those details is not through abstract discussion, but instead by looking at the working code for tiddlylisp. We'll get to that shortly, but for now can move onwards armed with a general impression of how environments work.

The procedure definitions I've described so far evaluate and return the value from just a single expression. We can use such expressions to achieve suprisingly complex things, because of the ability to nest expressions. Still, it would be convenient to have a way of chaining together expressions that doesn't involve nesting. A way of doing this is to use the begin keyword. begin is especially helpful in defining complex procedures, and so I'll give an example in that context:

tiddlylisp> (define area (lambda (r) (begin (define pi 3.14) (* pi (* r r)))))

This line is defining a procedure called area, with a single argument r, and where the value of (area r) is just the value of the expression

(begin (define pi 3.14) (* pi (* r r)))

with the appropriate value for r substituted. The way tiddlylisp evaluates the begin expression above is that it evaluates all the separate sub-expressions, consecutively in the order they appear, and then returns the value of the final sub-expression, in this case the sub-expression (* pi (* r r)). So, for example, we get

tiddlylisp> (area 3)
28.26

which is just the area of a circle with radius 3.

Now, in a simple example such as this you might argue that it makes more sense to avoid defining pi, and just put 3.14 straight into the later expression. However, doing so will make the intent of the code less clear, and I trust you will find it easy to imagine more complex cases where it makes even more sense to use multi-part begin expressions. This is especially the case since it is permissible to split Lisp expressions over multiple lines, so the above procedure definition could have been written:

(define area (lambda (r)
  (begin (define pi 3.14)
         (* pi (* r r)))))

Now, I should mention that if you enter the above into the tiddlylisp interpreter, you'll get errors. This is because the tiddlylisp interpreter is so stripped down that it doesn't allow multi-line inputs. However, tiddlylisp will allow you to load multi-line expressions from a file - try saving the above three lines in a file named area.tl, and then running that file with python tiddlylisp.py area.tl. Tiddlylisp will execute the code, defining the area procedure, and then leave you in the interpreter, where you can type things like:

tiddlylisp> (area 3)
28.26

A second advantage of using begin in the above code is that the variable pi is only defined in the inner environment associated to the lambda expression. If, for some reason, you want to define pi differently outside that scope, you can do so, and it won't be affected by the definition of area. Consider, for example, the following sequence of expressions:

tiddlylisp> (define pi 3)
tiddlylisp> pi
3
tiddlylisp> (define area (lambda (r) (begin (define pi 3.14) (* pi (* r r)))))
tiddlylisp> (area 1)
3.14
tiddlylisp> pi
3

In other words, the value of the final pi is returned from the outer (global) environment, not the inner environment created during the definition of the procedure area.

Earlier, we discussed define and set!, and wondered whether there was really any essential difference. Consider now the following example, where we modify area by using set! instead of define:

tiddlylisp> (define pi 3)
tiddlylisp> pi
3
tiddlylisp> (define area (lambda (r) (begin (set! pi 3.14) (* pi (* r r)))))
tiddlylisp> (area 1)
3.14
tiddlylisp> pi
3.14

As you can see by comparing the final line of this example to our earlier example, there really is a significant difference between define and set!. In particular, when we define area in this example, what set! does is check to see whether the inner environment contains a variable named pi. Since it doesn't, it checks the outer environment, where it does find such a variable, and that's what set! updates. If we'd used define instead it would have created a completely new variable named pi in the inner environment, while leaving pi in the outer environment untouched, so the final line would have returned 3 instead. So having both define and set! gives us quite a bit of flexibility and control over which environment is being used, at the expense of a complication in syntax.

As our final piece of Lisp before putting what we've learnt together to construct a nontrivial example, tiddlylisp includes a keyword if that can be used to test conditions, and conditionally return the value of different expressions. Here's an example showing the use of if to evaluate the absolute value of a variable:

tiddlylisp> (define x -2)
tiddlylisp> (if (> x 0) x (- 0 x))
2

We can use the same idea to define a procedure abs which returns the absolute value of a number:

tiddlylisp> (define abs (lambda (x) (if (> x 0) x (- 0 x))))
tiddlylisp> (abs -2)
2

The general form for if expressions is (if cond result alt), where we evaluate the expression cond, and if the value is True, return the value of result, otherwise return the value of alt. Note that in the expression (if cond result alt), the cond, result and alt of course shouldn't be read literally. Rather, they are placeholders for other expressions, as we saw in the absolute value example above. Through the remainder of this essay I will use use italics to indicate such placeholder expressions.

Let me conclude this section by briefly introducing two important Lisp concepts. The first is the concept of a special form. In this section we discussed several built-in procedures, such as + and *, and also some user-defined procedures. Calls to such procedures always have the form (proc exp1 exp2...), where proc is the procedure name, and the other arguments are expressions. Such expressions are evaluated by evaluating each individual expression exp1, exp2..., and then passing those values to the procedure. However, not all Lisp expressions are procedure calls. Consider the expression (define pi 3.14). This isn't evaluated by calling some procedure define with arguments given by the value of pi and the value of 3.14. It can't be evaluated this way because pi doesn't have a value yet! So define isn't a procedure. Instead, define is an example of what is known as a special form. Other examples of special forms include lambda, begin, and if, and a few more which we'll meet later. Like define, none of these is a procedure, but instead each special forms has its own special rule for evaluation.

The second concept I want to briefly introduce is that of a list. The list is one of the basic data structures used in Lisp, and even gives the language its name - Lisp is short for "list processing". In fact, most of the Lisp expressions we've seen up to now are lists: an expression such as (abs 2) is a two-element list, with elements abs and 2, delimited by spaces and parentheses. A more complex expression such as (define abs (lambda (x) (if (> x 0) x (- 0 x)))) is also a list, in this case with the first two elements being define and abs. The third element, (lambda (x) (if (> x 0) x (- 0 x))), is a sublist which in turn has sublists (and subsublists) of its own. Later we'll see how to use Lisp to do manipulations with such lists.

A nontrivial example: square roots using only elementary arithmetic

Let's put together the ideas above to do something nontrivial. We'll write a short tiddlylisp program to compute square roots, using only the elementary arithmetical operations (addition, subtraction, multiplication, and division). The idea behind the program is to use Newton's method. Although Newton's method is interesting in its own right, I'm not including this example because of its algorithmic elegance. Instead, I'm including it as a simple and beautiful example of a Lisp program. The example comes from Abelson and Sussman's book on the Structure and Interpretation of Computer Programs.

Here's how Newton's method for computing square roots works. Suppose we have a (positive) number x whose square root we wish to compute. Then we start by making a guess at the square root, guess. We're going to start by arbitrarily choosing 1.0 as our initial value for guess; in principle, any positive number will do. According to Newton's method, we'll get an improved guess by computing (guess+x/guess)/2, i.e., by taking the average of guess and x/guess. If we repeat this averaging process enough times, we'll converge to a good estimate of the square root.

Let's express these ideas in Lisp code. We'll do it from the top down, starting at a high level, and gradually filling in the details of all the procedures we need. We start at the absolute top level,

(define sqrt (lambda (x) (sqrt-iter 1.0 x)))

Here, (sqrt-iter guess x) is the value of a to-be-defined procedure sqrt-iter that takes a guess guess at the square root of x, and keeps improving that guess over and over until it's a good enough estimate of the square root. As I mentioned above, we start by arbitrarily choosing guess = 1.0.

The core of the algorithm is the definition of sqrt-iter:

(define sqrt-iter 
    (lambda (guess x) 
      (if (good-enough? guess x) guess (sqrt-iter (improve guess x) x))))

What sqrt-iter does is takes the guess and checks to see whether it's yet a good-enough? approximation to the square root of x, in a sense to be made precise below. If it is, then sqrt-iter is done, and returns guess, otherwise it applies sqrt-iter to the improved guess (improve guess x) supplied by applying Newton's method.

Incidentally, the way I've written sqrt-iter above it's another example of a multi-line tiddlylisp expression, and so it's not possible to enter in the tiddlylisp interpreter. However, you can enter it as part of a file, sqrt.tl, along with the definition of sqrt, and other procedures which we'll define below. We'll later use tiddlylisp to execute the file sqrt.tl.

To fill out the details of the above, we need to understand how good-enough? and improve work. Let's start with good-enough?, which simply checks whether the absolute value of the square of guess is within 0.00001 of x:

 
(define good-enough? 
    (lambda (guess x) (< (abs (- x (square guess))) 0.00001)))

Here, we have defined the absolute value procedure abs and the squaring procedure square as:

 
(define abs (lambda (x) (if (< 0 x) x (- 0 x))))
(define square (lambda (x) (* x x)))

That's all the code we need for good-enough?. For improve we simply implement Newton's method,

(define improve (lambda (guess x) (average guess (/ x guess))))

where average is defined in the obvious way:

(define average (lambda (x y) (* 0.5 (+ x y))))

We can write out the whole program as follows:

(define average (lambda (x y) (* 0.5 (+ x y))))
(define improve (lambda (guess x) (average guess (/ x guess))))
(define square (lambda (x) (* x x)))
(define abs (lambda (x) (if (< 0 x) x (- 0 x))))
(define good-enough? 
    (lambda (guess x) (< (abs (- x (square guess))) 0.00001)))
(define sqrt-iter 
    (lambda (guess x) 
      (if (good-enough? guess x) guess (sqrt-iter (improve guess x) x))))
(define sqrt (lambda (x) (sqrt-iter 1.0 x)))

Save these lines to the file sqrt.tl, and then run it using python tiddlylisp.py sqrt.tl. Tiddlylisp will execute the code, defining all the above procedures, and then leave you in an interpreter, where you can type:

tiddlylisp> (sqrt 2.0)
1.41421568627

This is, indeed, a pretty good approximation to the true square root: 1.4142135....

In writing out the overall program sqrt.tl, I've reversed the order of the lines, compared to my initial explanation. The reason I've done this is worth discussing. You see, the program sqrt.tl was actually the first piece of Lisp code I ever worked though in detail. The natural way to think through the problem of computing square roots with Newton's method is in the order I explained, working in a top-down fashion, starting from the broad problem, and gradually breaking our attack on that problem down into parts. But the first time I wrote the code out I assumed that I needed to explain these ideas to Lisp in a bottom-up fashion, defining procedures like average before improve and so on, so that procedure definitions only include previously defined procedures. That meant reversing the order of the code, as I've shown above.

Taking this approach bugged me. It doesn't seem like the most natural way to think about implementing Newton's method. At least for this problem a top-down approach seems more natural, and if you were just doing exploratory programming you'd start by defining sqrt, then sqrt-iter, and so on. As an experiment, I decided to reverse the order of the progam, so it reflects the natural "thinking order":

(define sqrt (lambda (x) (sqrt-iter 1.0 x)))
(define sqrt-iter 
    (lambda (guess x) 
      (if (good-enough? guess x) guess (sqrt-iter (improve guess x) x))))
(define good-enough? 
    (lambda (guess x) (< (abs (- x (square guess))) 0.00001)))
(define abs (lambda (x) (if (< 0 x) x (- 0 x))))
(define square (lambda (x) (* x x)))
(define improve (lambda (guess x) (average guess (/ x guess))))
(define average (lambda (x y) (* 0.5 (+ x y))))

Somewhat to my surprise, this ran just fine ! (Actually, my code was slightly different, since I was using Common Lisp at that point, not tiddlylisp, but tiddlylisp works as well.) It's apparently okay to define a procedure in terms of some other procedure which isn't defined until later. Now, upon close examination of the code for tiddlylisp it turns out that this makes perfect sense. But it came as a surprise to me. Later, when we examine the code for tiddlylisp, I'll set a problem asking you to explain why it's okay to re-order the code above.

I think this program for the square root is a very beautiful program. Of course, in most programming languages the square root is built in, and so it's not exactly notable to have an elegant program for it! Indeed, the square root is built in to most version of Lisp, and it's trivially possible to add the square root to tiddlylisp. But what I like about the above is that it's such a simple and natural expression of Newton's method. As Peter Deutsch has said, Lisp programs come close to being "executable mathematics".

Problems for the author

As a general point about programming language design it seems like it would often be helpful to be able to define procedures in terms of other procedures which have not yet been defined. Which languages make this possible, and which do not? What advantages does it bring for a programming language to be able to do this? Are there any disadvantages?

A few more elements of Lisp

I began my introduction to Lisp by focusing on elementary arithmetical operations, and how to put them together. I did that so I could give you a concrete example of Lisp in action - the square root example in the last section - which is a good way of getting a feel for how Lisp works. But if we're going to understand Lisp as "the Maxwell's equations of software" then we also need to understand more about how Lisp deals with expressions and with lists. I'll describe those operations in this section.

Before I describe those operations, I want to discuss a distinction that I haven't drawn much attention to until now. That's the distinction between an expression such as (+ 1 2) and the value of that expression, which in this case is 3. When we feed a (valid) tiddlylisp expression to the tiddlylisp interpreter, it evaluates the expression, and returns its value.

When I was first learning Lisp I'd often get myself in trouble because I failed to keep myself clear on the distinction between an expression and its value. Let me give you an example of the kind of confusion I'd get myself into. There is a built-in Lisp procedure called atom?, defined so that (atom? exp) returns True if the value of the expression exp is atomic - a number, for example, or anything which isn't a list - and returns False otherwise. (Note that exp is a placeholder expression, as we discussed earlier, and shouldn't be read literally as the identifier exp.) Now, I'd look at an example like (atom? 5) and have no trouble figuring out that it evaluated to True. But where I'd get into trouble is with an expression like (atom? (+ 1 2)). I'd look at it and think it must return False, because the expression (+ 1 2) is not atomic, it's a list. Unfortunately, while it's true that the expression (+ 1 2) is not atomic, it's irrelevant, because the value of (atom? exp) is determined by whether the value of exp is atomic - which it is, since the value of (+ 1 2)) is 3. You'll be a much happier Lisper if you always keep very clear on the distinction between expressions and the value of expressions.

Now, of course, this distinction between expressions and their values appears in most other programming languages. For this reason, I felt foolish when I understood what was causing my confusion. But what makes it so easy to make this kind of mistake is that in Lisp both expressions and the value of those expressions can be lists. And that's why when I write something like (atom? (+ 1 2)) there's a real question: is atom? checking whether the expression (+ 1 2) is atomic (no, it's not, it's a list), or whether the value of the expression - 3 - is atomic (yes, it is). So you need to be pretty careful to keep clear on which is meant. To help with this, I'll be pretty pedantic about the distinction in what follows, writing out "the value of exp" explicitly, to distinguish the value from the expression itself. Note, however, that it's quite common for people to be less pedantic and more informal, so that something like (+ x 2) may well be described as "adding the variable x to the variable y", not the more explicit "adding the value of the variable x to the value of the variable y."

Alright, with that admonishment out of the way, let's turn our attention to defining the final set of elementary operations we'll need in tiddlylisp.

Returning an expression as a value: (quote exp) returns the expression exp as its value, without evaluating exp. This is most clearly demonstrated by an example,

tiddlylisp> (quote (+ 1 2))
(+ 1 2)

as opposed to the value of (+ 1 2), which of course is 3. Another example, just to reinforce the point that quote returns an expression literally, without evaluating it,

tiddlylisp> (define x 3)
tiddlylisp> (quote x)
x

not the value of x, which of course is 3. Note also that quoting doesn't evaluate nested subexpressions, e.g.,

tiddlylisp> (quote ((+ 1 2) 3))
((+ 1 2) 3)

quote can also be used to return lists which aren't valid Lisp expressions at all, e.g.,

tiddlylisp> (quote (1 2 3))
(1 2 3)

It's worth emphasizing that (1 2 3) isn't a valid Lisp expression, since 1 is neither a procedure nor a special form - if you enter (1 2 3) into the tiddlylisp interpreter, it will give an error. But what quote lets us do is use the list (1 2 3) as data. For example, we could use it to store the (1 2 3) in a variable,

tiddlylisp> (define x (quote (1 2 3)))
tiddlylisp> x
(1 2 3)

In this way, quote lets us work with lists as data structures.

Why does Lisp have quote? Most other computer programming languages don't have anything like it. The reason Lisp has it is because Lisp allows you to treat code as data, and data as code. So, for example, an object such as (+ 1 2) can be treated as code, i.e., as a Lisp expression to be evaluated, which is what we've been doing up until the current discussion. But (+1 2) can also potentially be treated as data, i.e., as a list of three objects. This ability to treat code and data on the same footing is a wonderful thing, because it means you can write programs to manipulate programs. But it also creates a problem, which is that we need to be able to distinguish when an expression should be treated as data, and when it should be treated as code. quote is a way of distinguishing between the two. It's similar to the way most languages use escape characters to deal with special strings such as ". quote is a way of saying "Hey, what follows should be treated as data, not evaluated as Lisp code". And so quote lets us escape Lisp code, so we can take an expression such as (+ 1 2) and turn it into data: (quote (+ 1 2). In this way, quote allows us to define Lisp expressions whose values are arbitrary Lisp expressions.

Up until now we've been focused on using Lisp to do simple arithmetic operations, and as a result we haven't needed quote. But when we get to Lisp-as-Maxwell's-equations we're going to be increasingly focused on using Lisp to manipulate Lisp code, and as a result we'll be making frequent use of quote. For this reason it's helpful to introduce shorthand for quote. In most Lisps, the conventional shorthand is to use 'exp to denote (quote exp), i.e. if exp is some Lisp expression, then the value of 'exp is just the expression exp itself. Unfortunately, using the shorthand 'exp complicated the parsing in the tiddlylisp interpreter more than I wanted. So I decided to instead use a different (and, I emphasize, unconventional) shorthand for (quote exp), namely (q exp). So our examples of quote above can be shortened to,

tiddlylisp> (q (+ 1 2))
(+ 1 2)
tiddlylisp> (define x 3)
tiddlylisp> (q x)
x
tiddlylisp> (q ((+ 1 2) 3))
((+ 1 2) 3)

Testing whether the value of an expression is atomic: As I noted above, (atom? exp) returns True if the value of the expression exp is atomic, and otherwise returns False. We've already discussed the following example,

tiddlylisp> (atom? (+ 1 2))
True

but it's illuminating to see it in tandem with the following example, which illustrates also the use of quote,

tiddlylisp> (atom? (q (+ 1 2)))
False

As above, the first example returns True because while (+ 1 2) is not atomic, its value, 3, is. But the second example returns False because the value of (q (+ 1 2)) is just (+ 1 2), which is a list, and thus not atomic.

By the way, I should mention that atom? is not a built-in procedure in the dialect of Lisp that tiddlylisp based on, Scheme. I've built atom? into tiddlylisp because, as we'll see, the analogous operation is used many times in the code on page 13 of the LISP 1.5 Programmer's Manual. Of course, an atom? procedure is easily defined in Scheme, but for the purposes of understanding the code on page 13 it seemed most direct to simply include atom? as a built-in in tiddlylisp.

Testing whether two expressions both evaluate to the same atom (or the empty list): (eq? exp1 exp2) returns True if the values of exp1 and exp2 are both the same atom, or both the empty list, (). (eq? exp1 exp2) returns False otherwise. Note that if exp1 and exp2 have the same value, but are not atomic or the empty list, then (eq? exp1 exp2) returns False. For example,

tiddlylisp> (eq? 2 (+ 1 1))
True
tiddlylisp> (eq? 3 (+ 1 1))
False
tiddlylisp> (eq? (q (1 2)) (q (1 2)))
False

As for atom? my explanation of eq? is not quite the same as in standard Scheme, but instead more closely matches the function eq defined in the LISP 1.5 Programmer's Manual.

Getting the first item of a list: (car exp) returns the first element of the value of exp, provided the value of exp is a list. Otherwise (car exp) is undefined. For example,

tiddlylisp> (car (q (+ 2 3)))
+
tiddlylisp> (car (+ 2 3))

The first of these two behaves as expected: the value of (q (+ 2 3) is just the list (+ 2 3), and so car returns the first element, which is +. In the second, though, car is not defined, and tiddlylisp returns an error message, which I've elided. The reason it returns an error message is because the value of (+ 2 3) is 5, which is not a list, and so car is undefined. In a similar way, suppose we try

tiddlylisp> (car (1 2 3))

Again, we get an error message. The reason is that car is being applied to the value of (1 2 3), considered as a Lisp expression. And, of course, that value is undefined, since 1 is neither a procedure nor a special form. The right way to do the above is to quote the list first,

tiddlylisp> (car (q (1 2 3)))
1

Once again, we see how quote is used to make it clear to tiddlylisp that we're dealing with data, not code to be evaluated.

Getting the rest of a list: (cdr exp) returns a list containing all but the first element of the value of exp. Of course, the value of exp must be a list, otherwise (cdr exp) is undefined. For example,

tiddlylisp> (cdr (q (1 2 3)))
(2 3)

According to Wikipedia the names car and cdr have their origin in some pretty esoteric facts about the early implementations of Lisp. The details don't much matter here - car stands for "contents of address part of register", while cdr stands for "contents of decrement part of register" - I just wanted to make the point that you could reasonably be wondering "Where on Earth did those names come from?!" A mnemonic I find useful in distinguishing the two is to focus on the difference between the names of the two procedures, which of course is just the middle letter - a or d - and to keep in mind that a comes before d in the alphabet, just as car extracts the element of a list that comes before the remainder of the list, as given to us by cdr. Your taste in mnemonics may vary - if you don't like mine, it's still worth taking a minute or two to come up with some trick for remembering and distinguishing car and cdr. Of course, after a little practice you get used to them, and you won't need the mnemonic any more, but at first it's helpful.

Appending an item at the start of a list: Provided the value of exp2 is a list, then (cons exp1 exp2) returns a list containing the value of exp1 as its first element, followed by all the elements of the value of exp2. For example,

tiddlylisp> (cons 1 (q (2 3)))
(1 2 3)

Testing whether an expression evaluates to the empty list: (null? exp) returns True if exp evaluates to the empty list, (), and otherwise returns False. For example,

tiddlylisp> (null? (cdr (q (1))))
True

since (cdr (q (1))) returns the empty list.

Conditionals: (cond (p1 e1)...(pn en)): This starts by evaluating the expression p1. If p1 evaluates to True, then evaluate the expression e1, and return that value. If not, evaluate p2, and if it is True, return the value of e2, and so on. If none of the pj evaluates to True, then the (cond ...) expression is undefined.

That's all the Lisp we're going to need to write our version of Lisp-as-Maxwell's-equations, i.e., our version of the code on page 13 of the LISP Manual! In fact, as we'll see shortly, it's more than we need - I included a few extra features so that we could work through examples like sqrt, and also simply for fun, to make tiddlylisp a richer and more interesting language to play with. Of course, what I've described is merely a subset (with some variations) of our chosen dialect of Lisp (Scheme), and there are important missing ideas. To learn more about Lisp or Scheme, please consult the suggestions for further reading at the end of this essay.

An interpreter for Lisp

Now that we've worked through the basic elements of Lisp, let's write a simple Lisp interpreter, using Python. The interpreter we'll write is based on Peter Norvig's lispy interpreter, and I highly recommend you read through his explanation. I've given the program we'll discuss a separate name - tiddlylisp - so as to make it easy to refer to separately from Norvig's interpreter, but please keep in mind that most of the code we're discussing is Norvig's. However, we're going to examine the code from a different angle than Norvig: we're going to take a bit more of a bottom-up computer's-eye view, looking at how the code executes.

We'll look at the code piece by piece, before putting it all together. Let's start with the interactive interpreter, which is run when you start up. We implement this with a Python procedure called repl, meaning to read some input, evaluate the expression, print the result of the evaluation, and then loop back to the beginning. This is also know as the read-eval-print loop, or REPL. Here's the repl procedure, together with a procedure for handling errors when they occur:

import traceback

def repl(prompt='tiddlylisp> '):
    "A prompt-read-eval-print loop."
    while True:
        try:
            val = eval(parse(raw_input(prompt)))
            if val is not None: print to_string(val)
        except KeyboardInterrupt:
            print "\nExiting tiddlylisp\n"
            sys.exit()
        except:
            handle_error()

def handle_error():
    """
    Simple error handling for both the repl and load.
    """
    print "An error occurred.  Here's the Python stack trace:\n"
    traceback.print_exc()

The core of repl is in the try clause, and we'll get to how that works shortly. Before we look at that, note that if the user presses Ctrl-C, it raises the KeyboardInterrupt exception, which causes the program to exit. If an error occurs during the try block -- say, due to a syntax error in the Lisp expression being parsed, or due to a bug in tiddlylisp itself - then some other exception will be raised. Tiddlylisp doesn't deal very well with errors - it simply announces that an error has occurred, and prints the Python stack trace, which may give you a few hints about what's gone wrong, but which is obviously a long way short of truly informative error handling! After printing the stack trace, tiddlylisp simply returns to the prompt. This type of error handling could easily be improved, but we're not going to invest any effort in this direction.

Let's look at the try clause. It begins by taking input at the prompt, and then passing it to the function parse, which converts the string entered by the user into an internal representation, i.e., a data structure that's more convenient for our Python program to work with than a string. Here's an example which shows how it works:

parse("(* (+ 7 12) (- 8 6))") = ["*", ["+", 7, 12], ["-", 8, 6]]

In other words, nested Lisp lists become Python sublists, and things like procedures and numbers become elements in a Python list.

We'll look shortly at how parse is implemented. For now, let's finish understanding how repl works. The output of parse is passed to the function eval, which evaluates the internal representation of the expression entered by the user. Provided no error occurs, the result is returned in val. However, val is in the format of the internal representation, and so we need to convert it from that internal representation back into a printable Lisp expression, using to_string.

At this point, we've got three functions to understand the details of: parse, eval and to_string. I'll explain them out of order, starting with parse and to_string, since they're extremely similar. Then we'll get to eval.

Alright, let's understand how parse works. Without further ado, here's the code; for the explanation, see below:

Symbol = str

def parse(s):
    "Parse a Lisp expression from a string."
    return read_from(tokenize(s))

def tokenize(s):
    "Convert a string into a list of tokens."
    return s.replace('(',' ( ').replace(')',' ) ').split()

def read_from(tokens):
    "Read an expression from a sequence of tokens."
    if len(tokens) == 0:
        raise SyntaxError('unexpected EOF while reading')
    token = tokens.pop(0)
    if '(' == token:
        L = []
        while tokens[0] != ')':
            L.append(read_from(tokens))
        tokens.pop(0) # pop off ')'
        return L
    elif ')' == token:
        raise SyntaxError('unexpected )')
    else:
        return atom(token)

def atom(token):
    "Numbers become numbers; every other token is a symbol."
    try: return int(token)
    except ValueError:
        try: return float(token)
        except ValueError:
            return Symbol(token)

The parsing is easy to understand. tokenize first inserts spaces on either side of any parentheses, and then splits the string around spaces, returning a list of tokens, i.e., the non-space substrings. read_from takes that list and removes the parentheses, instead nesting sublists as indicated by the original parenthesis structure. And, finally, atom turns tokens into Python ints, floats or Symbols (strings, by definition), as appropriate.

That's all there is to parse. If tiddlylisp were a little more powerful then parse would need to be more complex. For example, if we allowed strings as first-class objects in the language, then it would not work to tokenize by splitting around spaces, since that would risk splitting a string into separate tokens. This is the kind of thing that'd be fun to include in an extended version of tiddlylisp (and I've included it as a problem later in this section), but which we don't need in a first version.

Let's look now at how to_string works. It's much simpler, quickly undoing the steps taken in parsing:

isa = isinstance

def to_string(exp):
    "Convert a Python object back into a Lisp-readable string."
    if not isa(exp, list):
        return str(exp)
    else:
        return '('+' '.join(map(to_string, exp))+')'

In other words, if the internal representation of the expression is not a list, then return an appropriate stringified version (this takes care of the fact that we may have ints, floats and, as we shall see, Booleans in the internal representation). If it is a list, then return whatever we get by applying to_string to all the elements of that list, with appropriate delimiting by whitespace and parentheses.

At this point, the main thing we need to complete tiddlylisp is the eval function. Actually, that's not quite true: as we discussed earlier, tiddlylisp also keeps track of a global environment (and possibly one or more inner environments), to store variable and procedure names, and their values. eval is going to make heavy use of the environments, and so it helps to look first at how environments are defined. They're pretty simple: an environment has a bunch of keys, representing the names of the variables and procedures in that environment, and corresponding values for those keys, which are just the values for the variables or procedures. An environment also keeps track of its outer environment, with the caveat that the global environment has an outer environment set to Python's None. Such environments are easily implemented as a subclass of Python dictionaries:

class Env(dict):
    "An environment: a dict of {'var':val} pairs, with an outer Env."

    def __init__(self, params=(), args=(), outer=None):
        self.update(zip(params, args))
        self.outer = outer

    def find(self, var):
        "Find the innermost Env where var appears."
        return self if var in self else self.outer.find(var)

As you can see, the only modifications of the dictionary class are that: (1) an environment also keeps track of its own outer environment; and (2) an environment can determine whether a variable or procedure name appears in its list of keys, and if it doesn't, then it looks to see if it's in its outer environment. As a result, the find method returns the innermost environment where the variable or procedure name appears.

Note, incidentally, that the environment doesn't distinguish between variable and procedure names. Indeed, as we'll see, tiddlylisp treats user-defined procedures and variables in the same way: a procedure is a variable which just happens to take the value of a lambda expression as its value.

Tiddlylisp starts off operating in a particular global environment, and this too must be defined by our program. We'll do this by creating an instance of the class Env, and calling a function to add some built-in procedure definitions and variables:

def add_globals(env):
    "Add some built-in procedures and variables to the environment."
    import operator
    env.update(
        {'+': operator.add,
         '-': operator.sub, 
         '*': operator.mul, 
         '/': operator.div, 
         '>': operator.gt, 
         '<': operator.lt, 
         '>=': operator.ge, 
         '<=': operator.le, 
         '=': operator.eq
         })
    env.update({'True': True, 'False': False})
    return env

global_env = add_globals(Env())

Incidentally, in tiddlylisp's version of add_globals I decided to strip out many of the built-in procedures which Norvig includes in lispy's global environment - it's instructive to look at Norvig's code for add_globals to see just how easy it is to add more built-in procedures to tiddlylisp. If you want to do some exploratory programming with tiddlylisp then you should probably copy some of Norvig's additional built-in procedures (and perhaps add some of your own). For us, though, the above procedures are enough.

One notable feature of the global environment is the variables named True and False, which evaluate to Python's Boolean True and False, respectively. This isn't standard in Scheme (or most other Lisps), but I've done it because it ensures that we can use the strings True and False, and get the appropriate internal representation.

With the global environment set up, we can now understand how eval works. The code is extremely simple, simply enumerating the different types of expressions we might be evaluating, and reading from or modifying the environment, as appropriate. It's worth reading (and rereading) the code in detail, until you understand exactly how eval works. I also have a few comments at the end. Here's the code:

isa = isinstance

def eval(x, env=global_env):
    "Evaluate an expression in an environment."
    if isa(x, Symbol):              # variable reference
        return env.find(x)[x]
    elif not isa(x, list):          # constant literal
        return x                
    elif x[0] == 'quote' or x[0] == 'q': # (quote exp), or (q exp)
        (_, exp) = x
        return exp
    elif x[0] == 'atom?':           # (atom? exp)
        (_, exp) = x
        return not isa(eval(exp, env), list)
    elif x[0] == 'eq?':             # (eq? exp1 exp2)
        (_, exp1, exp2) = x
        v1, v2 = eval(exp1, env), eval(exp2, env)
        return (not isa(v1, list)) and (v1 == v2)
    elif x[0] == 'car':             # (car exp)
        (_, exp) = x
        return eval(exp, env)[0]
    elif x[0] == 'cdr':             # (cdr exp)
        (_, exp) = x
        return eval(exp, env)[1:]
    elif x[0] == 'cons':            # (cons exp1 exp2)
        (_, exp1, exp2) = x
        return [eval(exp1, env)]+eval(exp2,env)
    elif x[0] == 'cond':            # (cond (p1 e1) ... (pn en))
        for (p, e) in x[1:]:
            if eval(p, env): 
                return eval(e, env)
    elif x[0] == 'null?':           # (null? exp)
        (_, exp) = x
        return eval(exp,env) == []
    elif x[0] == 'if':              # (if test conseq alt)
        (_, test, conseq, alt) = x
        return eval((conseq if eval(test, env) else alt), env)
    elif x[0] == 'set!':            # (set! var exp)
        (_, var, exp) = x
        env.find(var)[var] = eval(exp, env)
    elif x[0] == 'define':          # (define var exp)
        (_, var, exp) = x
        env[var] = eval(exp, env)
    elif x[0] == 'lambda':          # (lambda (var*) exp)
        (_, vars, exp) = x
        return lambda *args: eval(exp, Env(vars, args, env))
    elif x[0] == 'begin':           # (begin exp*)
        for exp in x[1:]:
            val = eval(exp, env)
        return val
    else:                           # (proc exp*)
        exps = [eval(exp, env) for exp in x]
        proc = exps.pop(0)
        return proc(*exps)

Mostly this is self-explanatory. But allow me to draw your attention to how Norvig deals with anonymous procedure definitions using lambda. When I first examined his code I wondered how he'd cope with this, and expected it would be quite complex. But as you can see, it is extremely simple: lambda expressions evaluate to the appropriate anonymous Python function, with a new environment modified by the addition of the appropriate variable keys, and their values. Beautiful!

Tiddlylisp is essentially complete at this point. It's convenient to finish off the program by providing two ways of running tiddlylisp: either in an interactive interpreter mode, i.e., the REPL, or by loading a tiddlylisp program stored in a separate file. To start the REPL, we'll simply run python tiddlylisp.py. To load and execute a file, we'll run python tiddlylisp.py filename. After execution, we'd like to be dropped into the REPL so we can inspect results and do further experiments. The main complication in doing this is the need to load tiddlylisp code which is split over multiple lines. We do this by merging lines until the number of opening and closing parentheses match. Here's the code - it's best to start at the bottom, with the code immediately after if __name__ == "__main__":

import sys

def load(filename):
    """
    Load the tiddlylisp program in filename, execute it, and start the
    repl.  If an error occurs, execution stops, and we are left in the
    repl.  Note that load copes with multi-line tiddlylisp code by
    merging lines until the number of opening and closing parentheses
    match.
    """
    print "Loading and executing 
    f = open(filename, "r")
    program = f.readlines()
    f.close()
    rps = running_paren_sums(program)
    full_line = ""
    for (paren_sum, program_line) in zip(rps, program):
        program_line = program_line.strip()
        full_line += program_line+" "
        if paren_sum == 0 and full_line.strip() != "":
            try:
                val = eval(parse(full_line))
                if val is not None: print to_string(val)
            except:
                handle_error()
                print "\nThe line in which the error occurred:\n
                break
            full_line = ""
    repl()

def running_paren_sums(program):
    """
    Map the lines in the list program to a list whose entries contain
    a running sum of the per-line difference between the number of '('
    parentheses and the number of ')' parentheses.
    """
    count_open_parens = lambda line: line.count("(")-line.count(")")
    paren_counts = map(count_open_parens, program)
    rps = []
    total = 0
    for paren_count in paren_counts:
        total += paren_count
        rps.append(total)
    return rps

if __name__ == "__main__":
    if len(sys.argv) > 1: 
        load(sys.argv[1])
    else: 
        repl()

That completes the code for tiddlylisp! A grand total of 153 lines of non-comment, non-whitespace code. Here it all is, in one big block (commented and slightly reordered), so you can see how the pieces fit together:

#### tiddlylisp.py
#
# Based on Peter Norvig's lispy (http://norvig.com/lispy.html),
# copyright by Peter Norvig, 2010.
#
# Adaptations by Michael Nielsen.  See
# https://michaelnielsen.org/ddi/lisp-as-the-maxwells-equations-of-software/

import sys
import traceback

#### Symbol, Env classes

Symbol = str

class Env(dict):
    "An environment: a dict of {'var':val} pairs, with an outer Env."

    def __init__(self, params=(), args=(), outer=None):
        self.update(zip(params, args))
        self.outer = outer

    def find(self, var):
        "Find the innermost Env where var appears."
        return self if var in self else self.outer.find(var)

def add_globals(env):
    "Add some built-in procedures and variables to the environment."
    import operator
    env.update(
        {'+': operator.add,
         '-': operator.sub, 
         '*': operator.mul, 
         '/': operator.div, 
         '>': operator.gt, 
         '<': operator.lt, 
         '>=': operator.ge, 
         '<=': operator.le, 
         '=': operator.eq
         })
    env.update({'True': True, 'False': False})
    return env

global_env = add_globals(Env())

isa = isinstance

#### eval

def eval(x, env=global_env):
    "Evaluate an expression in an environment."
    if isa(x, Symbol):              # variable reference
        return env.find(x)[x]
    elif not isa(x, list):          # constant literal
        return x                
    elif x[0] == 'quote' or x[0] == 'q': # (quote exp), or (q exp)
        (_, exp) = x
        return exp
    elif x[0] == 'atom?':           # (atom? exp)
        (_, exp) = x
        return not isa(eval(exp, env), list)
    elif x[0] == 'eq?':             # (eq? exp1 exp2)
        (_, exp1, exp2) = x
        v1, v2 = eval(exp1, env), eval(exp2, env)
        return (not isa(v1, list)) and (v1 == v2)
    elif x[0] == 'car':             # (car exp)
        (_, exp) = x
        return eval(exp, env)[0]
    elif x[0] == 'cdr':             # (cdr exp)
        (_, exp) = x
        return eval(exp, env)[1:]
    elif x[0] == 'cons':            # (cons exp1 exp2)
        (_, exp1, exp2) = x
        return [eval(exp1, env)]+eval(exp2,env)
    elif x[0] == 'cond':            # (cond (p1 e1) ... (pn en))
        for (p, e) in x[1:]:
            if eval(p, env): 
                return eval(e, env)
    elif x[0] == 'null?':           # (null? exp)
        (_, exp) = x
        return eval(exp,env) == []
    elif x[0] == 'if':              # (if test conseq alt)
        (_, test, conseq, alt) = x
        return eval((conseq if eval(test, env) else alt), env)
    elif x[0] == 'set!':            # (set! var exp)
        (_, var, exp) = x
        env.find(var)[var] = eval(exp, env)
    elif x[0] == 'define':          # (define var exp)
        (_, var, exp) = x
        env[var] = eval(exp, env)
    elif x[0] == 'lambda':          # (lambda (var*) exp)
        (_, vars, exp) = x
        return lambda *args: eval(exp, Env(vars, args, env))
    elif x[0] == 'begin':           # (begin exp*)
        for exp in x[1:]:
            val = eval(exp, env)
        return val
    else:                           # (proc exp*)
        exps = [eval(exp, env) for exp in x]
        proc = exps.pop(0)
        return proc(*exps)

#### parsing

def parse(s):
    "Parse a Lisp expression from a string."
    return read_from(tokenize(s))

def tokenize(s):
    "Convert a string into a list of tokens."
    return s.replace('(',' ( ').replace(')',' ) ').split()

def read_from(tokens):
    "Read an expression from a sequence of tokens."
    if len(tokens) == 0:
        raise SyntaxError('unexpected EOF while reading')
    token = tokens.pop(0)
    if '(' == token:
        L = []
        while tokens[0] != ')':
            L.append(read_from(tokens))
        tokens.pop(0) # pop off ')'
        return L
    elif ')' == token:
        raise SyntaxError('unexpected )')
    else:
        return atom(token)

def atom(token):
    "Numbers become numbers; every other token is a symbol."
    try: return int(token)
    except ValueError:
        try: return float(token)
        except ValueError:
            return Symbol(token)

def to_string(exp):
    "Convert a Python object back into a Lisp-readable string."
    if not isa(exp, list):
        return str(exp)
    else:
        return '('+' '.join(map(to_string, exp))+')'         

#### Load from a file and run

def load(filename):
    """
    Load the tiddlylisp program in filename, execute it, and start the
    repl.  If an error occurs, execution stops, and we are left in the
    repl.  Note that load copes with multi-line tiddlylisp code by
    merging lines until the number of opening and closing parentheses
    match.
    """
    print "Loading and executing 
    f = open(filename, "r")
    program = f.readlines()
    f.close()
    rps = running_paren_sums(program)
    full_line = ""
    for (paren_sum, program_line) in zip(rps, program):
        program_line = program_line.strip()
        full_line += program_line+" "
        if paren_sum == 0 and full_line.strip() != "":
            try:
                val = eval(parse(full_line))
                if val is not None: print to_string(val)
            except:
                handle_error()
                print "\nThe line in which the error occurred:\n
                break
            full_line = ""
    repl()

def running_paren_sums(program):
    """
    Map the lines in the list program to a list whose entries contain
    a running sum of the per-line difference between the number of '('
    parentheses and the number of ')' parentheses.
    """
    count_open_parens = lambda line: line.count("(")-line.count(")")
    paren_counts = map(count_open_parens, program)
    rps = []
    total = 0
    for paren_count in paren_counts:
        total += paren_count
        rps.append(total)
    return rps

#### repl

def repl(prompt='tiddlylisp> '):
    "A prompt-read-eval-print loop."
    while True:
        try:
            val = eval(parse(raw_input(prompt)))
            if val is not None: print to_string(val)
        except KeyboardInterrupt:
            print "\nExiting tiddlylisp\n"
            sys.exit()
        except:
            handle_error()

#### error handling

def handle_error():
    """
    Simple error handling for both the repl and load.
    """
    print "An error occurred.  Here's the Python stack trace:\n"
    traceback.print_exc()

#### on startup from the command line

if __name__ == "__main__":
    if len(sys.argv) > 1: 
        load(sys.argv[1])
    else: 
        repl()

Problems

Modify tiddlylisp so that the + procedure can be applied to any number of arguments, e.g., so that (+ 1 2 3) evaluates to 6.
Earlier we implemented a square root procedure in tiddlylisp. Can you add it directly to tiddlylisp, using the Python math module's sqrt function?
In our earlier implementation of the sqrt procedure we discussed the ordering of the lines of code, and whether it's okay to define a procedure in terms of some other yet-to-be-defined procedure. Examine the code for tiddlylisp, and explain why it's okay for a procedure such as sqrt to be defined in terms of a procedure such as sqrt-iter which isn't defined until later. Try doing the same thing with variables, e.g., try running (define x y) followed by (define y 1). Does this work? If so, why? If not, why not?
Modify tiddlylisp so that when applied to one argument the - procedure simply negates it, e.g., (- 2) returns -2, while - still computes differences when applied to two arguments.
Is it possible to write a pure tiddlylisp procedure minus so that (minus x) returns -x, while (minus x y) returns x-y?
In the discussion where we introduced cond I stated that (cond (p1 e1)...(pn en)) is undefined when none of the expressions p1...pn evaluate to True. What does Tiddlylisp return in this case? Can you think of a better way of dealing with this situation?
Can you add support for strings to tiddlylisp?

When I first examined Norvig's code for lispy, I was surprised by just how much I learned from his code. Of course, I expected to learn quite a bit - I am just a beginner at Lisp - but what I learned greatly exceeded my expectations. Why might writing an interpreter deepen our understanding of a programming language? I think the answer has to do with how we understand abstractions. Consider the way I first explained the concept of Lisp environments, early in this essay: I gave a general discussion of the concept, and then related it to several of the examples we were working through. This is the usual way we cope with abstractions when learning (or teaching) a language: we make those abstractions concrete by working through code examples that illustrates the consequences of those abstractions. The problem is that although I can show you examples, the abstraction itself remains ephemeral.

Writing an interpreter is a way of making a programming language's abstractions concrete. I can show you a million examples illustrating consequences of the Lisp environment, but none will have quite the same concrete flavour as the code for our Python Lisp interpreter. That code shows explicitly how the environment can be represented as a data structure, how it is manipulated by commands such as define, and so on. And so writing an interpreter is a way of reifying abstractions in the programming language being interpreted.

Problems

I gave the example of the environment as a Lisp abstraction which is made more concrete when you understand the code for the interpreter. Another example of an abstraction is errors in code. Can you improve tiddlylisp's error handling so that we get something more informative than a Python stack trace when something goes wrong? One suggestion for how to do this is to identify two (or more) classes of error that may occur in tiddlylisp programs, and to modify the interpreter so it catches and gracefully handles those error classes, printing informative error messages.

On Peter Norvig's webpage describing his interpreter, a few commenters take him to task for writing his interpreter in Python. Here's an example to give you the flavour of these comments:

This code looks very nice, but i think that implementing a Lisp Interpreter in Python is some kind of cheating. Python is a high-level language, so you get very much for free.

Norvig replies:

You are right -- we are relying on many features of Python: call stack, data types, garbage collection, etc. The next step would be to show a compiler to some sort of assembly language. I think either the Java JVM or the Python byte code would be good targets. We'd also need a runtime system with GC. I show the compiler in my PAIP [Paradigms of Artificial Intelligence] book.

The commenter and Norvig are right, in some sense. But there's also a sense in which the Python interpreter achieves something that would not be achieved by a program that compiled Lisp to the Java JVM, or to assembler, or some other target closer to the bare metal. That's because of all programming languages, Python is one of the closest to an ordinary human language. And so writing a Lisp interpreter in Python is an exceptionally clear way of explaining how Lisp works to a human who doesn't yet understand the core concepts of Lisp. Insofar as I can guess at Norvig's intention, I believe the code for his interpreter is primarily intended to be read by humans, and the fact that it can also be read by a computer is a design constraint, not the fundamental purpose [3].

It seems to me that the kind of comment above arises because there are really three natural variations on the question "How to explain Lisp?". All three variations are interesting, and worth answering; all three have different answers. The first variation is how to explain Lisp to a person who doesn't yet know Lisp. As I've just argued, a good answer to this question is to work through some examples, and then to write a simple Python interpreter. The second variation is how to explain Lisp to a machine. That's the question the commenter on Norvig's blog is asking, and to answer that question nothing beats writing a Lisp interpreter (or compiler) that works close to the bare metal, say in assembler, requiring you to deal explicitly with memory allocation, garbage collection, and so on.

But there's also a third variation on the question. And that's how best to explain Lisp to someone who already understands the core concepts of Lisp. That sounds paradoxical: doesn't such a person, by definition, already understand Lisp? But it's not paradoxical at all. Consider the following experience which many people have when learning (or teaching) mathematics. The best way to explain a mathematical idea to someone new to the idea is using their old language and their old way of looking at the world. This is like explaining Lisp by writing a Python interpreter. But once the person has grasped a transformative new mathematical idea, they can often deepen their understanding by re-examining that idea within their new way of looking at the world. That re-examination can help crystallize a deeper understanding. In a similar way, while writing a Lisp interpreter in Python may be a good way of explaining Lisp to a person who doesn't yet understand Lisp, someone who grasps the core ideas of Lisp may find the Python interpreter a little clunky. How should we explain Lisp within the framework of Lisp itself? One answer to that question is to use Lisp to write a Lisp interpreter. It's to that task that we now turn.

Lisp in Lisp

How should we write a Lisp interpreter in Lisp? Let's think back to what Alan Kay saw at the bottom of page 13 of the LISP Manual:

Although it's written in a different notation than we've used, this is Lisp code. In fact, it's the core of a Lisp interpreter written in Lisp: the procedure evalquote takes a Lisp expression as input, and then returns the value of that expression. In this section we're going to use tiddlylisp to write an analogue to evalquote (we'll change the name to eval). Of course, such a procedure is not really a full interpreter - we won't have a read-eval-print loop, for one thing - but it's not difficult to extend our code to a full interpreter (it requires a few additions to tiddlylisp, too). For this reason, in what follows I'll refer to our eval procedure as an "interpreter", even though it's more accurate to say that it's the core of an interpreter. I haven't made the extension to a full interpreter here, partly because I don't want to lengthen an already long essay, but mostly because I want to stick to the theme of the "Maxwell's equations of software". For the same reasons, I've also limited our eval to interpreting only a subset of tiddlylisp, omitting the arithmetical operations and concentrating instead on procedures for manipulating lists.

My treatment in this section is based on a beautiful essay (postscript) by Paul Graham, in which he explains what the original designer of Lisp, John McCarthy, was up to in the paper where he introduced Lisp. In his essay, Graham writes a fully executable Lisp interpreter in one of the modern dialects of Lisp, Common Lisp, and I've based much of my code on Graham's. Perhaps the main difference in my treatment is that while Graham's eval is written to be run under Common Lisp, our eval is executable in tiddlylisp, an interpreter for Lisp that we've written ourselves (with lots of help from Peter Norvig!) So even though the code is very similar, the perspective is quite diferent, and I think we gain something from this different perspective.

The code we'll write is longer than what you see on page 13 of the LISP Manual. The reason is that the code on page 13 was not actually self-contained, but made use of several procedures defined earlier in the LISP Manual, and we need to include those procedures. The final result is still only a little over a page of code. Let's start by defining a few of those helper procedures.

(not exp) returns True if the expression exp evaluates to False, and otherwise returns False. For example,

tiddlylisp> (not (atom? (q (1 2))))
True
tiddlylisp> (not (eq? 1 (- 2 1)))
False

Here's the tiddlylisp code for not:

(define not (lambda (x) (if x False True)))

(append exp1 exp2) takes expressions exp1 and exp2 both of whose values are lists, and returns the list formed by concatenating those lists. For example,

tiddlylisp> (append (q (1 2 3)) (q (4 5)))
(1 2 3 4 5)

Here's the tiddlylisp code for append:

(define append (lambda (x y)
		 (if (null? x) y (cons (car x) (append (cdr x) y)))))

(pair exp1 exp2) returns a two-element list whose elements are the value of exp1 and the value of exp2:

tiddlylisp> (pair 1 2)
(1 2)
tiddlylisp> (pair (+ 1 2) 1)
(3 1)

Here's the tiddlylisp code for pair:

(define pair (lambda (x y) (cons x (cons y (q ()) ))))

Note that my use of pair is somewhat unconventional - the more usual approach in Lisp is to use (list exp1 exp2 exp3...) to construct a list whose values are just the values of the respective expressions. The reason I haven't done this is because tiddlylisp doesn't allow us to define Lisp procedures with a variable number of arguments. Note also that the procedure pair that I've defined should not be confused with one of Scheme's standard procedures, pair?, which has a different purpose, and which we won't use in the current essay.

Problems

Can you modify tiddlylisp so that (list exp1 exp2 exp3...) does indeed return a list whose values are just the values of the respective expressions?

I'll now introduce a class of helper procedures which are concatenations of two or more applications of car or cdr. An example is the procedure cdar, which applies car first, followed by cdr, that is, (cdar exp) has the same value as (cdr (car exp)). The notation cdar is a mnemonic, whose key elements are the middle two letters, d and a, indicating that cdar is what you get when you apply (in reverse order) cdr and car. You might wonder why it's reverse order - the answer is that reverse order corresponds to the visual syntactic order, that is, the order from left-to-right that the procedures appear in the expression (cdr (car exp)).

As another example, the procedure caar is defined so that (caar exp) has the same value as (car (car exp)). In our eval it'll be helpful to use several such procedures:

(define caar (lambda (x) (car (car x))))
(define cadr (lambda (x) (car (cdr x))))
(define cadar (lambda (x) (cadr (car x))))
(define caddr (lambda (x) (cadr (cdr x))))
(define caddar (lambda (x) (caddr (car x))))

Our next helper procedure is called pairlis. (pairlis exp1 exp2) takes expressions exp1 and exp2 whose values are lists of the same length, and returns a list which is formed by pairing the values of corresponding elements. For example,

tiddlylisp> (pairlis (q (1 2 3)) (q (4 5 6)))
((1 4) (2 5) (3 6))

Here's the tiddlylisp code for pairlis:

(define pairlis 
    (lambda (x y)
      (if (null? x)
	  (q ())
	  (cons (pair (car x) (car y)) (pairlis (cdr x) (cdr y))))))

We'll call a list of pairs such as that produced by pairlis an association list. It gets this name from our final helper procedure, the assoc procedure, which takes an association list and treats it as a lookup dictionary. The easiest way to explain what this means is through an example,

tiddlylisp> (define a (pairlis (q (1 2 3)) (q (4 5 6))))
tiddlylisp> a
((1 4) (2 5) (3 6))
tiddlylisp> (assoc 2 a)
5

In other words, assoc looks for the key 2 as the first entry in one of the pairs in the list which is the value of a. Once it finds such a pair, it returns the second element in the pair.

Stated more abstractly, suppose the expression exp1 has a value which appears as the first entry in one of the pairs in the association list which is the value of exp2. Then (assoc exp1 exp2) returns the second entry of that pair.

After all that explanation, the code for assoc is extremely simple, simpler even than pairlis:

(define assoc (lambda (x y)
		(if (eq? (caar y) x) (cadar y) (assoc x (cdr y)))))

I won't explain how assoc works, but if you're looking for a good exercise in applying caar and similar procedures, then it's worth spending some time to carefully understand how assoc works.

With all these helper procedures in place, we can now write our equivalent to the code on page 13 of the LISP Manual. This includes both the core procedure, eval, together with a couple of extra helper procedures, evcon and evlis. Here's the code:

(define eval 
    (lambda (e a)
      (cond
	((atom? e) (assoc e a))
	((atom? (car e))
	 (cond
	   ((eq? (car e) (q car))   (car (eval (cadr e) a)))
	   ((eq? (car e) (q cdr))   (cdr (eval (cadr e) a)))
	   ((eq? (car e) (q cons))  (cons (eval (cadr e) a) (eval (caddr e) a)))
	   ((eq? (car e) (q atom?)) (atom? (eval (cadr e) a)))
	   ((eq? (car e) (q eq?))   (eq? (eval (cadr e) a) (eval (caddr e) a)))
	   ((eq? (car e) (q quote)) (cadr e))
	   ((eq? (car e) (q q))     (cadr e))
	   ((eq? (car e) (q cond))  (evcon (cdr e) a))
	   (True                   (eval (cons (assoc (car e) a) (cdr e)) a))))
	((eq? (caar e) (q lambda))
	 (eval (caddar e) (append (pairlis (cadar e) (evlis (cdr e) a)) a))))))

(define evcon 
    (lambda (c a)
      (cond ((eval (caar c) a) (eval (cadar c) a))
	    (True              (evcon (cdr c) a)))))

(define evlis 
    (lambda (m a)
      (cond ((null? m) (q ()))
	    (True     (cons (eval (car m) a) (evlis (cdr m) a))))))

Before we examine how eval works, I want to give you some examples of eval in action. If you want, you can follow along with the examples by first loading the program defining eval into tiddlylisp (the full source is below), and then typing the examples into the interpreter.

To understand how to use eval in examples, we need to be clear about the meaning of its arguments. e is a Lisp expression whose value is the Lisp expression that we want to evaluate with eval. And a is a Lisp expression whose value is an association list, representing the environment. In particular, the first element of each pair in a is the name of a variable or procedure, and the second element is the value of that variable or procedure. I'll often refer to a just as the environment.

Suppose, for example, that we wanted to use eval to evaluate the expression (car (q (1 2))). We'll assume that we're evaluating it in the empty environment, that is, no variables or extra procedures have been defined. Then we'd need to pass eval expressions with values (car (q (1 2))) and (). We can do this by quoting those values:

tiddlylisp> (eval (q (car (q (1 2)))) (q ()))
1

As you can see, we get the right result: 1.

I explained in detail how to build up the expression (eval (q (car...) evaluated above. But if we hadn't gone through that explanation, then the expression would have appeared quite of complicated, with lots of quoting going on. The reason is that eval is evaluating an expression which is itself the value of another expression. With so much evaluation going on it's no wonder there's many q's floating around! But after working carefully through a few examples it all becomes transparent.

Here's an example showing how to use variables in the environment:

tiddlylisp> (eval (q (cdr x)) (q ((x (1 2 3)))))
(2 3)

Unpacking the quoting, we see that it's evaluating the expression (cdr x) in an environment with a variable x whose value is (1 2 3). The result is, of course, (2 3).

Here's an example showing how to use a procedure which has been defined in the environment:

tiddlylisp> (eval (q (cddr (q (1 2 3 4 5)))) (q ((cddr (lambda (x) (cdr (cdr x)))))))
(3 4 5)

In other words, the environment stores a procedure cddr whose value is (lambda (x) (cdr (cdr x))), and eval returns the result of applying cddr to an expression whose value is (1 2 3 4 5). Of course, this is just (3 4 5).

We can also use eval to define and evaluate an anonymous procedure, in this case one that has the same effect as cadr:

tiddlylisp> (eval (q ((lambda (x) (car (cdr x))) (q (1 2 3 4)))) (q ()))
2

A significant drawback of eval is that it has a pretty limited Lisp vocabulary. You can see this by running:

tiddlylisp> (eval (q (eq? 1 1)) (q (())))

The first line looks like perfectly valid Lisp - in fact, it is perfectly valid Lisp. The problem is that eval doesn't recognize 1 - at the level of sophistication we're working it really only understands lists, variables, and procedures. So what it tries to do is treat 1 as a variable or procedure to look up in the environment, a. But 1 isn't in the environment, which is why there's an error message.

Fixing this problem by modifying eval isn't terribly difficult [4]. However, to stay close to the LISP Manual, I'll leave this as is. A kludge to get around this issue is to add 1 as a key in the environment. For example, we can use:

tiddlylisp> (eval (q (eq? 1 1)) (q ((1 1))))
True
tiddlylisp> (eval (q (eq? 1 2)) (q ((1 1) (2 2))))
False

This is exactly as expected. We didn't see this problem in our earlier examples of eval, simply because they involved list manipulations which didn't require us to evaluate numbers such as 1. Incidentally, here's an amusing variation on the above kludge:

tiddlylisp> (eval (q (eq? 1 2)) (q ((1 1) (2 1))))
True

In other words, if we tell our interpreter emphatically enough that 1 = 2 then it will start to believe it!

Just to put eval through its paces, let's add a bundle of tests of basic functionality. It's not an exhaustive test suite, but at least checks that the basic procedures are working as we expect. You don't need to read through the following test code in exhaustive detail, although you should read at least the first few lines, to get a feeling for what's going on. Note that in a few of the lines we need to add something like 1 or 2 to the environment, in order that eval be able to evaluate it, as occurred in the example just above.

(define assert-equal (lambda (x y) (= x y)))

(define assert-not-equal (lambda (x y) (not (assert-equal x y))))

(assert-equal (eval (q x) (q ((x test-value))))
	      (q test-value))
(assert-equal (eval (q y) (q ((y (1 2 3)))))
	      (q (1 2 3)))
(assert-not-equal (eval (q z) (q ((z ((1) 2 3)))))
		  (q (1 2 3)))
(assert-equal (eval (q (quote 7)) (q ()))
	      (q 7))
(assert-equal (eval (q (atom? (q (1 2)))) (q ()))
	      False)
(assert-equal (eval (q (eq? 1 1)) (q ((1 1))))
	      True)
(assert-equal (eval (q (eq? 1 2)) (q ((1 1) (2 2))))
	      False)
(assert-equal (eval (q (eq? 1 1)) (q ((1 1))))
	      True)
(assert-equal (eval (q (car (q (3 2)))) (q ()))
	      (q 3))
(assert-equal (eval (q (cdr (q (1 2 3)))) (q ()))
	      (q (2 3)))
(assert-not-equal (eval (q (cdr (q (1 (2 3) 4)))) (q ()))
		  (q (2 3 4)))
(assert-equal (eval (q (cons 1 (q (2 3)))) (q ((1 1)(2 2)(3 3))))
	      (q (1 2 3)))
(assert-equal (eval (q (cond ((atom? x) (q x-atomic)) 
			     ((atom? y) (q y-atomic)) 
			     ((q True) (q nonatomic)))) 
		    (q ((x 1)(y (3 4)))))
	      (q x-atomic))
(assert-equal (eval (q (cond ((atom? x) (q x-atomic)) 
			     ((atom? y) (q y-atomic)) 
			     ((q True) (q nonatomic)))) 
		    (q ((x (1 2))(y 3))))
	      (q y-atomic))
(assert-equal (eval (q (cond ((atom? x) (q x-atomic)) 
			     ((atom? y) (q y-atomic)) 
			     ((q True) (q nonatomic)))) 
		    (q ((x (1 2))(y (3 4)))))
	      (q nonatomic))
(assert-equal (eval (q ((lambda (x) (car (cdr x))) (q (1 2 3 4)))) (q ()))
	      2)

In tiddlylisp, perhaps the easiest way to use this test code is to append it at the bottom of the file where we define eval. Then, when we load that file into memory, the tests run automatically. If everything is working properly, then all the tests should evaluate to True.

How does eval work? Looking back at the code, we see that it's just a big cond statement, whose value is determined by which of various conditions evaluate to True. The cond statement starts off:

      (cond
	((atom? e) (assoc e a))
        ...

To understand what this accomplishes, it is helpful to remember that what we're most interested in is the value of e, not e itself. Let's use e' to denote the value of e, i.e., e' is the Lisp expression that we actually want to evaluate using eval. Then what the condition above does is check whether e' is atomic, and if so it returns the value of the corresponding variable or procedure in the environment, exactly as we'd expect.

Let's look at the next line in the big outer conditional statement:

	((atom? (car e))

At this stage, we know that e' isn't atomic, since we already checked for that, and so e' must be a list. This line checks to see whether the first element of e' is itself an atom. If it is, then there are multiple possibilities: it could be a special form, such as quote, or a built-in procedure, such as car, or else a procedure that's defined in the environment. To check which of these possibilities is the case, we evaluate another (nested) conditional statement. This just checks off the different cases, for instance the first line of the nested conditional checks to see if we're applying the procedure car, and if so proceeds appropriately,

	   ((eq? (car e) (q car))   (car (eval (cadr e) a)))

In other words, if the first symbol in e' is car, then extract whatever expression is being passed to car, using (cadr e), then evaluate that expression using (eval (cadr e) a), and finally extract the first element, using (car (eval...)). That's exactly what we'd expect car to do. Most of the rest of this nested conditional statement works along similar lines, as you can check yourself. The final line is interesting, and deserves comment:

	   (True                   (eval (cons (assoc (car e) a) (cdr e)) a))))

This line is evaluated when the expression e' does not start with a special form or built-in procedure, but instead starts with the name of a procedure defined in the environment. To understand what is returned, note that (car e) retrieves the name of the procedure, so (assoc (car e) a) can retrieve the procedure from the environment, and then (cons (assoc (car e) a) (cdr e)) appends the arguments to the procedure. The whole thing is then evaluated. It's all quite simple and elegant!

Moving back into the outer cond statement, the final condition is as follows:

	((eq? (caar e) (q lambda))
	 (eval (caddar e) (append (pairlis (cadar e) (evlis (cdr e) a)) a))))))

This occurs when evaluating a quoted expression of the form ((lambda (x...|) exp). The first line simply checks that we are, indeed, seeing a lambda expression. The caddar e extracts the expression exp from the body of the lambda expression. We evaluate this in the context of an environment which has modified by appending some new variable names (extracted with cadar e), using pairlis to pair them with their values, which are evaluated using evlis (which you can work through yourself). Once again, it's all quite simple and neat - a fact which speaks to the marvellous elegance of the design presented in the LISP Manual (and, ultimately, due to John McCarthy).

It won't have escaped your attention that our Lisp eval is very similar to the eval we wrote earlier in Python. Tiddlylisp is somewhat different to the dialect of Lisp our eval interprets, but the implementation is recognizably similar. It is a matter of taste, but I think the Lisp implementation is more elegant. It's true that the Lisp code is superficially a little more complex - it relies more on concepts outside our everyday experience, such as the procedures caar, cadar, and so on. But it makes up for that by possessing a greater conceptual economy, in that we are using concepts such as car, cdr and cond to write an interpreter which understands those very same concepts.

Here's the full code for our Lisp interpreter in tiddlylisp. You should append the test code given above, and save it all as a single file, eval.tl.

(define caar (lambda (x) (car (car x))))
(define cadr (lambda (x) (car (cdr x))))
(define cadar (lambda (x) (cadr (car x))))
(define caddr (lambda (x) (cadr (cdr x))))
(define caddar (lambda (x) (caddr (car x))))

(define not (lambda (x) (if x False True)))

(define append (lambda (x y)
		 (if (null? x) y (cons (car x) (append (cdr x) y)))))

(define pair (lambda (x y) (cons x (cons y (q ()) ))))

(define pairlis 
    (lambda (x y)
      (if (null? x)
	  (q ())
	  (cons (pair (car x) (car y)) (pairlis (cdr x) (cdr y))))))

(define assoc (lambda (x y)
		(if (eq? (caar y) x) (cadar y) (assoc x (cdr y)))))

(define eval 
    (lambda (e a)
      (cond
	((atom? e) (assoc e a))
	((atom? (car e))
	 (cond
	   ((eq? (car e) (q car))   (car (eval (cadr e) a)))
	   ((eq? (car e) (q cdr))   (cdr (eval (cadr e) a)))
	   ((eq? (car e) (q cons))  (cons (eval (cadr e) a) (eval (caddr e) a)))
	   ((eq? (car e) (q atom?)) (atom? (eval (cadr e) a)))
	   ((eq? (car e) (q eq?))   (eq? (eval (cadr e) a) (eval (caddr e) a)))
	   ((eq? (car e) (q quote)) (cadr e))
	   ((eq? (car e) (q q))     (cadr e))
	   ((eq? (car e) (q cond))  (evcon (cdr e) a))
	   (True                   (eval (cons (assoc (car e) a) (cdr e)) a))))
	((eq? (caar e) (q lambda))
	 (eval (caddar e) (append (pairlis (cadar e) (evlis (cdr e) a)) a))))))

(define evcon 
    (lambda (c a)
      (cond ((eval (caar c) a) (eval (cadar c) a))
	    (True              (evcon (cdr c) a)))))

(define evlis 
    (lambda (m a)
      (cond ((null? m) (q ()))
	    (True     (cons (eval (car m) a) (evlis (cdr m) a))))))

It's instructive to compare our eval to what Kay saw on page 13 of the LISP 1.5 Programmer's Manual:

Obviously, what we've written is longer than that half-page! However, as I mentioned earlier, that half-page omitted the code for helper procedures such as caar, append, and so on, which were defined earlier in the LISP Manual. A more direct comparison is to our code for the eval, evcon and evlis procedures.

If you compare our code to the LISP Manual, a few differences jump out. The most obvious is that the LISP Manual's evalquote, apply and eval have all been combined into one procedure. This is a form of organization I adopted from Paul Graham's eval, and it makes it much easier to see what is going on, in the outer cond. In particular, the outer cond has a very simple structure: (1) if the expression we're tring to evaluate is an atom, return its value; otherwise (2) the expression must be a list, so check to see if the first element is an atom, in which case it must be a special form or procedure, and should be evaluated appropriately (this is the inner cond); and otherwise (3) we must be dealing with a lambda expression.

Condition (3) is interesting. With the syntax we're using, the condition in step (3) could simply be expressed as True, not (eq? (caar e) (q lambda), since it's the only remaining possibility. This would, in some sense, simplify (and speed up) the code. However, it would also make it harder to understand the intent of the code.

Something that you may note was present in the LISP Manual but which is missing from our eval is the special form label. label was used in the LISP Manual to give names to procedures, and so that procedure definitions could refer recursively to themselves. It's only a couple of lines to add back in, but I haven't done so. If you'd like, it's a fun challenge to add this functionality back in, and so I've given this as a problem below.

Problems

How could you add a facility to eval so that procedure definitions can refer to themselves? If you're having trouble with this problem, you can get a hint by looking at the code from page 13 of the LISP Manual. A complete solution to the problem may be found in Paul Graham's essay about the roots of Lisp.

This is a nice little interpreter. However, it has many limitations, even when compared to tiddlylisp. It can't do basic arithmetic, doesn't cope with integers, much less more complicated data types, it doesn't even have a way of defineing variables (after all, it doesn't return a modified environment). Still, it already contains many of the core concepts of Lisp, and it really is an executable counterpart to what Alan Kay saw on page 13 of the LISP 1.5 Manual.

What can we learn from this interpreter for Lisp in Lisp? As an intellectual exercise, it's cute, but beyond that, so what? Let's think about the analogous question for Python, i.e., writing a Python function that can return the value of Python expressions. In some sense, solving this problem is a trivial one-liner, since Python has a built-in function called eval that is capable of evaluating Python expressions.

What if, however, we eliminated eval (and similar functions) from Python? What then? Well, a Python version of eval would be much more complicated than our Lisp eval. Python is much less regular language than Lisp, and this makes it much more complicated for it to deal with Python code. By contrast, Lisp has an extremely simple syntax, and is designed to manipulate its own code as data. This is all reflected in the simplicity of the interpreter above.

Beyond this, the code for eval is a beautiful expression of the core ideas of Lisp, written in Lisp. It's true that our eval implements a very incomplete version of Lisp, but with just a little elaboration we can add support for arithmetic, more advanced control structures, and so on - everything needed to make this an essentially complete basic Lisp. And so we need only a little poetic license to say that, just as with Maxwell's equations and electromagnetism, there is a sense in which if you can look at this compact little program and understand all its consequences, then you understand all that Lisp can do. And because Lisp is universal, that means that inside these few lines of code is all a computer can do - everything from Space Invaders to computer models of climate to the Google search engine. In that sense this elegant little program really is the Maxwell's equations of software.

Problems

Outline a proof that (eval (q (exp)) a) returns the value of exp in the environment a for all expressions exp and environments a if and only if the underlying Lisp interpreter is correct. This little theorem can be considered a formal way of stating that eval contains all of Lisp. The reason I ask for an outline proof only is that various elements in the statement aren't defined as well as they need to be to make this a rigorous result; still, a compelling outline proof is possible.
Extend the code given for eval so that you can implement a full read-eval-print loop. This will require you to extend tiddlylisp so that it can cope with input and output, and (perhaps) some sort of looping.
Having worked through eval.tl, it should now be easy to work through the first chapter of the LISP 1.5 Programmer's Manual. Download the LISP manual and work through the first chapter, including the code on page 13.

Problems for the author

Is it possible to modify the above Lisp-in-Lisp so that it interprets all of tiddlylisp? Note that this will require modification of tiddlylisp.

Acknowledgements

Thanks to Jen Dodd for many helpful discussions.

Footnote

[1] I'm paraphrasing, since this was 17 years ago, but I believe I've reported the essence of the comments correctly. I've taken one liberty, which is in supplying my own set of examples (antennas, motors, and circuits), since I don't recall the examples he gave. Incidentally, his comments contain a common error that took me several years to sort out in my own thinking: Maxwell's equations actually don't completely specify electromagnetism. For that, you need to augment them with one extra equation, the Lorentz force law. It is, perhaps, unfair to characterize this as an error, since it's a common useage to equate Maxwell's equations with electromagnetism, and I've no doubt my professor was aware of the nuance. Nonetheless, while the useage is common, it's not correct, and you really do need the Lorentz force law as well. This nuance creates a slight quandary for me in this essay. As the title of the essay suggests, it explores a famous remark made by Alan Kay about what he called "the Maxwell's equations of software", and I presume that in making this remark Kay was following the common useage of equating the consequences of Maxwell's equations with the whole of electromagnetism. My quandary is this: on the one hand I don't wish to perpetuate this useage; on the other hand I think Kay's formulation is stimulating and elegant. So I'll adopt the same useage, but with the admonishment that you should read "Maxwell's equations" as synonomous with "Maxwell's equations plus the Lorentz force law". That set of five equations really does specify the whole of (classical) electromagnetism.

[2] I first encountered lambda in Python, not Lisp, but I believe I would have been perplexed for the same reason even if I'd first encountered it in Lisp.

[3] With a hat tip to Abelson and Sussman, who famously wrote "programs must be written for people to read, and only incidentally for machines to execute".

[4] The simplest solutions I can think of are: (1) to give eval the ability to determine when some key is not in the environment; or (2) to give eval the ability to recognize numbers. Both approaches seem to also require making some modifications to tiddlylisp.

37 comments

Pingback: Interesting reads | Separallel
Misha says:

April 12, 2012 at 12:03 am

Book form
Jacques Chester says:

April 12, 2012 at 12:30 am

Consider publishing this as a Kindle book. I could easily see myself paying a dollar for the privilege of a well-laid out copy.
1. Michael Nielsen says:
  
  April 12, 2012 at 10:04 am
  
  Thanks for the suggestion. I hadn’t considered the idea before. I’ll certainly keep the idea in mind for future essays, and consider it for this one.
  1. stavros says:
    
    April 12, 2012 at 10:31 pm
    
    It’s really easy, http://www.leanpub.com will do it for you for free in about ten minutes. Give it a shot!
Nick Chaiyachakorn says:

April 12, 2012 at 4:32 am

He didn’t start by throwing copies of Jackson at the heads of people? I’m suspicious.
1. Michael Nielsen says:
  
  April 12, 2012 at 10:00 am
  
  No, he broke with tradition!
  
  (We did, in fact, use Jackson.)
Pingback: Lisp as the Maxwell’s equations of software « (blog 'irr)
Darius Bacon says:

April 12, 2012 at 5:09 am

A couple more problems you might enjoy:

1. Change your Lisp system so multiple people can use it to cooperate without undue vulnerability. Give each user a separate global environment, with some way to send Lisp values to each other. Can Alice run Bob’s code on Carol’s data without risking any party’s privacy or integrity? http://mumble.net/~jar/pubs/secureos/ explores this. I wrote some example interactions at https://github.com/darius/consp/blob/master/tests.scm

2. Modify lis.py or Tiddlylisp to represent data in a lower-level form, as a pair of a type tag and a value — for example, a cons pair might become (‘pair’, index) where index is an offset into an array of all pairs. I just did this problem now and I’ll post my answer after I add a simple garbage collector and clean it up. SICP chapter 5 also gets into this.
1. Michael Nielsen says:
  
  April 12, 2012 at 10:13 am
  
  These both seem like very nice problems.
  
  For problem 1, I wonder if the recent work on homomorphic encryption would be of any help?
  
  A solution to problem 2 would be a nice basis for improving tiddlylisp, e.g., by making the arithmetic more conformant to the Scheme standard, with notions such as exact arithmetic. At the moment tiddlylisp simply uses Python’s standard numerical types and operators, which don’t correspond very well with Scheme’s approach.
  1. Darius Bacon says:
    
    April 13, 2012 at 4:45 am
    
    About homomorphic encryption, I’d like to know! The answer I had in mind: the interpreter can obey some basic invariants that you can build higher-level invariants you care about on top of. Basically, calling a procedure can affect only the data in its environment plus the arguments passed; the properties you care about might be “this data structure never gets clobbered”, “money is conserved”, “each user can vote at most once”. It”s like exploiting conservation laws with Maxwell’s equations (and Hewitt and Baker called the above basic rule a ‘locality law’) — only hopefully easier to engineer. This approach hasn’t been followed much because in languages not designed for it, all code can access global mutable state by default, and you’re kind of screwed before you start. I hear EcmaScript Harmony is being designed with these ideas in mind, though, and I’m told similar ideas should work for the web (thinking of a URL as an object reference, etc.).
    
    So if I want Alice to vote at most once, I might mail her the value of
    
    (define vote (lambda (choice) (really-vote choice) (set! vote #f)))
    
    while keeping really-vote close to my vest.
    
    While this is excellent for protecting integrity, it’s less useful for privacy because adversaries typically can observe covert channels (messages encoded in resource usage). Maybe that’s where homomorphic encryption can come in?
    
    I didn’t have much time for problem 2 today, but I’ll post it tomorrow.
    1. Darius Bacon says:
      
      April 13, 2012 at 4:54 am
      
      Actually what I should send in that example is (lambda (choice) (vote choice)), oops. Posting secure-coding examples after midnight: good way to maximize their educational value…
  2. Darius Bacon says:
    
    April 14, 2012 at 5:38 am
    
    My version of the lower-level data representation, still needing work: https://github.com/darius/sketchbook/blob/master/lispy/lllis.py
Pingback: Functional Programming: Lisp as the Maxwell’s equations of software | James's Musings
Paul McJones says:

April 12, 2012 at 3:03 pm

The URL http://www.softwarepreservation.org/projects/LISP/book/LISP%201.5%20Programmers%20Manual needs to have a “.pdf” at the end.
1. Michael Nielsen says:
  
  April 12, 2012 at 3:09 pm
  
  Paul – Already caught it, thanks!
Arne Hanssen says:

April 14, 2012 at 6:52 am

Great post!
I really like the way you use a quote from Alan Kay in you post. I think it’s also worth mentioning that Allan Kay had problems with the concept of “special forms” in McCarthy Lisp evaluator.
About this Alan Kay says:

there where deep flaws in its logical foundations. By this, I mean that the pure language was supposed to be based on functions, but its most important components—such as lambda expressions quotes, and conds–where not functions at all, and instead where called special forms.

So in Lisp, Kay saw more of a promise of a metastructures than a actual universal model of computation. This led Alan Kay to the invention of Smalltalk a programing language with just 5 keywords(true, false, nil, super, self).
Fadi E. says:

April 15, 2012 at 10:20 am

I still have a problem while all of us are using a language other than LISP to program. It’s the most beautiful language ever – it’s the essence of programming languages – yet it’s spurned in favor of Java and others.

What a shame!
Enrico Spinielli says:

April 18, 2012 at 9:48 am

I see you generate your formulae from LaTeX, so I think you would benefit from using MathJax – http://www.mathjax.org/ – in your posts, i.e. no need to generate the images anymore and they will scale with the font of the page.

Also I really enjoyed your post and would signal another nice minimalist lisp implementation (http://piumarta.com/software/lysp/) from Ian Piumarta, a reasearcher at Kay’s VPRI. Also his work with Maru – http://piumarta.com/software/maru/ -develops and further goes down to the mathemathical essentials of software.

Thanks
Pingback: Lisp as the Maxwell’s equations of software | OMY
Gareth says:

May 10, 2012 at 9:50 am

Very nice work, Michael.

I’d add “The Little Schemer” to the suggested reading list. A peculiar but effective (for me, at least) introduction to thinking functionally.
Pingback: Implementing Programming Languages in Javascript, JsConf Argentina « Angel ”Java” Lopez on Blog
Ralph Moritz says:

May 21, 2012 at 3:30 am

A beautifully written essay which I’d love to have available as a PDF.
Pingback: Alexander Kruel · Computer Science and Programming: Books and Resources
Bystroushaak says:

June 17, 2012 at 5:33 pm

Thank you.

I’ve been reading this article about month and finally finished it today. Thanks to you, I’ve finally realized, what mathematicians means by “beauty”, when they talk about math.
1. Michael Nielsen says:
  
  June 17, 2012 at 5:52 pm
  
  @Bystroushaak: Thankyou, very much, for your generous, thoughtful compliment. I’m very glad you enjoyed my essay.
Al says:

August 4, 2012 at 3:05 pm

Hi Michael,
I am currently reading your book “Quantum computation and quantum communication” and enjoy reading it.
Why don’t you update the book with the latest and greatest findings in the field?

Also about Lisp:
I like Lisp but it is just a functional programming language. I don’t agree that it is the “Maxwell’s equations of software”. May be you can say Lambda Calculus which is the foundation of all functional (and to some degree non-functional languages” is the Maxwell’s equations of software. But even saying that is going too far. Software engineering is FAR FROM having a simple underlying theory. May be theory of computation (Turing machines etc) can be that.

By the way I think modern programming languages like Haskel are much better and more capable than lisp. Also I suggest you do a computational version of your quantum computing book using Mathematica or Maple (for symbolic computing). A package for Mathlab also would be great. I am a software engineer and currently taking Quantum Computation courses (online with Prof Vazirani, Coursra) and can cooperate with you in creating a symbolic package for QC and QCOMM.
thanks
Al
Glen Kaukola says:

November 9, 2012 at 10:09 am

So in repl(), shouldn’t the while loop go inside the try block? I’ve been mucking about with your code and it seems to me the keyboard exception isn’t always caught.
1. Glen Kaukola says:
  
  November 9, 2012 at 10:42 am
  
  Yeah, nevermind. It’s just because I’m filling in with stubs at the get go.
Pingback: Python: Links, News And Resources (6) « Angel ”Java” Lopez on Blog
Pingback: Interprete Lisp in Python | Ok, panico
Jan Stette says:

July 30, 2013 at 4:38 am

Just wanted to say thank you for a fantastic post, this is one of the best pieces of writing on software I have ever come across.

I’ve spent some time working through the points covered, and written up a summary on my blog at http://www.janvsmachine.net/2013/07/writing-simple-lisp-interpreter-in-scala.html, in case anyone’s interested.
1. Michael Nielsen says:
  
  July 30, 2013 at 7:53 am
  
  Thanks, Jan, for the kind words.
  
  I don’t know Scala, but that looks like a very nice implementation!
Pingback: Your Language is a Start-up | Late Bound
Franklin Chen says:

November 25, 2013 at 12:44 pm

Lisp was also a tremendous part of my education as a programmer, especially when I studied SICP, which altered my life.

I’d like to mention that after SICP, something else came along that also changed my entire mindset as a programmer: the discovery of statically typed functional languages that felt very much like Scheme but added a whole new level of expression for me. These were ML (in the dialects of Standard ML and OCaml) and Haskell.

I’m going to make a bold claim: if Lisp is Maxwell’s equations in 19th century formulation, ML/Haskell are Maxwell’s equations expressed using differential forms and therefore more fundamental. As a physicist, you know what I mean.
1. Michael Nielsen says:
  
  November 25, 2013 at 12:47 pm
  
  Thanks for the comment, Franklin. I’ve never used ML / Haskell etc. I really should try them out, they sound fascinating!
Konrad says:

November 25, 2013 at 3:16 pm

Very nice post Michael. All the best and keep writing! 🙂