Conditionals

(+ (* 4 5) 3)

(+ (print (* 4 5)) 3)

Pen and paper in pairs

(+ (* 2 3)
   (* 4 5))

(+ (print (* 2 3))
   (print (* 4 5)))

(if (= 3 4)
    (+ 5 6)
    (* 7 8))

(if (print (= 3 4))
    (print (+ 5 6))
    (print (* 7 8)))

parse-interpret.png

Material on Interpreters

In the Linked information on interpreters you can find information on

  • Our Tolk interpreter architecture
  • The Tolk implementation
  • The PLAI book upon which it is based
  • The Common Lisp libraries we use
  • Links to the course materials in classes

Interpreting programming languages

We begin by writing and interpreter for a very simple programming language for arithmetic. After that we will add features to the programming language, increasing its complexity step by step. At each step we will make an interpreter, often by extending the previous interpreter. Each interpreter will be in a package of its own to avoid naming clashes.

As we define each language, we carry out the following steps:

  1. Open a new file in the src/racket subdirectory to contain the functions and forms which will define the language, e.g. src/racket/arith.lisp
  2. Define a package there and export all symbols required to use the language.
  3. Add a (:file ...) clause inside the (:module "racket" ...) in the system definition in tolk.asd, e.g.

    (:file "arith")
    
  4. In the tests/racket subdirectory add a test file containing the checks of the language's parser, evaluator, etc.
  5. Add a (:file ...) clause inside the (:module "racket" ...) in the test system definition in tolk.asd, e.g.

    (:file "arith")
    
  6. Run all the tests using (asdf:test-system :tolk) or run a single test using, e.g. (5am:run! 'tolk/arith/tests::arith-execute)

Arithmetic

We begin with simple arithmetic with only two operations: addition and multiplication. We will limit ourselves to binary operators; each will take exactly two arguments. The expressions of this language will be of three kinds:

  1. a number
  2. the sum of two expressions of the language
  3. the product of two expressions of the language

Note the recursive nature of the description of the expressions of the language.

Here are some examples of the surface syntax of this language in Racket. The expressions can be arbitrarily deeply nested.

32
(+ 3 4)
(+ (* 1 2) (+ 2 3))

If we were interpreting Python, those same expressions might look like this:

32
3 + 4
(1 * 2) * (2 + 3)

As a first step to interpreting one of these expressions, to abstract away from the surface syntax, we read it into an Abstract Syntax Tree (AST) , which we define using CLOS classes as follows.

(defclass ast-node () ())

(defclass num (ast-node)
  ((n :type number :initarg :n)))

(defclass arith-binary-op (ast-node)
  ((op :type function :initarg :op)
   (l :type ast-node :initarg :l)
   (r :type ast-node :initarg :r)))

(defclass plus (arith-binary-op) ((op :initform '+)))

(defclass mult (arith-binary-op) ((op :initform '*)))

Note how there are three concrete classes (num, plus, mult) corresponding to the three kinds of expressions of the language above, plus an abstract base class arith-binary-op that captures the common structure of binary operations.

We parse an s-expression as follows.

(defgeneric parse (x)
  (:documentation "Parse into an AST form."))

(defmethod parse (sexp)
  (ematch sexp
    ((type number) (make-instance 'num :n sexp))
    (`(+ ,l ,r) (make-instance 'plus :l (parse l) :r (parse r)))
    (`(* ,l ,r) (make-instance 'mult :l (parse l) :r (parse r)))))

Note how we make use of the trivia library for pattern matching together with the quasiquote notation.

A further method for parsing from a pathname can be found in ./src/racket/arith.lisp and the corresponding test in ./tests/racket/arith.lisp passes. Note how one of the checks reads from a file.

(5am::explain! (5am:run 'tolk/arith/tests::arith-parse))

Note in particular in ./tests/object-equality.lisp that we define a function eqo (object equality) for comparing AST nodes during testing. Two AST nodes are equal under EQO when they are of the same class and their corresponding slot values are also EQO. For objects which are not AST nodes, EQO defaults to EQL.

Now that we can read and parse our arithmetic language into an AST, it's time to interpret the AST. We use generic functions with specialized methods for each AST node type and use the DEFMETHOD-BIND macro from ./src/utils.lisp to define a method with optionally further binding. The definitions here do not make use of this optional further binding; we will see it in action later.

(defgeneric interpret (astn))

(defmethod-bind interpret ((num n))
    ()
  n)

(defmethod-bind interpret ((arith-binary-op op l r))
    ()
  (funcall op (interpret l) (interpret r)))

Note how the arith-binary-op method handles both plus and mult through inheritance and the stored op function.

(5am::explain! (5am:run 'tolk/arith/tests::arith-interpret))

We tie it all together by writing execute which takes an s-expression or a pathname and first parses, then interprets.

(5am::explain! (5am:run 'tolk/arith/tests::arith-execute))

Note how we used the Lisp operators for addition and multiplication for our operators. But we could make other choices, e.g.

  • We could limit our numbers to 8 bits.
  • We could do arithmetic modulo 4.
(defgeneric interpret-mod-4 (astn))

(defmethod-bind interpret-mod-4 ((num n))
    ()
  (mod n 4))

(defmethod-bind interpret-mod-4 ((arith-binary-op op l r))
    ()
  (mod (funcall op (interpret-mod-4 l) (interpret-mod-4 r)) 4))
(interpret-mod-4 (parse '(+ (* 1 2) (+ 2 3))))

Subtraction

Now let's extend our language by adding (binary) subtraction.

First let us note two things.

  1. arith can deal with negative numbers just as well as positive ones.
  2. We can rewrite a subtraction as an addition using \(a - b = a + (-1) * b\). We say that the left-hand side is syntactic sugar for the right-hand side and wecall this rewriting technique desugaring.

We start with an AST for our subtraction language, extending the arithmetic classes with a new minus class.

(defclass minus (arith-binary-op) ((op :initform '-)))

Note that we give it an op but it will never be used since any minus expression will be rewritten.

As before we need a parser that takes an s-expression and returns an AST, now including the minus class.

(defgeneric parse (x)
  (:documentation "Parse into an AST form."))

(defmethod parse ((p pathname))
  (mapcar #'parse (read-racket-file p)))

(defmethod parse (sexp)
  (ematch sexp
    ((type number) (make-instance 'num :n sexp))
    (`(,op ,l ,r) (make-instance (ecase op
                                   (+ 'plus)
                                   (* 'mult)
                                   (- 'minus))
                    :l (parse l) :r (parse r)))))

Now, rather than writing an interpreter for the expressions of subtraction, we are going to desugar them by rewriting them into expressions of arith before using the interpreter arith:interpret that we wrote already.

(defgeneric desugar (astn))

(defmethod desugar ((astn num))
  astn)

(defmethod desugar ((astn minus))
  (bind (((:slots l r) astn))
    (make-instance 'plus
      :l (desugar l)
      :r (make-instance 'mult
           :l (make-instance 'num :n -1)
           :r (desugar r)))))

(defmethod-bind desugar ((arith-binary-op op l r))
    ()
  (make-instance (ecase op
                   (+ 'plus)
                   (* 'mult))
    :l (desugar l) :r (desugar r)))

Note how the default method returns the AST unchanged (for num, plus, mult), while the specialized method for minus transforms it into addition with a negated right operand.

We can see how this works for the simple subtraction expression (- 4 3)

(tolk/subtraction::desugar (tolk/subtraction::parse '(- 4 3)))

This output is the printed representation of (+ 4 (* -1 3))

We also run the test file.

(5am::explain! (5am:run 'tolk/subtraction/tests::subst-desugar))

Now, in order to interpret an expression with subtraction, we desugar it first and then interpret:

(defun interpret (ast)
  "Interpret an AST AST and return a number."
  (a::interpret (desugar ast)))

Note how we use the local nickname a for tolk/arith

(:local-nicknames (:a :tolk/arith))

We write a substitution:evaluate and test.

(5am::explain! (5am:run 'tolk/subtraction/tests::subst-execute))

Conditionals

We could continue to extend our language by equipping it with other arithmetic operators but now we turn our attention to a conditional: the if form.

(if (= 3 4)
    (+ 5 6)
    (* 7 8))

We note that if can not behave like a function. To evaluate a function application we first evaluate all of its arguments before applying the function. But this is not how we want if to work. We would like if to evaluate its first argument. The result should be some kind of Boolean. If the Boolean is true we then evaluate the second argument. However if the Boolean is false we then evaluate the third argument. In no case do we evaluate both the second and third arguments.

We also need a predicate that returns a Boolean so we add = to our language.

Let's leave aside subtraction and desugaring for now and build our language with conditionals on top of our first arithmetic language. We add two new classes to the AST hierarchy.

(defclass == (arith-binary-op)
  ((op :initform #'=)))

(defclass ift (ast-node)
  ((test :type ast-node :initarg :test)
   (then :type ast-node :initarg :then)
   (else :type ast-node :initarg :else)))

We extend the parser to parse if forms and = forms.

(defgeneric parse (x)
  (:documentation "Parse into an AST form."))

(defmethod parse ((p pathname))
  (mapcar #'parse (read-racket-file p)))

(defmethod parse (sexp)
  (ematch sexp
    ((type number) (make-instance 'num :n sexp))
    (`(+ ,l ,r) (make-instance 'plus :l (parse l) :r (parse r)))
    (`(* ,l ,r) (make-instance 'mult :l (parse l) :r (parse r)))
    (`(= ,l ,r) (make-instance '==   :l (parse l) :r (parse r)))
    (`(if ,test ,then ,else) (make-instance 'ift
                               :test (parse test)
                               :then (parse then)
                               :else (parse else)))))

Before now every value in our language has been a number but we have just introduced Booleans and we would like to be a little more careful with our types moving forward so we define value classes with a base value class and two concrete subclasses: numval and boolval.

(defclass value () ())

(defclass numval (value)
  ((n :type number :initarg :n)))

(defclass boolval (value)
  ((b :type boolean :initarg :b)))

Finally we need to write interpret. We need an interpreter that can deal with value objects and return them. For this reason we can no longer build on the arith:interpret function.

We use the if form of the Common Lisp language since it has identical semantics to the if form of the Racket language we wish to capture.

(defmethod-bind interpret ((== l r))
    (((:slots (nl n)) (interpret l))
     ((:slots (nr n)) (interpret r)))
  (make-instance 'boolval :b (= nl nr)))

(defmethod-bind interpret ((ift test then else))
    (((:slots b) (interpret test)))
  (if b
      (interpret then)
      (interpret else)))

Note how the interpreter will give an error if the value it is given is in the test position of if is anything other than a Boolean. We could also have taken a broader view using generalized Booleans to decide what we consider to be truthy and falsy values, as is done in Python, Lisp and other languages. How would we have written our interpreters in these cases?

The methods for num and arith-binary-op look like this.

(defmethod-bind interpret ((num n))
    ()
  (make-instance 'numval :n n))

(defmethod-bind interpret ((arith-binary-op op l r))
    (((:slots (nl n)) (interpret l))
     ((:slots (nr n)) (interpret r)))
  (make-instance 'numval :n (funcall op nl nr)))

We run further tests for the parser and executor.

(5am::explain! (5am:run 'tolk/conditional/tests::cond-parse))
(5am::explain! (5am:run 'tolk/conditional/tests::cond-execute))

etc

Author: Breanndán Ó Nualláin <o@uva.nl>

Date: 2026-05-11 Mon 12:22