Mutation & Recursion
Assignment 4
Our Racket interpreter is split over a number of Lisp packages. This was because we wrote a numbre of iterpreters for various mini-languages.
For your Python interpreter we will use just one package: :tolk/python. This
package is already defined in src/python/package.lisp so when you begin writing
your Python interpreter you only need to make a file called
src/python/interpreter.lisp, you only need to put the following line at the top:
(in-package :tolk/python)
You might then continue with:
(defgeneric interpret (node))
(defmethod interpret ((c constant))
(slot-value c 'value))
(defmethod interpret ((b binop))
(funcall (case (slot-value b 'op)
(:add '+))
(interpret (slot-value b 'left))
(interpret (slot-value b 'right))))
The first one interprets constant values and the second binary operators (only addition!).
Or if you prefer to use defmethod-bind then:
(defmethod-bind interpret ((constant value) env)
()
value)
(defmethod-bind interpret ((binop left op right) env store)
((left-val (interpret left env store))
(right-val (interpret right env store1)))
(ecase op
(:add (+ left-val right-val))))
Be sure to put a line like this in the tolk.asd file:
(:file "interpreter")
Now make a test file tests/python/interpreter.lisp and add tests similar to
those in tests/python/parsing.lisp Be sure to import the functions you defined
in src/python/interpreter.lisp into tests/python/suite.lisp
Let Over Lambda
In the previous class we saw how we can separate
- making a function value with
lambda, and - assigning that value to a variable with
define
(define (f x y)
(+ (* 2 x) y))
(define f (lambda (x y)
(+ (* 2 x) y)))
The first form is simply syntactic sugar for the second.
The second form is no different than
- making an integer value with
+, and assigning that value to a variable with
define(define seven (+ 3 4))But what use is this separation?
(define counter (let ((count 0)) (lambda () (set! count (+ 1 count)) count)))
Now we have
> (counter)
1
> (counter)
2
> (counter)
3
> (counter)
4
Note how this implements information hiding; the variable count cannot be
accessed in any way except through the counter function.
Here is an example of a bank with two accounts, for Jack and Jill.
(define bank
(let ((jack 100)
(jill 100))
(lambda (amount)
(set! jack (- jack amount))
(set! jill (+ jill amount))
(list jack jill))))
We can transfer money from Jack's account to Jill's or vice versa but the total amount remains constant.
> (bank 0)
'(100 100)
> (bank 20)
'(80 120)
> (bank -50)
'(130 70)
>
Database transactions need to satisfy the so-called ACID properties. The above is an example of transaction atomicity.
Mutation
In Python we use the same notation to mutate a (simple) value as to introduce it.
x = 1
x = 2
Here the first line is an introduction and the second a mutation.
In Racket we use different notations.
(define x 1)
(set! x 2)
The convention in Racket is to end the name of a mutation operator with an exclamation mark (pronounced "bang", as in "set bang"). The reason for this is that Racket is more often used for functional programming so it is helpful to draw attention to any explicit mutation.
- Boxes
To add mutation to our interpreter, we introduce a box, with three operations.
- we can put something into a box
- we can take out the contents of a box, and
- we can mutate the value in the box.
We will call these operations
box,unboxandset-box!.This will allow us to be explicit about locations in memory.
Because mutation is sensitive to the order in which they take place, we will also introduce an sequencing operator into our language. In Racket this operator is called
begin.Let's have a look at some Racket code using these newly introduced operators.
#lang racket (let ((b0 (box 0)) (b1 (box 1))) (let ((boxlist (list b0 b1))) (begin (set-box! (first boxlist) 2) (set-box! (second boxlist) 3) (list (unbox (first boxlist)) (unbox (second boxlist))))))Here are the steps in order:
- Make a box containing the number
0and store that box inb0 - Make a box containing the number
1and store that box inb1 - Make a list containing the boxes
b0andb1and store it inboxlist - Set the value in the box in the first element of the list to be
2 - Set the value in the box in the second element of the list to be
3 - Make a list of two elements, the first being the contents of the box which
is the first element of
boxlistand the second being the contents of the box which is the second element ofboxlist
Here the ordering of the two occurrences of
set-box!is not important but if they had both mutated the same box, it would have been. For example this ordering:(let ((b (box 0))) (begin (set-box! b 1) (set-box! b 2) (unbox b)))is different to this one:
(let ((b (box 0))) (begin (set-box! b 2) (set-box! b 1) (unbox b)))To add these operations to our interpreter, we extend the definition of the
functionsASTwith classes forbox,unbox,setboxandseq2.(defclass box (ast-node) ((arg :type ast-node :initarg :arg))) (defclass unbox (ast-node) ((arg :type box :initarg :arg))) (defclass setbox (ast-node) ((box :type box :initarg :box) (contents :type ast-node :initarg :contents))) (defclass seq2 (ast-node) ((s1 :type ast-node :initarg :s1) (s2 :type ast-node :initarg :s2)))Before continuing, we take an aside to discuss an interesting and useful construct.
- Let over Lambda
There is an interesting and powerful interaction between
letandlambda. So interesting that someone wrote a book with that title: Let Over LambdaFirst let's remind ourself that a function definition in Racket such as
(define (f n) (+ 3 n))is really just syntactic sugar for the form
(define f (lambda (n) (+ 3 n)))This latter form makes it explicit that we are first constructing a function and then assigning it to
f.But we could also construct the function in a slightly more complex way.
(define f (let ((a 3)) (lambda (n) (+ a n))))This has exactly the same effect.
But if we now bring mutation into the mix, we get interesting consequences.
(define counter (let ((n (box 0))) (lambda () (begin (set-box! n (+ 1 (unbox n))) (unbox n)))))Each time we call the
counterfunction, it updates the value in the box before returning that value.> (counter) 1 > (counter) 2 > (counter) 3 >
Note how it is important that the box is constructed outside the
lambda. If we has put it inside thelambda, a new box would have been constructed every time we call the function.(define bad-counter (lambda () (let ((n (box 0))) (begin (set-box! n (+ 1 (unbox n))) (unbox n)))))> (bad-counter) 1 > (bad-counter) 1 > (bad-counter) 1 >
We will use such a counter written in Lisp to generate addresses of memory locations.
- Interpreting mutation
Let's go to work to interpret our box operations.
We need to add a new variant of
valueto hold the location in memory where we store a box. We call itlocval(defclass locval (value) ((loc :type Location :initarg :loc)))Now, before writing our interpreter for boxes, let's examine a bit more closely how they work.
Here is an example.
(let ((b (box 0))) (begin (begin (set-box! b (+ 1 (unbox b))) (set-box! b (+ 1 (unbox b)))) (unbox b)))It evaluates to 2. But note that the two expressions with
set-box!are identical. But the value in the box has changed between them.How about this example?
(let ((b (box 0))) (+ (begin (set-box! b (+ 1 (unbox b))) (unbox b)) (begin (set-box! b (+ 1 (unbox b))) (unbox b))))Now the two
beginexpressions are identical but they produce different values! Perhaps we will need to change the environment? But the environment is not passed from the firstbeginexpression to the second. The environment is only passed to subexpressions not to sibling expressions. Do we need to change that?If we use the environment to pass information to siblings, we could get into trouble with an escape of binding from lexical scope. Something like this:
(+ (let ((b (box 0))) 1) b)This should lead to an error since the final occurrence of
bis unbound. Should we remove all bindings when sequencing? That won't work since we have closures. Here's an example.(let ((a (box 1))) (let ((f (lambda (x) (+ x (unbox a))))) (begin (set-box! a 2) (f 10))))Somehow we need to communicate the change made by the
set-box!to the call tof.This leads us to ask ourselves, is the environment even the place to keep information about mutations? Its purpose is to carry information about substitutions and to respect lexical scope.
We introduce a new data structure which we call the store. In the same way as the environment is responsible for managing lexical scoping, the store will be responsible for managing mutation. Like the environment, the store will be a partial map.
Looking up the value associated with a variable now becomes a two step process:
- We look for a location associated with the variable in the environment, then
We look for the value stored at that location in the store.
We will use the same underlying structures for our environments and stores but we will give them different names to distinguish them.
;;; Environment (alist: name → location or value) ;;; Defined in tolk/utils (defparameter empty-env nil) (defun extend (var val env) "Extend the environment ENV with a binding of VAR to VAL." (cons (cons var val) env)) (defun lookup (var env) "Lookup the value associated with VAR in ENV." (or (cdr (assoc var env)) (error "Environment does not contain ~a" var))) ;;; Store (alist: location → value) ;;; Defined in tolk/utils (deftype Location () '(integer 0 *)) (defparameter empty-store nil) (defun override (location val store) "Override the value at LOCATION in STORE to VAL." (cons (cons location val) store)) (defun fetch (location store) "Fetch the value at LOCATION in STORE." (cdr (assoc location store))) (let ((address 0)) (defun new-location () "Return a new location in the store." (incf address)))Now the
Valueassociated with a box will be aLocation.And we will need to pass around through our interpreter not only the environment but also the store. The environment will be updated to reflect the lexical scope and the store will be updated to reflect mutations.
Since the
interpretfunction will need to pass around not only values but the store too, we will use Common Lisp's facility to return not only one value from a function, but any number (up to a certain large limit). - Multiple values
An example of a built-in Common Lisp function that returns multiple values is floor.
> (floor 7/2) 3 1/2
It returns two values, in this case the whole and fractional parts of the number.
If we use
floorin a context where one value is expected, the first returned value is used.(+ 10 (floor 7/2))But if we want to use both returned values we can bind them with the
bindconstruction from the packagemetabang-bind(bind (((:values whole part) (floor 7/2))) (format nil "The whole part was ~a and the fractional part was ~a" whole part))bindalso works for single values and can refer to bindings previously made.(bind ((a 88) (b (+ a 800)) ((:values c d) (floor b 10))) (list a b c d))(88 888 88 8)
We will use
valuesto return two values fromtolk/mutation:interpret: the value itself and the updated store. Let's look at it case by case.- Numbers
To interpret a number we just return the numeric value itself and pass the store through unchanged.
(defmethod-bind interpret ((num n) env store) () (declare (ignore env)) (values (make-instance 'numval :n n) store)) - Lambdas
Similarly for a
lambda(defmethod-bind interpret ((lambda1 var body) env store) () (values (make-instance 'funval :parameter var :body body :env env) store)) - Variables
To interpret a variable, we look up its value using the two-stage process via environment and store.
(defmethod-bind interpret ((var v) env store) () (values (fetch (lookup v env) store) store)) - Sequencing
When we interpret the first of the two arguments in sequence, it may change the store so we had better pass along the output store when we are interpreting the second argument. On the way we discard the value we get when we interpret the first argument. We use the same environment for the interpretation of both arguments.
(defmethod-bind interpret ((seq2 s1 s2) env store) (((:values _ store1) (interpret s1 env store))) (interpret s2 env store1))Note the distinction between how we treat the environment and the store.
- We pass the environment down into subterms in a recursive descent fashion. It takes care of new bindings according to the lexical scoping rules of the language.
- We thread the store from one subterm through its siblings. This reflects any mutations made in bindings.
- Arithmetic operators
We will need to do something similar for addition.
(defmethod-bind interpret ((arith-binary-op op l r) env store) (((:values lval store1) (interpret l env store)) ((:values rval store2) (interpret r env store1))) ;; TBD This isn't quite right. Both should be nums or both bools. (multiple-value-ematch (values lval rval) (((or (numval :n xl) (boolval :b xl)) (or (numval :n xr) (boolval :b xr))) (values (make-instance 'numval :n (funcall op xl xr)) store2))))Remember that
bindmakes its bindings in order so the left argument toarith-binary-opis interpreted before the right.There is a slightly clumsy handling of binary operations here. The functions we have defined (
+,*and==) have different signatures. It would be good to separate these to allow us to add more functions. - Boxes
To interpret a box, we will need a new location to store its result. We first interpret its argument, then store that value at the new location in the store before returning the location value and the updated store.
(defmethod-bind interpret ((box arg) env store) (((:values val1 store1) (interpret arg env store)) (loc (new-location))) (values (make-instance 'locval :loc loc) (override loc val1 store1)))For unboxing, interpreting the argument should result in a location. We fetch what is at that location and return it together with the resulting store.
(defmethod-bind interpret ((unbox arg) env store) (((:values lv store1) (interpret arg env store)) ((:slots loc) lv)) (values (fetch loc store1) store1))And to set a box, we first interpret the box, which should result in a location. Then we interpret the value, threading the store as we go. We override the value at that location and return the value.
(defmethod-bind interpret ((setbox box contents) env store) (((:values boxval store1) (interpret box env store)) ((:values contentsval store2) (interpret contents env store1)) ((:slots loc) boxval)) (values contentsval (override loc contentsval store2))) - Function application
Previously, to interpret the application of a function we interpreted the function body in an environment extended with a binding of the function variable to the argument. How does the introduction of the store change this?
As before, we first interpret the function and then argument, but this time we thread the store from the former to the latter. Now we need a new location to store the binding of the function parameter.
We interpret the function body in the environment extended with a binding of the parameter to the new location and a store extended with a binding of that location to the value of the argument.
(defmethod-bind interpret ((apply1 fun arg) env store) (((:values funval store1) (interpret fun env store)) ((:values argval store2) (interpret arg env store1)) ((:slots parameter body env) funval) ; NB shadowing env ((:slots v) parameter) (loc (new-location))) (interpret body (extend v loc env) (override loc argval store2))) - Conditional
Interpreting
ifis straightforward. Note that either thethenbranch or theelsebranch gets interpreted but not both.(defmethod-bind interpret ((ift test then else) env store) (((:values testval store1) (interpret test env store))) (interpret (if testval then else) env store1)) - let
Since we can regard
letas just syntactic sugar for alambdaapplication, we rewrite it as such and interpret the result.(defmethod-bind interpret ((let1 var varex body) env store) () (interpret (make-instance 'apply1 :fun (make-instance 'lambda1 :var var :body body) :arg varex) env store))
- Numbers
Recursion
We have seen above that let does not support recursion because the
environment for the recursive call does not contain a binding for the function
itself. To address this, we introduce a letrec form that makes recursive
function definitions possible.
The letrec form has the same shape as let1 but uses a different evaluation
strategy: the value expression is interpreted in an environment that already
contains the binding for the variable being defined.
(letrec (fact (lambda n
(if (= n 0)
1
(* n (fact (- n 1))))))
(fact 5))
We add a letrec class to the AST in ./src/racket/recursive-functions.lisp.
(defclass letrec (ast-node)
((var :type var :initarg :var)
(value :type ast-node :initarg :value)
(body :type ast-node :initarg :body)))
The parser adds a letrec clause:
(`(letrec (,var ,value) ,body) (make-instance 'letrec
:var (make-instance 'var :v var)
:value (parse value)
:body (parse body)))
The interpretation of letrec uses the store machinery from the mutation
language to create a location for the new binding before interpreting the
value. This way the binding is in scope for recursive calls.
(defmethod-bind interpret ((letrec var value body) env store)
(((:slots (var-name v)) var)
(loc (new-location))
(new-env (extend var-name loc env))
((:values val store1) (interpret value new-env store)))
(interpret body new-env (override loc val store1)))
Note how new-env already contains the binding before value is interpreted,
and the store is updated afterwards with the actual computed val.
Tests for letrec include factorial, Fibonacci, and nested recursive
definitions.
(5am::explain! (5am:run 'tolk/tests/recursive-functions::letrec-factorial-direct))