UP | HOME

Parsing

Table of Contents

Definition

Parsing is the transformation of a character stream (surface syntax) into a structured internal representation (typically an AST). It is the first phase of an interpreter pipeline.

Parsing is a large and complex field (ambiguity, operator precedence, error recovery). PLAI deliberately minimises its treatment of parsing by exploiting Lisp's s-expression syntax.

The Bicameral Design (PLAI Ch.2)

PLAI's approach splits parsing into two levels:

  1. Lower level (reader): Racket's read / quotation converts character streams into s-expressions (nested lists of symbols, numbers, strings). This is unambiguous because input is fully parenthesised.
  2. Upper level (parser function): A parse function converts s-expressions into typed AST nodes, performing well-formedness checking.

In Common Lisp, the built-in reader plays the same role as Racket's read. The upper-level parser uses ematch with quasiquote patterns.

Tolk's Parser

(defmethod parse (sexp)
  "Upper-level parser: s-expression → AST"
  (ematch sexp
    ((type number) (make-instance 'num :n sexp))
    (`(+ ,l ,r) (make-instance 'plus :l (parse l) :r (parse r)))
    (`(* ,l ,r) (make-instance 'mult :l (parse l) :r (parse r)))))

(defmethod parse ((p pathname))
  "Pathname method: reads a .rkt file, returns list of parsed ASTs"
  (mapcar #'parse (read-racket-file p)))

The quasiquote patterns (+ ,l ,r) match a three-element list whose first element is the symbol +. This is the fare-quasiquote extension enabling destructuring in trivia patterns, activated by the named-readtables declaration in-readtable :fare-quasiquote at the top of each source file.

PLAI's Parser (plai-typed)

(define (parse [s : s-expression]) : ArithC
  (cond [(s-exp-number? s) (numC (s-exp->number s))]
        [(s-exp-list? s)
         (let ([sl (s-exp->list s)])
           (case (s-exp->symbol (first sl))
             [(+) (plusC (parse (second sl)) (parse (third sl)))]
             [(*) (multC (parse (second sl)) (parse (third sl)))]
             [else (error 'parse "invalid list input")]))]
        [else (error 'parse "invalid input")]))

Racket's plai-typed requires explicit casts (s-exp->number, s-exp->symbol etc.) because s-expressions have their own distinct type. Common Lisp requires no such casts.

Comparison

Aspect PLAI (plai-typed) Tolk (Common Lisp)
First phase Racket read / quotation CL reader / quotation
Second phase parse with cond + predicates parse with ematch + patterns
Type casting Required (s-exp->number etc.) Not needed
Pathname support No built-in Separate defmethod on pathname

Sources

Related