UP | HOME

Tolk Python AST

Table of Contents

Definition

The tolk/python AST (Abstract Syntax Tree) is a CLOS class hierarchy representing the structure of parsed Python programs. Every node class inherits from node, which splits into two principal subtrees:

expr
expressions that evaluate to values, and
stmt
statements that perform actions.

This page catalogues every AST class, its slots, and its role.

Class Hierarchy

node                            ← abstract base (no slots)
├── expr                        ← expression: evaluates to a value
│   ├── constant                ← literal value
│   ├── binop                   ← binary operator (+, -, *, /)
│   ├── unaryop                 ← unary operator (not, usub, uadd)
│   ├── compare                 ← comparison (==, !=, <, >, etc.)
│   ├── boolop                  ← boolean operator (and, or)
│   ├── ifexpr                  ← ternary conditional
│   ├── name                    ← variable reference or target
│   ├── lambda-expr             ← lambda expression
│   ├── call                    ← function call
│   ├── list-expr               ← list literal [...]
│   └── subscript               ← indexed access arr[i]
│
├── stmt                        ← statement: performs an action
│   ├── assign                  ← variable assignment
│   ├── fundef                  ← function definition (def)
│   ├── return-stmt             ← return statement
│   ├── if-stmt                 ← if/elif/else statement
│   └── subscript-assign        ← indexed assignment arr[i] = val
│
├── stmts                       ← block: sequence of statements
├── slice                       ← slice start:stop:step (not an expr)
│
└── (runtime values — not AST nodes)
    ├── closure                 ← function value at runtime
    └── builtin                 ← built-in function wrapper

Expression Nodes (expr)

All expression nodes inherit from expr, which inherits from node.

constant

A literal value.

Slot Type Description
:value integer / boolean / :none The literal value

Examples: 42constant with :value 42; Trueconstant with :value t; Noneconstant with :value :none.

binop

A binary operation. Used for +, -, *, /.

Slot Type Description
:left expr Left-hand operand
:op symbol :add, :sub, :mult, or :div
:right expr Right-hand operand

Operator precedence and associativity are encoded in the grammar, not in the node. The AST records the resolved tree shape: 2 + 3 * 4 produces a binop whose :right is itself a binop.

unaryop

A unary operation. Used for not, unary +, unary -.

Slot Type Description
:op symbol :not, :usub (unary minus), or :uadd (unary plus)
:operand expr The operand expression

compare

A comparison operation. Carries one operator and one comparator.

Slot Type Description
:left expr Left-hand side of comparison
:ops list A list of one comparison symbol: :eq, :noteq, :lt, :lte, :gt, :gte
:comparators list A list of one expr (the right-hand side)

:ops and :comparators are lists for compatibility with Python's ast.Compare AST (which supports chained comparisons like 0 < x < 10). However the tolk parser only ever produces one-element lists. Chained comparisons are not implemented.

boolop

A boolean operation with short-circuit semantics.

Slot Type Description
:op symbol :and or :or
:values list List of expr operands (at least 2)

and and or flatten to a single boolop node per operator (not binary nesting): A and B and C produces one boolop with three :values.

ifexpr

A ternary conditional expression (body if test else orelse).

Slot Type Description
:test expr The condition
:body expr Value when test is truthy
:orelse expr Value when test is falsy

Note the ordering. In Python surface syntax the body comes first (x if True else y), but the slot names follow Python's ast.IfExp: test, body, orelse.

name

A variable reference or assignment target.

Slot Type Description
:id symbol The variable name, uppercased (:X, :FOO_BAR, :_PRIVATE)
:ctx symbol :load (reading the variable) or :store (target of assignment)

This mirrors Python's ast.Name node. The :ctx slot distinguishes uses from definitions, essential for correct scope analysis.

lambda-expr

An anonymous function expression.

Slot Type Description
:params list List of parameter symbols (:X, :Y). Empty list for no params
:body expr The single-expression body

call

A function call.

Slot Type Description
:func expr The function being called (often a name)
:args list List of expr arguments (may be empty)

Calls can be chained via successive call nodes: f(1)(2) produces a call whose :func is itself a call.

list-expr

A list literal.

Slot Type Description
:elements list List of expr elements. nil for empty list.

subscript

An indexed access expression (arr[index] or arr[slice]).

Slot Type Description
:obj expr The collection being indexed
:key expr or slice The index expression or slice node

Statement Nodes (stmt)

All statement nodes inherit from stmt, which inherits from node.

assign

Variable assignment.

Slot Type Description
:target name A name node with :ctx :store (the variable being assigned)
:value expr The value expression

fundef

Function definition (def f(params): body).

Slot Type Description
:name symbol The function name (:F, :ADD, :HELLO)
:params list List of parameter symbols (:X, :Y)
:body stmt or expr The function body

The body may be a stmts block (multi-line body indented below def), a return-stmt, or a single expr. The parser determines which based on what follows the colon.

During execution, a fundef creates a closure runtime value and binds it to the function name in the current environment.

return-stmt

Return from a function.

Slot Type Description
:value expr The value to return

if-stmt

An if/elif/else statement.

Slot Type Description
:test expr The condition
:body stmts The then-branch (a block)
:orelse nil, stmts, or if-stmt The else/elif branch. nil means no else.

elif is desugared in the parser: each elif becomes a nested if-stmt in the previous branch's :orelse slot. This is semantically equivalent to Python's behaviour where elif is syntactic sugar.

subscript-assign

Indexed assignment (arr[i] = value).

Slot Type Description
:target name A name node (the array variable)
:index expr The index expression
:value expr The value to store at that index

Block Node (stmts)

A sequence of statements, used for indented blocks. Not a stmt itself but a node (it's neither an expression nor a single statement).

Slot Type Description
:statements list List of statement ASTs (stmt and expr nodes)

Created by the .block parser when it encounters INDENT … DEDENT tokens. Used as the body of if-stmt and fundef nodes.

Slice Node (slice)

Represents a slice in a subscript. Not an expr (it does not evaluate to a value on its own); it is a structural node used only within subscript.

Slot Type Description
:start nil or expr Start index. nil means "from beginning"
:stop nil or expr Stop index. nil means "to end"
:step nil or expr Step size. nil means "step by 1"

All three slots are optional, reflecting Python's flexible slice syntax: arr[:], arr[1:], arr[:5], arr[1:5:2], etc.

Runtime Values (Not AST Nodes)

These CLOS classes represent values at runtime, not parse trees.

closure

A user-defined function value created by def or lambda.

Slot Type Description
:params list Parameter symbols
:body node The function body (AST, evaluated at call)
:env alist The captured lexical environment

Closures are first-class: they can be passed as arguments and returned from other functions. The captured :env is the environment at the point of definition (lexical scoping).

builtin

A built-in function wrapping a Lisp lambda. Created by default-env.

Slot Type Description
:name string The Python name of the builtin (e.g. "len")
:func function A Lisp lambda implementing the builtin

Builtins are distinguished from closures in interpret: calls to a builtin invoke the stored lambda directly, passing all arguments as a list.

Key Design Choices

node / expr / stmt Trichotomy

Mirrors Python's own ast.AST / ast.expr / ast.stmt distinction. This enables method dispatch to distinguish expressions (which produce values) from statements (which produce effects), matching PLAI's approach where type-case discriminates on the variant tag.

:id Symbols Are Uppercased

All name, fundef, lambda-expr identifier symbols are uppercased (:X, :FOO_BAR). This is done by the .identifier parser function (intern ... :keyword with string-upcase). It means case sensitivity follows Common Lisp convention (case-insensitive but canonical uppercase), diverging from Python's case sensitivity.

Slots as Initargs

Every slot is accessible via :initarg keyword. Object construction uses make-instance, which is verbose but type-safe. The test suite uses eqo (structural equality) to compare constructed ASTs against parse results.

:ctx on name Nodes

The :load / :store distinction on name nodes is directly borrowed from Python's AST. It allows the interpreter to know from context alone whether a name occurrence is a read or a write, which matters for scope resolution (the store side creates bindings; the load side looks them up).

Pre-Allocation of Names

Before executing a program, collect-names scans all top-level assign and fundef nodes to extract names, then pre-allocate creates store slots for all of them. This implements the letrec pattern: all names are bound (to :uninitialized) before any body is evaluated, enabling mutual recursion and forward references.

AST vs. Python's ast Module

Aspect tolk/python Python ast module
constant One class for all literals Separate Num, Str, NameConstant etc.
binop Single binary operator class Separate BinOp, Add, Sub etc.
boolop Flat list of operands Binary nesting (left-recursive)
fundef Single body slot (expr/stmt/block) Separate body list in FunctionDef
Module wrapper List of statements Module(body[…])=
Location info None lineno, col_offset everywhere

The tolk AST is deliberately simpler. It omits source location tracking (lineno, col_offset), does not preserve surface-syntax details (e.g., Python's separate Add/Sub classes for each operator), and uses a single constant class for all literal values rather than separate classes per literal type.

Sources

  • sources/advprog/tolk/src/python/ast.lisp — AST class definitions
  • sources/advprog/tolk/src/python/parser.lisp — parser grammars that construct AST nodes
  • sources/advprog/tolk/src/python/execute.lispparse-program, collect-names
  • sources/advprog/tolk/tests/python/parsing.lisp — parser tests with AST construction

Related