Tolk Python AST
Table of Contents
Definition
The tolk/python AST (Abstract Syntax Tree) is a CLOS class hierarchy
representing the structure of parsed Python programs. Every node class
inherits from node, which splits into two principal subtrees:
expr- expressions that evaluate to values, and
stmt- statements that perform actions.
This page catalogues every AST class, its slots, and its role.
Class Hierarchy
node ← abstract base (no slots)
├── expr ← expression: evaluates to a value
│ ├── constant ← literal value
│ ├── binop ← binary operator (+, -, *, /)
│ ├── unaryop ← unary operator (not, usub, uadd)
│ ├── compare ← comparison (==, !=, <, >, etc.)
│ ├── boolop ← boolean operator (and, or)
│ ├── ifexpr ← ternary conditional
│ ├── name ← variable reference or target
│ ├── lambda-expr ← lambda expression
│ ├── call ← function call
│ ├── list-expr ← list literal [...]
│ └── subscript ← indexed access arr[i]
│
├── stmt ← statement: performs an action
│ ├── assign ← variable assignment
│ ├── fundef ← function definition (def)
│ ├── return-stmt ← return statement
│ ├── if-stmt ← if/elif/else statement
│ └── subscript-assign ← indexed assignment arr[i] = val
│
├── stmts ← block: sequence of statements
├── slice ← slice start:stop:step (not an expr)
│
└── (runtime values — not AST nodes)
├── closure ← function value at runtime
└── builtin ← built-in function wrapper
Expression Nodes (expr)
All expression nodes inherit from expr, which inherits from node.
constant
A literal value.
| Slot | Type | Description |
|---|---|---|
:value |
integer / boolean / :none |
The literal value |
Examples: 42 → constant with :value 42; True → constant with :value t;
None → constant with :value :none.
binop
A binary operation. Used for +, -, *, /.
| Slot | Type | Description |
|---|---|---|
:left |
expr |
Left-hand operand |
:op |
symbol | :add, :sub, :mult, or :div |
:right |
expr |
Right-hand operand |
Operator precedence and associativity are encoded in the grammar, not in
the node. The AST records the resolved tree shape: 2 + 3 * 4 produces a
binop whose :right is itself a binop.
unaryop
A unary operation. Used for not, unary +, unary -.
| Slot | Type | Description |
|---|---|---|
:op |
symbol | :not, :usub (unary minus), or :uadd (unary plus) |
:operand |
expr |
The operand expression |
compare
A comparison operation. Carries one operator and one comparator.
| Slot | Type | Description |
|---|---|---|
:left |
expr |
Left-hand side of comparison |
:ops |
list | A list of one comparison symbol: :eq, :noteq, :lt, :lte, :gt, :gte |
:comparators |
list | A list of one expr (the right-hand side) |
:opsand:comparatorsare lists for compatibility with Python'sast.CompareAST (which supports chained comparisons like0 < x < 10). However the tolk parser only ever produces one-element lists. Chained comparisons are not implemented.
boolop
A boolean operation with short-circuit semantics.
| Slot | Type | Description |
|---|---|---|
:op |
symbol | :and or :or |
:values |
list | List of expr operands (at least 2) |
and and or flatten to a single boolop node per operator (not binary
nesting): A and B and C produces one boolop with three :values.
ifexpr
A ternary conditional expression (body if test else orelse).
| Slot | Type | Description |
|---|---|---|
:test |
expr |
The condition |
:body |
expr |
Value when test is truthy |
:orelse |
expr |
Value when test is falsy |
Note the ordering. In Python surface syntax the body comes first (
x if True else y), but the slot names follow Python'sast.IfExp:test,body,orelse.
name
A variable reference or assignment target.
| Slot | Type | Description |
|---|---|---|
:id |
symbol | The variable name, uppercased (:X, :FOO_BAR, :_PRIVATE) |
:ctx |
symbol | :load (reading the variable) or :store (target of assignment) |
This mirrors Python's ast.Name node. The :ctx slot distinguishes uses from
definitions, essential for correct scope analysis.
lambda-expr
An anonymous function expression.
| Slot | Type | Description |
|---|---|---|
:params |
list | List of parameter symbols (:X, :Y). Empty list for no params |
:body |
expr |
The single-expression body |
call
A function call.
| Slot | Type | Description |
|---|---|---|
:func |
expr |
The function being called (often a name) |
:args |
list | List of expr arguments (may be empty) |
Calls can be chained via successive call nodes: f(1)(2) produces a call
whose :func is itself a call.
list-expr
A list literal.
| Slot | Type | Description |
|---|---|---|
:elements |
list | List of expr elements. nil for empty list. |
subscript
An indexed access expression (arr[index] or arr[slice]).
| Slot | Type | Description |
|---|---|---|
:obj |
expr |
The collection being indexed |
:key |
expr or slice |
The index expression or slice node |
Statement Nodes (stmt)
All statement nodes inherit from stmt, which inherits from node.
assign
Variable assignment.
| Slot | Type | Description |
|---|---|---|
:target |
name |
A name node with :ctx :store (the variable being assigned) |
:value |
expr |
The value expression |
fundef
Function definition (def f(params): body).
| Slot | Type | Description |
|---|---|---|
:name |
symbol | The function name (:F, :ADD, :HELLO) |
:params |
list | List of parameter symbols (:X, :Y) |
:body |
stmt or expr |
The function body |
The body may be a stmts block (multi-line body indented below def), a
return-stmt, or a single expr. The parser determines which based on what
follows the colon.
During execution, a fundef creates a closure runtime value and binds it
to the function name in the current environment.
return-stmt
Return from a function.
| Slot | Type | Description |
|---|---|---|
:value |
expr |
The value to return |
if-stmt
An if/elif/else statement.
| Slot | Type | Description |
|---|---|---|
:test |
expr |
The condition |
:body |
stmts |
The then-branch (a block) |
:orelse |
nil, stmts, or if-stmt |
The else/elif branch. nil means no else. |
elif is desugared in the parser: each elif becomes a nested if-stmt in
the previous branch's :orelse slot. This is semantically equivalent to
Python's behaviour where elif is syntactic sugar.
subscript-assign
Indexed assignment (arr[i] = value).
| Slot | Type | Description |
|---|---|---|
:target |
name |
A name node (the array variable) |
:index |
expr |
The index expression |
:value |
expr |
The value to store at that index |
Block Node (stmts)
A sequence of statements, used for indented blocks. Not a stmt itself but
a node (it's neither an expression nor a single statement).
| Slot | Type | Description |
|---|---|---|
:statements |
list | List of statement ASTs (stmt and expr nodes) |
Created by the .block parser when it encounters INDENT … DEDENT tokens.
Used as the body of if-stmt and fundef nodes.
Slice Node (slice)
Represents a slice in a subscript. Not an expr (it does not evaluate to a
value on its own); it is a structural node used only within subscript.
| Slot | Type | Description |
|---|---|---|
:start |
nil or expr |
Start index. nil means "from beginning" |
:stop |
nil or expr |
Stop index. nil means "to end" |
:step |
nil or expr |
Step size. nil means "step by 1" |
All three slots are optional, reflecting Python's flexible slice syntax:
arr[:], arr[1:], arr[:5], arr[1:5:2], etc.
Runtime Values (Not AST Nodes)
These CLOS classes represent values at runtime, not parse trees.
closure
A user-defined function value created by def or lambda.
| Slot | Type | Description |
|---|---|---|
:params |
list | Parameter symbols |
:body |
node |
The function body (AST, evaluated at call) |
:env |
alist | The captured lexical environment |
Closures are first-class: they can be passed as arguments and returned from
other functions. The captured :env is the environment at the point of
definition (lexical scoping).
builtin
A built-in function wrapping a Lisp lambda. Created by default-env.
| Slot | Type | Description |
|---|---|---|
:name |
string | The Python name of the builtin (e.g. "len") |
:func |
function | A Lisp lambda implementing the builtin |
Builtins are distinguished from closures in interpret: calls to a builtin
invoke the stored lambda directly, passing all arguments as a list.
Key Design Choices
node / expr / stmt Trichotomy
Mirrors Python's own ast.AST / ast.expr / ast.stmt distinction. This
enables method dispatch to distinguish expressions (which produce values)
from statements (which produce effects), matching PLAI's approach where
type-case discriminates on the variant tag.
:id Symbols Are Uppercased
All name, fundef, lambda-expr identifier symbols are uppercased (:X, :FOO_BAR).
This is done by the .identifier parser function (intern ... :keyword with
string-upcase). It means case sensitivity follows Common Lisp convention
(case-insensitive but canonical uppercase), diverging from Python's case
sensitivity.
Slots as Initargs
Every slot is accessible via :initarg keyword. Object construction uses
make-instance, which is verbose but type-safe. The test suite uses eqo
(structural equality) to compare constructed ASTs against parse results.
:ctx on name Nodes
The :load / :store distinction on name nodes is directly borrowed from
Python's AST. It allows the interpreter to know from context alone whether
a name occurrence is a read or a write, which matters for scope resolution
(the store side creates bindings; the load side looks them up).
Pre-Allocation of Names
Before executing a program, collect-names scans all top-level assign and
fundef nodes to extract names, then pre-allocate creates store slots for
all of them. This implements the letrec pattern: all names are bound (to
:uninitialized) before any body is evaluated, enabling mutual recursion
and forward references.
AST vs. Python's ast Module
| Aspect | tolk/python | Python ast module |
|---|---|---|
constant |
One class for all literals | Separate Num, Str, NameConstant etc. |
binop |
Single binary operator class | Separate BinOp, Add, Sub etc. |
boolop |
Flat list of operands | Binary nesting (left-recursive) |
fundef |
Single body slot (expr/stmt/block) |
Separate body list in FunctionDef |
| Module wrapper | List of statements | Module(body[…])= |
| Location info | None | lineno, col_offset everywhere |
The tolk AST is deliberately simpler. It omits source location tracking
(lineno, col_offset), does not preserve surface-syntax details (e.g.,
Python's separate Add/Sub classes for each operator), and uses a single
constant class for all literal values rather than separate classes per
literal type.
Sources
sources/advprog/tolk/src/python/ast.lisp— AST class definitionssources/advprog/tolk/src/python/parser.lisp— parser grammars that construct AST nodessources/advprog/tolk/src/python/execute.lisp—parse-program,collect-namessources/advprog/tolk/tests/python/parsing.lisp— parser tests with AST construction
Related
- Tolk ← cluster hub
- Tolk Arith — the arithmetic interpreter AST (s-expression based)
- Tolk Python Language — the Python fragment being parsed
- Abstract Syntax Tree — general concept
- Defmethod-bind — macro for methods on AST classes
- Python ast module documentation