@notjack
@mflatt @ryanc I'm attempting to give Resyntax more and better information from the macro expander, and to do so I'm taking a closer look at a lot of the
log-expand
events emitted by the expander. I'm aware of the "event lexer and parser" that the macro stepper uses[1], but I still don't understand a lot of the individual event types. And I definitely don't understand what the overall grammar is for event streams. Could I get a high-level explanation of how the expander's event stream is structured?
[1] https://github.com/racket/macro-debugger/blob/master/macro-debugger-text-lib/macro-debugger/model/deriv-tokens.rktin particular, what subsequence of events corresponds to a "single expansion step", in the sense of what
expand-once
does?
@ryanc
The events are designed to be enough information to reconstruct the macro expansion process as represented by the structures in "deriv-c.rkt". It's an internal protocol, likely to change when the expander changes or when the macro stepper changes. The derivation structures are also likely to change, but probably in more tractable ways.
@notjack
oh yes, I'm definitely aware I'm hacking on internals that are subject to change
so far I've been getting away with just plucking syntax objects out of'visit
events
@ryanc
You can experiment with the event traces and derivation structures with this:
(require macro-debugger/model/debug) (trace-verbose? #t) (trace #'(or 1 2) expand)
First line requires the necessary support. Second line prints out the type of event (but not the payload) as it is received. Third line calls the
expand
function on the given syntax, collects the events, and parses a derivation structure. It looks likeexpand-once
doesn't work as a second argument; typicallyexpand
orexpand/compile-time-evals
is used.
@notjack
I put together my own tool that's similar to that already, actually
(require racket/pretty rebellion/collection/vector/builder) (struct expansion-step (preceding-events visited-form) #:transparent) (struct expansion-event (signature value) #:transparent) (define (expansion-history stx) (define current-expand-observe (dynamic-require ''#%expobs 'current-expand-observe)) (define steps (make-vector-builder)) (define events-since-last-visit (make-vector-builder)) (define (observe-event! sig val) (cond [(equal? sig 'visit) (vector-builder-add steps (expansion-step (build-vector events-since-last-visit) val)) (set! events-since-last-visit (make-vector-builder))] [else (vector-builder-add events-since-last-visit (expansion-event sig val))])) (parameterize ([current-expand-observe observe-event!]) (expand stx)) (build-vector steps)) (pretty-print (expansion-history #'(module foo racket/base (+ 1 2 3))))
but it's definitely wrong because of things like local expansion. I can see that the event stream has a tree structure to it of some kind, it's not just a flat sequence.
@notjack
@ryanc
You can experiment with the event traces and derivation structures with this:
Okay the derivation structs here are helpful for trying to get the tree structure. The acronyms in the names make things... less helpful.
@ryanc
Right, the expander is like a big-step interpreter, and the derivation structures represent the derivation tree for a particular expansion.
@notjack
The macro stepper GUI has to step through that tree in a linear fashion, right? How do you get from the tree structure to something resembling a linear sequence of steps? And does the stepper skip over local expansion?
I'm only vaguely familiar with the big-step/little-step terminology in PL semantics. I kinda remember that from experimenting with redex like a decade ago, but I don't have much experience with it.
@ryanc
For example:
(trace #'(#%plain-lambda (x) (#%plain-app add1 x)) expand)
. There's some getting-started junk at the beginning, but line 6 is where the main internal expand function gets started. That visit event means "the macro expander is given this syntax as an expression/definition/body form to expand". Next is resolve (resolving the#%plain-lambda
identifier to a binding). Next is enter-prim; we're about to expand a primitive syntactic form. Next is prim-lambda, which says the primitive in question is lambda. The structure that follows is specific to the lambda primitive. First, the formals list is checked and a renaming scope is created and applied (the lambda-renames event). Then the rest of the term is treated as a block and expanded according to the block rules (from enter-block to block->list). That corresponds to pass 1 of blocks' two-pass expansion; then pass 2 is from the enter-list event to exit-list. There are recursive calls to the main expand function within here: events 14-16 are an expand call that stops early because the expander recognizes an expression form in pass 1, and events 20-34 are the expansion of the application expression in the body (in pass 2). After expanding the body, the lambda primitive reassembles the lambda term, maybe applies some syntax properties, and returns (exit-prim/return).Yes, the macro stepper parses the events into a derivation tree, then interprets the derivation tree as a sequence of updates to a term. When I took the Semantics class from Mitch Wand when I started grad school, we used a simple lambda-calculus variant (maybe PCF, maybe ISWIM, maybe some ad hoc unnamed language) and we gave a big-step (interpreter-like) semantics and a small-step (rewriting) semantics. Then you want to prove them equivalent. The macro stepper's derivation-to-reduction-sequence component follows the spirit of that proof. (It's more complicated when macro hiding gets involved.)
@notjack
Ah I see, so the calls back to the main expander happen during the expansion of the primitive lambda form. Do macro transformations also work like that? I notice that the macro stepper in drracket skips over the entire
(#%plain-lambda (x) (#%plain-app add1 x))
form in a file with just#lang racket/base (#%plain-lambda (x) (#%plain-app add1 x))
in it, even when I disable all macro hiding.
@ryanc
You can think of a simple macro as a function that just computes a new term, without invoking the macro expander. The macro expander receives the result and immediately expands it. If a macro calls
local-expand
, then that call re-enters the macro expander.
@notjack
ahhh, so they're different from primitives in that way
@ryanc
Yes. Primitives are like macros that call
local-expand
in predictable ways, and whose results don't get re-expanded.
@notjack
Okay excellent that's very useful to know
I might try and make my own parser for the logged event streams, since Resyntax has very different needs from the macro debugger. I'll need to figure out which of the event types are delimiters that signal the start or end of some group of events. Do you have a list of just those event types? The lexer definition in the macro debugger doesn't make that clear, so I'm just guessing based on the names and on what the logged event streams look like. (edited)
@ryanc
I don't think there are tokens that act reliably as delimiters. I would not recommend trying to parse event streams. I would recommend starting with the derivation tree and pruning away the information you don't need.
@notjack
hmm. alright, I'll try that. I'm going to have quite a few questions for you though since a lot of the names of the derivation tree structs and their fields are very unclear to me.
@ryanc
no problem
@notjack
@ryanc Okay some questions so far:
- Am I right to assume that in the root
node
struct, thez1
andz2
fields, when not false, are syntax objects andz1
expands intoz2
?,- What's a tagrule?,
- In the
ecte
andlift-deriv
structs, I see that there'sfirst
andsecond
fields in each containing subderivations. Why are there two? What's the difference between them?,- Why does
ecte
have two lists of local actions, one inlocals
and one inlocals2
?,- What's the
de2
field in thebase
derivation for?,
@ryanc
- Re
z1
andz2
: usually, but they can also be lists of syntax objects for block (bderiv
) and "list of expression" (lderiv
) nodes.|,- A
tagrule
represents making an implicit syntactic form explicit, such as(+ 1 2)
becoming(#%app + 1 2)
; expansion continues with the second form. Likewise for#%datum
and#%top
.,- The
expand/compile-time-evals
procedure does two-pass expansion to try to mimic module-level and block-level expansion. (The top level is still hopeless, but this makes it slightly less hopeless.) Thefirst
field holds the first expansion pass (head-expansion), and thesecond
field holds the second pass. The firstlocals
appears (?) unused now; the second catches the effects of running the compile-time code (in the case of, eg, adefine-syntaxes
form).,- The
lift-deriv
node represents an expansion that lifted definitions usingsyntax-local-lift-expression
. Thelift-deriv
node is where the lifted definitions were caught; the actual lifting happens somewhere within thefirst
derivation. Thesecond
field is the re-expansion of the term after lifts are converted to definitions (or nestedlet-values
forlift/let-deriv
).,de1
is the "disarmed" version of the initial expression. I believe it should always beeq?
toe1
, but it's possible that there are some early-stage syntax property adjustments that I'm forgetting about thatde1
was also catching.,
@benknoble
Any volunteers to archive this info somewhere?
https://discord.com/channels/571040468092321801/893314076346826852/1387829848462065696
Copied and edited from original on Racket Discord by @spdegabrielle with permission from authors @notjack @ryanc.