Finding srcloc for modules in fully-expanded code

TL;DR: I can't figure out how to get original srclocs for spliced module+ forms in fully-expanded code.


Background: Racket Mode has a run command much like Dr Racket: Run the module for the file, as well as optionally some other submodules like main and test.

It also has a command to "run the module at point", which could be the outermost file module or any arbitrarily nested submodule. So, you run/enter that module in the REPL (in the sense of module->namespace). This can be another handy way to run test or main, or any submodule. (Sometimes I have (module+ example __) as an alternative to commented-out code examples.)

Currently this works by identifying module s-expression surface syntax. So it doesn't work at all for non-sexp langs like rhombus.


Next: So now I want to make it work "the right way", as with check-syntax, by walking fully-expanded code.

And this is straightforward for things like:

#lang racket/base
(module m racket/base
  (module mm racket/base))
(module n racket/base)

You can imagine saving in some data structure the syntax-position and syntax-span of each fully-expanded module form, to support a "position->module-name" lookup function.

But module+ is tricky. To support interleaving tests, it has a neat splicing feature. So something like:

(module+ test)
(module+ test)
(module+ test)

ends up fully expanding to a single (module* test #f) form. Unfortunately, its srcloc seems to be only for the first original (module+ test) in the source. So if the user had point in one of the others, we couldn't know it was the test submodule.

I was excited to notice an 'origin syntax-property on the (module* ___) forms, which is a list of syntax objects, some of which come from the original source. Unfortunately they are syntax objects for just the module+ identifiers in the original source -- not syntax objects for the complete (module+ __) original forms.

So... I'll think about it more, but for now I'm stumped.

  1. Is there some self-help approach?

  2. If not, could a new syntax property be added for the spliced single form, to give a list of syntax objects (or at least srclocs) for all the original pre-spliced forms?

p.s. Although I'm experimenting with this in my "extra analysis" pass of pdb, for now, my goal would be to contribute this to drracket/check-syntax.

2 Likes

This change to module+ seems to suffice. "It works", but does it seem correct?

Also not sure about:

The property name, spliced-module-srcloc?

The property value, a srcloc struct; even though I don't think it needs to be preserved (serialized), would a srcloc vector still be better?

modified   racket/collects/racket/private/submodule.rkt
@@ -27,7 +27,10 @@
                 (set-box! stxs-box
                           (cons (append (reverse (syntax->list (syntax-local-introduce #'(e ...))))
                                         (car (unbox stxs-box)))
-                                (cons (syntax-local-introduce stx) (cdr (unbox stxs-box))))))
+                                (cons (syntax-property (syntax-local-introduce stx)
+                                                       'spliced-module-srcloc
+                                                       (syntax-srcloc stx))
+                                      (cdr (unbox stxs-box))))))
               (syntax/loc stx (begin)))])]
         [else
          (raise-syntax-error #f

1 Like

To make things easy in case it's OK as-is, I went ahead and submitted that as a pull request.