Check-syntax arrow with rename-in

Hovering on get-pure-port, two arrows are drawn to it:

  1. From importing net/url.

  2. From the rename-in of get-pure-port to a different name --- in an unrelated submodule m that isn't even in scope for the main, file module. (Also, xxx in that module is highlighted as related.)

I don't understand the second arrow. It seems to say, "This use of get-pure-port is a binding from two places at once". Is this as-expected for other people, and if so, can you help me understand why? Maybe I have a squishy idea what the arrows are supposed to mean in this case. (Otherwise, if it's just a bug, I'll move this to GitHub issues for DrRacket instead. I just wanted to get a sanity check, first.)

This looks like a bug to me!

Thanks for that quick feedback!

Trying some variations, it seems to be related to rename transformers, only.

For example the confusing extra arrow doesn't appear when the sub module m does (require net/url) or (require (only-in net/url get-port-port)).

But it does appear with (require net/url (rename-in [get-pure-port XXX])) (as in my screenshot), and also (require net/url (only-in [get-pure-port XXX])) (which IIRC expands to something similar).

I think it also seems to happen when the exporting file uses rename-out.

Perhaps when a rename transformer is involved, some extra scope gets applied... and either that isn't quite right, or, it's right but check-syntax needs to account for that, somehow.

I'll open an issue, and see how far I can get, myself.

@robby This seems to be introduced by commit 00d3092?

It seems that although add-binders is given mods, it doesn't use this to distinguish the bindings in separate modules, in this case? I'm not clear how/why it's a problem in this case but not generally.

I'm not sure if this is right, but it looks to me like this case (of the function that traverses the syntax) is getting at the issue. It looks like, for this slightly simpler version of the program from the start of this thread:

#lang racket/base
(require net/url)
(module+ m
  (require (rename-in net/url [get-pure-port ABC])))

hits that case with (syntax-e #'lang) being #f and thus we don't add module-name to the mods argument and thus everything inside the submodule is treated as if it were in the containing module.

So...that suggests to me that a bunch of stuff needs to change to fix all this, namely that we need to keep track of where the binders are and make sure that arrows go only from outer modules to inner one (or stay at the same level) and not from inner ones to outer ones.

I'm not sure, however, if this is right or if there is possibly a better way to do this, handing more work off to syntax objects somehow. I wonder what @mflatt thinks!

1 Like

Looking at this program (that doesn't export via the contract system but does have the same bad result):

#lang racket/base
(require racket/list)
(module+ m
  (require (rename-in racket/list [first ABC])))

in the macro stepper, it seems like the scopes on first should be differentiating things here. That is, in the fully expanded term I see that the first that's in the submodule has a "module" and "module m" scopes that the first in the outer module doesn't have, and the one in the outer module has a "local" and "intdef" scopes that the one in the inner module doesn't have.

All that's to say that probably check syntax's management of the syntax objects is wrong somehow and the mod argument that ends up in the level+tail+mod-loop function is probably fine. Not sure I'm any closer to a bug fix but I do think that this refutes my suggestion just above!

This seems wrong, naïvely, since it doesn't account for module (which, IIRC, can be required from the containing module). Perhaps you meant "only from outer modules to inner module*s"? But you seem to have landed on a different approach later, so ignore this if it isn't helpful.

That's a great idea to look at an example not wrapped by contracts; much simpler expanded code.

Thinking out loud here, sanity check from reading code I don't fully grok:

Check-syntax uses various syntax/id-table tables, which are hash-tables where the keys are compared using free-identifier=?. This knows about scopes. So generally it's fine to toss in any number of identifiers whose symbol value is the same (like "first"), and they'll be distinguished appropriately. Also you can toss in identifiers with differing symbol values, and if they actually refer to the same binding, they'll be treated the same.

Brief pause for someone to correct my terminology... or lack of any basic understanding whatsoever. :slight_smile:

Because the various things in this example have differing scopes -- and they do seem to, inspecting with the macro stepper -- we expect them to be distinguished appropriately in the table, and therefore to get disjoint sets of arrows. But... we're not. It should all "just work"... but it isn't.

Is that a good summary of the mystery so far?

I guess one question is, what's different about the identifiers participating in import/export rename transformers? Is the #%require sub-form rename contributing to this somehow??

Oh. Maybe this is as-intended for free-identifier=?:

Returns #t if a-id and b-id access the same local binding, module binding, or top-level binding—perhaps via rename transformers—at the phase levels indicated by a-phase-level and b-phase-level, respectively. A #f value for a-phase-level or b-phase-level corresponds to the label phase level.

“Same module binding” means that the identifiers refer to the same original definition site, and not necessarily to the same require or provide site. Due to renaming in require and provide, or due to a transformer binding to a rename transformer, the identifiers may return distinct results with syntax-e.

Maybe this could "just work" if check-syntax used a variant of identifier tables that uses some TBD predicate other than free-identifier=?.

Otherwise I guess check-syntax would need to do per-module partitioning on its own. Such as some additional level of hash-tables leading to the identifier table, like phase -o> module list -o> identifier table.

Yeah, I think you're right @greghendershott . I'll try to see if I can do something about it.

Okay, I started on this and (just in my own copy) have made changes so that the modules are tracked properly as check syntax is recurring through the term (the mods argument to level+tail+mod-loop was not getting extended when the submodule had #f as the language name before).

My thinking was that I would associate the list of enclosing modules with each binder and then refuse to add an arrow if the binder was in a more nested module than the occurrence of the variable. My question is: should I be doing this for every possible binder, or only those that appear in rename-in in modules? In this program:

#lang racket/base
(require racket/list)
(define f 1)
(module+ m
  (define f 1)
  (require (rename-in racket/list [first 1st])
  (module+ m
    (define f 1)
    (require (rename-in racket/list [first 1st])

there are extra arrows from the racket/lists that aren't in the rename-in and extra arrows from the 1sts. There are not extra arrows from the defines to the fs (indeed I cannot make arrows from define misbehave with submodules).

I'm thinking that require is special as it introduces an binding that's the same binding in multiple places, whereas things like define introduce new binders that shadow and that's why the current arrows are wrong.

But this doesn't tell me if I should specially treat require-based binding in this way or if I should treat all binding occurrences in this way (where "this way" means coupling the binding with the list of modules that enclose it and then ensure I don't go "upwards" in that list).

Any opinions?

Thanks for digging into this!

  1. The free-indentifier=? docs mention rename transformers as another special case (in addition to renaming imports and exports) so although I don't have an example yet I wonder about that, too?

  2. To be fair, I could see an argument that the arrows should report what free-identifier=? thinks. Presumably it works that way for good reasons, and maybe its perspective would be useful to people in editors (?). On the other hand, I lack that perspective and want the arrows to make sense "intuitively". To me, that means the rename-in behaves more like a let or define of a new name (as well as hiding the old name). So in something like (module m (require (rename-in racket/list [first 1st])) 1st) I would expect to see arrows between all of:

  • racket/list and first
  • the old/new parts of the rename sub-clause: first and 1st
  • the rename sub-clause's introduction of 1st, and uses of 1st based on module visibility rules

And I would not expect to see arrows that "violate" module boundaries.

To me, that's the most thorough and "intuitive" set of arrows.

(Also, it could let some tools work better, e.g. that help people do multi-file renames that, even in the face of arbitrarily long chains of renaming exports and imports, are properly limited to sub-graphs of the full "name introduction graph". Which is something I keep working on... but setting aside.)

TL;DR I'm not 100% sure but I think the right answer is that all binding arrows should respect module boundary nesting? (With a slight worry I don't appreciate why free-identifier=? behaves like it does.) What do you think?

I think you are totally right about arrows violating module boundaries. Here's an argument: if you take this program (which currently draws an arrow from the second require to the first) and remove the first require, you get a syntax error. So the upwards-pointing arrow that lands on first must be bogus and thus shouldn't be drawn.

#lang racket/base
(require racket/list)
(module+ m (require racket/list))

I guess this has me leaning towards adding a list of modules to every binder internally in Check Syntax so that if someone manages to write some clever macro that smuggles an identifier from one module to another and sets up some binding structure that matches what require is doing, then they won't get an arrow because, presumably, it doesn't really bind in the sense that the second require in the example in this message doesn't really bind.

PS: an example with rename transformers would also be super useful; I agree about that!

Okay, this seems to work:

but I've noticed that module+ isn't getting an arrow but module is, eg in this program:

#lang racket
(module m racket)
(module+ n)

I'll probably try to get that sorted out before pushing to master because a bunch of test cases'll change with that change.

This rename transformer example produces a variety of arrows and highlights, some of which I don't understand:

#lang racket/base
(module n racket/base
  (require (for-syntax racket/base))
  (define-syntax when-not (make-rename-transformer #'unless))
  (unless #f 42)
  (when-not #f 42))

In my version with the changes, I'm getting an additional (bogus) arrow from the last unless to the racket/base on the first line! So that's a timely test case! :slight_smile:

For the bindings from the quoted syntax object, @ryanc actually added that code to check syntax with the comment "connect every identifier inside a quote-syntax to each binder, at any phase, in any submodule" and I guess the idea is that someone can later grovel over the syntax object and make any of those purple arrows into blue ones somehow, if that syntax object makes its way into a macro transformer?

I'm not sure what to think about the green highlighting on both the unless and the when-not. That's basically saying that they are the same identifier (despite the different spelling). I see also that renaming changes all of them, not just the ones that are spelled the same as the one you invoke the renaming from.

OK, and that seems to correspond to the syncheck:add-arrow actual? argument.

Probably I need to pay attention to that, because in Emacs I add an annotation "defined locally" when require-arrow? is false. However in this case that results in an annotation for e.g. unless that it is "defined locally; imported from racket/base" -- which is a bit contradictory. :smile: I'll make use of actual?.

Well I'm sure we can figure this out easily; "equivalence" is one of the most straightforward concepts in computer science :wink:.

Seriously I'm not sure what to think here, except "it depends". For some purposes it's useful to think of renamed things as being the same, and for others, different.

At least the consumer (whether human or software tool) of these arrows can easily do a symbol compare, when they want to treat them as "different". So probably it's best to draw the arrow or make the highlight, in this example. (I think the main wonky thing is when the arrows cross module boundaries in non-intuitive ways, where untangling that later is not always so easy.)

As I've just been staring at this code, I believe that require arrow is not true when you use rename-in to bind the identifier. This is not completely without sense as the binder is actually written there in the code somehow, but if you want different information I can also add more information in this case somehow (or change things if that is better, I suppose!)

Looks like the module+ that gets left in the origin field isn't syntax-original?. Here's some code that demonstrates that.

#lang racket
(define p (open-input-string "#lang racket/base\n(module+ m)"))
(port-count-lines! p)
(define stx
  (parameterize ([current-namespace (make-base-namespace)]
                 [read-accept-reader #t])
   (read-syntax "file.rkt" p))))
(let loop ([stx stx])
    [(pair? stx)
     (loop (car stx))
     (loop (cdr stx))]
    [(identifier? stx)
     (when (equal? (syntax-e stx) 'module+)
       (printf "original? ~s ~s\n" (syntax-original? stx) stx))]
    [(syntax? stx)
     (loop (syntax-e stx))
     (loop (syntax-property stx 'origin))]))