Imports as completion candidates for hash-langs

Given

#lang rhombus
imp

with the cursor after "imp", if I either

  • In DrRacket: Press C-/.
  • In Emacs, with racket-xp-mode enabled: Press TAB (or trigger complete-symbol in some other manual or automatic way).

then the choices include

  • import: Good.
  • impersonate-box: Oops.

Using imported identifiers (from module and #%require in fully-expanded code) as a source of completion candidates has been, in my experience, reliable and useful for #lang racket. That's what racket-xp-mode is doing here. I assume similar in DrRacket?

But based on this example, it looks like this can't be the whole story for arbitrary hash-langs, including #lang rhombus.

Does a hash-lang need to be able to supply some kind of function to filter and/or transform the imported identifiers, to make them suitable as completion candidates? (Maybe omit some. Maybe transform some to be suitable, e.g. #{impersonate-box} IIUC for rhombus?)

Thoughts?


p.s. This program

#lang rhombus
#{impersonate-box}

gives the error impersonate-box: unbound identifier in: #{impersonate-box}. So maybe (as usual) I'm confused what the problem even is here.

The fully expanded version of:

#lang rhombus
1

is

(module anonymous-module rhombus
  (#%module-begin
   (#%declare #:realm rhombus #:require=define)
   (#%app call-with-values (lambda () '1) print-values)
   (module configure-runtime racket/base
     (#%module-begin
      (module configure-runtime '#%kernel
        (#%module-begin
         (#%require racket/runtime-config)
         (#%app configure '#f)))
      (#%require rhombus/runtime-config)))))

That doesn't have any imports that involve racket/base, except in the submodules. So I don't know where either DrRacket or racket-mode is getting the relevant bindings.

OK I see now. The configure-runtime module uses racket/base and contributes these candidates.

The status quo approach in racket-xp-mode is to err on the side of more candidates. (The philosophy is to save typing. Not a perfect mad lib copilot.) So it throws candidates from all submodule langs and imports into one big pool.

Ignoring configure-runtime modules might be a reasonable tactical hack (for this example "it works").


The approach in pdb (still WIP) is a little smarter. It scopes candidates by submodule (and is module+ aware). Requests for completion candidates can supply a position. That avoids this.

Even so, pdb isn't currently attempting to limit candidates perfectly down to intra-module lexical scope.

And, when the file is in an error state, even pdb falls back on the "one big pool" approach -- you get all candidates from all modules, from the previous successful expansion. After all, if the file has an expansion error, strictly speaking we can't know in what submodule the user wants completion candidates.

Completion is weird this way -- the program is being edited and is almost always in a temporarily invalid state.

I think it might be better to ignore module subforms (and consider only module* submodules) for the purposes of providing completions outside the submodule.

Another probably good idea is to ignore submodules that have no source locations from the actual module.

I agree. pdb already does that -- tracks imports by module, understands module* nesting visibility, knows the srcloc extents of module forms (even module+ forms spliced from disjoint source spans).

The "tactical hack" idea is for racket-xp-mode's simplistic back end.

I'm trying to get to the point with pdb where racket-xp-mode can become a legacy thing used only by people who need to work with older versions of Racket.

(I'd prefer racket-xp-mode just change to use pdb on the back end. Unfortunately pdb needs some things added only in very recent versions of Racket. So for older Rackets, I need to keep the old racket-xp-mode back end for awhile. I just don't want to invest too much time in enhancing it.)

p.s. I guess more broadly there's some entire other story when it comes to completion candidates and dotted names.

Looking at gui_demo.rhm we have things like math.max. OK. Let's make a simpler example:

#lang rhombus
math.max(1, 2)

If a user wanted to type math. and get completions including max... I don't see how this is supposed to work.

First, the rhombus exports include what seems like a couple dozen phase+space variations. From looking at the result of

(parameterize ([current-namespace (make-base-namespace)])
  (namespace-require '(lib "rhombus"))
  (module->exports '(lib "rhombus")))

I see:

  • math is exported under phase+space (0 . rhombus/namespace). And FWIW check-syntax draws an arrow from it to the rhombus module language.

  • But max is not exported anywhere. And FWIW check-syntax doesn't draw an arrow for it.

Is there some "rhombus-namespace->exports" function? If so, I could imagine discovering things like max that way. But even if so, is there some lang-independent mechanism?


p.s. The following is N/A for discovering completion candidates -- which needn't already be present in the user program, of course -- but just of note, in fully expanded code max is apparent but math isn't:

(module rhombus-math rhombus
  (#%module-begin
   (#%declare #:realm rhombus #:require=define)
   (#%app call-with-values (lambda () (#%app max (quote 1) (quote 2))) print-values)
   (module configure-runtime racket/base
     (#%module-begin
      (module configure-runtime '#%kernel
        (#%module-begin (#%require racket/runtime-config) (#%app configure (quote #f))))
      (#%require rhombus/runtime-config)))))

However max does have a syntax property 'rhombus-dotted-name with value 'math.max.

I think ~0 thought has gone into APIs that could be used for completions in the rhombus context. One thing to note is that the expansion of

import rhombus/gui as G

expands to (roughly)

(for-meta
 0
 (portal
  G
  ((import (lib "rhombus/gui.rhm") (lib "rhombus/gui.rhm") mod-ctx) G G)))

which may be useful for finding identifiers to complete after G.

So IIUC there is some prefix mechanism for things like math.min.

And there is another prefix mechanism using portal syntax, with import as.

One confusion I have is that #%require documents portal as:

(portal portal-id content)

but there is no grammar for content, just this prose:

The portal form provides a way to define portal syntax at any phase level. A (portal portal-id content) defines portal-id to portal syntax with content effectively quoted to serve as its content.

On the one hand "quoted anything" seems super flexible for a lang to use, which is great.

On the other hand this is opaque to a tool -- how can it know what content means? If something like ([import (lib "rhombus/gui.rhm") (lib "rhombus/gui.rhm") mod-ctx] G G) isn't following a documented grammar, is the idea that some new lang-info function can cash this in for more information? Or some other idea?

(Beyond completion candidates: This seems discouraging because I'm now realizing this impacts pdb's goal to track names across files/modules to support multi-file find-references and renaming. I thought I had nearly all the bases covered but it seems like the onion has more layers. Or... maybe it's good news, once I understand it, because (#%require [prefix _]) relies on renaming, which forces using syntax props to retain the original pieces -- but maybe this new approach avoids that lossy-ness?)

I'm forgetting the pragmatics of portal syntax, but I think it is a way to put a definition directly into a provide which you can then access without actually running the corresponding module body. So it is a way to export something and let outsiders access it without having to run a bunch of code. I'm totally blanking on why it is there but surely it is something to do with rhombus.

So, while I don't have a helpful technical nugget to add here but I just wanted to say that I agree with @samth that there is some wide open design space here to figure out the Right Way to add information into the macro system for use with completion and figuring something out seems like a fun design challenge. (Of course, academics are rewarded for doing that kind of thing and then publishing papers about it so that colors my view of these things ...)

1 Like

I'm forgetting the pragmatics of portal syntax, but I think it is a way to put a definition directly into a provide which you can then access without actually running the corresponding module body. So it is a way to export something and let outsiders access it without having to run a bunch of code. I'm totally blanking on why it is there but surely it is something to do with rhombus.

And that aspect sounds wonderful for tools!

So, while I don't have a helpful technical nugget to add here but I just wanted to say that I agree with @samth that there is some wide open design space here to figure out the Right Way to add information into the macro system for use with completion and figuring something out seems like a fun design challenge. (Of course, academics are rewarded for doing that kind of thing and then publishing papers about it so that colors my view of these things ...)

Understood about completion candidates, and for some users that's more of a nice-to-have than a must-have.

I do want to underline the thing buried at the end of my last message: For pdb I'm trying to track definition and use sites across files ("inter-file arrows").

So if a user program has math.min, I need to know where the math and min pieces come from. Likewise for imported or exported prefix.original.

I'd eventually managed to work that out for the #%require and #%provide forms' documented grammar, AFAIK, including making some PRs to preserve components' srclocs. But now... argh.

To me this feels more fundamental. (And given that information reliably, I think it's free or at least pretty short strokes to completion candidates.)

re the expansion of import in rhombus: is the information in the portal portion of the import part of some public information that follows a format tools are expected to use? Or is it internal information that rhombus is using? The documentation for portal would seem to suggest the former, which suggests that changes in rhombus are needed before tools are going to be able to autocomplete, I suppose? Is this also a correct expectation for Greg's inter-file arrows that pdb wants?

Currently, that format is intended as internal to Rhombus. Although some part of the format could be pinned down, that would still be Rhombus-specific.

I'm not sure of the right direction here to support tools. Maybe there should be something in a language that points to a tool for extracting this kind of information from a module expansion, similar to the way that a language says how to handle coloring or indentation on source. Or maybe there should be a designated submodule name, where a module expansion defines the submodule to expose this kind of information.

1 Like

I feel like it's worth distinguishing two different issues here:

  1. How do we draw a check-syntax arrow for math in @greghendershott's example program? Maybe the answer here is as simple as using a 'disappeared-use in the resulting max identifier, but at no point in the macro expansion steps visible in the macro stepper is math even shown as an identifier that has a useful binding (although that may be a space issue?).

  2. Since Rhombus makes heavy use of a.b references, how do we do completion of such forms? Racket's completion support has never been as good as in other languages, for a variety of reasons, and we have a chance to do better here with Rhombus, but right now we're in a worse position instead.

(2) seems like a problem that will take a bunch of thought and iteration, but (1) seems like a more immediate regression wrt things that we expect Racket-based languages to support, and also seems like it will probably be easier to fix.

I just realized I got math and max backwards in this post -- math gets an arrow currently, but max does not because it is not syntax-original? even though it appears in the fully-expanded code.

@samth 's thinking makes sense to me but I'd be more ambitious on point 2. Since Rhombus has a different expander, maybe we should be asking "can a change to the way the expander works mean that we have completion information more readily available with less work on the part of the macro author?". Note that I have no idea how to get a positive answer to that!

Yes, two distinct issues. But I think maybe two sides of same coin?

IMHO any lang ought to expand exports and imports into the grammar for fully-expanded programs, which includes the grammars for #%require and #%provide. (But let's exclude the portal clause which is a kind of quote of undocumented content.)

If that grammar isn't optimal, then let's extend the grammar in a documented way?

It's currently not optimal for exporting/importing identifiers composed of multiple other identifiers (e.g. prefixes).

  • For example #%require has a prefix clause, which is great because it allows tracing the provenance of the prefix and prefixed identifiers. But it doesn't support nested prefix. (I added syntax properties for this, but only racket require and provide add those.)

  • #%provide lacks even non-nested prefix. (Ditto.)

What if we extend the grammar for #%require and #%provide with new clauses that expose the composite identifiers in a documented, principled way?

p.s. If a tool knows that Math.min, Math.max, etc. are the identifiers, then it can provide completions for all of those after typing Math.. Maybe it can even restrict choices based on originating from the same clause? Anyway, I understand there may be fancier completion validation and choice reduction tactics, and that's cool, and maybe research-y. But my simpler concern atm is (a) provenance for features including but not limited to completion and (b) basic completion.

p.p.s. I realize any program can smuggle things in using dynamic-require and things built on that like lazy-require or rackunit's require/expose (which also uses module->namespace). I don't see any great answer for that, except user's will probably understand tools can't see through such tricks. I just think that "normal" langs should not work invisibly, if possible.

p.p.p.s. If #%require and #%provide are growing too baroque, or assume too much: There could also be a clean sheet of paper and new #%import and #%export forms?

All of this makes sense to me, and we can probably even make the change in a backwards compatible way if we use properties (although maybe that's icky).

On this point:

Sometimes those macros that generate those dynamic-requires are going to know the names (or know where to get them?), and they are there only to avoid pulling in the depenencies / running code at inopportune times. Maybe such things could communicate the information that a tool would want by leaving behind such information in properties too. (At this stage, maybe just keeping in mind that we might want to get that done is enough.)

I mentioned that just to show I understand that "expand to a documented grammar for composite identifiers" doesn't magically cover all possible cases. :slight_smile:

I agree situations with dynamic-require aren't necessarily hopeless. I just feel it's more nice-to-have or "version 2".

Like, as a user, I might want a multi-file find-references or rename command to know about all instances of "foo" in (provide foo), (define foo), (dynamic-require 'mod 'foo). But I wouldn't be shocked if it missed the last one.

Whereas in situations like (provide (prefix-out pre foo)) and in some other file (begin (require (prefix-in pre mod)) pre-pre-foo) -- the last "foo" IMHO is a must-have. And so are each of the distinct "pre"s. That ought to work for all langs that do prefixed imports/exports with whatever surface syntax.

Right now, Rhombus is expanding into a program the certainly follows the grammar for fully-expanded programs. But it effectively works like this:

;; module M
#lang racket
(define-syntax (scope stx)
    (syntax-case stx (math string)
      [(_ math id) (with-syntax  ([i (syntax-e #'id)])
                     #'(let () (local-require (only-in racket/math i)) i))]
      [(_ string id) (with-syntax  ([i (syntax-e #'id)])
                     #'#'(let () (local-require (only-in racket/string i)) i))]))
(provide scope)
;; module N
#lang racket
(require M)
(scope math max)

Except that math and string in Rhombus are lifted out to be top-level definitions so they could be potentially renamed.

I don't think it could be changed to use an extension to require for hierarchical names, because those are too flexible in Rhombus -- the name before the dot has full control over the meaning of the dotted name, and that's used to implement everything from namespaces to import prefixes to struct field access.

But I also don't think we need to do that, at least to handle renaming. Instead, we can use the tools we already have, or extend them a little (eg, by providing syntax properties that talk about dotted names specifically and could also be used for syntax-parse), and continue to build things from the fully-expanded program.

As a proof-of-concept, this commit: Add disappeared-use for fields. · samth/rhombus-prototype@4c700bd · GitHub improves things a bit (there's now a link to the docs for max) but it doesn't yet let you jump to the definition (and it doesn't do anything useful for math.pi).