I've been looking at the findMethodByExample feature from Smalltalk, that does just what it says. I've also been wondering about a Racket implementation. Because I recently witnessed some criticism about Racket's discoverability, with people stating that finding stuff in the libraries/docs wasn't that easy. So, would you care to chime in about:
Would a Racket findMethodByExample be a worthy feature?
How would one go about it?
Looking at the Pharo stuff, the initial filtering of the methods is done by walking the class tree, then by metainfo on the procedures (number of args, etc.). I think in the end, there's also some whitelisting going on to avoid unwanted side-effect execution. In my understanding, the Racket equivalent couldn't really do it just by traversing the dependency tree, though? I think some "hardcoded" search paths would be needed for core language features, since we are not class/method based and modules do not rely on inheritance. However, maybe we could leverage the contract system in some way to approximate this functionality, especially since contracts don't have side-effects?
The Haskell community has Hoogle, which allows searching for a function by its type: (a -> a) -> [a] -> [a] - Hoogle. I've always found this search capability incredibly useful and have relied heavily on it.
Not all functions in Racket (can) have types, but this may be a reasonable first step.
How does this work in Smalltalk or Pharo; do you have a better link than this?
I think the situation in Racket is difficult because:
Many useful functions do not have a Typed Racket type, or even a contract. Some have a "contract" that exists only in the form of a Scribble defproc. Some don't have documentation.
Many useful things aren't even functions -- they're macros. Depending on how the macro was defined, the grammar might be clear, but it might be completely opaque. Again, there might exist a grammar via a Scribble defform, but maybe not.
I can imagine some tool that tries to scrape all these sources, within some user-defined scope that includes some combination of packages and/or personal projects, and refreshes that metadata in a database accessible by editor/IDE tools.
I can imagine that, but... imagining it also sort of gives me a headache. Seriously this feels like a project that would be challenging in a "never-ending" way --- a source of perpetual bug reports where users have quite reasonable expectations that can never quite be satisfied?
TL;DR I think it sounds like a wonderful thing to have, and if someone built it I might use it, but I probably wouldn't want to be the person attempting to build and maintain it?
Thanks for the input, @greghendershott. I got to conclusions that are similar to yours, although I think there are some points that save the idea somewhat:
We could limit the scope of the project to those functions that do have a type/contract.
That would enable a progressive (although neverending indeed) expansion of the project scope, as we could add the missing contracts along the way.
Macros can expand to contracted functions, I think. So, is that really such a problem?
So basically, I think we could start very small and expand. But is that worth the effort? It indeed seems like something that would be troublesome to maintain for a single contributor. On the other hand, contracts make Racket uniquely positioned for such a feature in the current language landscape, IMO.
@scolobb I agree that something similar to Hoogle is very enticing. Are you really specifically speaking about types, though? Or are you including contracts in that? I'm certainly no expert on either design-by-contract or type theory, but I think that somewhat changes things. Especially since the fraction of typed functions in the lib is minuscule compared to the untyped fraction (I think).
As for the implementation, I found the blog you linked and also related:
For sure if there is some subset you think is satisfactory, I don't mean to discourage you. "The perfect is the enemy of the good".
Why I was primed to feel familiar with this space: Emacs has an "eldoc" feature. Given some function name, it tries to display its parameters (their order and names).
You could describe this as "function name -> parameters" --- the "opposite" of "parameters -> possible function names" (the latter being findMethodByExample).
Either way, there is the same set of challenges about discovering the number of parameters (much less the name and type/contract/syntax-class/kind of each).
And either way, if users are OK with some (pretty significant) limitations, there are useful things that could be done, for sure.
Indeed, I am not specifically speaking about types, and I agree with you that contracts or other tags may be used to do search. I do have the impression however that types would be easier to use than contracts, because the contract language is more expressive. For example, one would probably have to match list? with (listof number?) and with (list (listof number?)). On the other hand, there are indeed many more functions with contracts than functions with types, even though I'd qualify their proportion as "more than minuscule"
I felt sure I'd seen something like this before, but I think I'm thinking of this package for Emacs Lisp that does it:
ELisp doesn't have contracts or types, so the way it works is by curating a relevant list of functions, and then trying all of them with the user's example interface, returning the ones that match and don't raise an error.
Doing it this way in Racket would take some design effort -- which functions are relevant? Can we dynamically obtain the list of relevant functions from installed libraries (maybe using something like module->exports)? Are there security concerns with invoking these functions -- do we trust the installed libraries? Are there performance concerns with invoking a potentially large number of functions? How do we parse the user's query reliably?
Of course, relying on types and contracts would be preferable where available, but this approach could be a reasonable fallback.
There even don't have to be security issues per se. For example, I wouldn't like it if a suggestion feature would "randomly" try file system functions that modify the file system.
So I guess you'd need to be able to remove functions with (certain) side effects from the candidates to try before trying them.
suggest.el uses enumerative program synthesis. It tries your inputs (in any order) against every function in suggest-functions .
suggest-functions is a carefully chosen function list: they're all pure functions with a small number of arguments using only simple data types. We only include functions that users could 'stumble upon' with the right set of inputs.
People have done this kind of thing when writing low-level code generators, exhaustively trying out different combinations of machine instructions and finding unexpectedly efficient ways of doing things.
Of course the results are checked by people before any of them is used into a real compiler.
Concerning functions that have a documented contract but no actual contract: should they have a contract? That is: are these things we could improve by adding actual contracts? (Maybe not everything.)
I think the answer is probably, "No", for a few reasons:
Quite a few things documented by Scribble defprocs aren't actually defined when/where racket/contract is available. For instance primitives defined in C or in Scheme.
Quite a few things are macros documented by Scribble defforms.
Even when things are defined in Racket, and where racket/contract is available, there might be an unacceptable performance penalty to wrapping them in an actual contract. (Although there seems to be a lot of great progress in making contracts faster, as well as ideas like option contracts.)
We already have the notion of run time, compile time, and even documentation production time.
You could propose a fourth time: Something like "tool time" or "humans writing code time". It can use some of the results of the other times. For example "blue boxes" from doc production, as well as some sort of "crawling/scraping" of things like signatures/types/contracts resulting from compile time. (Insert prolonged, vigorous hand-waving here.)
Anyway, I think that is its own distinct "time", where (when?) things like findMethodByExample live.