Again, thanks to everyone.
I was surprised that the hash-table can match partial cases
It’s unclear what you mean by this. Are you talking about how one can use _ ...
to do partial matching? Or are you talking about how if keys are all literals, then it will do partial matching? If the latter, that’s a bug.
I like (hash a b c d h …), where h is a hash like the initial hash. I don’t expect too long list of key-values, and in that case I expect to use the vertical alignment.
In my prototype, hash
would not have grouping, but hash*
would. This is a compromise that we seem to reach.
Does the macro detect (hash a b c d _ …). and avoid the creation of the rest part? It looks very difficult to remove the creation of the unused rest part later in the optimization.
In my prototype, #:rest _
would avoid the creation of the rest part.
Do you feel strongly about the syntax choice of _ ...
over #:rest _
? Right now I’m pretty convinced by @shhyou’s and @soegaard’s argument that (1) ...
should be reserved for lists and (2) it’s awkward to write a match expander that generates ...
. But if you have a case for _ ...
, I’d be interested to hear.
What about removing the _
My personal opinion is a strong no. But again, if you have a case for it, I’d be interested to hear.
What happens with (hash x 10 y 10)? Can it bind (hash ‘a 10)?
That depends on whether or not you allow duplicates, and whether the key positions are expressions or patterns.
In my prototype, key positions are expressions and duplicates are allowed, so x
and y
wouldn’t bind anything, and it needs to be already bound. If they are both bound to 'a
, then the match would succeed, and fail otherwise.
What happens with weak hashes
Can you come up with a concrete example? But I’d say that without concurrency, matching a weak hash should work fine in my prototype.
why not simply introduce some syntactic sugar
Yes, that’s exactly what I have right now in my prototype. hash*
is the more general pattern (supporting the #:default
clause), and hash
is a syntactic sugar for hash*
. hash*
has grouping, but hash
doesn’t.
I think we are mostly in agreement. Though I want to note the following:
full match is just some additional size check
In my prototype, it turns out that full matching needs to be more complicated than that. Mainly because we allow duplicates (#:default
clause also adds complication, but that’s not too bad).
And we allow duplicates because it improves efficiency significantly for partial matching. If we disallow duplicates, we need to detect them. Since keys can be expressions in general (even with Ryan’s proposal to use ==
explicitly), we need to check for duplicates at run-time. This turns out to be substantially slower than no check. See https://racket.discourse.group/t/rfc-hash-table-pattern-matching/1686/16 for more information.
what would be a practical use case of …
For this feature, I’m mostly inspired by object destructuring in JavaScript, which is used extensively to e.g. manipulate React states.
let nextState = (state) => {
let {timestamp, ...xs} = state;
......
};
I wonder whether there is a need to differentiate between mutable/immutable hashes, hasheq, hasheqv, hash (equal) and hashalw.
Yes, the prototype preserves key-comparison method and weak/strong/mutable/immutable/ephemeron-ness.
So it may be not so strange that hash produce slow code.
I can argue in the opposite direction that, if I know that a pattern can produce inefficient code, without being clear under what condition it produces inefficient code, I might avoid using it altogether out of fear of being inefficient.
match
already has this problem. For example, I don’t know how efficient (list x xs ... y)
is. Optimally it can be done in linear time, but does match
do that? I don’t know, so I have been avoiding using it.
Not allowing key positions to be general patterns guarantees efficiency, allowing users to be able to use it without any concern.
My current prototype should address your first reply.
In analogy with list-rest I suggest introducing a hash-rest
It would be easy to do this.
However, I wrote below that hash*
to hash
is like struct*
to struct
and unlike list*
to list
, so perhaps adding list-rest
might mislead people to think that there’s an analogy to list{,*,-rest}
.
f an ellipsis-like notation is needed for hash, how about ht… instead of …?
I think that would be acceptable, though I personally would prefer #:rest
, since everyone already knows what a keyword is. ...
also usually can be used in a more liberal fashion (e.g. multiple ...
in the same pattern), so that might raise a false hope that ht...
can get used multiple times, etc, etc.
what about dict that could match any gen:dict?
That would be a good thing to support, though I think it’s orthogonal to the hash table matching. The dict operations on a hash table in particular can be much slower than hash operations on a hash table. So unless that gets improved, hash table patterns would still be useful in practice.
Right, we have hash*
with grouping for that. There’s also hash
for people who prefer no grouping, as a compromise.
it feels verbose needing to write #:default:
Yeah, #:default
is kinda long. My inspiration is from ~optional
syntax-parse pattern, which has #:defaults
.
If there is another sensible keyword, I would be happy to change it. But I don’t think #:else
or #:or
match the meaning well. I want to avoid not having any keyword at all, since this seems like an “extension” to me, and I want to accommodate for future extensions too. Keywords seem like the appropriate way to do that.
The mixture of expressions and patterns is confusing/suprising
A couple of points:
-
We already have a mixture of expressions and patterns in other patterns, like
(app <expr> <pat>)
and(? <expr> <pat>)
. So I personally don’t find expressions in key positions to be surprising. -
It would be good to make distinction between literals and expressions if that accomplishes something. But I don’t think it does much (besides enabling a very minor optimization that can push some computation on literals to compile-time — which doesn’t change asymptotic complexity). All we want for efficient matching is that we can
hash-ref
on a concrete key value, and expressions suffice for that. -
On the other hand, key pattern with these restrictions makes is not ergonomic to use, and even makes the documentation more complicated and difficult to understand. Compare:
<pat> ::= (hash [<key-pat> <pat>] ...)
<key-pat> ::= <literal>
| '<literal>
| (== <expr>)
with:
<pat> ::= (hash [<expr> <pat>] ...)
Let alone the fact that <key-pat>
would be documented pages down below, so users need to scroll up and down to understand the pattern.
- Minor thing, but it’s not possible right now to document
==
properly, since==
is a match expander which should be recognized by binding, unlike other match patterns. So https://github.com/racket/racket/pull/2145 probably is needed to be resolved first.
I strongly believe the default hash pattern should use partial matching
Per the discussion above, I made a compromise by implementing both hash
and hash*
. hash
to hash*
is like struct
to struct*
, in that hash
/struct
requires full matching, and hash*
/struct*
allows partial matching. So if you want partial matching, you can use hash*
.
In general, most Racket patterns require full matching, so I think it’s good to have at least one pattern for hash table matching that has full matching by default. hash
seems like a good candidate.
This will break whenever
The hash
pattern is a newly added pattern, so it couldn’t possibly break any existing code. Perhaps you meant to say people might misunderstand the meaning of the pattern? But that’s why we have documentation. We also are not trying to mislead people. The discussion so far seems to prefer hash
and hash*
precisely because they are consistent with other patterns. We are trying to reduce confusion!
FWIW, I am in agreement with you that partial matching seems more useful than full matching, so I definitely would use hash*
more than hash
. But I think it’s good to have both, with different matching mode, regardless.