Would it make sense to add a function for inverting a hash table to the standard library?

ddrake · July 23, 2025, 8:43pm

If you have a hash table, there's a pretty reasonable "inverse" to that table: in the original, you have keys mapping to values. The obvious inverse would be a hash table whose keys are the values in the original table, and the values are lists of keys in the original table with that value. (Or perhaps sets, or some other suitable data structure.)

Given that there are functions in the standard library for things like intersecting and unioning hash tables, it seems like this inverse operation would belong there, too.

FWIW, I wrote this today:

(define (invert-hash-table ht)
  (define values-to-key-list (make-immutable-hash
                  (for/list ([val (set->list (apply set (hash-values ht)))])
                    (cons val empty))))
  (for/fold ([acc values-to-key-list])
            ([pair (hash->list ht)])
    (let* ([val (cdr pair)]
           [key (car pair)]
           [keylist (hash-ref acc val)])
      (hash-set acc val (cons key keylist)))))

(1) is there a more standard, easy way to do this? Am I missing some existing library function or one-liner?

(2) If not, would it make sense to add the above function (or something equivalent) to the standard library?

(3) A question/feedback: it was counterintuitive to me to discover that I needed to use car and cdr for the pairs, and that first and second did not work.

That seems weird to me. I know car and cdr (and I even know their etymology as 'contents of the address/decrement register') but I find it more readable to see first and second. And my guess was that first and second were just syntactic sugar / aliases for car and cdr. Although I do see some ambiguity for "second" for a list, even a list of two elements: is "second" the second element, or the list containing just the second element (that is, the rest of the list)...?

soegaard · July 23, 2025, 9:01pm

To me it doesn't make sense to include operations the data structure can't implement efficiently. If I have a program that need an "inverse" I just use two hash tables.

Do you have concrete examples where an inverse is needed?

A question/feedback: it was counterintuitive to me to discover that I needed to use car and cdr for the pairs, and that first and second did not work.

The names are somewhat historical, but car and cdr signals that one works with "a data structure built with pairs" - whereas first and rest (and second`) signal that one works with lists (which is a subset of "a data structure built with pairs").

Before contracts the names first and rest was just a help for readers of the program.
After contracts were introduced, first and rest check whether they receive lists.

Note: second is the same as a cadr that accepts lists only.

shawnw · July 23, 2025, 10:21pm

For cases where the same equality and hash functions can be used for both the keys and values, and values are unique, I'd just use

(define (invert-hash-table ht)
  (hash-map/copy ht (lambda (k v) (values v k))))

Edit: One that lets you specify which kind of hash table to return and stores keys as a list per distinct value:

(define (invert-hash-table ht #:kind [kind 'equal])
  (for/fold ([new-ht (case kind
                       [(equal) (hash)]
                       [(eqv) (hasheqv)]
                       [(eq) (hasheq)]
                       [(equal-always) (hashalw)])])
            ([(k v) (in-hash ht)])
    (hash-update new-ht v (curry cons k) '())))

(hash-update makes folding over things while accumulating results in a hash table so easy)

notjack · July 24, 2025, 12:20am

There's a lot of variation in how this could work, to the point where I'm not sure it merits its own function. All of these questions could have different answers:

Do you want a mutable or immutable hash table?
How do you want key comparisons to work? equal?, eqv?, eq?, equal-always?, or something else?
How do you want to handle duplicates? Error? Take the first or last one? Merge them into a collection?
And if you do want to gather all duplicates into a collection, do you want a list? A vector? A set?

For that reason, I would reach for either for/hash to solve this problem in simple cases or transducers and reducers in complex cases. Your invert-hash-table definition can be implemented with transducers like this:

(define (invert-hash-table ht)
  (transduce (in-hash-entries ht)
             (bisecting entry-value entry-key)
             (grouping into-list)
             #:into into-hash))

; returns (hash 1 '(a b) 2 '(e c) 3 '(d))
(invert-hash-table (hash 'a 1 'b 1 'c 2 'd 3 'e 2))

benknoble · July 24, 2025, 10:16pm

Skip the extra set->list, you don't need it.

(for/list ([val (apply set (hash-values ht))]) …)

You could equally do something like (edit: GMail eats whitespace on outgoing messages):

(for/fold ([h (hash)])
          ([pair (hash->list ht)])
  (match-define (cons k v) pair)
  (hash-update h v (curry cons k) empty))

ddrake · July 25, 2025, 7:06pm

Those are good points. It does look like, since there are so many questions, and so many different answers, that any library function for inverting a hash table that attempted to serve all those different needs would at best be very complex, and possibly slow because of all the overhead of trying to be all things to all people. It could also just be difficult to use because anyone trying to use it for their specific situation would have to wade through all the documentation and examples for completely different situations.

ddrake · July 25, 2025, 7:09pm

Oh wow, transducers look pretty cool. That transduce expression is nicely readable.

ddrake · July 29, 2025, 11:25am

Are those tranducers and such the same sort of thing that Rich Hickey is talking about in this Strange Loop conference talk?

notjack · July 30, 2025, 12:24am

Yes. Slightly different API, same general idea.

Topic		Replies	Views
How to turn a list into a hash table? Questions & Answers question	2	585	October 31, 2022
A global hash table where values are sets General question	10	306	April 23, 2024
Can I reliably use integers directly as indices for immutable hash table iteration? Questions & Answers question , racket , hash-tables	16	493	March 21, 2023
RFC: hash table pattern matching General	68	1982	January 10, 2024
Hash-table name? General	2	283	February 13, 2023

Would it make sense to add a function for inverting a hash table to the standard library?

Related topics