RFC: hash table pattern matching

On second thought, I think I would prefer allowing duplicates, which makes partial matching significantly faster, while full matching is still as slow as it currently is.

In the current implementation of hash*, which disallows duplicates:

"partial match"
cpu time: 2124 real time: 2218 gc time: 27
"full match"
cpu time: 2152 real time: 2255 gc time: 8
"hash-table, which is buggy, but it allows partial match and duplications"
cpu time: 1126 real time: 1220 gc time: 1

But if hash* allows duplicates:

"partial match"
cpu time: 589 real time: 624 gc time: 22
"full match"
cpu time: 2052 real time: 2142 gc time: 7
"hash-table, which is buggy, but it allows partial match and duplications"
cpu time: 1096 real time: 1173 gc time: 0

Here’s the benchmark:

#lang racket

(require "hash-pattern.rkt")

(define N 10000000)

(define (test h)
  (match h
    [(hash* [1 v] [2 v2] [3 v3])
     (list v v2 v3)]))

(define (test2 h)
  (match h
    [(hash* [1 v] [2 v2] [3 v3] #:full)
     (list v v2 v3)]))

(define (test3 h)
  (match h
    [(hash-table [1 v] [2 v2] [3 v3])
     (list v v2 v3)]))

(define h (hash 1 2 2 3 3 4))

"partial match"
(time (for ([i (in-range N)])
        (test h)))

"full match"
(time (for ([i (in-range N)])
        (test2 h)))

"hash-table, which is buggy, but it allows partial match and duplications"
(time (for ([i (in-range N)])
        (test3 h)))

1 Like