I ported a Python version of Advent of Code Day 25 to this Racket version , and I was surprised by what appears to be the relatively poor performance of Racket's set implementation.
The Python version runs in about 1.5 seconds, and the Racket version runs in about 10 seconds. I'll try and put together a simpler example comparing the two, but if anyone has some insight into the Racket set performance, please share. I realize Python's set implementation is in C, but I would still expect Racket to be within a factor of 2 to 4, or so.
My main solution uses a vector and runs in 1.1 seconds, but I'm still interested in seeing if I can get the Racket set version closer to the Python speed.
If I remember correctly, there seemed to be a performance issue with Racket hash tables that gave some slow timings on some Computer Language Benchmark Game entries, so if the set is implemented with a hash table (likely?) then maybe it's related.
UPDATE: I found the Computer Benchmark Game program that mentions the performance issue with hash tables in the comments. And even with the issue, Racket was only 1.4x slower, despite using 1 core vs. Python using 4 cores.
One piece of low-hanging fruit would be, instead of equal?-based sets, using make-mutable-seteqv to get the much cheaper equality check. You could also try replacing in-set with in-mutable-set, which should get you more optimized (though less generic) iteration code.
(Of course, I’d really recommend using in-immutable-set, since the sets are never actually mutated after they are constructed, but that would be a different program, though it could be interesting to see a comparison.)
Using set! on module-level variables in move-all! will trigger various pessimizations.
Potentially a big contributor is that list-ref is O(n): using it inside a loop to treat a list as a random-access data-structure is never good. Why not try something like: (with apologies for any formatting errors from mobile)
(P.S. Sorry, I received the post by email, and in my email client it was not obvious that you already supplied a link to the code. Please disregard. Happy holidays!)
I ported a Python version of Advent of Code Day 25 to this Racket version , and I was surprised by what appears to be the relatively poor performance of Racket's set implementation.
I'm curious about your solution, can you post it?
I'm curious to see how you used sets.
My solution written in openlisp, a pure interpreter, runs in about 4 seconds, uses an array for the map and conses up an auxiliary list...
(I can post it later today if interested).
It appears fixing list-ref doesn’t help much. There are only ~100 lines, so list-ref is fast enough for this particular input. But if there are more lines, list-ref could definitely become the bottleneck.
FWIW, in Python, it’s idiomatic to use enumerate over range(len(...)). In Racket, we could similarly use in-indexed. Initializing the sets could then be written as:
I probably should've mentioned I only used list-ref because of being in "puzzle mode". In other words, I knew the impact was minimal, and it simplified the code somewhat My primary solution does beat the Python version (barely).
I appreciate the tip about using eqv - my original (median of 3 timings) was 10.3 seconds. Switching to mutable-seteqv reduced to 4.66 seconds, as you mentioned. I originally read your post incorrectly, I thought you got the total time down to 0.4 seconds with hasheqv, so I was pretty excited - then I realized you meant it just got 0.4 seconds faster. Mine was 4.07 seconds with hasheqv.
So, it looks like Racket is about 2.8x slower than Python when using hasheqv. I think this is probably acceptable, but I may investigate further because I think that data structure is pretty important to many programs.
One piece of low-hanging fruit would be, instead of equal? -based sets, using make-mutable-seteqv to get the much cheaper equality check.
Thanks for mentioning this - good tip.
I’d really recommend using in-immutable-set , since the sets are never actually mutated after they are constructed
Good point re: mutation. My original port of the Python version didn't create new "destination" sets, so when I switched to doing that, I left the mutable sets because it was easier, and I doubted that using immutable sets would be any faster.
I was extremely pleased with using Racket for Advent of Code. I think this set/hash performance was the only disappointment in quite a bit of coding!
Also, @dannypsnl , as you mentioned, you made a "small comparison" - I think it's too small to gain any performance insights. I'll try and create two simple versions later, one for Python, one for Racket, that demonstrate the performance difference with sets. I think the Advent of Code puzzle probably has roughly 10,000 elements.
An efficient solution would use define-sequence-syntax, which uses for*/stream when it’s not used directly in a for form, and uses :do-in when it appears directly in a for form. This should make it even faster than the for*/list variant.
The Racket and Chez Scheme compilers do very little optimization of variables assigned to with set! (preferring to expend their effort elsewhere), which can confuse people. Chez boxes all such variables on the heap (to facilitate a more simple and efficient implementation of closures for the common set!-free case), so even referencing such variables is a little more expensive. Racket quickly abandons inlining optimizations when set! gets involved, enough so that (set! proc proc) is an idiom to prevent proc from being inlined (for testing etc.), even though a compiler that wanted to could eliminate that assignment altogether. I am not certain, but I suspect that set! may also defeat at least some optimizations that depend on inferring the type of a variable, e.g. to eliminate redundant checks.
I think some of the optimizations from define-sequence-syntax do not occur with in-indexed, which is why I typically would write such code as:
where #:when #t serves to create a nested iteration context.
I think part of the cost of racket/set is that generic dispatch is relatively expensive and is less optimizable than calls to known functions, and there aren't versions of set-add/set-add!/etc. that are specialized to "hash sets", analogous to in-immutable-set/in-mutable-set for in-set. It would be interesting to put a number on the cost of generics, perhaps by copying the "hash set" implementation from racket/set to create a non-generic drop-in replacement.
in-indexed should work efficiently when it appears directly within a for loop. It actually compiles to the same code that you wrote, provided that we use (in-indexed (in-list lines)) and (in-indexed (in-string line)).