Racket has thread safe hashes. This is a problem for toy programs becasue in a toy program you can put each hash in a different place, so it is "better" to implement your own thread-unsafe hash, but the "game" ask for idiomatic solutions.
Strings are very slow in Racket BC. (I didn't try too many microbenchmarks in Racket CS.) As a rule of thumb, Racket is 5x faster than Python for number crushing, but x1 (i.e. equal) than Python for programs that use strings. I'm not sure about the detials here, but my guess is that Racket has Unicode support by default but other languages in the "game" use the fact that the data is just ASCII. Only ASCII is a good guess for technical files, but once you are nearby real humans, they love to add weird things arround letters. [¡Hi from Argentina! Don't ask why "ñ" can't be safetly replaced by "n".]
In vectors and other things, Racket can mix fixnum, flonums, structs, boxes. So a fixnums N is represented in BC as 2*N+1 and by 8N in CS. [I hope I remember the detaisl correctly.] So A+B is implemented as A+B-1 or (A-1)+(B-1)+1 or something like that, and each fixnum operations has a small overhead. It's worse for flonums because each flonum has it's own secret hidden flonum-only-box that must be alloccated and garbage-collected. Luckly in both BC and CS the flonum’s can be unboxed by the compiler (@mflatt wrote that part). If you chain flsomething operations, the compiler may skip the intermediate boxes and reduce the overhead.
By number crunching, do you mean programs using NumPy/Pandas for the numeric operations or programs where loops, conditions etc. are in Python? If the latter, I'm surprised if the factor to Racket isn't larger.
On the other hand, the builtin string operations in Python are all in C code "below" the Python API, so I'm not surprised they're equally fast as in Racket.
Python uses unicode strings by default, like Racket.
It really depends on what you're doing. It wouldn't make sense to decode and re-encode a, say, UTF-8 byte stream, if you just copy a file.
I think flvector and fxvector contain contiguous sequences of the low-level datatypes, so they would be equivalent to a C arrays of double or int (for example, depending on the size of a fixnum). Especially if you combine flvector and fxvector with their "corresponding" unsafe operations, you can get a nice speed-up.
If memory serves me - most of Numpy is C/C++ and has been heavily optimized over the years by virtue of being exposed to production stress by many organizations. I cannot speak to Pandas. Vanilla Python - I would not be surprised if it is slow but nobody uses vanilla Python to do number crunching in any serious production code.
Depends on the extent of number crunching. If it's an algorithm you can't easily implement with NumPy and the "number crunching" part isn't a big part of the program or can be done "offline" (during the night etc.), you may get away with it.
I agree that if you have relatively straightforward numerical code you can implement with NumPy or SciPy, plain Python would be much, much slower.
I expect less difference if the programs uses NumPy instead, but it's a lot of work to rewrite them. I only modified a few programs with NumPy for unrelated work, but I just followed the style of the previous code, searched in Stack Overflow and hopped that my coded didn't break the performance too much.
On strings specifically, CPython has multiple representations of strings, including more compact representations for ASCII-only strings. It is likely to be more efficient than Racket for such programs, since Racket uses UTF-32 for all strings in memory.
It's more complicated, because the inputs of the programs are fixed and public, so you could write
and call it a day.
So the rules ask to use the "same" algorithm, whatever that means when you are using very different languages. [My guess is that this is a huge source of nasty emails send to the organizer, becuase people claim their creative solution is not cheating or claim that other languages are cheating.]
I think that using byte string is fine, but IIRC it didn't change the time too much. It was weird, and perhaps I was doing something wrong or perhaps I'm misremembering. I don't have too much spare time now, so if someone likes these challenges, it's possible to try agian.