Language implementation design decisions and trade offs

spdegabrielle · May 12, 2022, 11:52am

Hi

@lexi_lambda wrote an interesting comment on language implementation trade-off’s:

I recently learnt in chat(discord); roughly the racket compiler produces safer executables at the cost of a little (runtime)speed. (Thank you @samth )

I’d be interested to know more about these sort of design decisions?

I’m sure there are thousands - I'm interested in knowing what the top 5 most important or interesting ones are for professional language designers.

Pointers to texts, courses(course notes), papers and dissertations also appreciated.

Bw

Stephen

gus-massa · May 12, 2022, 10:06pm

I have no formal notes, but my experience from trying to write solutions for The Computer Language Benchmarks Game:

Racket has thread safe hashes. This is a problem for toy programs becasue in a toy program you can put each hash in a different place, so it is "better" to implement your own thread-unsafe hash, but the "game" ask for idiomatic solutions.
Strings are very slow in Racket BC. (I didn't try too many microbenchmarks in Racket CS.) As a rule of thumb, Racket is 5x faster than Python for number crushing, but x1 (i.e. equal) than Python for programs that use strings. I'm not sure about the detials here, but my guess is that Racket has Unicode support by default but other languages in the "game" use the fact that the data is just ASCII. Only ASCII is a good guess for technical files, but once you are nearby real humans, they love to add weird things arround letters. [¡Hi from Argentina! Don't ask why "ñ" can't be safetly replaced by "n".]
In vectors and other things, Racket can mix fixnum, flonums, structs, boxes. So a fixnums N is represented in BC as 2*N+1 and by 8N in CS. [I hope I remember the detaisl correctly.] So A+B is implemented as A+B-1 or (A-1)+(B-1)+1 or something like that, and each fixnum operations has a small overhead. It's worse for flonums because each flonum has it's own secret hidden flonum-only-box that must be alloccated and garbage-collected. Luckly in both BC and CS the flonum’s can be unboxed by the compiler (@mflatt wrote that part). If you chain flsomething operations, the compiler may skip the intermediate boxes and reduce the overhead.

benknoble · May 13, 2022, 1:56am

RE: vectors and nums, I believe flvector and fxvector are relatively idiomatic for perf-sensitive code. They have (IIUC) more compact representations and operate faster in general?

sschwarzer · May 13, 2022, 10:09am

By number crunching, do you mean programs using NumPy/Pandas for the numeric operations or programs where loops, conditions etc. are in Python? If the latter, I'm surprised if the factor to Racket isn't larger.

On the other hand, the builtin string operations in Python are all in C code "below" the Python API, so I'm not surprised they're equally fast as in Racket.

Python uses unicode strings by default, like Racket.

It really depends on what you're doing. It wouldn't make sense to decode and re-encode a, say, UTF-8 byte stream, if you just copy a file.

Also, if you know that the input bytes are UTF-8 (not just ASCII), you can still safely run certain operations on them without decoding them. For example, you can split paths at / separators or split off drive letters at :. This works because UTF-8 multibyte sequences don't contain bytes that by themselves could be 7-bit ASCII characters.

I think flvector and fxvector contain contiguous sequences of the low-level datatypes, so they would be equivalent to a C arrays of double or int (for example, depending on the size of a fixnum). Especially if you combine flvector and fxvector with their "corresponding" unsafe operations, you can get a nice speed-up.

maketo · May 13, 2022, 11:26am

If memory serves me - most of Numpy is C/C++ and has been heavily optimized over the years by virtue of being exposed to production stress by many organizations. I cannot speak to Pandas. Vanilla Python - I would not be surprised if it is slow but nobody uses vanilla Python to do number crunching in any serious production code.

sschwarzer · May 13, 2022, 11:57am

Depends on the extent of number crunching. If it's an algorithm you can't easily implement with NumPy and the "number crunching" part isn't a big part of the program or can be done "offline" (during the night etc.), you may get away with it.

I agree that if you have relatively straightforward numerical code you can implement with NumPy or SciPy, plain Python would be much, much slower.

benknoble · May 13, 2022, 2:15pm

Pandas is considered slow and unscalable, I believe [citation needed].

gus-massa · May 13, 2022, 4:12pm

The rules of the "game" ask to use plain Python. Looking again at https://benchmarksgame-team.pages.debian.net/benchmarksgame/fastest/racket-python3.html there are a few x10, an even x20 or x40. Sometimes the difference is in the paralelization.

I expect less difference if the programs uses NumPy instead, but it's a lot of work to rewrite them. I only modified a few programs with NumPy for unrelated work, but I just followed the style of the previous code, searched in Stack Overflow and hopped that my coded didn't break the performance too much.

samth · May 13, 2022, 6:37pm

On strings specifically, CPython has multiple representations of strings, including more compact representations for ASCII-only strings. It is likely to be more efficient than Racket for such programs, since Racket uses UTF-32 for all strings in memory.

soegaard · May 14, 2022, 7:51am

Is it cheating to use byte strings in the benchmarks?

spdegabrielle · May 14, 2022, 8:54am

I don’t think so. How can you cheat in an exercise where the goal is to cheat as much as possible?

gus-massa · May 14, 2022, 2:02pm

It's more complicated, because the inputs of the programs are fixed and public, so you could write

#lang racket
(display 42)

and call it a day.

So the rules ask to use the "same" algorithm, whatever that means when you are using very different languages. [My guess is that this is a huge source of nasty emails send to the organizer, becuase people claim their creative solution is not cheating or claim that other languages are cheating.]

I think that using byte string is fine, but IIRC it didn't change the time too much. It was weird, and perhaps I was doing something wrong or perhaps I'm misremembering. I don't have too much spare time now, so if someone likes these challenges, it's possible to try agian.

Topic		Replies	Views
Is there plans for improving the performance of the generated code from the racket compiler? Internals question	16	1047	May 28, 2024
Discussion of Racket's LOP abilities in r/ProgrammingLanguages Questions & Answers discussion	14	1238	February 26, 2022
Racket set performance General performance	11	550	December 27, 2021
Why does my code run significantly faster on BC than CS? General	18	438	February 17, 2024
Language sweet spots General	6	221	July 26, 2024

Language implementation design decisions and trade offs

Related topics