Is there plans for improving the performance of the generated code from the racket compiler?

I found Matthew's talk Incremental Parallelization of Dynamic Languages | Air Mozilla | Mozilla, in Video useful context for futures. Unfortunately, it looks like Mozilla Air moved to a new CMS and didn't import old videos, and the Internet Archive page I linked to doesn't seem to have archived the actual video: maybe someone can find another source?

With respect to discussions about the limitations of futures one might read in various places, a big change came with the move to Racket CS, and I think we are (certainly I am) still gaining experience with what is now possible.

Racket BC was originally single-threaded, and most operations ended up blocking futures from running in parallel. (This was explained well in the Mozilla talk. I particularly remember a picture of a bike covered with an extreme number of locks.) What worked in Racket BC was primarily carefully written numeric code.

In Racket CS, most primitives are now future-safe (by virtue of Chez Scheme's support for OS-level threads), basically the opposite of the Racket BC situation! I found the diff from guide: update discussion on futures for Racket CS · racket/racket@4fcecee · GitHub an interesting view into what changed.

That said, there is certainly also room for improvement, e.g. (as @samth wrote here) finding a way to make IO operations "synchronized" rather than "blocking".

Are you thinking of the --processes mode for e.g. raco setup? IIUC, places always use OS threads in the same OS process when they are able to run in parallel, and that mode exists to enable process parallelism when parallel places aren't supported (as was the case on non-x86{,_64} with Racket BC), but has to be implemented explicitly in raco setup. I don't think ordinary places ever run in separate OS processes.

(I say "normal places" because loci by Paulo Matos and racket/place/distributed exist, and prop:place-location provides some extensibility.)

(Tangentially, I wonder if the bottleneck from the OS page table described in docs: describe some limits of place scaling · racket/racket@b223ce4 · GitHub and this thread also applies with Racket CS. I don't have any machines with more than 16 cores to check.)

I agree with @samth on this, but, if someone wants to spend time making the benchmarks look faster, adding (#%declare #:unsafe) would probably give at least some benefit without requiring deep thought. (Unlike using (#%declare #:unsafe) in real life!)

5 Likes