Multi Process Racket (or even distributed)

I wrote this in response to another question, but also to get these things out of my head and written down. I am also curious about other interesting similar or related tools.

Personally I don't have done a lot of multi process racket in a practical sense, but I have looked into the options in the past, because it was interesting to me and I thought I could mention them here, to make this a sort-of overview.

rackety multi process

  • Parallelism with Places describes places, there are also distributed places on the same page for multiple machines. [os-threads, multi-machine]
  • loci is a library by @pmatos that is similar to places but overcomes memory locking issues that can be observed with high core counts. [os-processes]
  • tcp client-server (This could make sense if you don't want to use one of the places variants)
  • Syndicate (classic) "Syndicate is an actor language where all communication occurs through a tightly controlled shared memory, dubbed the dataspace. [...]" It seems that a new version of syndicate is in development, but it currently doesn't have a working documentation link, so I am linking to the classic version which seems more stable.
  • Goblins: a transactional, distributed actor model environment "Goblins is a quasi-functional distributed object system, mostly following the actor model. Its design allows for object-capability security, allowing for safe distributed programming environments.¹ [...]"

In general I would say when you want true multi-process, you still have something easier than dealing with multiple machines, but you are already very close to multi-machine solutions and I think you should be thinking about, whether you truly only need one machine or also multiple machines.

While with one machine you don't have to worry about network latency and uptime and all that stuff, when you build something that can run on multiple machines you often can create it, in a way so that it also easily runs on one machine. The reverse is more difficult, when it is done as a secondary thought.

general tools

For example ZeroMQ allows you to do messaging between applications running on the same or different machines, or also communicate between applications written in different programming languages. Racket seems to have 3 different ZeroMQ Packages I haven't tested those, so I am not sure which works best.

I have some experience using redis (with other programming languages) and I liked it a lot! When your problem can be mapped to redis, I think you should really consider it, because I found it to be quite a lot of fun to use it. redis packages (but if you really only need to merge message streams it may be a bit much)

Kafka also seems very interesting, so far I haven't had an excuse to use it, or look deeper into it, it seems @bogdan is working on a package for it. Kafka package


Overall I like the simplicity and flexibility I would get from just writing my own tcp server with racket, especially in regards to being able to add new message types when the need arises. ZeroMQ seems very valuable if you have to interact with a multitude of programming languages. Redis if your problem fits it and you can make use of its features and for example use it as a cache (but it can do a lot more).

5 Likes

There are also os-threads. I haven't used them myself, but I wonder whether the GC would make it hard to scale to many cores.

Currently I tend to start independent racket processes from the command line, but this requires one Racket VM per job, which is a lot (say with 10 processes, that's 10x one VM, about 2-4GB). Sharing some of this would be nice.

1 Like

Another pretty interesting third party library: Distributed Places to run jobs on different physical machines via ssh. It seems to have lots of features.

I agree that it is interesting, but it is part of the main distribution and I saw it as the distributed variant of places. So mainly wanted to say that it doesn't seem third party to me.

2 Likes

Marketplace: Network-Aware Programming

3 Likes