Thanks to the amazing effort to get OS threads and thread pooling into Racket, this is now something to discuss. Non-blocking I/O is well on it's way being supported in most languages and is already the default (TypeScript, C#) in some.
I think having support for it has educational value as well as having a positive impact on real world performance.
Of course, this requires a lot of thought as it has (potentially) more overall impact on the language than native thread support did.
It depends what you mean by non-blocking I/O. Racket threads already use non-blocking I/O APIs under the hood, similar to "green threads" in other languages that offer those constructs like Go, Java (via Project Loom), and others. For more promise/future-like APIs, there isn't anything out of the box in Racket but the synchronizable event system is more than capable of serving as a basis for such an abstraction.
There are roughly 3 things I see people mean by "non-blocking IO":
Explictly asynchronous APIs at the language level (like APIs that take callbacks in JS or async/await in C#).
Using epoll or similar to enable the runtime to multiplex IO calls under the hood.
Using new low-level asynchronous OS APIs such as io_uring on Linux, either at the API level or the implementation level.
In Racket, there's no particular need to add (1) to the language because everything can be written with lightweight threads. For example, the delay/thread form makes anything at all synchronous in one way (other designs could be imagined and are also easy to implement). The ability to do this is because Racket already implements (2) and always has.
There's so far nothing like (3) in Racket and it's not obvious what the best design would be, including whether that would involve just using the APIs inside the runtime or exposing more of it at the language level. More investigation here would be welcome.
The underlying I/O library (rktio) does use polling, but it is pretty inconsistent. Granted, a lot of that is dealing with Windows and trying to get that to play "nice" with Unix file descriptors, pipes and so on. This is completely understandable, but there is a lot of things that can be updated.
Agreed, not necessary, but from an educational standpoint, having async/await would be useful as that is the model that a lot of people will be using for quite some time. Yes, I understand the debate around function coloring, but it is the most popular model and will be around for quite some time.
I think this is the main discussion.
One: Having a "fake" async/await model that meets pedagogical needs without low level changes might be good enough. However, that risks having a feature that should have a noticeable impact on throughput not having any real change in performance at all.
Two: Non-blocking, call back I/O (via io_uring, completion ports, etc) is becoming the default and not supporting it may have increasing performance and maintainability impacts. Modernizing this aspect of Racket would be useful future proofing. However, this is a major change to a low-level part of Racket.
My opinions here don't matter, but I think creating a road-map item to implement the thread and I/O primitives via libuv (forked if absolutely needed) is the right thing to do long term. It gives Racket access to top-level performance and reliability in this area.
ETA: Also, having better low-level non-blocking I/O makes things like the racket-language-server (and debug adapters) and the like more responsive and much less likely to stall.
I continue to think you're confusing different levels here. It's not true that rktio is inconsistent in handling IO in a non-blocking way; there's nothing (modulo bugs) where one Racket-level thread waiting on IO at the rktio level causes other Racket-level threads to wait as well. This is not responsible for lack of responsiveness in racket-language-server or other problems.
A fake async/await implementation is easily doable as a library. Here's a simple implementation:
(define-syntax-rule (define/async (f x ...) body ...)
(define f
(lambda (x ...)
; could also use delay/sync for alternative semantics
(delay/thread body ...))))
(define-syntax-rule (await e) (force e))
It's not obvious that adding this to the standard library is that useful but making a package for writing async/await style code could make sense.
For io_uring at a low level, adding support for that to rktio would be a good idea although it would have to be done with care. At a higher level, I don't know of any high-level languages that have added support yet and would be interested in hearing examples.
For libuv, I don't think switching from rktio is a feasible plan; the existing code is carefully written to maintain Racket's semantics and I'm skeptical that switching would be easier than improving the parts that need it (eg with io_uring).
Agreed that having async/await as a core part of Racket makes no sense. My apologies if I seem to suggest that was the case. Indeed, it should just be a module.
There are differences between the Windows and Unix systems (of course) that do change how things work at a low level.
An example from rktio_file.c
On Unix systems, files are opened with O_NOBLOCK flag set.
On Windows systems, CreateFileW is called, but FILE_FLAG_OVERLAPPED is not set. So, all operations here are serialized at the OS level. How console input and output is handled is completely different as well.
Now that multiple native threads are available, I/O contention becomes a much more complex issue.
Node and Bun use io_uring when available. After all, libuv was created to support Node. TypeScript has its quirks for sure, but it is high level enough to count.
C# uses io_uring. For a while, .Net Core used libuv, until it was decided that a .Net specific version was needed (versus overwhelming the libuv team).
Java (via Netty) uses it.
OCaml uses it via the eio library.
Haskell has a library to support it, and there was a project (Bachelor's thesis) to add support to the core GHC, but I can't see a roadmap item for io_uring support in the GHC core.
There were some security concerns, but that has greatly improved. Those issues did pause work on using it in Erlang (Erlang io-uring support - Chat / Discussions - Erlang Forums) There is still interest as one project found a 30% reduction in CPU usage by using io_uring over kqueues. Of course, Erlang on Windows uses completion ports.
That certainly makes sense. I would suggest that looking at using libuv (or bits and bobs from it) under the hood for rktio has value. Use all the work they've done to get Windows, MacOS and Linux under the same umbrella.
Finally it seems best to leave discussion about any possible changes to the high-level Racket I/O model to later.
I am definitely in favor of exploring newer I/O APIs, though my interest has been more in copy_file_range/sendfile/fclonefileat than in io_uring, and I've never quite been in a position to actually devote time to working on implementation.
However, these APIs are not without problems. For one, last I checked there was no io_uring operation corresponding to copy_file_range.
Starting with libuv v1.45.0, some file operations on Linux are handed off to io_uring when possible. Apart from a (sometimes significant) increase in throughput there should be no change in observable behavior. Libuv reverts to using its threadpool when the necessary kernel features are unavailable or unsuitable. Starting with libuv v1.49.0 this behavior was reverted and Libuv on Linux by default will be using the threadpool
again.
io_uring support was default-disabled because of numerous kernel bugs but those are all in the sqpoll (file i/o) parts of io_uring. Batching of epoll_ctl calls through io_uring works fine, is a nice optimization, and is therefore unconditionally enabled again. The UV_USE_IO_URING environment variable now only affects sqpoll, and only when the UV_LOOP_ENABLE_IO_URING_SQPOLL event loop flag is set.
The security concerns seem relatively debatable, especially for a runtime system that does not expose setuid(2), but it sounds like there have been significant problems on e.g. arm32 and powerpc64, and also (though I didn't investigate this in detail) corners with uneven performance. I am reminded that, to quote Kent Dybvig, “uniformity and continuity of performance are important.”
Matthew’s description of rktio in this talk stuck in my memory:
If you’re familiar with libuv, as used in Node, it’s that kind of thing, it just has exactly the right things that Racket needs and has been around for 20-or-so years …
Factoring out rktio from Racket BC was one of the first steps toward Racket CS.
If you were starting from a blank slate today, libuv would definitely be a good choice. As things are, borrowing code from other projects is absolutely valuable and welcome. One relatively recent example is the improvement of racket_get_self_exe_path with code from the LLVM project (suggestion, issue, patch).
Deep knowledge of Windows is definitely something our community can use lots more of.