What would it take to write an independent Racket interpreter?

The answers given so far cover things well, but I'll offer a few extra comments.

On the macro expander:

As @LiberalArtist and @gus-massa noted, the macro expander is implemented in Racket, and that implementation covers things like quote-syntax, phases, and the module system. The expander can't be a Racket library in the #lang sense, because it implements that library mechnanism, but it can be a library in terms of an underlying library system that is potentially much simpler. In Racket CS, for example, the expander is wrapped as a Chez Scheme R6RS library.

Implementing a similar macro expander is maybe not that difficult, but building a separate and fully compatible macro expander is probably not practical. That would be similar to building a RnRS Scheme implementation that is not merely a drop-in replacement for another according to the standard, but would work for every use of the other Scheme implementation. So, for the foreseeable future, any compatible and practical implementation of Racket is just going to use the main implementation of the macro system.

As @LiberalArtist noted, the "linklet" interface represents the macro expander's communication to an underlying compiler. Depending on how the underlying compiler works and how much performance is needed, the mapping through linklets can be fairly straightforward, but a lot of work went into this layer for Racket CS.

On the other primitives:

As @LiberalArtist and @gus-massa also noted, the 1700 "primitives" include things at different levels of primitiveness:

  • Some primitive are things that could be straightforwardly implemented as libraries, although there may be a performance advantage to building them into the kernel and directly supporting them in the compiler. This category is probably more than half of the primitives, and a big chunk is just operations on numbers (exact, inexact, complex, etc.).

  • Some primitives provide OS facilities like filesystem and network access. They could be implemented in Racket as libraries using a foreign-function interface, but since OS APIs are published in C, it's even easier to use a combination of Racket and C. The "rktio" library is the C half.

  • Finally, some primitives implement and reflect runtime facilities like the thread scheduler, delimited continuations, garbage collection, foreign-function interface, and access to compilation/evaluation. These parts make Racket a kind of OS itself (nested in a process within the host operating system). It's a bigger subset of the primitives than you might guess, and it's tricky to make it a layer that's isolated from the other kinds of primitives. For example, there's a connection between make-vector, which might seem purely a library entity, and memory limits as imposed on a custodian that manages the thread allocating a vector, which requires an extra check when the vector to allocate is large enough. Access to underlying OS facilities also interact frequently with Racket-as-OS primitives. Racket CS layers this implementation somewhat (e.g., across the "thread" and "io" layers), but it's also the main source of special hooks that let lower layers call upper layers.

All of this could be implemented on top of a very small compiler and core runtime system. But making it perform well ends up blurring the lines, such as having the compiler recognize arithmetic operations to inline them with specialized number representations, or building support for continuation marks into the compiler.

8 Likes