Best way to integrate "schemesh" written in Chez Scheme, into Racket ecosystem?

I would like to extend my program schemesh, currently written in Chez Scheme,
and integrate it within Racket ecosystem.

A little background to explain the question:
schemesh is a Unix shell scriptable in Chez Scheme.

It provides a full Chez Scheme REPL, including the ability to create definitions,
load libraries, modules and C shared objects, and use the C FFI provided by Chez Scheme.

And of course, being a Unix shell, it can launch arbitrary external commands,
including pipelines, subshells, redirections, and most importantly job control.

Internally, it uses several Chez Scheme features that are not part of R6RS (see the list at the end of this post).

My question is: what's the best way to extend schemesh in order to integrate it within Racket ecosystem?

This means:

  1. schemesh REPL must understand Racket syntax - at least the one corresponding to #lang racket - and must be able to (eval) it.
  2. schemesh must be able to access Racket packages
  3. optionally, add support for #lang schemesh to Racket and/or DrRacket

Some possible ways to proceed - this list is surely incomplete, more ideas are welcome:

a. do nothing and use Rash - The Reckless Racket Shell instead.
Rash does not have job control, and the author admitted having run out of steam.
See How does this project compare to RaSH: Racket shell? for Rash author's comments on the topic,
and Schemesh: Fusion between Unix shell and Lisp REPL | Hacker News for the surrounding discussion

b. rewrite schemesh in Racket
it would be painful (see below for the used Chez Scheme extension, some are not available in Racket), and a lot of work for creating a fork, that would also need to be maintained.

Such duplication would also slow down all future work on schemesh,
because the modifications would need to be implemented twice, both in the original version and in the Racket fork.

c. take the existing schemesh, compiled as a Chez Scheme library, and load it from Racket
No idea if that's even possible, if it can be implemented by extending Racket, etc.

d. add #lang chezscheme to Racket, and use it from Racket to compile schemesh sources.
Again, no idea if that's even possible, if it can be implemented by extending Racket, etc.

If schemesh was a pure R6RS program, one would just add #!r6rs to every file and load them in Racket.

Of course, this is not the case: it uses several Chez Scheme features that are not part of R6RS,
plus a small library written in C for low-level POSIX syscalls.

=============================================================
Appendix: ordered from the most critical to the least critical one, the used Chez Scheme features are:

(register-signal-handler) and (keyboard-interrupt-handler)
needed for installing signal handlers for POSIX signals SIGINT, SIGCHLD, SIGQUIT
and quickly reacting to them

($primitive $event)
if a POSIX signal was received, calls the corresponding signal handler.
by default, Chez Scheme periodically calls ($primitive $event), but I need to call it immediately after C functions return with errno = -EINTR
because it means some POSIX signal has been received and I need to call the corresponding signal handler,
before retrying the C function that may block for an arbitrarily long time. Examples: read() or write() on a pipe file descriptor

(read-token) and (unread-char)
used to parse a single token of Scheme syntax - otherwise I would need to reimplement a Scheme syntax parser from scratch.
(read) is not a suitable alternative because it does not recognize the syntax extension tokens
added by schemesh for switching from Scheme syntax to shell syntax: #!shell { }

(interaction-environment) and (eval form environment)
the mutable Chez Scheme environment containing all top-level R6RS bindings plus Chez Scheme extensions,
and the (eval) procedure to implement a REPL.
Since schemesh is a REPL, expressions evaluated at REPL must be able to access top-level bindings, and may also create new ones.

(top-level-bound?) (top-level-value) (meta-cond) (library-exports)
used to check for some Chez Scheme bindings that are not always present, such as:
(make-thread-parameter) (make-flvector) (flvector-set!)

(foreign-procedure) (lock-object) and (unlock-object)
the core of Chez Scheme C FFI, schemesh also uses it for bidirectional exchange of Scheme objects with C functions
such as vectors, bytevectors and lists.

If I understand correctly, Racket C FFI can only exchange C types with C functions,
i.e. one needs to malloc(), copy a Racket string or byte string into the allocated memory,
and pass such memory to C functions. It may be enough, but the porting will be somewhat painful.

(environment-symbols)
used for autompletion with TAB key: needed to retrieve the top-level bindings present in (interaction-environment)
and match them against user-entered text.

(generate-temporaries)
used for hygienic macros that need to introduce a variable number of symbols into their expansion

The full list is longer, but the remaining procedures are less critical and this post is already long enough.

Thanks for any feedback!

2 Likes

Hi thanks for the detailed post

SCSH looks great.

Please forgive my ignorance, I’m have two questions

  1. What advantage does integration between SCSH and the Racket REPL have over calling racket —repl from SCSH? See 24.1 Command-Line Tools

  2. Can SCSH call racket scripts? (As described in the guide: 21.2 Scripts )

Best regards

Stephen

Hi @spdegabrielle,

please don't confuse schemesh with scsh

The names are similar, but they have different approaches at merging a Unix shell with a Lisp REPL:

  • my schemesh is a shell scriptable in Chez Scheme: if you type ls and press ENTER, it executes the command ls. Plus it has job control, multiline editing, and shell commands can be managed from Scheme syntax too. Also, the Scheme REPL is literally one character away: just type any Scheme expression, macro or definition inside parentheses does what you expect. Example: (display (+ 1 2))

  • scsh is a Scheme48 REPL: if you type ls and press ENTER, it tries to evaluate the top-level Scheme variable ls. It has additional facilities to start external commands, but among other things it lacks job control, and has no shell syntax.

About your questions:

What advantage does integration between SCSH and the Racket REPL have over calling racket —repl from SCSH? See 24.1 Command-Line Tools

The advantage is a single integrated environment, where shell commands and Lisp values can be freely mixed. You can store a shell command in a Lisp variable, then later start it, and collect its exit status from Lisp. Also, you can inject Lisp values into a shell command. A minimal example showing both sides of the integration is:

(define j {find (scheme-expression-returning-a-directory-as-string) -type f | wc -l})
(define str (sh-run/string j))
(display str)

If you run racket —repl from schemesh, then only strings, byte strings and files can be exchanged easily between schemesh and Racket:
you lose the ability to natively exchange arbitrary Scheme (or Racket) objects between the two programs.

Can SCSH call racket scripts? (As described in the guide: 21.2 Scripts )

Yes, of course it can: it can start them as OS-level processes.
But again, you lose the ability to natively exchange arbitrary Scheme (or Racket) objects between the two programs.

Regards,

Massimiliano

1 Like

It would be wonderful to have a shall inside of Racket.

If you can live without job control, Rash is exactly that.

It's both a shell scriptable in Racket, and a Racket REPL

1 Like

I know :slight_smile:

(1) I'd like someone to promise "maintenance" and (2) I want it all :slight_smile:

My apologies and thank you for the explanation !

This sounds very exciting! I have a few off-hand thoughts.

This is definitely an option you could use for all or part of the code. You can use the ffi/unsafe/vm library to access Chez Scheme functionality from Racket: there’s more discussion here on how to load additional Chez Scheme libraries.

However, as the name of ffi/unsafe/vm implies, Chez Scheme functionality corresponds to the “unsafe” layer of Racket. There are some documented caveats, particularly about potential uses of dynamic-wind and Racket procedures that may not be Chez Scheme procedures.

I think it would probably work out better to write a compatibility layer in Racket that would let most of your code be shared between both implementations.

To the extent you need them, ffi/unsafe/vm is probably the best way to get these. You could also consider the unix-signals package.

However, with respect to blocking C functions, note that Racket’s IO functions are non-blocking at the level of the OS process (though some block at the level of the green Racket thread). You might want to use Racket’s process control functionality instead of going through C. In particular, some variants can handle bridging the Racket-thread-specific current-directory parameter and wiring up arbitrary Racket ports to stdin/out/err.

Racket’s reader is highly extensible: you can add dispatch and delimiter macros to the readtable to cover #!shell, {, }, and everything else. Extending the Racket reader can also help you cooperate with syntax coloring and REPL support that works with both DrRacket and Racket’s command-line expeditor (and maybe racket-mode in Emacs, and/or other editors via the LSP?).

The needed functionality definitely exists, and may be as simple as calling (read-eval-print-loop) after setting up an appropriate namespace and such. If you need finer-grained control, all of the pieces are also provided.

There are ways to do conditional compilation and reflection in Racket, though you may need less of it: Racket always has flvectors, for example.

The Racket FFI is a layer on top of the Chez Scheme FFI, and vectors, bytevectors (a.k.a. byte strings), and lists are the same in Racket as in Chez Scheme (except IIRC for impersonated vectors, including chaperones), so all of this should work. In particular, a byte string is a Racket cpointer?. This might be an area where ffi/unsafe/vm would be useful.

If my suggestions above work out, the autocompletion in expeditor should do this for you. If you end up needing to reimplement more functionality, all the reflective operations you need exist, e.g. namespace-mapped-symbols.

We’ve got it: generate-temporaries

(Also, this one is standardized in (rnrs syntax-case).)

1 Like

Wow, that's a lot of useful and detailed information :smiley:
Thanks @LiberalArtist !

To the extent you need them, ffi/unsafe/vm is probably the best way to get these. You could also consider the unix-signals package.

The ffi/unsafe/vm package is exactly what I needed. Since that gives access to Chez's Scheme (register-signal-handler) and ($primitive ...) probably I won't need the unix-signals package.

However, with respect to blocking C functions, note that Racket’s IO functions are non-blocking at the level of the OS process (though some block at the level of the green Racket thread). You might want to use Racket’s process control functionality instead of going through C. In particular, some variants can handle bridging the Racket-thread-specific current-directory parameter and wiring up arbitrary Racket ports to stdin/out/err.

It did not come to my mind that that Racket's IO functions are non-blocking, but it's obvious in retrospect since it has green threads. I will have to use them instead of blocking C functions, and hope they do not cause too much havoc with (dynamic-wind) - see below.

The known issue with Chez Scheme's (dynamic-wind) not cooperating with Racket green threads is important, as schemesh internally uses (dynamic-wind) quite a lot.

I know Rash was built on top of Racket's process control package, and I will need to verify if that package is needed - and if it's sufficient - when loading schemesh inside Racket.

Also, the tip for fixing (import) when using ffi/unsafe/vm is essential too.

I think it would probably work out better to write a compatibility layer in Racket that would let most of your code be shared between both implementations.

Yes, definitely.

Racket’s reader is highly extensible: you can add dispatch and delimiter macros to the readtable to cover #!shell, {, }, and everything else. Extending the Racket reader can also help you cooperate with syntax coloring and REPL support that works with both DrRacket and Racket’s command-line expeditor (and maybe racket-mode in Emacs, and/or other editors via the LSP?).

For the syntax parsing, I do not yet know enough of Racket's internals and how to customize them in order to understand the best way to proceed. The extensible readtable you point out looks promising, and seems similar in spirit to Common Lisp's one - which incidentally, I have some experience with.

I will tackle the syntax coloring at a later time, and it's good to know Racket has a mechanism to customize it too.

(interaction-environment) and (eval form environment)

The needed functionality definitely exists, and may be as simple as calling (read-eval-print-loop) after setting up an appropriate namespace and such. If you need finer-grained control, all of the pieces are also provided.

Yes, schemesh will need access to Racket's (eval) or something slightly higher level as the (read-eval-print-loop) you suggest.

The Racket FFI is a layer on top of the Chez Scheme FFI, and vectors, bytevectors (a.k.a. byte strings), and lists are the same in Racket as in Chez Scheme (except IIRC for impersonated vectors, including chaperones), so all of this should work. In particular, a byte string is a Racket cpointer?. This might be an area where ffi/unsafe/vm would be useful

Good to know :smiley: and in the worst case, now I know how to use Chez Scheme FFI from Racket: just load with ffi/unsafe/vm the Scheme sources that use Chez Scheme FFI.

If my suggestions above work out, the autocompletion in expeditor should do this for you. If you end up needing to reimplement more functionality, all the reflective operations you need exist, e.g. namespace-mapped-symbols.

We'll see. Can it also autocomplete file and directory names depending on the current syntax and inside quoted strings?

All considered, it seems a somewhat long and sometimes tricky approach - but definitely feasible.

Thanks again!