Chez for architectures without native backends

In light of this enhancement in Racket 8.5:

I'm trying to understand how to adapt Guix's package of Racket's Chez Scheme variant to support e.g. powerpc64le-unknown-linux-gnu, which is one of Guix's supported systems. (I think the same would apply for mips64el-unknown-linux-gnu, riscv64-unknown-linux-gnu, and i586-unknown-gnu, i.e. the Hurd.)

Currently, Guix bootstraps Racket's Chez Scheme by building Racket BC [3M], running racket rktboot/main.rkt to generate bootfiles for the inferred current system, and then doing essentially ./configure && make && make install: we pass flags like --prefix= and --threads, but currently we do not explicitly specify the machine type. The Guix package definition also does not currently deal with cross-compilation, though we'd like it to. Elsewhere, we have a functions to convert from our ${arch}-${os} representation of system types (inherited from Nix, e.g. x86_64-linux) to Chez machine types and to report on the state of upstream Chez support for various systems (e.g. threading, whether bootfiles are checked in), but they handle some edge cases poorly and need more work.

Is that approach supposed to work for platforms without native code generation?

From my reading of the Chez configure script, I thought supplying --pb (or maybe we should use --pbarch?) still required a Chez machine type to be either inferred or supplied via --m=, and I don't know how to translate these architectures to machine types.

Yes, the more I look at it, it's clear that the claim to support other platforms was premature. Racket CS can now work in principle, but some pieces need to be filled in.

For a start, the Chez Scheme configure script shouldn't require a machine to go with --pb. You can get past that obstacle by providing any machine type with -m=, even using a made-up machine name. A good generic choice might be a pbarch name like tpb64le. Still, there's no longer a reason to require a machine type.

For the corners where pb mode needs machine-specific configuration, the intent is that "version.h" detects the machine as needed. For example,"version.h" recognizes ppc64 to enable big-endian mode in the runtime system... but that's not right for ppc64le. No doubt other things need to be fixed or added in "version.h".

Meanwhile, Racket's configure script looks for specific architectures to pick a pbarch machine type, and it also still requires a non-pb machine type. We can fill in more cases, but there should also be a way to specify a pbarch variant directly.

Probably the way forward is to fix configure scripts to not require a non-pb machine type, and then see what happens when building for different platforms.

1 Like

I gave this a try, cross-compiling from x86_64-linux-gnu to powerpc64le-linux-gnu, but configure did too much error checking:

starting phase `configure'
source directory: "/tmp/guix-build-pb-chez-9.5.7.6.drv-0/source/racket/src/ChezScheme" (relative from build: ".")
build directory: "/tmp/guix-build-pb-chez-9.5.7.6.drv-0/source/racket/src/ChezScheme"
configure flags: ("-m=tpb64le" "--pb" "--disable-x11" "--threads" "--installprefix=/gnu/store/24iibwyyjy6l0an61rm063cszpglaxwz-pb-chez-9.5.7.6" "--threads" "ZLIB=-lz" "LZ4=-llz4" "--libkernel" "--nogzip-man-pages")
Don't select pb using -m or --machine, because pb needs the
 machine as the kernel host machine. Instead, use --pb or --pbarch
 to select a pb (portable bytecode) build.
error: in phase 'configure': uncaught exception:
%exception #<&invoke-error program: "./configure" arguments: ("-m=tpb64le" "--pb" "--disable-x11" "--threads" "--installprefix=/gnu/store/24iibwyyjy6l0an61rm063cszpglaxwz-pb-chez-9.5.7.6" "--threads" "ZLIB=-lz" "LZ4=-llz4" "--libkernel" "--nogzip-man-pages") exit-status: 1 term-signal: #f stop-signal: #f> 
phase `configure' failed after 0.0 seconds
command "./configure" "-m=tpb64le" "--pb" "--disable-x11" "--threads" "--installprefix=/gnu/store/24iibwyyjy6l0an61rm063cszpglaxwz-pb-chez-9.5.7.6" "--threads" "ZLIB=-lz" "LZ4=-llz4" "--libkernel" "--nogzip-man-pages" failed with status 1

Using -m=ignored caused configure to complain that it wasn't a recognized machine type. I also tried -m=ta6osx as a valid but irrelevant machine type, which got a bit further, but ultimately failed with this error:

powerpc64le-linux-gnu-gcc  -m64 -O2 -Wpointer-arith -Wall -Wextra -Wno-implicit-fallthrough -c  -DPORTABLE_BYTECODE -I../boot/pb    pb.c
powerpc64le-linux-gnu-gcc  -m64 -O2 -Wpointer-arith -Wall -Wextra -Wno-implicit-fallthrough -c  -DPORTABLE_BYTECODE -I../boot/pb    main.c
cp -p main.o ../boot/pb/main.o
powerpc64le-linux-gnu-ar rc ../boot/pb/libkernel.a statics.o segment.o alloc.o symbol.o intern.o gcwrapper.o gc-011.o gc-par.o gc-ocd.o gc-oce.o number.o schsig.o io.o new-io.o print.o fasl.o vfasl.o stats.o foreign.o prim.o prim5.o flushcache.o schlib.o thread.o expeditor.o scheme.o compress-io.o random.o ffi.o pb.o 
:
powerpc64le-linux-gnu-gcc  -m64 -O2 -Wpointer-arith -Wall -Wextra -Wno-implicit-fallthrough  -o ../bin/pb/scheme ../boot/pb/main.o ../boot/pb/libkernel.a -lz -llz4  -liconv -lm -lncurses
powerpc64le-linux-gnu-ld: cannot find -liconv
collect2: error: ld returned 1 exit status
make[2]: *** [Makefile:19: ../bin/pb/scheme] Error 1
make[1]: *** [Makefile:21: build] Error 2
make: *** [Makefile:20: build] Error 2
error: in phase 'build': uncaught exception:
%exception #<&invoke-error program: "make" arguments: ("-j" "16") exit-status: 2 term-signal: #f stop-signal: #f> 
phase `build' failed after 4.2 seconds
command "make" "-j" "16" failed with status 2
note: keeping build directory `/tmp/guix-build-pb-chez-9.5.7.6.drv-5'
builder for `/gnu/store/3phmm6jsmw6bchxlb4pcsw1kycw42sz7-pb-chez-9.5.7.6.drv' failed with exit code 1
build of /gnu/store/3phmm6jsmw6bchxlb4pcsw1kycw42sz7-pb-chez-9.5.7.6.drv failed
View build log at '/var/log/guix/drvs/3p/hmm6jsmw6bchxlb4pcsw1kycw42sz7-pb-chez-9.5.7.6.drv.gz'.
guix build: error: build of `/gnu/store/3phmm6jsmw6bchxlb4pcsw1kycw42sz7-pb-chez-9.5.7.6.drv' failed

which could well be a symptom some broader problem with cross-compilation (which I have not dealt with before even for fully-supported architectures), but it also made me question whether the -m=ta6osx is influencing things: do I recall correctly that -liconv is needed on Mac but not with glibc?

1 Like

I should have reported back yesterday that I tried some of these things and ran into similar trouble. The configure script in Git HEAD now omits the check that rejected -m=tpb64le (but I miswrote, and it should be tpb64l without the e). You could try applying a patch to configure to see if it lets you get further.

1 Like

Thanks! I've rebased my Guix branch for Zuo on top of the updates for 8.5, so I think I will try it there—and probably also try to get cross-compilation working more generally, at least at the VM level—and then see if it makes sense for Guix to add a patch to 8.5 or just wait for 8.6.

Meanwhile, building Chez out of source did not turn out to be enough to fix the test failures. I'll try disabling parallel tests next.

@mflatt if you need access to an unsupported machine, the GCC Compile Farm has a few big machines, like a big Sparc64 with Debian. You can request an account there. Despite the name, they're open to any FLOSS project.

6 Likes

Thanks for that suggestion! I was granted an account, and so far I've used it to repair the build for non-threaded ppc64le (just needed some configure refinements) and sparc64 (deep alignment problems there). I haven't yet pushed the changes for those, but soon.

5 Likes

I tried this again today, picking ppc64le because it seems to be reasonably well supported on Guix's side.

I first tried to build via emulation, which worked for 3M, but some point in the Chez build failed with an unhelpful qemu error. I'll look into that again later.

I then tried cross-compiling (from x86_64-linux), which got further, but failed with a number of linker errors, e.g.:

powerpc64le-linux-gnu-ld: tpb64l/boot/tpb64l/libkernel.a(prim5.o): in function `s_make_reference_bytevector':
prim5.c:(.text+0x91c): undefined reference to `pthread_getspecific'
powerpc64le-linux-gnu-ld: tpb64l/boot/tpb64l/libkernel.a(prim5.o): in function `s_showalloc':
prim5.c:(.text+0x1820): undefined reference to `pthread_getspecific'
powerpc64le-linux-gnu-ld: tpb64l/boot/tpb64l/libkernel.a(prim5.o):prim5.c:(.text+0x2c80): more undefined references to `pthread_getspecific' follow
powerpc64le-linux-gnu-ld: tpb64l/boot/tpb64l/libkernel.a(prim5.o): in function `s_mod':
prim5.c:(.text+0x573c): undefined reference to `fmod'
powerpc64le-linux-gnu-ld: tpb64l/boot/tpb64l/libkernel.a(prim5.o): in function `s_exp':
prim5.c:(.text+0x5790): undefined reference to `exp'
powerpc64le-linux-gnu-ld: tpb64l/boot/tpb64l/libkernel.a(prim5.o): in function `s_log':
prim5.c:(.text+0x57e4): undefined reference to `log'
powerpc64le-linux-gnu-ld: tpb64l/boot/tpb64l/libkernel.a(prim5.o): in function `s_log2':
prim5.c:(.text+0x5840): undefined reference to `log'
powerpc64le-linux-gnu-ld: prim5.c:(.text+0x5850): undefined reference to `log'
powerpc64le-linux-gnu-ld: tpb64l/boot/tpb64l/libkernel.a(prim5.o): in function `s_pow':
prim5.c:(.text+0x58b4): undefined reference to `pow'
powerpc64le-linux-gnu-ld: tpb64l/boot/tpb64l/libkernel.a(prim5.o): in function `s_sqrt':
prim5.c:(.text+0x5908): undefined reference to `sqrt'
powerpc64le-linux-gnu-ld: tpb64l/boot/tpb64l/libkernel.a(prim5.o): in function `s_sin':
prim5.c:(.text+0x595c): undefined reference to `sin'

I've put the full build log at <philipmcgrath.com/tmp/05bkf12zbca2k6gn4w7y0cacrhxwri-chez-scheme-for-racket-9.5.9.2.drv.gz>.

I haven't tried cross-compiling for an architecture with a native backend, so the problem could be something with that, too.

The Chez Scheme configure script doesn't currently try to figure out flags for platforms that are not specifically supported. For Linux with threads, you'd probably need to supply CFLAGS="-O2 -pthread" and LIBS="-lm", at least.

If you run the Racket configure script, it's more likely to figure out these flags and propagate them to the Chez Scheme build. But I understand that building Racket with an automatic Chez Scheme build is not what you're trying to do.

Thanks! Using CFLAGS=-g -O2 -D_REENTRANT -pthread and LIBS=-lm -ldl -lncurses seems to have been enough to get the Chez Scheme build to succeed.

For building Racket, I was a bit unsure about this part of "racket/src/README.txt":

For Racket CS, an additional flag is required:

  • --enable-scheme=SCHEME, where SCHEME is a Chez Scheme executable
    executable that runs on the build platform; the executable must be
    the same version as used in Racket built for the target platform.

    Supplying --enable-scheme=DIR is also supported, where DIR is a
    path that has a "ChezScheme" directory where Chez Scheme is built
    for the host system (but not necessarily installed).

The --enable-racket=RACKET and --enable-scheme=SCHEME flags are
allowed for non-cross builds, too:

  • For Racket CS, supplying either selects a Racket or Chez Scheme
    implementation used to create boot files to the build platform.
    Suppling Chez Scheme is a much more direct path, but when Racket is
    supplied, its version does not have to match the version being
    built.

When it says that --enable-scheme=DIR should be built "for the host system", is this meant to be different than --enable-scheme=SCHEME (which "runs on the build platform"), or does it not mean "host" in the Autoconf build/host/target sense?

That "host" is supposed to be "build", the same as when a file path is supplied. I'll push a fix.

I'm a bit confused about (at least) one aspect of cross-compiling Chez Scheme. I think this is unrelated to weather the host has a native backend or not, so I'm going to try to put the scenario in concrete terms.

Let's say I have a ta6le system and I've already built and installed Racket's variant of Chez Scheme. Now I want to cross-compile Chez for, say, ti3nt, and I have a fresh checkout of the Racket Git repository. How am I supposed to start?

IIUC, the cross build needs to run my ta6le Chez to compile the ti3nt bootfiles, but there doesn't seem to be a way to supply my existing Scheme. Do I need to unpack the bootfiles I created with rktboot to build my installed scheme, but not supply my installed scheme itself? It looks like the files equates.h, gc-ocd.inc, gc-oce.inc, gc-par.inc, heapcheck.inc created with the bootstrap bootfiles don't get installed by make install, so it seems like there isn't some part of the ta6le installation I could just copy into place.

I don't know if it's feasible, but I think the ideal scenario for me would be to create the ti3nt bootfiles similarly to the way I do for ta6le—but using my existing scheme, rather than Racket BC—then proceed with the rest of the cross build.

I do see that there are --host-workarea and --host-scheme Zuo arguments used by the Racket CS build, but I haven't figured out how to use them when just building Chez Scheme.

It would also be possible to keep around more from the ta6le, maybe even the whole workarea, if that would make things easier.

I've also tried building emulating powerpc64le-linux-gnu again (as opposed to cross-compiling), and it fails when trying to run rktboot with Racket BC:

phase `patch-generated-file-shebangs' succeeded after 0.0 seconds
starting phase `build'
qemu: uncaught target signal 6 (Aborted) - core dumped
error: in phase 'build': uncaught exception:
%exception #<&invoke-error program: "/gnu/store/d31g0znpk0k6r12049mpzq6z9203rh1d-racket-vm-bc-8.5.0.8/opt/racket-vm/bin/racket" arguments: ("rktboot/main.rkt" "--machine" "tpb64l") exit-status: #f term-signal: 6 stop-signal: #f> 
phase `build' failed after 0.2 seconds
command "/gnu/store/d31g0znpk0k6r12049mpzq6z9203rh1d-racket-vm-bc-8.5.0.8/opt/racket-vm/bin/racket" "rktboot/main.rkt" "--machine" "tpb64l" failed with signal 6
note: keeping build directory `/tmp/guix-build-chez-scheme-for-racket-bootstrap-bootfiles-9.5.9.2.drv-6'
builder for `/gnu/store/3pllg3rngm6djqfhpasi1f04r4i1k806-chez-scheme-for-racket-bootstrap-bootfiles-9.5.9.2.drv' failed with exit code 1
build of /gnu/store/3pllg3rngm6djqfhpasi1f04r4i1k806-chez-scheme-for-racket-bootstrap-bootfiles-9.5.9.2.drv failed
View build log at '/var/log/guix/drvs/3p/llg3rngm6djqfhpasi1f04r4i1k806-chez-scheme-for-racket-bootstrap-bootfiles-9.5.9.2.drv.gz'.
cannot build derivation `/gnu/store/g32qimkk7hswak035m33khjm7xh711pa-chez-scheme-for-racket-9.5.9.2.drv': 1 dependencies couldn't be built
guix build: error: build of `/gnu/store/g32qimkk7hswak035m33khjm7xh711pa-chez-scheme-for-racket-9.5.9.2.drv' failed

When I tried to run the binary manually, I got a slightly more detailed error:

$ /gnu/store/d31g0znpk0k6r12049mpzq6z9203rh1d-racket-vm-bc-8.5.0.8/opt/racket-vm/bin/racket --version
Welcome to Racket v8.5.0.8 [bc].
SIGSEGV MAPERR si_code 1 fault on addr 0x40021cb3b8
qemu: uncaught target signal 6 (Aborted) - core dumped
Aborted (core dumped)

I don't know much about QEMU, but the SIGSEGV made me wonder if this might have to do with the GC write barrier on Racket BC. It's still surprising, though, because the raco setup for BC worked fine. Are there any known caveats to running BC under QEMU? Specifically, I'm using:

As you say, things are not really set up for that already, but it's not difficult to add. I'll add it.

That way is currently supported by make ti3le.bootquick in the ta6le workarea. But I can see how using an installed scheme would be easier for your setup.

I've gotten a little further with this, after figuring out an issue with my Guix recipe, but now the Chez build process is trying to run my cross-compiled scheme as part of the build process.

In more detail: I'm on x86_64-linux-gnu and trying to build tpb64l for powerpc64le-linux-gnu. With an existing ta6le build, I've used zuo makefiles/boot.zuo /my/installed/ta6le/scheme tpb64l (method 5) to generate a set of bootfiles. Then I've configured with these flags:

("--disable-x11" "--threads" "-m=tpb64l" "CFLAGS=-g -O2 -D_REENTRANT -pthread" "LIBS=-lm -ldl -lncurses" "--toolprefix=powerpc64le-linux-gnu-" "--installcsug=/gnu/store/zmkijfnk1nqsrwziflg861y57r21ddjc-chez-scheme-for-racket-9.5.9.2-doc/share/doc/chez-scheme-for-racket-9.5.9.2/csug" "--installreleasenotes=/gnu/store/zmkijfnk1nqsrwziflg861y57r21ddjc-chez-scheme-for-racket-9.5.9.2-doc/share/doc/chez-scheme-for-racket-9.5.9.2/release_notes" "--installprefix=/gnu/store/qs4rwfphpzj3aw50ij7v0iicby5p95gl-chez-scheme-for-racket-9.5.9.2" "CPPFLAGS=-DGUIX_RKTIO_PATCH_BIN_SH=/gnu/store/q9pidl3hg9l0qga88gsgjs8brv82qy0v-bash-minimal-5.1.8/bin/sh" "ZLIB=-lz" "LZ4=-llz4" "--libkernel" "--nogzip-man-pages")

The C build steps then succeed, from:

powerpc64le-linux-gnu-gcc -DGUIX_RKTIO_PATCH_BIN_SH=/gnu/store/q9pidl3hg9l0qga88gsgjs8brv82qy0v-bash-minimal-5.1.8/bin/sh -DPORTABLE_BYTECODE -Itpb64l/boot/tpb64l -Itpb64l/c -I../ChezScheme/c/ -g -O2 -D_REENTRANT -pthread -o tpb64l/c/statics.o -c ../ChezScheme/c/statics.c

but then we get to the problem:

powerpc64le-linux-gnu-gcc -g -O2 -D_REENTRANT -pthread -o tpb64l/bin/tpb64l/scheme tpb64l/boot/tpb64l/main.o tpb64l/boot/tpb64l/libkernel.a -lm -ldl -lncurses -lz -llz4
: tpb64l/bin/tpb64l/scheme
running tpb64l/bin/tpb64l/scheme to build tpb64l/s/cmacros.so
exec failed
failed
 in build-one
 in loop
 in module->hash
make: *** [Makefile:10: build] Error 1

Since tpb64l/bin/tpb64l/scheme was compiled for powerpc64le-linux-gnu, it doesn't run on x86_64-linux-gnu.

I'm not sure how to deal with this in general for cross-compiling Chez, but it seems especially likely for things to get confused when trying to cross-compile a pb variant that is also supported by the build machine. It seems like nothing in my configure flags communicates that this is supposed to be a cross build, and I'm not seeing any way to add that information, either.

Thanks to your fix for Internal error during `zuo . install` · Issue #4377 · racket/racket · GitHub, I've made more progress on minimal Racket CS, but I've hit a new error: the Racket BC passed to configure's --enable-racket= complains bad switch: --cross-compiler. I'll try using --enable-racket= with Racket CS, instead; I'm not sure if BC is intended or reasonable to work there or not.

compiler/cm:   finish-compile: /tmp/guix-build-racket-vm-cs-8.5.900.drv-0/source/racket/collects/setup/unixstyle-install.rkt
Copying collects -> /gnu/store/ddcdg56xlgfmqx7v79hjl42qzal9hms5-racket-vm-cs-8.5.900/opt/racket-vm/collects
Copying share/pkgs -> /gnu/store/ddcdg56xlgfmqx7v79hjl42qzal9hms5-racket-vm-cs-8.5.900/opt/racket-vm/share/pkgs
  missing source path "share/pkgs", skipping...
Copying share -> /gnu/store/ddcdg56xlgfmqx7v79hjl42qzal9hms5-racket-vm-cs-8.5.900/opt/racket-vm/share
  missing source path "share", skipping...
Copying doc -> /gnu/store/ddcdg56xlgfmqx7v79hjl42qzal9hms5-racket-vm-cs-8.5.900/opt/racket-vm/doc
  missing source path "doc", skipping...
Copying etc -> /gnu/store/ddcdg56xlgfmqx7v79hjl42qzal9hms5-racket-vm-cs-8.5.900/opt/racket-vm/etc
/gnu/store/nnwmpjkgsrnk9a3hanjs40rkjs6bdrsw-racket-vm-bc-8.5.900/opt/racket-vm/bin/racket -MCR cs/c/compiled: --cross-compiler tpb64l cs/c -X /gnu/store/ddcdg56xlgfmqx7v79hjl42qzal9hms5-racket-vm-cs-8.5.900/opt/racket-vm/collects -G /gnu/store/ddcdg56xlgfmqx7v79hjl42qzal9hms5-racket-vm-cs-8.5.900/opt/racket-vm/etc -N raco -l- setup --no-user
/gnu/store/nnwmpjkgsrnk9a3hanjs40rkjs6bdrsw-racket-vm-bc-8.5.900/opt/racket-vm/bin/racket: bad switch: --cross-compiler
Use the --help or -h flag for help.
failed
 in build-one
 in loop
 in module->hash
make: *** [Makefile:16: install] Error 1
error: in phase 'install': uncaught exception:
%exception #<&invoke-error program: "make" arguments: ("install" "ZUO=/gnu/store/9li7zhdi70d7pjrk6dniknq2qib6m913-zuo-1.0-racket8.5.900-guix1/bin/zuo") exit-status: 2 term-signal: #f stop-signal: #f> 
phase `install' failed after 4.1 seconds
command "make" "install" "ZUO=/gnu/store/9li7zhdi70d7pjrk6dniknq2qib6m913-zuo-1.0-racket8.5.900-guix1/bin/zuo" failed with status 2

Cross compilation does require the same variant of Racket (BC vs. CS) as the target variant.

With that change, I have minimal Racket CS cross-compiling successfully for powerpc64-linux-gnu/tpb64l! The only remaining problem is with building Chez directly.

Reading the BUILDING file again, I found that I could use make kernel when cross-compiling to "compile just the C sources to produce the executable so that running can use existing bootfiles." This seems to have worked! I think there's room for improvement to make things go more smoothly, but I think this should be enough to let Guix start distributing Racket's Chez Scheme for all of Guix's supported architectures with the Racket 8.6 release.

One particular snag along those lines is make install-doc depends on make build, and would probably try to run the wrong scheme if it got that far.