In the interest of more broadly testing the FFI change, I've pushed the revised FFI implementation for Racket, letting Racket's copy of Chez Scheme temporarily diverge from the main Chez Scheme branch while a PR is considered there.
Here are some rough performance results. The "fix" column shows peformance after repairing a broken check in v8.15 that disabled an intended fast path. The "new" column shows the latest Racket with the revised FFI implementation. The "ref-stress-test" group of results correspond to the original post, but using in-range
. The last four examples are about foreign call as much as (or instead of) ptr-ref
and ptr-set!
. For example, "math.rkt" uses bf+
, which is relevant because it uses MPFR and GMP bindings. Programs here.
CPU time in milliseconds
v8.15 fix new
ref-stress-test
vector ref 11 11 11
_ulong ref 37 38 >> 17
_ulong set! 631 >> 72 >> 18
_int ref 37 37 >> 16
_uint ref 37 37 >> 16
_double ref 42 41 >> 18
_fixnum ref 509 510 >> 88
_racket ref 373 373 >> 96
from/to bytes
_ulong ref 29 29 >> 11
_ulong set! 629 >> 62 >> 11
struct ref 171 170 > 164
struct set! 660 >> 180 > 172
math 105 > 84 > 64
plus 29 28 >> 13
strlen 19 19 > 14
draw 228 227 213
There's still room for improvement in foreign calls. A Chez Scheme variant of plus
runs in 3ms instead of 13ms, which reflects overhead still added by Racket's more dynamic FFI. The round of improvements here seems on the path to better improvement, though.