A micro-benchmark

I haven't studied macros yet.
What makes these things macros?

In "vector-wraps.rkt. you will see:

(define-syntax-rule (define-vector-wraps <pattern> ...) <template>)

This defines define-vector-wraps as a macro.

In this case the macro is defined using a "rule".
When define-vector-wraps is used, the expander matches the macro call to the patterns.
It then constructs the code in the template by substituting syntax from the patterns.

In this case the pattern is rather long, since it consists of definitions of quite a few functions and macros. In the case of f64vector we get definitions of in-fXvector, unsafe-f64vector-copy!, for/f64vector, for*/f64vector, f64vector-copy.

As already mentioned, in my extra-srfi-libs package.

raco pkg install extra-srfi-libs

or the equivalent in the DrRacket package manager window to install.

1 Like

should that have been (define-vector-wraps "f64vector")?

Yes! The "f64lvector" ought to be "f64vector". I think the string is used in error messages.

(I haven't followed this thread in detail so apologies if this has been covered)

If you can implement gen:sequence for the data structure, you can use the Generic Collections library's in and other generic sequence utilities provided there. For built-in Racket data structures that aren't already supported, it may be necessary to submit a pull request to the library to add a default implementation.

This looks great. I submitted a request to support in-VM benchmarking to hyperfine some time back, but they decided it was out of scope for the project. It'll be nice to have a visualization tool like this in Racket as @soegaard said.

Prompted by @sschwarzer's questions:

...I produced some variants in ChezScheme (Racket's backend). My idiomatic baseline is flvector-sum:

The fl.vector library implements simple lazy allocation and manual loop unrolling, with fl.vector-sum and fl*vector-sum variants. Sample timings:

*Version*         *Timings in seconds for 1000 executions*
baseline           .19
unroll make        .14
unroll sum         .12
unroll both        .07
fl. +unroll        .07
fl. "lazy"         .000007

(all produce the correct result :wink: )

1 Like

Why is the lazy result so much faster than anything else we've seen so far?

The conventional Scheme to allocate a vector is (make-vector length fill);
the fl.vector library just saves length and fill, so fl.vector-sum can produce the sum by multiplication:

Allocation of the 800Kb vector is in fl.vector-set!; some sample stats in flvector-bench:

...show 800Mb total memory allocation (as expected), but 32Kb for the "lazy" run.

fl.vector contains versions of the basic vector procedures (-ref, -length, etc), which
should produce correct results for either flvector or fl.vector arguments.

It may be worth noting that the ChezScheme compiler (and hence Racket) can inline
procedures from a user library, so uses are comparable in execution to standard library procedures.