A micro-benchmark

Here's what gpt-4 says, based on using Racket docs as part of its copious training data:

In Racket, you can use the #:result keyword with for/fold to give a hint to the compiler to unbox the 
accumulator variable when possible. This can help improve performance, especially in tight loops 
where you are working with flonum values.


Here's a simple example to demonstrate the use of #:result to unbox a flonum accumulator when 
summing an flvector:

racket
Copy code
#lang racket

(require racket/flonum)

(define (flvector-sum vec)
  (for/fold ([sum 0.0])
            #:result (flonum? sum) ; Hint for the compiler to unbox the accumulator
            ([val (in-flvector vec)])
    (fl+ sum val)))
In this example, the #:result (flonum? sum) tells the compiler that the accumulator, sum, should be a 
flonum, giving a hint that it can be unboxed for better performance.


It's worth noting that this is just a hint, and the compiler might choose not to unbox the accumulator in 
certain situations. However, for performance-critical loops with flonum values, it can provide a 
noticeable improvement.

It didn't quite provide the ideal hint as suggested above in the thread: #:result (fl+ sum). Sparing you the copy/paste, gpt-4 acknowledge that this is even better because it gives 2 hints: that the type of sum is float and that the result of the fold/for loop is also float. Further, it "prefers" using the #:result argument vs putting fl+ at the start of a form that encloses the for/fold form because it keeps the loop in its more typical syntax and enables use of other for/fold named parameters. I agree.

Having been pointed to section 19 of the Guide, I have to say I would have never figure out the syntax for actually incorporating the hint in the code as no example is provided.

Perhaps a good moment for someone to contribute such an example then. I'm sure it would help the next docs reader understand what's going on more quickly.

1 Like

That code doesn't work. It produces a boolean rather than the sum. Beware of plausible-sounding BSโ€ฆ

4 Likes

You are so right! Probably Chatgpt won't take that many jobs away if doing the job right matters!

what's the best way? seems like a (minor) contribution not a bug to report on GitHub with a PR.

A PR is indeed likely the right way forward! There's even a ongoing Racket summer event focused on examples as well. :slightly_smiling_face:

In case you may not have encountered it yet, Racket docs will reveal their source path if you click on the section header. So for the page mentioned above, you can see that the main document is https://github.com/racket/racket/tree/master/pkgs/racket-doc/scribblings/guide/guide.scrbl, and then from there, a little searching / guessing reveals the part in question: https://github.com/racket/racket/blob/e9376c099a53825ef31f233ca80f42641ebaa26a/pkgs/racket-doc/scribblings/guide/performance.scrbl#L365

2 Likes

Btw - the corresponding thread on the Julia forum show cases a really nifty benchmarking tool. Makes for a neat project idea.

2 Likes

Woah that's really cool!

My personal favorite is hyperfine: GitHub - sharkdp/hyperfine: A command-line benchmarking tool

2 Likes

My version is a small rewrite of the one posted before by @bogdan that has a similar speedup. My version is IMHO better as a hint to improve the compiler someday.

The idea is that flonums are usually stored in secret boxes.For example in (vector (cons 1 2) 3.0) the vector has a pointer to the object that represent the pair and a pointer to the secret box with a flonum 3.0.

In this case, if you wrap the expression (for {something ...}) as (fl+ (for {something ...})) the compiler will notice that the internal variables of the loop are used only by fl-something functions and avoid creating the secret boxes. (The link by @sorawee has more detailed info.)

I agree. The fastest code should by the more idiomatic and nicest code. Sometime it's not possible, but it's a good objective. In this case, I think the compiler can be smart enough to make the fl+ wrapper unnecessary. I didn't wrote the flonum unboxig code, so I'm not sure of the details, but my guess is that it's possible and I made a feature request. (I can try to take look, but not now. Perhaps in a few months.)

Great, thanks. Does Racket have homogeneous vector types like sbcl?

  • Lewis

Here's a first step to that: bench.rkt ยท GitHub
It prints the time series of timings, not the histogram of log(frequency) but that should be an easy next step.

Example output:

> (bench (repeat vector-sum-for/fold 10 100000 0.5) 50)
โ–ˆโ–ˆโ–ˆโ–‚โ–‚โ–‚โ–‚โ–‚โ–‚โ–‚โ–‚โ–‚โ–‚โ–‚โ–‚โ–‚โ–‚โ–‚โ–‚โ–‚โ–‚โ–‚โ–‚โ–‚โ–‚โ–‚โ–‚โ–‚โ–‚โ–‚โ–‚โ–‚โ–‚โ–‚โ–‚โ–‚โ–‚โ–‚โ–‚โ–‚โ–‚โ–‚โ–‚โ–‚โ–‚โ–‚โ–‚โ–‚โ–‚โ–‚
โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ
>
3 Likes

Yes. See:

Once again, these functions are crippled by the lack of function overloading. This should be TOPS for making Racket a truly modern, functional language.

We don't have in-fl64vector. And we would need about 12 or so new functions to cover all the types of ffi based vectors.

This problem crops up a lot in the entire Scheme family. It's partly the lack of function overloading and it's partly the lack of container polymorphism: functions on containers that don't depend on element types. This exists in spades for lists, but not for other container types (OK, maybe hash maps, which I haven't played with very much...).

And so we can't write performant code using c-style vectors. Trying to find the source of in-flvector to write a version for f64vector. Where would I find it?

Look in flonum.rkt and private/vector-wraps.rkt.

https://github.com/search?q=repo%3Aracket%2Fracket%20define-vector-wraps&type=code

I just added in-@vector functions to my SRFI-160 port.

(What I miss is not having support for these number types and their vectors baked into Typed Racket. And no longer having 32-bit floats; Racket lost those with the switch to Chez)

@soegaard I can't make hide nor hair out of any of that code. Some of the files just contains (provides ...) to export functions. Then there must be dozens of calls to define-syntax-rule, which probably creates some global functions that are used elsewhere in the complex module. I can't find an actual definition of a function that begins with in-vector or in-flvector.

There are also calls to define-in-vector-like and make-in-vector-like. But, I have never encountered syntax like this:

(define-in-vector-like (in-fXvector* check-fXvector)
       fXvector-str fXvector? fXvector-length :fXvector-gen)

What does the preceding do? Is it a call to a function that defines something? What does it define? What is the name of what it defines? Assuming this is actually a call, what the heck does it return? How is the result of the call consumed since it is at top-level.

it is part of the larger definition for define-syntax-rule, which is very cryptic code. it's primary input is the result of a call to define-syntax-wraps, with many input arguments--which probably define some list of function names or functions that will be created on the fly. Since, these are ephemeral it's hard to see what their definitions are. It's all very functional and probably beautiful in Scheme, but seemingly impenetrable.

This whole mess provides only define-vector-wraps. So, this must be called somewhere else where the sought after functions are actually created.

One of the places where this function is used is flonum.rkt. This provides in-flvector, so this is is where in-flvector is presumably defined.

Later in the file is more of the earlier peculiar syntax:

(define-vector-wraps "flvector"
  "flonum?" flonum?
  flvector? flvector-length flvector-ref flvector-set! make-flvector
  unsafe-flvector-ref unsafe-flvector-set! unsafe-flvector-length
  in-flvector*
  in-flvector
  for/flvector
  for*/flvector
  flvector-copy
  0.0
  check-flvector)

Again, is this a call to define-vector-wraps? Is the rest of the form arguments? Where do they get a value? Is the result a list of the definitions for all of these functions? Seems like that must be true. Where on earth do the values come from? Somewhere else these values/definitions must be supplied.

How does anyone follow so many layers of indirection? There are 28 files in all of the repos of Org Racket that refer to in-flvector. Most are uses of (references to) in-flvector.

One of the files contains a comment that in-flvector is defined in racket/flonum, which is a library--so that only narrows things down a little bit. So, I have to find that library. The repo racket/racket has a variety of directories that seem to contain libraries. Or is there a different repo for Org Racket that has these libraries? Github doesn't make it very easy to navigate very extensive repos.

If I search everywhere in repos of org Racket I get these, below:

Not racket/libs: these are compiled libs for various platforms.
math-lib/math/flonum.rkt: here we just find a bunch of requires and provides...
racket/racket/packages: nothing called flonum here...
racket/racket/src: doesn't look like it's here...
racket/racket/collects: maybe deeply nested here assuming collects refers to collections

  • vector-wraps.rkt is here, but that is just the mysterious code above
    racket/racket/collects/racket/flonum.rkt:

    • this file provides in-flvector, so should be something here but not really. Just a list of exported function names in a provides statement and the above shown mysterious use of define-vector-wraps shown above, with no actual code that defines the function in-flvector.

So, pretty much a complete dead end. I am afraid I can't find any way to contribute to in-f64vector. I supposed I don't need to as an flvector is double precision floating point already, just not in c-format, which I don't need since I am not using ffi or any package reliant on ffi.

@shawnw Where would I find this? I searched all of GitHub and only found srfi-160 for Gauche, chili-scheme, gerbil, saggitarius-scheme and chicken-scheme.

Hi @lewisl

The idea is to share the code that defines in-vector, in-flvector, ... as well as for-fvector, for-flvector, .... The macro define-vector-wraps simply uses the specialized functions and defines the all the interesting macros at once.

In the case of f64vector we can do as follows:

#lang racket

(require '#%flfxnum 
         racket/private/vector-wraps
         racket/unsafe/ops
         ffi/vector)

(define unsafe-f64vector-length f64vector-length)

(define-vector-wraps "f64lvector"
  "flonum?" flonum?
  f64vector? f64vector-length f64vector-ref f64vector-set! make-f64vector
  unsafe-f64vector-ref unsafe-f64vector-set! unsafe-f64vector-length
  in-f64vector*
  in-f64vector
  for/f64vector
  for*/f64vector
  f64vector-copy
  0.0
  check-f64vector)

(define v (f64vector 1. 2. 3. 4.))
(for/list ([x (in-f64vector v)]) x)
(for/sum ([x (in-f64vector v)]) x)

Note that unsafe-f64vector-length seems to be undefined.
So there is still a side quest, to sqeeze out a little more performance.

I haven't studied macros yet.

What makes these things macros?

The name place being occupied by a string?