How to speed up array indexing from untyped Racket

The arrays from the math library performs best when using operations that work on an entire axis (think row or column) at a time. Using array-ref and array-set! in a loop will be slower than using array-ref and array-set! from Typed Racket due to contract checking of the array.

But we can reduce the number of times the contract is checked.

(: array-getter (All (A) ((Array A) -> (Indexes -> A))))
(define (array-getter arr)
  (λ (js)
    (array-ref arr js)))

The idea is to use let array-getter getter check that the input is an array,
and return a function that only needs to check the type of the indexes.

It is used like this:

(define M   (mutable-array #[#[0 1] #[2 3]]))
(define M_  (array-getter M))
(M_ #(0 0))

As an extra speed boost unsafe-array-ref can be used instead.

The same approach works for array-set!.

#lang typed/racket/base
(require math/array)
(provide array-getter array-setter unsafe-array-getter unsafe-array-setter)

(: array-getter (All (A) ((Array A) -> (Indexes -> A))))
(define (array-getter arr)
  (λ (js) (array-ref arr js)))

(: array-setter (All (A) ((Settable-Array A) -> (Indexes A -> Void))))
(define (array-setter arr)
  (λ (js value) (array-set! arr js value)))

(: unsafe-array-getter (All (A) ((Array A) -> (Indexes -> A))))
(define (unsafe-array-getter arr)
  (λ (js) (unsafe-array-ref arr js)))

(: unsafe-array-setter (All (A) ((Settable-Array A) -> (Indexes A -> Void))))
(define (unsafe-array-setter arr)
  (λ (js value) (unsafe-array-set! arr js value)))

A small benchmark reveals that the speed up from untyped Racket is well worth using array-getter or unsafe-array-getter.

array-ref          cpu time: 4508 real time: 4531 gc time: 234
unsafe-array-ref   cpu time: 3856 real time: 3874 gc time: 186
array-getter       cpu time: 2091 real time: 2101 gc time: 85
unsafe-array-ref   cpu time: 1355 real time: 1362 gc time: 55

The benchmark for the setter version:

array-set!           cpu time: 5978 real time: 6007 gc time: 244
unsafe-array-set!    cpu time: 5316 real time: 5341 gc time: 223
array-setter!        cpu time: 2296 real time: 2305 gc time: 90
unsafe-array-setter  cpu time: 1563 real time: 1570 gc time: 61

The whole benchmark:

#lang racket/base

(require math/array
         "array-experiment.rkt")

(define M   (mutable-array #[#[0 1] #[2 3]]))

(define M_  (array-getter M))
(define M!  (array-setter M))
(define %M_ (unsafe-array-getter M))
(define %M! (unsafe-array-setter M))

(M_ #(0 0))
(M! #(0 0) 42)
(M_ #(0 0))

(define js (vector 0 0))

(time (for ([n (in-range 100000)])
        (array-ref M js)))

(time (for ([n (in-range 100000)])
        (unsafe-array-ref M js)))

(time (for ([n (in-range 100000)])
        (M_ js)))

(time (for ([n (in-range 100000)])
        (%M_ js)))

;; cpu time: 4508 real time: 4531 gc time: 234
;; cpu time: 3856 real time: 3874 gc time: 186
;; cpu time: 2091 real time: 2101 gc time: 85
;; cpu time: 1355 real time: 1362 gc time: 55


(time (for ([n (in-range 100000)])
        (array-set! M js 42)))

(time (for ([n (in-range 100000)])
        (unsafe-array-set! M js 42)))

(time (for ([n (in-range 100000)])
        (M! js 42)))

(time (for ([n (in-range 100000)])
        (%M! js 42)))


;; cpu time: 5978 real time: 6007 gc time: 244
;; cpu time: 5316 real time: 5341 gc time: 223
;; cpu time: 2296 real time: 2305 gc time: 90
;; cpu time: 1563 real time: 1570 gc time: 61

6 Likes

I forgot to include a base line.

(define v (vector (vector 0 1)
                  (vector 2 3)))

(time (for ([n (in-range 100000)])
        (vector-ref (vector-ref v 0) 0)))

This is fast:

cpu time: 1 real time: 2 gc time: 0

In so fast that the conclusion sadly is that the arrays in math/array are unsuable from standard Racket.

I must admit I don't understand, why the array-getter trick doesn't avoid overhead. My expectation was the the array was type checked once only.

Any ideas to fix this are appreciated.

You're imagining a different kind of implementation for checking than what actually happens. Checking the array ends up with wrappers, and there's a wrapper on the result of your array-getter function as well. It's the wrappers that really cost here, and you can't get to a reasonable speed without eliminating them somehow.

3 Likes

How long does the small benchmark program take in Typed Racket?

1 Like

@Gambiteer
Sorry, I missed the question.

If I change the line #lang racket to #lang typed/racket I get:

That doesn't say much, so I ran this program instead (x100).

(define v (vector (vector 0 1)
                  (vector 2 3)))

(time (for ([n (in-range 10000000)])
        (vector-ref (vector-ref v 0) 0)))

The time for Typed Racket:
cpu time: 30 real time: 30 gc time: 0
The time for Racket:
cpu time: 47 real time: 47 gc time: 0

I guess I'm curious how long this takes in Typed Racket.

Is this what you meant?

1 Like

I successfully confused myself and my post above was only the base line in Typed Racket.

Below is the whole benchmark in Typed Racket.

#lang typed/racket/base

(require math/array
         "array-experiment.rkt")

(define M   (mutable-array #[#[0 1] #[2 3]]))

(define M_  (array-getter M))
(define M!  (array-setter M))
(define %M_ (unsafe-array-getter M))
(define %M! (unsafe-array-setter M))

(M_ #(0 0))
(M! #(0 0) 42)
(M_ #(0 0))

(define: js : Indexes (vector 0 0))

(time (for ([n (in-range 100000)])
        (array-ref M js)))

(time (for ([n (in-range 100000)])
        (unsafe-array-ref M js)))

(time (for ([n (in-range 100000)])
        (M_ js)))

(time (for ([n (in-range 100000)])
        (%M_ js)))

;; cpu time: 8 real time: 8 gc time: 0
;; cpu time: 3 real time: 3 gc time: 0
;; cpu time: 9 real time: 10 gc time: 0
;; cpu time: 5 real time: 5 gc time: 0

(time (for ([n (in-range 100000)])
        (array-set! M js 42)))

(time (for ([n (in-range 100000)])
        (unsafe-array-set! M js 42)))

(time (for ([n (in-range 100000)])
        (M! js 42)))

(time (for ([n (in-range 100000)])
        (%M! js 42)))

;; cpu time: 9 real time: 9 gc time: 0
;; cpu time: 4 real time: 4 gc time: 0
;; cpu time: 44 real time: 46 gc time: 32
;; cpu time: 6 real time: 6 gc time: 0

@samth

You are right, I don't have a mental model of how Typed Racket wraps the values.
And thus the following question might be naïve:

Is it possible to make a new "type" UnsafeIn that could be used like this:

#lang typed/racket
(: sum (-> (UnsafeIn (Listof Integer))  Integer))
(define (sum xs) (apply + xs))
(sum (list 1 2 3))

Within the Typed Racket module it works exactly the same as:

#lang typed/racket
(: sum (-> (Listof Integer)  Integer))
(define (sum xs) (apply + xs))
(sum (list 1 2 3))

But if sum is called from untyped Racket, the type check of xs is simply omitted.

Alternatively, one could imagine an optional second argument:

(UnsafeIn <type> <cheaper-type>)

where is a type that leads to a cheaper contract.

In the example it could be used as (UnsafeIn (Listof Integer) Pair).
This would of course still lead to unsafe code.

Just a quick thought - is it gonna be just fixnums? Because if that is true, than I suggest developing the algorithm using Typed Racket and then just manually translate it to fxvectors using racket/unsafe/ops. Same applies for flonums. But definitely you have to handle multi-dimensionality yourself as those are one-dimensional by definition.
Or maybe unsafe-provide[1] is what you are looking for? It provides given provide spec without installing any contracts - so plain Racket code can use it without checking.

[1] 11 Unsafe Typed Racket operations

1 Like

The arrays from math/array can have type (Array A) where A is an arbitrary type.

@samth

Another idea:

Have a form bless used as (bless value A). It creates an instance of a bless structure that contains the blessed value and some kind of "certificate" that identifies the type A.
It will have the type (Bless A).

When a contract is made for a function with the type (-> (Bless A) Any) it will use the "certificate" and if it matches A the contract checker just accepts the value without further checks.

This approach stops accidently passing a value of a wrong type to typed code. But it's still unsafe, since on the untyped side the value could be extracted and mutated.

Is this feasible?

Sorry for resurrecting this ancient thread. With the release of shallow typed racket, I'd be interested to know if it could help with performance when using the math lib from untyped racket.

Perhaps I'm misunderstanding what shallow typed racket really is, but couldn't it avoid the checking of deeply nested contracts? In that case, would providing a shallow typed racket wrapper for the math lib be enough? I think it's such a shame that such a beautiful math library is limited by those existing performance issues! :frowning:

1 Like