Arrays proposal: array broadcasting, and adding new axes

I didn't know whether to start a new thread for each topic, or to stick with one thread. I started a new thread, but can do the other if more appropriate.

When composing this message, I didn't know how much background to give, so I probably erred on the long side.

First the background, then the proposal.

Brad

Background:

SRFI 231 does not have array broadcasting, either explicit or explicit, or a way to add new axes to an existing array.

Array broadcasting repeats an existing array or scalar as array "slices" along one or more new axes (or existing, length one, axes). Broadcasting is very common in array libraries---it's in Racket's math/array, Python's NumPy, Stan, Julia, Javascript stdlib, Matlab, etc.; links can be found here:

But I didn't see a need for broadcasting in SRFI 231 given that SRFI's other features.

For example, in Python broadcasting is used to implement both APL-style inner and outer products. (References are in the message linked above, and there are a few other comments in that thread.) SRFI 231 has APL-style array-inner-product and array-outer-product directly.

Or, broadcasting is used to simplify notation, to write A x s for an array A and a scalar s and have the value of s "broadcast" to an implicit array the same size and shape as A for component-wise multiplication.

Or, to use another common example, one could have a two-dimensional array A and a one-dimensional vector v, interpreted as a row vector, and write A-v to subtract v from each row of A.

But in SRFI 231 these examples are not so hard to do, because the SRFI has array-map (which deals directly with the first example) and array-curry, which decomposes an array by dimension; in the second example one would interpret the two-dimensional array A as a one-dimensional array of one-dimensional arrays, and then use array-map to subtract v from each of those one-dimensional rows (as arrays themselves). The array broadcasting approach would extend the vector v to a two-dimensional array of the right size and shape and the subtract that new array from A component-wise.

A further complication is that SRFI 231 arrays can have nonzero lower bounds in indices, while the other systems have (implicit) zero lower bounds in each dimension.

SRFI 231 does have a very general

(array-share old-array new-domain mapping-from-new-domain-to-old-domain)

but the SRFI requires that the affine mapping-from-new-domain-to-old-domain from new multi-indices to existing multi-indices be one-to-one, so that mutating one array entry does not change a different array entry. To write broadcasting in this way necessarily requires that the mapping be many-to-one. (By the way, it appears that checking whether the mapping is 1-1 is equivalent to the small lattice vector problem, which I hear is NP-hard.)

Proposal:

In the mail thread starting here:

I laid out some of the issues I found problematic when considering integrating array broadcasting into SRFI 231 and then made the following initial proposal for new functionality.

I invite comments about this, or any other, aspect of the array SRFI.

==============================

Procedure: (array<-scalar obj)

Returns (make-array (make-interval '#()) (lambda () obj)). (People seem
to like treating scalars like arrays, so let's make it easier to do so.)

Procedure: (array-add-axes array vector)

Sample calls:

(array-add-axes A '#(* 0 1 * 2 * 3))
(array-add-axes A (vector '* 0 1 '* 2 '* 3))
(array-add-axes A '#(* * *))      ;; if A is zero-dimensional

Assumes that the first argument A is an array with dimension d.

Assumes the second argument is a vector v whose elements are the exact
integers 0, 1, ..., d-1, in order, interspersed with zero or more
instances of the symbol *. [In a later message I note that there is no real reason to require that the integers be in order, in which case this routine would generalize array-permute.]

Returns a new array A* that shares its elements with A in the same
lexicographical order.

Any entry in v with a nonnegative integer j entry denotes an axis in the
domain of A* with the same lower and upper bounds as the jth axis of
domain of A.

Any entry in v with the symbol * denotes a new axis in the domain of A*
with lower bound 0 and upper bound 1.

[In a later message, I ask whether array-squeeze, which is a partial inverse to array-add-axes and is given in the SRFI document as an example, should be added to the SRFI if array-add-axes is added to the SRFI.]

Procedure: (array-broadcast list-of-As)

Assumes that list-of-As is a nonempty list of immutable arrays.

Let D be the maximum of the dimensions of the arrays in list-of-As.

A new list of arrays, list-of-A*s, is constructed by adding axes to the
left of existing axes of the arrays in list-of-As with array-add-axes so
that all arrays in list-of-A*s have dimension D.

The arrays in list-of-A*s are now required to have the following
properties, checked in order:

  1. If the domain of any A* in list-of-A*s has an axis j with nonzero
    lower bound, then the domains of all arrays A* in list-of-A*s must have
    the same lower and upper bounds for axis j.

  2. If the domain of any A* in list-of-A*s has an axis j with an
    upper-bound u_j different from 1, then axis j in the domain of all
    arrays A* in list-of-A*s must have either upper bound u_j or upper bound 1.

We then use the usual broadcast rules (which we have not yet specified
here) to form a new list-of-A**s containing arrays A** all with the same
domain.

1 Like

This is a detail, but I am curious about the choice of array<-scalar over scalar->array.
Am I missing some convention?

I can't recreate my thinking, it should really be scalar->array.

1 Like

Maybe scalar->array isn't needed, it's a one-line definition:

> (define (scalar->array obj) (list*->array 0 obj))
> (array->list* (scalar->array 42))
42

I guess the question is how would people want to use it.

I would include scalar->array since the operations matches a conceptual operation.
The other expression (list*->array 0 obj) requires switching to a more low-level way of thinking.

Yes, good.

But the operation wraps any value, not just a scalar and possibly another array, in an array of dimension zero, as in

heine:~/text/courses> /usr/local/racket/bin/racket
Welcome to Racket v8.16.0.2 [cs].
> (require math/array)
>  (array-shape (array (array #[0 1 2])))
'#()

So maybe object->array? Just ->array? (But the latter might be confusing, because people might think that (->array '#(0 1 2)) would process the vector into an array with three elements.)

But the operation wraps any value, not just a scalar ...
You are right, I didn't further than scalars.

I think object->array is a fine choice.

Should object->array have the calling sequence

(object->array object [ storage-class [ mutable? ] ] )

like the other conversion routines, or just the simpler

(object->array object)

which returns an immutable array with generic-storage-class?

The advantage of the second is that there is no error checking needed, no possible way to raise an exception, and I can't see the benefit of the first except for a certain uniformity of calling sequence.

Maybe I'm just tired, but I've been going back and forth on this while implementing it.