TIL: Inherited fields in structure subtypes are accessed with the accessor from the supertype

I thought you'd get at all fields in the subtype with accessors derived from the name of the subtype. No -- for supertype fields you use the supertype accessors.

For example, if:

  • A position is a pair of numbers: (struct posn (x y)); and
  • A 3d position is a kind of position, with an extra number: (struct 3d-posn posn (z))

Then the three components of (define p (3d-posn 10 20 30)) are:

(posn-x p), (posn-y p), and (3d-posn-z p).

not, say, (3d-posn-x p) as I had imagined.

The docs explain: 5 Programmer-Defined Datatypes

8 Likes

I've bumped into this.

Although I often access struct members using match or match-define forms -- not the accessor functions -- the same thing crops up if you use the struct* match form.

That is, although you could get all the members by-position:

(match-define (3d-posn x y z) p)

You can't do this:

(match-define (struct* 3d-posn ([x x] [y y] [z z])) p)
; struct*: field name not associated with given structure type
;   at: x
;   in: (struct* 3d-posn ((x x) (y y) (z z)))

As documented. But it's kind of a shame because matching structs by-position isn't as maintainable in the face of changing the struct fields someday.




p.s. I guess you could do something like:

(match-define (and (struct* posn    ([x x] [y y]))
                   (struct* 3d-posn ([z z])))      p)

But that's a mouthful. Instead I'd probably just write out:

(define x (posn-x p))
(define y (posn-y p))
(define z (3d-posn-z p))

Especially in lieu of a match-define. (If I needed full match's combination of conditionals and binding, and all the other match patterns were straightforward, I might be more tempted to use that and hairball. Maybe?)

I guess one moral of the story is that simple, explicit code is sometimes more tedious... and sometimes clearer. :smile:

2 Likes

I guess the example is mostly to illustrate the behavior, but in this particular example my preferred way is to simply avoid using sub/super-types for this.

Use explicit conversion instead

So instead I would do this:

(struct pos2 (x y))
(struct pos3 (x y z))

(define (pos2->pos3 p)
  (match-define (pos2 x y) p)
  (pos3 x y 0))

This way you just deal with 2 separate simple types and convert between them explicitly where it is needed, I like that this makes things more explicit.

false simplicity, that leads to complicated code

Of course you could argue that subtypes are more convient because you can just treat a 3d-pos as a 2D pos, but I would argue that that isn't necessarily a good thing. Adding a (pos2 1 2) and (pos3 1 2 3) and getting (pos2 2 4) as result doesn't make that much sense.
Instead I would expect it to either error saying it doesn't implement adding to different types or return (pos3 2 4 3) which doesn't just silently drop the 3D-ness of one operand; and you don't get that without writing code that deals with the actual types. (don't make it more generic than it needs to be)

So I prefer when I am forced to explicitly write code that does what I want to happen, over code that just chooses one of the options that may or may not be appropriate.

I also think going from a 3D to a 2D there are 3 variants that make sense xy xz yz (ignoring one of the dimensions). So I don't think choosing one dimension to be auto converting while the others have to be converted through other explicit functions is good. Why?
Because when you keep things symmetrical (independent on what axis or plane you are focusing on), you end up being able to write code that is generic and fast. Instead of needing to special case based on what axis you are dealing with (e.g. sometimes you can use a macro to generate all the cases which means you type it once but it performs as if you had typed out all the repeating lowlevel permutations based on axis/plane).

Maybe there is even an argument to call pos2->pos3; pos2->xy-pos3 instead and have pos2->xz-pos3 and pos2->yz-pos3 too, (and then you could define arbitrary planes within the 3D space that contain the point) but I am getting too off topic...

Where do I use subtypes / or something else?

Apis / Interfaces / Runtime structure / organization

I think subtypes have their uses e.g. when you use them to create an api/interface that has different implementations that need their own specialized internal invariants/handles to resources etc., where you keep the generic stuff in the supertype and the specific in the subtype. (generics can be useful too)

I also want to mention that sometimes explicit composition by just having a struct member that is another struct might be a good choice.

I don't use classes often, but they don't leak their class name into every method name, so depending on what you are actually implementing they might be the natural choice for polymorphic things. (similar with generics)

Lowlevel

For things like positions and other lowlevel mathy stuff I want them to be as simple as possible with least amount of indirections because those are more likely to have a lot of instances and it is good when those can be easily transformed into a contiguous memory segment that is just pure data without pointers to other places. (Because then you can get good performance)
So I keep those as plain structs which are then transformed to buffers of data containing multiple positions one level more lowlevel.

I would avoid polymorphism for lowlevel stuff (and I prefer math to be lowlevel if possible)

Questions

This reminds of this post, because I feel like there are some questions here that could be asked:

What are good use cases for structs with super-/subtypes?
Where do you like to use them and how does it help you?

2 Likes

This really seems like it's a bug in struct* that should be fixed.

To be fair the struct* documentation discloses this limitation, but I agree it would be nice to change.

As long as the hood is open, I'd suggest another, "ergonomic" improvement: Allow x as a shorthand for [x x]. (In my experience a common case is wanting the variable name to be the same as the field name.)

Combining both of these, the struct* pattern example above could be either of these, which are equivalent:

(struct* 3d-posn (x y z))
(struct* 3d-posn ([x x] [y y] [z z])
1 Like

@samth I guess one wrinkle is that super struct field names need not be unique. For example a field named x may exist both in a struct and its super struct:

(struct a (x))
(struct b a (x))
(define v (b 1 2))
(a-x v)
(b-x v)

So struct* would need to let you specify which struct's "x" field was desired. Some syntax like a:x or a.x or other TBD?

Plus some behavior for just "x" when it's ambiguous, like use the most-immediate struct, or error, or other TBD?

Maybe when struct* was originally written this seemed like not worth bothering, or not deciding solo in a rush, and instead just stop there and document a limitation.

The Racket exceptions, exn, are an example of structs with super/subtypes.

However, I think that structs should be used only for very simple cases. Objects and classes are a more elegant solution to many of the struct related issues I have seen discussed here...


I also agree with you that pos2 and pos3 should be different types, but in my experience they should not be structs at all. There are two broad cases where I had to use positions:

  • either I have lots of positions, and it is faster and more convenient to store the coordinates for all the positions in a single vector next to each other -- I use this technique to display a few million data points on an interactive map (in Racket, no less :slight_smile: )
  • an alternative to the above is to store the X coordinates in a vector and the Y coordinates in another vector.
  • or, I have a position which needs to be estimated from different measurements, in which case it is stored as 3 elements in a matrix together with other parameters such as orientation, speed, angular velocity, etc as part of a Kalman filter.

Alex.

2 Likes

I've used structure inheritance in the implementation of a lexer. In an alternative design, I could have added a type field to the token struct, but I would need to define predicates for all "subtypes." Also, there was no requirement that the subtypes of token be that open of a set.


(struct token (srcloc value) #:transparent)
(struct identifier      token () #:transparent)
(struct binary-selector token () #:transparent)
(struct keyword         token () #:transparent)
(struct block-argument  token () #:transparent)
(struct delimiter       token () #:transparent)
(struct opener          token () #:transparent)
(struct closer          token () #:transparent)

3 Likes