A new way to type mutable data structures with opposite variance in typed racket

Hi everyone,

I have been experimenting with a way to define types with opposite variance in typed racket (see also Is it possible to make Mutable Types support subtype? · racket/typed-racket · Discussion #1129 · GitHub). The idea is to use a struct with two type arguments, one for the input type and one for the output type, and then use a Parameter type to constrain them. For example, here is how I defined a type for queue, a mutable data structure from data/queue:

#lang typed/racket/base

(require typed/racket/unsafe)

(struct (S T) Queueof ([_ : (Parameter S T)]))
(define-type (Queue S T) (Queueof S T)) ; avoid printing #(struct:Queueof ...)
;; TODO (Queue T) -> (Queue T T)
#;(struct (A ...) _ ([_ : (Parameter A ...)]) #:type-name Queue) ; not work well
(provide Queue)

(unsafe-require/typed/provide
 data/queue
 [make-queue (All (S T) (-> (Queue S T)))]

 [enqueue! (All (S) (-> (Queue S Any) S Void))]
 [enqueue-front! (All (S) (-> (Queue S Any) S Void))]
 [dequeue! (All (T) (-> (Queue Nothing T) T))]

 [queue->list (All (T) (-> (Queue Nothing T) (Listof T)))]
 [queue-length (-> (Queue Nothing Any) Index)]

 [queue? (pred (Queue Nothing Any))]
 [queue-empty? (-> (Queue Nothing Any) Boolean)]
 [non-empty-queue? (All (S T)
                        (case->
                         (-> (Queue S T) Boolean)
                         (-> Any Boolean : #:+ (Queue Nothing Any))))])

I think this way of defining types with opposite variance is interesting and useful for mutable data structures. I wonder if it is safe to use, and if it is, can we redefine some built-in mutable types in this way? For example, can we redefine mutable vector or mlist in this way?

4 Likes

I encapsulated the above code into the typed-data-queue-lab package and used it to implement typed/amb, and everything seems to be working fine.

I came across an interesting discovery. Typed Racket defines Nothing as the
subtype of all types. At first glance, it seems intuitive to assume that
(Listof Nothing) should be a subtype of all List types. Furthermore, since
Null is also a subtype of all List types, one might expect (Listof Nothing)
and Null to be the same type.

However, Typed Racket doesn't see it that way. According to the type checker,
Null is indeed a subtype of all List types, but (Listof Nothing) is not.
Here's a simple example to illustrate this behavior:

Welcome to Racket v8.15 [cs].
> (ann (ann '() Null) (Listof Nothing))
- : (Listof Nothing)
'()
> (ann (ann '() (Listof Nothing)) Null)
string:1:10: Type Checker: type mismatch
  expected: Null
  given: (Listof Nothing)
  in: (quote ())
 [,bt for context]

At first, I thought this was simply an implementation choice. But I realized I
was wrong! There are actual, concrete lists that have the type (Listof Nothing)
and are not null.

This distinction becomes clear when considering situations where Typed Racket
might support contravariant types. Imagine converting mutable data to immutable
data: the first step would be to treat the mutable data as read-only. But what
if the mutable data was write-only? Here's a concrete example:

Welcome to Racket v8.15 [cs].
> (require unsafe/typed/data/queue)
> (: q (Queueof Any Nothing))
> (define q (make-queue))
> (enqueue! q 1)
> (enqueue! q 2)
> (enqueue! q 3)
> (queue-length q)
- : Integer [more precisely: Index]
3
> (define l (queue->list q))
> (:print-type l)
(Listof Nothing)
> (length l)
- : Integer [more precisely: Index]
3
> (null? l)
- : Boolean
#f

Here, the queue q is write-only (Queueof Any Nothing), but after converting
it to a list l, we obtain a non-Null list with the type (Listof Nothing).

This behavior sheds light on a subtle but important aspect of Typed Racket's type
system: (Listof Nothing) and Null are fundamentally different because
(Listof Nothing) can represent lists containing elements that can never be read
or observed due to type constraints!

After digging deeper, I realized there's a problematic aspect to this behavior in
Typed Racket. Immutable data types like List in TR are always readable, so
attempting to access the elements of a list with the type (Listof Nothing) will
result in a type error that Typed Racket cannot detect. Here's an example:

> (list-ref l 0)
- : Any [more precisely: Nothing]
1
> (list-ref l 1)
- : Any [more precisely: Nothing]
2
> (list-ref l 2)
- : Any [more precisely: Nothing]
3

In this case, the list l is typed as (Listof Nothing), meaning it should not
contain any observable elements. However, list-ref allows access to the values
stored in it, which leads to a mismatch between the expected type (Nothing) and
the actual runtime behavior.

I found another issue related to (Listof Nothing). When we define a predicate
using define-predicate to check if a list is of type (Listof Nothing), the
predicate does not produce correct Boolean values. Here's an example:

> (define-predicate ls-of-none? (Listof Nothing))
> (ls-of-none? '())
- : True
#t
> (ls-of-none? '(1))
- : False
#f
> (ls-of-none? l)
- : True
#f

A simpler example:

Welcome to Racket v8.15 [cs].
> (require unsafe/typed/data/queue)
> (: q (Queueof Any Number))
> (define q (make-queue))
> (enqueue! q 123)
> (dequeue! q)
- : Number
123
> (enqueue! q 'hi)
> (dequeue! q)
- : Number
'hi

After further analysis, it seems the root of this issue lies in how Typed Racket's
type checker works. When a function is called, the type checker knows the types
of both the arguments and parameters, so it can verify that the input matches the
expected type. However, after the function call, the type checker only knows the
declared return type, not the actual runtime type, since functions are treated as
"black boxes" in this context.

The Parameterof in Typed Racket addresses this issue in two ways:

  1. Static Type Enforcement in Typed Modules:
    In a Typed Racket module, make-parameter enforces that the "write type" must
    always be a subtype of the "read type":
Welcome to Racket v8.15 [cs].
> (: p (Parameter Any Number))
> (define p (make-parameter 'hi))
string:1:10: Type Checker: Polymorphic function `make-parameter' could not be applied to arguments:
Types: a  -> (Parameterof a)
       b (-> a b)  -> (Parameterof a b)
Arguments: 'hi
Expected result: (Parameterof Any Number)
  in: (make-parameter (quote hi))
 [,bt for context]
> (:print-type make-parameter)
(All (a b) (case-> (-> a (Parameterof a)) (-> b (-> a b) (Parameterof a b))))

Here, make-parameter ensures that 'hi cannot be used as an initial value for
a (Parameter Any Number).

  1. Contracts for Untyped Modules:
    For parameters imported from an untyped module, Typed Racket adds a contract to
    enforce the type at runtime. This ensures that any violation of the declared type
    is caught dynamically:
> (module m racket/base
    (provide q)
    (define q (make-parameter 'hi)))
> (require/typed 'm
    [q (Parameter Any Number)])
> (q)
q: broke its own contract
  promised: number?
  produced: 'hi
  in: the parameter of
      (parameter/c any/c number?)
  contract from: (interface for q)
  blaming: (interface for q)
   (assuming the contract is correct)
  at: string:2:3
 [,bt for context]
> (q 123)
> (q)
- : Number
123
> (q 'hi)
> (q)
q: broke its own contract
  promised: number?
  produced: 'hi
  in: the parameter of
      (parameter/c any/c number?)
  contract from: (interface for q)
  blaming: (interface for q)
   (assuming the contract is correct)
  at: string:2:3
 [,bt for context]

Here, the contract ensures that 'hi is flagged as invalid because it does not
conform to the expected Number type.

These two mechanisms together ensure that Parameter behaves correctly in both
typed and mixed-typed contexts.

1 Like

It seems there are only three situations where unsafety can occur:

  1. Directly reading write-only data.

  2. Converting write-only data into another type:
    When write-only data is converted into another type, the type checker cannot ensure that the behavior of the target type matches expectations. Since write-only data cannot be read, converting it into a type that assumes readable behavior can lead to potential type errors.

  3. Writing to a value where the write type is a supertype of the read type, then reading it:
    In this case, the write type allows a broader range of values than the read type. Writing a value that doesn’t conform to the read type and then attempting to read it will result in a type mismatch at runtime.

I've recently uploaded the typed-racket-mutable-treelist-lab package, which extends this idea to Racket's new mutable-treelist data structure. Typed Racket has historically struggled with handling mutable types due to the challenges of ensuring type safety and variance. This new package leverages contravariant type variables to define a type-safe interface for mutable-treelist.

While this approach is not suitable for retrofitting existing mutable types, the introduction of mutable-treelist in Racket provided a good opportunity to experiment with defining contravariant types. I hope this package addresses some of the past difficulties with using mutable data structures in Typed Racket and encourages further exploration of this method.

I've also replaced queue with mutable-treelist in my implementation of amb. In terms of performance, the two are nearly identical, but mutable-treelist offers a far richer set of operations compared to queue.

Forgive me, I'm very confused. At first, I was thinking that you had discovered a serious problem with the soundness of typed racket. Then I looked more closely and saw that you were using something called unsafe/typed/data/queue. My assumption is that when you use an unsafe data structure, the responsibility is on you (the user of the package) to ensure that you do not violate the type constraints of the data structures you construct... but then you do exactly that, when you enqueue values that don't belong to the (uninhabited) type Nothing. At that point, I would say that all bets are off, you've "cheated", and you're going to suffer the consequences. Am I misreading your text?

Yes, you are correct. The types defined using this method are unsafe. This post is about sharing an informal way to define contravariant types, which is why I prefix packages that add types in this way with unsafe.

While this approach is unsafe, it addresses a significant limitation in TR: the lack of contravariant type support, which results in mutable types having no subtyping relationships. For example:

> (ann (vector 0) (Mutable-Vectorof Natural))
- : (Mutable-Vectorof Nonnegative-Integer)
'#(0)
> (ann (ann (vector 0) (Mutable-Vectorof Natural)) (Mutable-Vectorof Integer))
string:1:10: Type Checker: type mismatch
  expected: (Mutable-Vectorof Integer)
  given: (Mutable-Vectorof Nonnegative-Integer)
  in: (vector 0)
 [,bt for context]

Additionally, predicates for mutable types cannot be defined:

> (define-predicate p (Mutable-Vectorof Natural))
string:1:20: Type Checker: Error in macro expansion -- Type (Mutable-Vectorof Nonnegative-Integer) could not be converted to a predicate: required a flat contract but generated a chaperone contract
  in: (Mutable-Vectorof Natural)
 [,bt for context]

These factors make mutable types in TR very difficult to use. As shown in other posts (e.g., this one), I'm not the only one facing this problem, which seems to have persisted for quite some time. Years ago, I struggled with this issue and resorted to using untyped Racket, then importing unsafe types via unsafe-require, and carefully working with them. In this discussion, @AlexKnauth introduced the concept of contravariant contracts and proposed contravariant types as a potential solution.

Currently, the only contravariant type officially supported in TR is Parameterof:

> (ann (make-parameter 0) (Parameter Natural Natural))
- : (Parameterof Nonnegative-Integer)
#<procedure:parameter-procedure>
> (ann (ann (make-parameter 0) (Parameterof Natural Natural)) (Parameterof One Integer))
- : (Parameterof One Integer)
#<procedure:parameter-procedure>

I've summarized cases where using contravariant types defined with struct and Parameterof is unsafe, and I think these cases are relatively uncommon. A careful programmer can avoid them, making this type of unsafety one of the more manageable kinds. This approach introduces subtype polymorphism and read-only types, significantly alleviating the challenges of working with mutable types in TR.

Moreover, with Racket's recent addition of mutable-treelist, a new mutable data structure that has yet to see widespread use, I see an opportunity for programmers to experiment with these types in TR. Exploring contravariant types might also provide valuable insights and serve as a testbed for future developments in TR. For instance, some of the unsafe scenarios I've identified, such as write-only types and write types being supertypes of read types, are issues that would need to be addressed if TR were to officially support contravariant types. After all, types and contracts are fundamentally different, as evidenced by the fact that many TR types cannot be directly translated into corresponding contracts.

Lastly, I'd like to explain why it took me over a year to notice these unsafe scenarios :stuck_out_tongue: . When writing TR code for testing, the test code itself must be type-correct, which makes it difficult to write test cases that deliberately produce type errors. Since rackunit only captures runtime exceptions, I naturally ended up writing code that was type-correct (which is one of the reasons I love TR—it guides me to write correct code). This is likely why I missed what, in hindsight, seems like an obvious unsafe scenario for almost a year! :slight_smile:

Taking a step back, I think I now understand what you're saying. I think that my principal obstacle was your use of the term "contravariant type"; in reading the post that you link to, it appears to me that Alex does not use this term, and I would say that it's a confusing one; the term has an existing and different meaning. I'd be inclined to call them something like "split mutable types", "read-write paired mutable types", or something similar?

1 Like

I don't quite remember why I initially referred to it as contravariant type. Perhaps it was because the concept seemed somewhat analogous to contravariant functors in category theory?

Additionally, I noticed that the definition of contravariant on Wikipedia seems to align with this usage.

That said, if this terminology feels confusing, I agree that read-write paired types is an excellent alternative. Thank you for the suggestion!

(I only briefly read the replies, so sorry if I’m missing anything.)

Generally, we speak of contravariant/negative positions, where the subtyping rule is “in reverse”, in contrast to covariant/positive positions. For example, in a function, output positions are covariant, while input positions are contravariant. This is why a mutable type itself must be invariant: a mutable type t(?a) must support (let me use a Shplait-like notation)

  • a get operation get :: t(?a) -> ?a (item type ?a is in covariant position), and
  • a set operation set :: (t(?a), ?a) -> () (item type ?a is in contravariant position).

When you combine contravariance and covariance, bang, invariance!

In a capability-based approach, you can use get/set capabilities to limit what one can do with a mutable object and therefore allow contravariant/covariant types as the supertypes. This is the more conventional approach to the variance problem and is discussed in, for example, Types and Programming Languages Section 15.5. Coming back to the context of Typed Racket, we can imagine (Mutable-Vectorof A) being the subtype of (Writable-Vectorof A) (contravariant) and (Readable-Vectorof A) (covariant) (while (Immutable-Vectorof A) is only (Readable-Vectorof A)). But the covariant/contravariant types must have no relation (because they represent capabilities). I think what Alex proposed is similar in spirit to that, but it combines get/set capabilities into one type, which is nice.

The “unsafety” isn’t really an inherent problem. The issue here is that a mutable type constructor t, when given two positions, must be constrained such that the set type is always a subtype of the get type. This is easy to achieve: don’t allow “direct” formation at all. Instead, let t(?a-, ?a+) be a subtype of equivalent to t(?a), and contravariance/covariance make it that the set type is always a subtype of the get type. But it does require a more powerful type system than what Typed Racket offers now.

Also, I don’t think Parameterof does what you think it does. What the type actually says is that parameters can be read differently than be written, because parameters can be given guards that do conversion, but the guard isn’t applied to the initial value. For example, you can write

#lang typed/racket/base
(: p (Parameterof Any Number))
(define p (make-parameter 0 (lambda (v) 1)))

and it typechecks. So this is a different problem than variance.

If you want to more systematically study type theory, the mentioned Types and Programming Languages is a good book. The topics are relatively “classic”, in that it touches subtyping quite heavily, but that makes it exactly on topic.

1 Like

Thank you so much for your response! I've been meaning to take the time to study type theory systematically, but unfortunately, I just haven't had enough time. I've added Types and Programming Languages to my reading list. Since I haven't formally studied type theory, I realize that some of my questions may seem very basic, so I hope you can bear with me.

To express my confusion, I'd like to use the example of a mbox: for instance, if I define MBoxof using Parameterof, how does this mbox differ from a truly invariant box? For example, the following code defines MBoxof, and to me, it behaves similarly to both invariant types and Parameter types:

#lang typed/racket/base

(module mb racket/base
  (provide (all-defined-out))
  (define make-mbox
    (case-λ
      [(v) (make-mbox v (λ (v) v))]
      [(v f) (cons (box v) f)]))
  (define (unmbox mb)
    (define b (car mb))
    (define v (unbox b))
    v)
  (define (set-mbox! mb v)
    (define b (car mb))
    (define f (cdr mb))
    (set-box! b (f v)))
  (define (mbox? mb)
    (and (pair? mb)
         (box? (car mb))
         (procedure? (cdr mb)))))

(require typed/racket/unsafe)
(struct (w r) MBoxof ([_ : (Parameter w r)]))
(define-type (MBox w r) (MBoxof w r))
(provide MBox (rename-out [MBox MBoxof]))

(define-type MBoxTop (MBox Nothing Any))
(provide MBoxTop)

(unsafe-require/typed/provide 'mb
  [make-mbox   (∀ (w r) (case→ (→ w (MBox w w)) (→ r (→ w r) (MBox w r))))]
  [unmbox      (∀ (r) (→ (MBox Nothing r) r))]
  [set-mbox!   (∀ (w) (→ (MBox w Any) w Void))]
  [mbox?       (pred MBoxTop)])
Welcome to Racket v8.15 [cs].
> (require "unsafe/mbox.rkt")
> (: mb (MBoxof Any Number))
> (define mb (make-mbox 0 (lambda (v) 1)))
> (unmbox mb)
- : Number
0
> (set-mbox! mb "hi")
> (unmbox mb)
- : Number
1

It looks like, at least in the typed module, the issue of w being a supertype of r is unlikely to occur with mbox. This is because both make-mbox and make-parameter seem to follow a similar strategy. I hope I've managed to express my thoughts clearly here.

Uh, I finally see why Parameterof was mentioned! Yes, the two positions in Parameterof are indeed contravariant and covariant, respectively, so it is related to the variance problem. The difference is that, in a normal mutable object, we’re starting from (Mutable-Objectof A A), instead of (Mutable-Objectof A B), because the A -> B guard is missing. So I think your idea can indeed work safely, if you make the data constructor polymorphic in one type variable instead of two. (More concretely, try changing the type of make-queue to -> (Queueof A A)!) And I think the ability to make user-defined mutable objects work this way will be a great addition to Typed Racket.

This is a side point: Nothing is the bottom/empty type in Typed Racket, which means no value has the Nothing type. When the type system is behaving properly, only non-terminating terms can have the Nothing type, for example, an infinite loop (it can have any type, technically, but there’s nothing you can do with it :P). So you can’t really “read” a Nothing value, but you are able to get a Nothing “result” from queues by using the above proposed type: when you make a (Queueof Nothing Nothing), you can’t write anything into it (because there’s no Nothing value), and therefore you can’t read anything from it (because the way queues work is that you have to write something before you can read). That means that this is indeed a safe approach, because Nothing is behaving properly. (Puns intended!)

Edit: And there is one way to derive “write-only” (Queueof A Nothing) safely in this approach. Imagine we have an operation queue->write-only-queue that has type (Queueof A B) -> (Queueof A Nothing). This operation should wrap/chaperone the original queue such that when the queue is read, an error is raised. Raising an error is non-terminating and thus can have Nothing type totally fine, and the behavior also makes sense: nothing can be read from the queue by the virtue of raising. And you can also have a (Listof Nothing) term totally fine, except that it raises an error when you try to reduce it, so no value is ever read. Unfortunately, we cannot statically detect when a “write-only” queue is read, unlike in a genuine capability-based approach. In a capability-based approach, reading from a write-only queue wouldn’t typecheck at all. This also highlights the fact that Nothing isn’t a “constraint”—it’s a totally normal type that has no value, and you have to maintain that fact for type safety.

1 Like

First of all, thank you for pointing out the correct type for make-queue! If I understand correctly, as long as we ensure that w is a subtype of r at the source, i.e., in the data constructor, then using this strategy to define write-read pair types can be done safely.

Given this, I don't think it's necessary to use the unsafe- prefix for packages that adopt this strategy. Instead, I've opted to use -lab to give space for the eventual formal support for such types in TR.

it can have any type, technically, but there's nothing you can do with it :stuck_out_tongue:

I understand that, but I'd like to point out that sometimes it makes sense for a non-terminal to have a type other than Nothing. For example, in typed-amb, I define (in-amb* thk) to construct a sequence, where calling thk invokes a continuation previously captured by call/cc. Although the return type of thk might appear to be Nothing, we cannot use Nothing because we need to specify the exact type of the Sequence returned by in-amb*: (∀ (a ...) (→ (→ (values a ... a)) (Sequenceof a ... a))).

amb* works similarly by calling a continuation, but the return type of amb* also cannot be Nothing (e.g., the type of amb* is not (→ (Listof (→ Any)) Nothing)). This is because we need to specify the type for the function calling amb*. For example, with (in-amb* (λ () (amb* (list (λ () 1) (λ () 2) (λ () 3))))), the return type of amb* must be something other than Nothing. Currently, the type of amb* is (∀ (a ...) (case→ (→ Null Nothing) (→ (Listof (→ (Values a ... a))) (Values a ... a)))).