Signatures in student languages

Hi all,

I've been following HtDP to teach introductory programming at USI Lugano for several years now, and I really like the approach. The one thing that I have been missing from the student languages is some kind of support for type annotations, so as to be able to turn the data type definitions that are so central to HtDP's approach into something that is more than comments with (structured) natural language.

Recently, I actually found out that something like that has been available since Racket 8.9 (!), in the form of checked signatures. This is great: I definitely intend to integrate it in my teaching material and let the students systematically use these features.

While trying out checked signatures, and thinking how to apply them to the various topics of HtDP, I also found a few scenarios where it'd be helpful to have some additional features. I would like to discuss here whether others think these features could be generally useful, and whether they could even be considered for inclusion in the student languages.

Of course, any comments are welcome.

1. Alias mixed as one-of

Function mixed describes mixed data or, as they are sometimes
called in HtDP, itemizations:

(define StringOrInteger (signature (mixed String Integer)))

An itemization is usually described in HtDP with the structured
natural language form:

A StringOrInteger is one of:
  - a String
  - an Integer

Having one-of as a synonym of mixed would help make formal type
declarations closer to the structured natural language form used in
HtDP:

(define (StringOrInteger (signature (one-of String Integer))

2. Signature definitions without signature

We can use define to give a signature form a name:

(define WeekDay (signature (enum "Mon" "Tue" "Wed" "Thu" "Fry" "Sat" "Sun")))

Since such definitions correspond to data (type) definitions in HtDP,
it would be convenient to have a variant of define that does not
need an explicit signature. Thus, the above definition could be
written as:

(define-type WeekDay (enum "Mon" "Tue" "Wed" "Thu" "Fry" "Sat" "Sun"))

3. Predicate types in BSL

Function predicate allows one to define a type constrained by a
predicate, which returns #true precisely for the values that belong
to the type.

(define (nonzero-number? v) (and (number? v) (not (= v 0))))
(define NonZeroNumber (signature (predicate nonzero-number?)))

Even though predicate is defined in all student languages, it cannot
really be used in BSL or BSL+ where functions are not first class.

I'm unsure what's the best way of handling this. Removing predicate
from BSL is not a good idea, given that there are several examples of
data type definitions that are already used with BSL and BSL+ (e.g.,
interval data types in HtDP) but would require predicate to be
defined.

4. Posn and PosnOf

A define-struct with name struct also defines signature forms
Struct and StructOf. For example:

(define-struct name [first last])

; name-full inputs instances of `name` where both fields are strings
(: name-full ((NameOf String String) -> String))
(define (name-full n) (string-append (name-first n) " " (name-last n)))

; the constructor of `name` returns an instance of type `Name`
(: make-name (Any Any -> Name))

In the student languages, there is a predefined struct called
posn. However, the type names Posn and PosnOf are not defined.

5. Support type-annotated struct definitions

Often one wants to constrain the types of fields in a struct. This
is possible by providing a signature for the struct's constructor:

(define-struct name [first last])

(: make-name (String String -> (NameOf String String)))

While this works, Name would still refer to a struct with two
fields of any type, whereas the type NameOf String String denotes
the "constrained" type with two strings as fields.

It would be convenient if one could add type constraints to the fields
of a define-struct form, or anyway having a mechanism to easily
define a struct with certain type constraints on its fields.

Wow, great feedback!
The signatures come from the DeinProgramm languages (and have been in place much longer actually), which is where most of the experience and feedback come from.
Thus, the signatures in the HtDP languages would definitely benefit from some iterative improvement, driven by suggestions from people actually using them.
Would you mind pasting your post sections into issues on the HtDP repo?

This way, more people could chime in with the discussion, and we could make them actionable. (No promise anything will happen quickly, but I'll certainly do my best.)

How do these signatures differ from the types in Typed Racket?

-- hendrik

Thanks for the suggestion!

I have opened 5 issues in the HtDP GitHub repo:

I did not look into how they are implemented, but it seems that the signatures in the student languages are much simpler, do not support static checking, and do support only a limited form of runtime checking.

They fit very well the HtDP approach, though, by providing a direct means to express data type definitions and function signatures as code.

Indeed they are checked dynamically, not statically, and are integrated with the test framework. For details, see this paper:

https://dl.acm.org/doi/10.1145/1863543.1863576

Here are some observations about the pedagogy behind HtDP/2e and the (probably no longer relevant) path to HtDP/3e.

First, he teaching languages are intentionally untyped (dynamically typed).

  1. Northeastern had wonderful co-op faculty who systematically interacted with students on (6 mo) internships and after their return. The first co-op almost always involved an untyped scripting language. (I dont want to cite specific statistics but it used to be far more than 50%. We lost this ability to assess these factors some five, six years ago, so who knows what it is now.)

  2. To program in an untyped language properly requires the development of the mental discipline of (a) using a type-driven design a la HtDP and (b) live without a type checker.

(( 3. There is a small advantage to imagining a type language without having a checker breathing down our neck. We can go beyond what any fixed type system offers, for example, with true-union (not tagged) types or dependent types — without getting into the complications of the latter. ))

Second, eventually students should recognize that they lack the mental discipline to write type-correct code w/o a checker. At this point we could choose one of two tools:

  1. We could equip BSL/ISL with a type system that accommodates the idioms they already know. That’s why I asked Philippe Meunier in 1999 to investigate this idea and eventually Sam TH (on Discourse) to develop Typed Racket. We never got around to “lowering” this type system to BSL/ISL.

  2. I also concluded that injecting types first would be a mistake. Why?

A dynamically typed language such as ISL reports errors with concrete witness values from the inside of the machine. So when you write

(define (f x)
  (string-append “hell” x “world”))

(f 0)

the language explains

string-append: expects a string, given 0

BUT, if we were to write

(define/typed (f {x : Number}) : String 
     (string-append “hell” x “world”))

(f 0)

we’d an abstract answer (of type Number classhing with type String).

Novice programmers are much more likely to understand concrete values rather than abstract explanations.

  1. So our second choice is to deploy a run-time checking system, which are the checked signatures now available in htdp/*sl thanks to Mike (and earlier in his DeinProgram languages).

Dynamic checking continues to deliver concrete values that violate constraints. Critically it grants students the same power that the underlying virtual machine has, namely imposing constraints on primitive functions. Students can now impose constraints on their own function definitions, meaning errors are caught earlier.

(The implementation is non-trivial so that it doesn’t shift asymptomatic complexity and still delivers comprehensible error messages.)

Third, if we were to tackle HtDP/3e, I might include checked signatures and even some form of type system as long as I can still shrink the size of the book. (I imagine off-loading extended exercise section and outside topics (say parsing iTunes files) as external resources.) But there are a number of underdeveloped topics that I would tackle first, most importantly function/design composition.

Space permitting, I would then explain the trade-off between dynamic signatures (formulate arbitrary predicates as constraints, get witness values) and type systems (up-front checking, new language, strange error messages, especially if we were to thrown in module-level type inference).

So you see. all of this is a non-trivial addition and needs to take into account context.

Fourth. in support of teachers and instructors who do use checked signatures, I have recommended in the past and recommend again to

— collect ideas/code for what signatures should come with the various *SL levels
— create a common teachpack repo that we can distribute via pkgs.racket-lang.org
— have students use drracket to install this teachpack and include it in their programs.