Impersonate syntax transformer: cursed or not?

Problem: I want to create a "smart constructor" for struct (e.g., supporting default values for some fields) without losing the ability to use struct-copy, struct* in match. In general, I don't want to lose the struct-info information in the transformer binding.

One possibility is to duplicate what struct is doing, but use my own smart constructor instead. This is a huge amount of work, and when struct from Racket changes, my implementation would need to sync up to be compatible, putting even more burden on me.

I have another solution that looks really cursed, but appears to do its job well: impersonating the syntax transformer generated by struct. Is it actually cursed? Does anyone have a better solution?

Here's a concrete example:

#lang racket

(module provider racket
  (require syntax/parse/define
           (for-syntax racket/struct-info))

  (provide (rename-out [node* node]))

  (struct node (id value) #:transparent)

  (define-syntax-parse-rule (repack x:id constr new-id:id)
    (define-syntax new-id
      (impersonate-procedure
       (syntax-local-value #'x)
       (λ (y)
         (values (syntax-parser [(_ . args) #'(constr . args)]
                                [_:id #'constr])
                 y)))))

  (define better-node
    (procedure-rename
     (λ (id #:value [value #f])
       (node id value))
     'node))

  (repack node better-node node*))

(require 'provider
         (for-syntax racket/struct-info))

node

(match (node 1)
  [(node id val) (println (list id val))])

(match (node 1 #:value 2)
  [(node id val) (println (list id val))])

(println (struct-copy node (node 1) [value 10]))

(define-syntax (get-field-names stx)
  #`'#,(struct-field-info-list (syntax-local-value #'node)))

(println (get-field-names))

In this exploit, I simply copy the content in the transformer binding entirely, so I get the support in struct*, struct-copy, etc. for free. The only change I made is that I impersonate the procedure so that it does something else. The syntax transformer originally expands to the use of the non-smart constructor, so I simply replace that with a syntax transformer that expands to the use of my smart constructor.

It looks almost perfect. I can give the original syntax transformer (i.e. (syntax-local-value #'x)) any input (by adjusting y) and can totally replace the output of the original syntax transformer with anything (by adjusting what goes to the syntax-parser). This essentially allows me to supplant the original syntax transformer with anything that I want. The only constraint is that the original syntax transformer must still be called and must return without an error. An original syntax transformer that always errors, for instance, would ruin this scheme.

So for this particular syntax transformer generated by struct, the exploit works well. But in general, it might not. Is there a solution to this? To put it another way, given a struct that may already have prop:procedure and a procedure, is there a way to replace prop:procedure with the procedure without the caveat I mentioned above?

2 Likes

cough, cough

https://docs.racket-lang.org/struct-plus-plus/index.html

#lang racket

(require struct-plus-plus)

(struct++ node
          ([(id (gensym "node-")) (or/c string? symbol?) (compose1 string->symbol ~a)]
           [value])
          #:transparent)
; This produces (struct node (id value)) along with a lot of
; supporting infrastructure.  Since it is at base a call to `struct`
; you can still do struct-copy and (match (struct* ...)) etc


;;  Examples:
;
; A node contains two fields:
;
;   - `id`, which can accept either a string or a symbol and will
;   coerce it to a symbol.  It defaults to the result of (gensym
;   "node-"), a guaranteed-unique symbol of the form 'node-4502929
;
;   - `value` which can be of any type

(node++ #:value 'x)  ; the ID field will default
;; => (node 'node-4528542 'x)

(define bob (node++ #:id 'bob #:value (hash 'age 17 'eyes 'brown)))
bob
;; => (node 'bob '#hash((age . 17) (eyes . brown)))

(node++ #:id "bob" #:value (hash 'age 17 'eyes 'brown)) ; id will be coerced to symbol
;; => (node 'bob '#hash((age . 17) (eyes . brown)))

(with-handlers ([any/c (lambda (e) (displayln (exn-message e)))])
  ; This throws an exception
  (node++ #:id 17 #:value '()))
;; =>
;; node++: contract violation
;;   expected: (or/c string? symbol?)
;;   given: 17
;;   in: the #:id argument of
;;       (->*
;;        (#:value any/c)
;;        (#:id (or/c string? symbol?))
;;        node?)
;;   contract from: (function node++)
;;   blaming: /Users/dstorrs/test.rkt
;;    (assuming the contract is correct)
;;   at: /Users/dstorrs/test.rkt

; The default `struct` constructor still exists.  It does not do type checking or handle coercions
(node '17 '())
;; => (node 17 '())

; Want a functional setter?
(set-node-value bob 'new-value)
;; => (node 'bob 'new-value)


; How about dotted accessors that make clear what is the struct name and what is the field name?
(node.id bob)
;; => 'bob

; match still works
(match bob
  [(struct* node ([id id] [value val])) (displayln (~a "id is: " id ", value is: " val))])
;; => id is: bob, value is: #hash((age . 17) (eyes . brown))

; Hey, look!  Full reflection data!
(match (force (struct++-ref bob))
    [(struct* struct++-info
              ([base-constructor base-constructor]
               [constructor constructor]
               [predicate predicate]
               [fields (and fields
                            (list (struct* struct++-field
                                           ([name     field-names]
                                            [accessor field-accessors]
                                            [contract field-contracts]
                                            [wrapper  field-wrappers]
                                            [default  field-defaults]))
                                  ...))]
               [rules (and rules
                           (list (struct* struct++-rule
                                          ([name rule-names]
                                           [type rule-types]))
                                 ...))]
               [converters converters]))
  
     (pretty-print
      (hash 'field-names     field-names
            'field-accessors field-accessors
            'field-contracts field-contracts
            'field-wrappers  field-wrappers
            'field-defaults  field-defaults
            'rule-names      rule-names
            'rule-types      rule-types
            'converters      converters
            'fields          fields
            'rules           rules))])
;; =>
;; '#hash((converters . ())
;;        (field-accessors . (#<procedure:node-id> #<procedure:node-value>))
;;        (field-contracts . (#<flat-contract: (or/c string? symbol?)> #<flat-contract: any/c>))
;;        (field-defaults . (node-4518296 no-default-given))
;;        (field-names . (id value))
;;        (field-wrappers . (#<procedure:composed> #<procedure:identity>))
;;        (fields . (#<struct++-field> #<struct++-field>))
;;        (rule-names . ())
;;        (rule-types . ())
;;        (rules . ()))

;;  See the docs for what else it can do.  There's a lot.

2 Likes

Also, struct-copy should probably be avoided: racket - How do you get struct-copy to create a struct of the same type as the original? - Stack Overflow lexi-lambda (aka Alexis King) sums it up as "struct-copy is hopeless and can't be fixed without major changes to how structs work."

1 Like

Are there any plans to support supertypes at some point? If not, what is/are the reason(s)?

I don't want to complain, but I'm curious. :slight_smile:

1 Like

That is truly unfortunate RE: struct-copy; I just changed a number of (match … [(struct* …s in my code to use struct-copy. Fortunately I think I'm in one of the "safe" realms for its use…

1 Like

Pull requests extremely welcome, but I'm not intending to do it myself. Details are in the link I quoted, but the upshot is that once supertypes get involved things get very hairy and I didn't want to put in the work.

2 Likes

I've been recently looking for a way to create a functionally updated copy of a struct where most fields stay the same, and struct-copy is what I found and hence used. Is there another recommendation what to use instead? I read the StackOverflow discussion, but it seems it says, struct-copy is kind of bad, but doesn't provide an alternative (or I've missed that).

Technically, I could of course write a function similar to struct-copy for my specific struct type, but that doesn't look like a better alternative to using struct-copy.

Does it make sense to say, "if struct-copy seems to work for my use case, it's fine" or is there the possibility that there are non-obvious problems that will catch me later? I'm mostly thinking of runtime problems. I guess I can live with failures at compile time if I can find a workaround.

Sorry if my question is (too) vague.

1 Like

struct++ is cool. My question, however, is somewhat theoretical and more general than just struct handling.

Question: given a transformer binding and a syntax transformer, is it possible to create a new transformer binding that acts just like the input transformer binding, except that the syntax transformer is used instead for expansion.

I alluded to “smart constructor” as one motivation why doing this is useful, but its application goes beyond struct handling.

1 Like

I had another part of my answer regarding struct-copy, but it got stripped out. Sigh.

There’s no reason why you should avoid struct-copy. The answer given in StackOverflow was in 2018. Since then, I have been fixing it.

To be clear, it still has many “limitations”, but these "limitations" can also be viewed as a good characteristic of struct. For example, fields are static and not dynamic, so you can't query them dynamically. On the flip side, because fields are static, field lookup is very efficient. If you want things to be more dynamic, you can for instance use hash to represent data instead of struct at the cost of lookup efficiency. So many of these criticisms sound like saying that lists are bad because they can't do random access. Does that mean we should always use vectors instead of lists everywhere? Absolutely not.

The actual bugs — especially hygiene bugs — are mostly fixed. I don’t think “unhygenically pasting bits of structs together” in the StackOverflow answer is a good characterization of the struct system anymore.

7 Likes

There’s no reason why you should avoid struct-copy . The answer given in StackOverflow was in 2018. Since then, I have been fixing it.

Very cool! Thank you for doing this.

2 Likes

I just discovered adjust-struct-info from the scramble library (a really cool library btw!). It additionally allows one to replace the match expander on the struct info, while my scheme cannot. However, this comes at the cost of having to sync up with the struct implementation whenever it changes.

1 Like

This is not a recommendation, because I haven't used struct-copy or lenses enough, to say what I prefer.
Still I want to point out lenses, because I think they are one interesting way of accessing/updating structures or maybe even deeply nested combinations of data structures.

3 Likes

Thanks! I had read the term occasionally, but up to now didn't know what it meant. (I just read the Lens Guide.) I probably wouldn't use them for a single "struct level", but I hope to run into a more interesting application one day where I can try them out. :slight_smile:

1 Like