FFI define-cstruct recursive dependencies on types

I am trying to use define-cstruct 1.4 C Structs to define many cstructs some of those have pointers to other cstructs. My actual code is quite big, but I think it boils down to being able to define mutually recursive cstructs. In C this can be done quite easily by using a forward declaration, but it seems like there is no equivalent for this on the racket side within the ffi.

Here is a simplified example:

(define-cstruct _Person
  ([name    _string]
   [friends _PersonList]))

(define-cstruct _PersonList
  ([next _PersonList-pointer/null]
   [data _Person-pointer]))

With this I get _PersonList: undefined; ... if I swap it around I get _Person-pointer: undefined; is there some equivalent to how letrec works that can be used with define-cstruct?
Maybe there should be a (begin/rec ...) form that makes it so that the locations for all bindings are introduced before the expressions are evaluated, but seems that might be difficult to implement.
(hmm maybe this could be done with partial expansion? Expand the body rewrite defines and lets to use letrec instead?)

Currently it seems my only choice is to use _pointer (an untyped pointer) in a few places until I get a linear order of define-cstruct definitions, but this would be quite disappointing considering this can be done in C, I would expect that it can be done in Racket too.

  • Has someone had this problem with ffi, what did you do to deal with it?
  • Is there some trick to evaluate the type expression lazily somehow?
  • What would be a good way to make this possible (if it isn't already)?
1 Like

ffi bindings / experience report / what I did

first attempt: manual bindings for freetype

In my particular case I wanted to use freetype from racket via ffi, but after a while I gave up on trying to create bindings for it directly. The freetype library has deeply nested structs and it is very annoying having to replicate all the type declarations, then eventually you get some type wrong somewhere and the struct alignments don't work anymore and you get a segfault trying to use your faulty bindings.

Overall this can be quite tedious and demotivating, on the c side you are confronted with types where sometimes you aren't exactly sure what their size is, so sometimes you have to write a small piece of code that prints its size with sizeof. Then on the racket side it is sometimes not the easiest to find the right racket type. When there are a lot of typedefs or even pre-processor macros on the c side this makes it that you are flipping through the (freetype) documentation from one place to the next and occasionally hunting down things in the header files. When you crash you don't really know what is wrong, because you are dealing with structs with sub-structs with 20+ entries and you don't really know which one you got wrong. I was almost starting to analyze all the structs interactively with a debugger, but then decided to go another route.

second attempt: c wrapper library to simplify api

All that manual binding creation above, I tried that for a bit, but then I switched gears.
I thought well if the libraries api is difficult to create bindings for, why don't I write a small wrapper around the library, creating a simple api, for what I actually want to get from that library.

So I wrote a small piece of c code compiled that as a dynamic library, that is dynamically linked with the original freetype library, now my wrapper can make the calls to freetype and give me a few more constrained, easier to call function calls. So now it was way easier to create bindings for my wrapper because it exposed simpler c types.

With that I relatively quickly had a working racket program that used the ffi to make calls to my wrapper and load glyphs from some font with font size x. Success.

That wrapper tries to do the minimal stuff to load gray scale glyph data using freetype.
(so far completely without complex hinting stuff, only the basic advance info that is attached to the glyph)

native racket libraries

Except now I have a new problem, because now I have to build that wrapper on every platform where I want to use it. I don't think that this is necessarily difficult, it is more another annoyance.

For example, when I want to make it easy for people to use that wrapper too, then it is nicer if I provide it for them pre-built and packaged as a racket package. So far I don't have a setup / images / (virtual-)machines to do that.


closing thoughts

For now I am closing this, because I had a look at how define-cstruct is implemented, but apparently that was pre syntax-parse and I don't want to reverse engineer it to figure out how to build a version where you can define mutually recursive structs.

And my experience with freetype has taught me that sometimes it doesn't make sense to try to create bindings 1 to 1, especially not if you just use 5% of what a library offers. If you just need 5% then you can often greatly simplify how you cross that ffi boundary making it easier to write that ffi-code and also possibly more performant, because you don't have to cross back and forth over that boundary a lot of times.

Another approach might be using dynamic-ffi to generate bindings, but so far I haven't tried a lot in that direction, it seems you need a relatively old llvm version, I couldn't find these versions as packages for my system. I have the impression that llvm is quite big and maybe it is quite the task to compile it yourself, but I haven't actually tried it so I could be wrong.

Overall the idea of being dependent upon llvm is something I am not particularly fond of, which is another reason why I have avoided that particular route so far.

If the insides of a struct are not needed (for instance if the library provides constructor and accessor functions) and the program only deals with a pointer it is much easier to just make an opaque pointer type and work from that.

2 Likes

Here's a _delay form that delays evaluation of the actual FFI type until it is used for conversion. But it needs a base type of the right size; currently it assumes it's a _pointer.

(define-syntax-rule (_delay type)
  (_delay* (lambda () type)))

(define (_delay* get-type [base-type _pointer])
  (define the-type #f)
  (define (type)
    (or the-type
        (let ([t (get-type)])
          (unless (= (ctype-sizeof t) (ctype-sizeof base-type))
            (error '_delay "type has wrong size: ~e" t))
          (set! the-type t)
          t)))
  (define (rkt->c v)
    (cast v (type) base-type))
  (define (c->rkt v)
    (cast v base-type (type)))
  (make-ctype base-type rkt->c c->rkt))

And here's a little example:

(define-cstruct _person
  ([id _int]
   [friends (_delay _personlist-pointer/null)])
  #:property prop:custom-write
  (make-constructor-style-printer
   (lambda (self) 'make-person)
   (lambda (self) (list (person-id self) (person-friends self)))))

(define-cstruct _personlist
  ([data _person-pointer]
   [next _personlist-pointer/null])
  #:property prop:custom-write
  (make-constructor-style-printer
   (lambda (self) 'make-personlist)
   (lambda (self) (list (personlist-data self) (personlist-next self)))))

(define alice (make-person 1 #f))
(define bob (make-person 2 #f))
(define chris (make-person 3 (make-personlist alice (make-personlist bob #f))))
chris

Note that printing depends on the friends actually being converted as a _personlist-pointer/null, not just a _pointer.

3 Likes

Wow this is awesome, I hadn't expected to see such an solution, I will definitively play around with it a bit.
I may end up using my wrapper anyhow because it simplifies the interface (for the freetype library) quite a bit, but I will have to revisit and evaluate whether this changes things for my use case.

For sure a good technique for this problem.