Using `'nonatomic` foreign memory

Consider this C type definition:

typedef struct git_strarray {
	char **strings;
	size_t count;
} git_strarray;

This called to mind the support added in https://github.com/racket/racket/commit/87196e0144226b44fd4c00fcca26965458ae159d for ctypes like (_list i _string) as function arguments, but I realized I'm not quite sure about all the details of how to use it properly.

As I understand it, writing something like:

(define-cstruct _git_strarray
  ([strings (_list i _string)]
   [count _size])
  #:malloc-mode 'nonatomic)

would not work, because the count field is not a pointer, and memory allocated in 'nonatomic or 'interior mode:

is treated by the garbage collector as holding only pointers … The memory is allowed to contain a mixture of references to objects managed by the garbage collector and addresses that are outside the garbage collector’s space.

Is that true?

Does that mean I have to allocate the _git_strarray in 'atomic (or 'atomic-interior) mode and manually wrap things to manage the reference to the strings field (perhaps with ffi/unsafe/alloc and/or a hash table)?

Do I also have to allocate the memory for strings with 'interior, since the collector wouldn't update the reference in the git_strarray?

And does that mean that I'd need to do basically the same thing for C structs like:

typedef struct {
	unsigned int version;
	git_remote_callbacks callbacks;
	git_proxy_options proxy_opts;
	git_remote_redirect_t follow_redirects;
	git_strarray custom_headers;
} git_remote_connect_options;

that have a git_strarray in a field?

The foreign library never retains a reference to a git_strarray (or any structure containing one) across function calls, but some of these structures you'd want to create once and use many times. It would be nice to give the collector freedom to do its thing and minimize the amount of manual memory management.

I also have the sense that I'm overthinking this somehow …

2 Likes

One thing I've found somewhat confusing is that the documentation in various places (e.g. _gcpointer) uses the terms "argument type" and "result type", but doesn't define them. They make sense well enough with respect to _fun "callouts" and "callbacks", but it's not entirely clear how they interact with C struct fields and malloc modes.

The short answer is "yes, right" to everything before "overthinking". I would try something like this:

(define-cstruct _git_strarray
  ([strings _pointer]
   [count _size])

(define (make-strarray strs)
   (define p (cast strs (_list i _string interior) _gcpointer))
   (define r (make-git_strarray p (length strs)))
   (hash-set! links r p)
   r)

(define links (make-weak-hasheq)) ; to keep list pointers live

Thanks!

Does it matter that:

(cpointer-gcable? (git_strarray-strings (make-strarray '("abc"))))

is #f? It seems possible to just avoid exposing an accessor.

Also, do I understand correctly that I'll need a similar wrapper and hash table for e.g. _git_remote_connect_options?

(In some ways I feel like I'm reinventing a lot of ffi/unsafe/cvector, and maybe I should figure out how to use it.)