FFI: Converting from _pointer to byte string

I have a C struct like so:

(define-cstruct _foo ([data _pointer] [len _int32]))

I then get an instance of this struct as an output parameter of a C function. data points to a non-null-terminated array of bytes, of length len.

How can I convert foo-data to a byte string? Or maybe a vector?

What would be the most efficient way to receive this value in Racket FFI?

Thanks!

1 Like

Unfortunately, you can't create a byte string without copying. I think _array and array-ref is probably the simplest option, although I don't know what would be most efficient in your situation.

Thanks for the suggestion! I'm looking at the docs for _array, and it kinda looks like it's for arrays with known length? In my case, the length is only known at runtime, and can vary from call to call - the callee fills in the len field of the struct to indicate the length of the array pointed-to by the data field. Is this possible to handle with _array?

Yes, I think you want something like this (untested):

(define (convert f)
   (ptr-ref (foo-data f) (_array _int (foo-len f)) 0))
1 Like

Hm, OK, that gives me an _array! However, there doesn't seem to be much I can do with this array... I'm trying to deserialize a stream of data, and sometimes the array will just contain a string, other times multiple strings, each prefixed with their length (as a 32-bit integer), and so on.

The other day you told me about integer-bytes->integer, and I also found bytes->string/utf-8. I'd rather not have to reimplement these in terms of _array, especially not if I'd have to write them in C to get decent performance.

It seems like a byte string would be a good intermediate representation. How can I convert from _array to byte string? I see there's a sequence->bytes in an external package, and there's an in-array for creating a sequence over an array, but not only do I want to avoid external dependencies, but this seems really inefficient when it should just be a cast or at worst, maybe a memcpy.

To get a byte string you might just use ptr-ref and a loop that copies into a byte string. Unfortunately there's no way to convert to a byte string without copying because of how they're represented internally. I think there's also not a way to copy the bytes in bulk but that seems like something that could be added.

I would do it like this:

#lang racket

(require ffi/unsafe)

(define (convert data len)
  (cast data _pointer (_bytes o len)))

(convert #"abcdefghijklmn" 7) ;-> #"abcdefg"

(It might be surprising that cast does create a copy!)

You can also use Racket functions like memcpy.

1 Like

For what it's worth, I would stick to working with the pointer instead of converting to a byte string, at least for most cases and where performance is a concern. For example, I'd use ptr-ref with _int (or another suitable integer type) instead of integer-bytes->integer. But there are cases where it's more convenient to work with byte strings.

2 Likes

Thanks a lot everyone for the helpful responses! I'll play around with cast, ptr-ref and the likes and see how it goes :slightly_smiling_face:

(Looks like I can only mark one answer as the solution, so it's a bit arbitrary, sorry!)