Help with FFI errors (for #lang clingo integration)

kiranandcode · December 10, 2023, 7:35am

Heya!! This is a long shot, but I figured I'd post here to see if anyone could possibly provide some advice on an FFI integration that I'm currently building and running into some really mind-bending errors.

The repository is here: GitHub - Gopiandcode/clingo-lang: #lang clingo for Racket (WIP)

For context, this package implements a binding to the libclingo ASP solver. The file unsafe.rkt lists my fairly exhaustive FFI binding for all of the functions from clingo.h.

The file main.rkt lists my initial experiments with using the FFI API to make queries from Racket.

The core of my program is:

;; create a context
(define ctrl (clingo-control-new '[] #f #f 20))
(configure-to-enumerate-all-models ctrl)
;; add facts to database
(clingo-control-add ctrl "base" '[] "a :- not  b. b :- not a. - a :- b.")
(clingo-control-ground ctrl `[,(make-clingo-part "base" #f 0)] #f #f)

;; ask the solver to solve this database
(define solve-handle (clingo-control-solve ctrl 'clingo-solve-mode-yield '[] #f #f))

;; loop over solutions and print them
(define (loop)
  (clingo-solve-handle-resume solve-handle)
  (define model (clingo-solve-handle-model solve-handle))
  (when model
    (print-model model)
    (loop)))
(loop)

;; clean up the solver handle
(define solve-result (clingo-solve-handle-get solve-handle))
(println solve-result)
(clingo-solve-handle-close solve-handle)

This actually works pretty well, and I can send queries, and print out the results (using a function from the api to convert solutions to strings) and they match with what I expect.

Inside print-model, I then started writing some code to translate from clingo's internal representation back to a more racket friendly encoding, via a function symbol->term, where term is defined as:

(define term?
  (flat-rec-contract term?
   (or/c
    infinum?              ;; #inf
    supremum?             ;; #sup
    number?               ;; 1
    string?               ;; "hello"
    symbol?               ;; a
    (list/c '-
            (or/c
             symbol?
             (cons/c symbol? (listof term?)))) ;; (- a) or (- (f x y))
    (cons/c symbol? (listof term?))))) ;; (f x y)

My problem is that my function symbol->term is not behaving properly, and I'm getting really weird results.

(define/contract (symbol->term sym)
  (-> any/c term?)
  (define symbol-type (clingo-symbol-type sym))
  (match symbol-type
    ...
    ['clingo-symbol-type-function
     (define is-positive (clingo-symbol-is-positive sym))
     (define is-negative (clingo-symbol-is-negative sym))
     (define f (string->symbol (clingo-symbol-name sym)))
     (define args (map symbol->term (clingo-symbol-arguments sym)))
     (define expr (if (null? args) f (cons f args)))
     (if is-negative
         (list '- expr)
         expr)]))

Above I have the relevant snippet from symbol->term; the errors relate to the outputs of the function clingo-symbol-is-negative, and clingo-symbol-is-positive, which seem to sometimes return the "wrong" results (sometimes they both return #true, which is definitely wrong.

I double checked my bindings, and I can't think there's something wrong there:

(define-clingo clingo-symbol-is-positive
  (_fun _clingo-symbol (positive : (_ptr o _bool)) -> (res : _bool) ->
        (if res positive (raise-clingo-error))))

(define-clingo clingo-symbol-is-negative
  (_fun _clingo-symbol (negative : (_ptr o _bool)) -> (res : _bool) ->
        (if res negative (raise-clingo-error))))

I've looked at the source of the corresponding functions in the c-code, and they definitely should never return the same results at the very least:

extern "C" bool clingo_symbol_is_negative(clingo_symbol_t val, bool *sign) {
    GRINGO_CLINGO_TRY {
        clingo_expect(Symbol(val).type() == SymbolType::Fun);
        *sign = Symbol(val).sign();
    } GRINGO_CLINGO_CATCH;
}

extern "C" bool clingo_symbol_is_positive(clingo_symbol_t val, bool *sign) {
    GRINGO_CLINGO_TRY {
        clingo_expect(Symbol(val).type() == SymbolType::Fun);
        *sign = !Symbol(val).sign();
    } GRINGO_CLINGO_CATCH;
}

I assume I've wrapped something wrong, and now am getting some kind of stack corruption , but for the life of me, I don't know where...

Could anyone give some advice as to what I might be doing wrong? or how could I go about debugging this issue?

kiranandcode · December 10, 2023, 7:46am

Hm... If I call the same functions multiple times in a row, I end up getting different results sometimes:

     (println "start")
     (define is-positive (clingo-symbol-is-positive sym))
     (define is-negative (clingo-symbol-is-negative sym))
     (println [list is-positive is-negative])
     (set! is-positive (clingo-symbol-is-positive sym))
     (set! is-negative (clingo-symbol-is-negative sym))
     (println [list is-positive is-negative])
     (println "done")

which outputs:

"start"
'(#t #t)
'(#t #f)
"done"

which seems to suggest that the error isn't from the clingo-symbol-is-negative function side? because if it were stack corruption, I would have expected one of the other C-ffi functions to accidentally end up overwriting the values for is-positive and is-negative. Maybe I've messed something up and the garbage collector is causing this??

kiranandcode · December 10, 2023, 7:54am

hmmm.... changing (_ptr o _bool) to (_box _bool) seems to make the results consistent, but I'm pretty sure I haven't actually solved the problem here, just kicked it down the line a bit...

(define is-negative-result-box (box #false))
(define-clingo clingo-symbol-is-negative-unsafe
  (_fun _clingo-symbol (negative : (_box _bool)) -> (res : _bool) ->
        (if res negative (raise-clingo-error)))
  #:c-id clingo_symbol_is_negative)
(define (clingo-symbol-is-negative symbol)
  (clingo-symbol-is-negative-unsafe symbol is-negative-result-box)
  (unbox is-negative-result-box))

mflatt · December 10, 2023, 1:48pm

Does it help to use _stdbool instead of _bool?

kiranandcode · December 10, 2023, 2:01pm

Huh. Yeah, it does? (well, I return to getting consistent correct results when I changed to use (_ptr o _stdbool). Do you know why this was happening? What's the explanation? I would have thought the size of _bool and _stdbool would be the same?

mflatt · December 10, 2023, 2:20pm

The size of _stdbool is 1 byte (on all systems I know about), and the size of _bool is 4 bytes (everywhere that Racket runs).

Topic		Replies	Views
`#lang clingo` for integration with the Clingo ASP solver Show & Tell	2	183	December 12, 2023
Tips for building native libraries for Racket? Questions & Answers package , native-libraries	5	428	February 18, 2022
Help: Where to find ffi/examples Questions & Answers question	3	450	April 20, 2022
Why ffi's `_list` is bad syntax? Questions & Answers	1	216	August 30, 2022
Heads up: Qi 4 Release Incoming General qi	3	294	January 11, 2024

Help with FFI errors (for #lang clingo integration)

Related topics