Logging, custodians and larger applications

I have a racket CLI application that used to be a single file and now spans multiple files. I added logging to that app based on what I'd learnt (see Logging With Racket // Grinning Cat). So in my main.rkt I have the following:

  (define-logger agent)
  (define agent-logger-receiver (make-log-receiver agent-logger 'info 'agent))
  (current-logger agent-logger)

  (define (write-to-file destination content)
    (with-output-to-file destination
      (λ ()
        (displayln 
         content))
      #:mode 'text #:exists 'append))

  (define (write-to-port destination content)
    (displayln content destination))

  (define log-destinations
    (list (list write-to-file "pii_agent.log")
          (list write-to-port (current-error-port))))

  (define (send-log-content content destinations)
    (unless (null? destinations)
      (let ([ destination (car destinations)])
        ((car destination) (cadr destination) content)) 
      (send-log-content content (cdr destinations))))

  ; Listen for events on the log-receiver
  (void 
   (thread 
    (λ()(let loop ()
          (define log-vector (sync agent-logger-receiver))
          (define content (format "[~a] ~a\n"
                                  (vector-ref log-vector 0)
                                  (vector-ref log-vector 1)))
          
          (send-log-content content log-destinations)
          (loop)))))

Which roughly means that I have a logger setup to log to two places (a file and STDERR) and a thread that runs waiting for the logger to get called.

Then in another file I have a number of log statements like:

  (log-info "Run completed.")

A couple of times (i.e. not consistently) I've received the following error toward the end of the applications' execution:

open-output-file: the current custodian has been shut down
  context...:
   /Users/robertpostill/software/pii_spider/main.rkt:30:2: send-log-content
   /Users/robertpostill/software/pii_spider/main.rkt:39:4

Which makes me worry that I've setup a race condition and that the thread is terminated by the GC before that final log message is sent to it. Am I correct? If so what should I do? Do I set up a thread per file/module? Is spinning off a thread like this the right thing to do?

Thanks in advance for any advice or help.

1 Like

So the simple answer is, the main racket thread doesn't wait for threads you have started unless you tell it to do that, so if you don't and the main thread is done, the program exits even when there are still other threads running. I think this is what causes the error message you are seeing.

Here is an example program demonstrating this:

#lang racket

(define (background-task)
  (thread
   (thunk
    (for ([i (in-range 0 10)])
      (displayln i)
      (sleep 1)))))

(module+ main
  (void (background-task))

  ;; now lets get our work done
  (sleep 1)
  (displayln "not very much work todo...")
  (sleep 1)
  (displayln "just printing a bunch of lines...")
  (sleep 1)
  (displayln "done"))
0
not very much work todo...
1
just printing a bunch of lines...
2
done

We can change this program to explicitly wait for the background task:

#lang racket

(define (background-task)
  (thread
   (thunk
    (for ([i (in-range 0 10)])
      (displayln i)
      (sleep 1)))))

(module+ main
  (define bg-thread (background-task))

  ;; now lets get our work done
  (sleep 1)
  (displayln "not very much work todo...")
  (sleep 1)
  (displayln "just printing a bunch of lines...")
  (sleep 1)
  (displayln "done")

  (thread-wait bg-thread))

With this the program exits after the background-task is done counting.

Now if you have used e.g. #lang racket/gui you may say that there the program reaches the end of my (module+ main ...) but it doesn't exit.

Yes and there is an explanation for that, racket/gui installs an executable-yield-handler that waits for gui related things that are still running (open windows, timers, etc.). 12 Startup Actions

We can do a similar thing to make sure that our background task can finish.
This allows us to make using our background task more convenient for the user, the user doesn't have to manually call thread-wait:

#lang racket

(define (background-task)
  (thread
   (thunk
    (for ([i (in-range 0 10)])
      (displayln i)
      (sleep 1)))))

(define (register-yield-handler/wait-for thread)
  (define orig-handler (executable-yield-handler))
  (executable-yield-handler
   (λ (arg)
     (thread-wait thread)
     (orig-handler arg))))

(define (start-background-task)
  (register-yield-handler/wait-for (background-task)))

(module+ main
  (start-background-task)

  ;; now lets get our work done
  (sleep 1)
  (displayln "not very much work todo...")
  (sleep 1)
  (displayln "just printing a bunch of lines...")
  (sleep 1)
  (displayln "done"))

If you have to wait for many threads you maybe want to change register-yield-handler/wait-for to get a list of threads and make it loop over that and wait for all of them or maybe even to something using sync.

4 Likes

@simonls already answered the question and said it better than I was going to, so I've got nothing to add there. I thought I might offer one suggestion, which is to rewrite send-log-content like so:

(define (send-log-content content destinations)
  (match destinations
    ['() (void)]
    [(list (list func where) elsewhere ...)
     (func where content)
     (send-log-content content elsewhere)]))

I'm a big fan of using match instead of booleans and accessors to do your flow control and parsing. Not only does it combine them into one step but it makes the structure of the data clearer and it does implicit error checking for you by telling you that there's a case you didn't consider. Granted, you get an error message either way but I consider this:

; car: contract violation                                                                      
;   expected: pair?                                                                            
;   given: #<procedure:write-to-file> 

less helpful than this:

; match: no matching clause for '(#<procedure:write-to-file> "pii_agent.log")   

since it points to the correct issue -- there was a parsing case that I didn't consider, not simply a misplaced access. I grant that this one is very much a matter of preference.

3 Likes