Preventing unnoticed thread death

I'm expanding the majordomo2 task management module to accept a max-workers parameter such that number of running tasks can be constrained with newly-added tasks being queued until there are workers available.

My initial thought on how to do this is to have a central thread that will listen on an async-channel for messages regarding new tasks being queued, workers finishing or dying, etc. Based on these signals it would start new workers as needed. My concern is that if the monitor thread stops working then the whole thing breaks down and new tasks will never be run. I'm looking for a way to guarantee that I can detect when the monitoring thread dies.

I am aware of 16.3 Wills and Executors but don't have experience with them. The Reference section is terse and there's nothing in the Guide on them, so it's not clear to me if they are a good fit here.

Any advice?

2 Likes

If the original, main thread exits, doesn't the whole Racket process terminate?

If so, what if you make that main thread do the monitoring?

In some programs the main thread has nothing interesting to do -- e.g. in a web server it might park forever in a (sync never-evt), just to keep the process from exiting -- in which case, yay, now it has a more-interesting role to play.

I think one problem (at least) with that answer, is that it's only an answer for an application. You're talking about a library, correct?

It's a library, yes.

An executor needs another "monitor" thread to execute wills. If "guarantee" means you need to handle that thread stopping unexpectedly, this seems like it's going to be an infinite regress.

So instead of noticing the bad thing happening, I'd probably try to prevent it --- focus on trying to keep the monitor thread alive?

I'd start by making sure the monitor thread won't exit itself. Make sure it handles all uncaught exceptions, using with-handlers or call-with-exception-handler.

That leaves something else killing the thread, via kill-thread or shutting down its custodian. I think you can mostly guard against those by (a) not making that thread descriptor value available in most of your library (don't provide a variable holding its value) and (b) carefully creating the thread in its own custodian, and ditto not advertising that custodian value.


Strictly speaking this answer is cheating by redefining the question from "preventing unnoticed thread death" to "preventing thread death", but, that would be my advice. Probably other people have better!

1 Like

So, essentially, "Threads aren't going to die without a specific reason so as long as their function is simple and robust then you should stop obsessing about irrelevant things." :>

That works. Thank you!

1 Like

I will agree with this last part only because I think we share a sense of humor that includes a fair amount of self-deprecation. :smile: Of course seriously you asked a really good question.

I'm not certain my advice is best, but it's what I'd do until/unless I learned something better.

1 Like