Why is scheduling this future from a thread slower than scheduling it from the main thread?

I'm interested in having a synchronization event available for work started in parallel via a future, so that I can combine it with other synchronization events. The first thing I tried was using a new green thread that creates a future encapsulating a thunk representing an expensive computation. This thread can block by calling touch, without blocking the thread that wishes to schedule the work. It will put the result on a channel that the main thread can synchronize on via sync

Feel free to let me know of a better solution, but I'm also curious why creating the future in the thread, as in schedule-pure-work below, ends up being quite a bit (2 seconds) slower than schedule-pure-work2, which creates the future before capturing it in the thread.

Is it taking that long to schedule the new thread so the future work begins in parallel later?

#lang racket
(require future-visualizer
         future-visualizer/trace)

(start-future-tracing!)

;; Schedule the work thunk to be executed in parallel and return a synchronization
;; event whose result is the result of the work
(define (schedule-pure-work work)
  (define worker-ch (make-channel))
  (thread (lambda ()
            (define future-result (future work))
            (define result (touch future-result))
            (channel-put worker-ch result)))
  worker-ch)

;; Why is this one multiple seconds faster? Is the future
;; work not started in parallel until much later if started
;; in a new thread?
(define (schedule-pure-work2 work)
  (define worker-ch (make-channel))
  (define future-result (future work))
  (thread (lambda ()
            (define result (touch future-result))
            (channel-put worker-ch result)))
  worker-ch)


(define (dumb-work n)
  (cond
    [(= n 1) 1]
    [else (* n (dumb-work (- n 1)))]))

(define hard-work-1 (lambda () (dumb-work 92499)))
(define hard-work-2 (lambda () (dumb-work 92499)))

(define x-future-sync-event (schedule-pure-work2 hard-work-1))
(define y (hard-work-2))
(define x (sync x-future-sync-event))


(stop-future-tracing!)
(show-visualizer)

In the case where you create the future and immediately touch it within the same thread, it turns out to be unlikely that the future gets picked up by the future scheduler before touch, and so the thread that calls touch just runs the future directly (i.e., in the green/coroutine thread).

When you create a future and then create a thread to touch it, then there's enough delay between the creation of the future and the time the new thread reaches touch that the future is likely to have been picked up by the future scheduler to run in parallel. So, the thread waits for a result, instead of running the future's computation itself.

It certainly seems like there's room for improvement in the scheduler heuristics here!

Meanwhile, I can't help thinking that you just want parallel threads. If you'd like to try out that route, see Help test via snapshots: parallel threads.

That makes sense, thank you! Parallel threads would be perfect. I didn't realize that was being worked on so I'll check them out.