HtDP speed curiosity

Hello,

I'm reading HtDP, a very interesting and well written book. I’m on chapter VI about accumulators.

The note on speed compares two different functions, one with an accumulator and one with natural recursion. Surprisingly the accumulator one is slower than the other one. So I developed an example on my own.

I developed three procedure that build a list of descending numbers:

(define mfirst mcar)
(define mrest mcdr)
(define set-mfirst! set-mcar!)
(define set-mrest! set-mcdr!)


(define (build n)
  (if (= n 0)
      '()
      (cons n (build (sub1 n)))))

(build 10)

=> '(10 9 8 7 6 5 4 3 2 1)


(define (build/a n [acc '()])
  (if (= n 0)
      (reverse acc)
      (build/a (sub1 n) (cons n acc))))


(define the-list '())
(define tail '())

(define (build/i n)
  (define t (mcons n '()))
  (cond 
    [(empty? the-list)
     (set! the-list t)
     (set! tail t)
     (build/i (sub1 n))]
    [(> n 0)
     (set-mrest! tail t)
     (set! tail t)
     (build/i (sub1 n))]))

     
(displayln "functional")
(time (for ([i (in-range 5000)])
        (build 1000)))
        
(displayln "functional acc")
(time (for ([i (in-range 5000)])
        (build/a 1000)))
        
(displayln "imperative")
(time (for ([i (in-range 5000)])
        (set! the-list '())
        (build/i 1000)))

=> functional
   cpu time: 47 real time: 47 gc time: 16

   functional acc
   cpu time: 87 real time: 87 gc time: 14

   imperative
   cpu time: 128 real time: 128 gc time: 24

Build uses natural recursion, build/a uses an accumulator and build/i uses the imperative paradigm.

I compiled this program into an executable and the result is incredible. The imperative paradigm is much slower than the functional paradigm.

I’m really curious about this. What could it be due to?

Matteo

  1. HtDP’s example is about the order of operations. A transformation from a natural style to an accumulator style, changes the order in which primitive operations build the results (+ for example). If this operation satisfies a certain mathematical property, you can align the orders. If not, too bad — the performance of accumulator variant may suffer. The big point is “don’t trust prejudice” (now called “consensus science”).

  2. Your imperative example shows that “C style” programming is not good for memory access and GC. The Java world discovered this problem around ’00. Programmers thought that “it looks like C++ so I can kind of program C++” but that ignored the underlying semantics and operation (GC). (The Java team tuned the GC eventually so that it wouldn’t perform too badly with C++ style programming.)

I wanted to understand this as well for Gambit, where I get similar timings with the following declarations:

(declare
  (standard-bindings)
  (extended-bindings)
  (block))

which I believe accords to assumptions that the Racket compiler makes.

For Gambit, build/a takes just about twice as long as build because (a) the build recursion isn't so deep as to trigger a GC and copy the stack to the heap and (b) build/a's reverse takes about as much time as building the list.

build/i takes almost three times as long as build because of the code inserted to check that the first argument to set-mrest! is truly a pair. If I add (not safe) to the list of declarations, then build and build/i take just about the same run time.