Differences between for/fold and explicit car/cdr recursion?

ddrake · August 9, 2025, 3:07pm

I solved one of the katas on codewars: it's called "are they the same?". The key thing is deciding if two lists have equal elements.

My solution used for/fold; one of the other solutions used first and rest and explicitly recursed. I'm wondering about the basic conceptual difference between these two functions:

(define (all-equal-via-fold xs ys)
       (for/fold ([all-equal #t])
                  ([x xs]
                   [y ys]
                   #:break (not all-equal))
         (and all-equal (equal? x y))))

(define (all-equal-via-cons-and-cdr xs ys)
  (if (equal? (first xs) (first ys))
      (all-equal-via-cons-and-cdr (rest xs) (rest ys))
      #f))

(Those aren't exactly what is used in the problem/kata, but they illustrate what I'm wondering; you can imagine replacing equal? in (equal? x y) and (equal? (first xs) (first ys)) with some other predicate.)

Is there a meaningful difference between those two? I know there are a lot of extra details involved in the for/fold, but I want to verify my mental model here: I think that the fold, with its #:break, will get compiled into something equivalent to the explicit first/rest recursion: for each thing over which we are folding (an x and y pair), it runs the predicate, uses the result to update the accumulator (or break out of the fold), and then proceeds to do the same thing on the rest of the thing over which we are folding.

That seems to match conceptually with the recursion: take the first x and y pair in the lists, run the predicate, and then either return (= break out of the fold) or recurse (= "do the same thing on the rest of the thing...").

On a basic conceptual level, am I right that these are pretty much equivalent?

If so, what differences are there between the two?

ddrake · August 9, 2025, 3:17pm

FWIW, some naive timing shows that the explicit recursion is about 40% faster than the fold version. So there's a practical difference here, which isn't so surprising. What else is relevant here?

(define (compare-times n)
  (let* ([length 1000]
         [xs (randlist length)]
         [ys (map sqr xs)])
    (printf "fold:\n")
    (time (for ([_ (range n)]) (all-equal-via-fold xs ys)))
    (printf "recurse:\n")
    (time (for ([_ (range n)]) (all-equal-via-cons-and-cdr xs ys)))))

(randlist just makes a big list of random floats)

And I get:

> (foo 100000000)
fold:
cpu time: 9912 real time: 9912 gc time: 0
recurse:
cpu time: 6026 real time: 6026 gc time: 2
>

LiberalArtist · August 9, 2025, 5:22pm

Both versions seem broken, and, even if this counterexample breaks what was supposed to be an unchecked invariant, the two implement different behavior:

> (all-equal-via-fold '(1 2) '(1 2 3))
#t
> (all-equal-via-cons-and-cdr '(1 2) '(1 2 3))
first: contract violation
  expected: (and/c list? (not/c empty?))
  given: '()
 [,bt for context]

(Your original question is a good one, I just didn't want to “spoil” a puzzle by jumping over this way the two differ.)

greghendershott · August 9, 2025, 7:31pm

As for performance: When you know you'll be operating on a specific type of sequence -- like a list -- you want to give for that "hint" -- like by using in-list. See Iteration Performance.

I think the only time I'd omit that is when the code is supposed to be "generic" for any type of sequence (which in my experience is rare), or, when I'm casually exploring or experimenting with short sequences in the REPL and performance doesn't matter.

With in-list I'd expect the for/fold code to expand into something much like the handwritten recursion (but corrected for details like @LiberalArtist pointed out).

FWIW I'd probably use for/and unless that would be cheating for the kata.

(To really really cheat, I'd give as the solution just (define comp equal?) IIUC the problem correctly. But seriously, it's interesting to think about how you would define equal? for real, over a variety of values.)

ddrake · August 9, 2025, 9:51pm

Ah, that's an artifact of me trying to extract just the relevant piece of the full solution -- the assumption is that the lists are of the same length. In the actual recursive solution, that is handled.

My own solution -- which was accepted as correct on the codewars site -- has that bug, because for/fold terminates on the shorter input. I commented on the solution and suggested they add tests to exercise that corner case. Thanks for that!

So, for the moment, let's just assume that the lists are the same length.

The differing behavior, I think, is a good example of a meaningful difference between the fold and the recursion: the fold just automatically terminates on the shorter input; the recursion blows up. In some situations, the former behavior may be exactly what you want. But not here...

ddrake · August 9, 2025, 11:00pm

Indeed, that eliminates the performance difference -- I just ran the same timing, and there's no significant difference.

ddrake · August 9, 2025, 11:03pm

It's not, and I didn't know about for/and -- it's precisely what I was aiming to do with my fold.

ddrake · August 10, 2025, 6:55pm

BTW, is there a nice idiomatic way to get for/fold and friends to somehow signal or handle the situation in which one of the iteration sequences is exhausted, but the other isn't? It looks like for's essential behavior is to just terminate on the shorter sequence, and I don't see a reasonable way to detect or signal that situation.

I asked a couple chatbots for some code, and both landed on something sequence->generator -- but when I really looked at the code, it seemed to be (for this problem, at any rate) just a really fancy complex version of that simplistic first/rest recursion.

greghendershott · August 10, 2025, 7:57pm

No, I don't think so. The iteration/comprehension forms like for/fold follow the example of fold and map in this regard.

If I needed special behavior, I would probably just write out a recursive function by hand.

For example, as a first attempt:

(define (strict-map f as bs)
  (cond
    [(and (pair? as) (pair? bs))
     (cons (f (car as) (car bs))
           (strict-map f (cdr as) (cdr bs)))]
    [(and (null? as) (null? bs))
     null]
    [else
     (error 'strict-map "lists are different lengths")]))

Or, using racket/match something like:

(define (strict-map f as bs)
  (match* [as bs]
    [[(cons a as) (cons b bs)] (cons (f a b) (strict-map f as bs))]
    [[(list)      (list)     ] null]
    [[_           _          ] (error 'strict-map "lists are different lengths")]))

When writing recursive functions by hand I tend to prefer pattern matching. Because it reduces the car and cdr noise:signal ratio. And because it looks like writing out the various cases in a "by example" style. And, as with for/fold, it tends to expand to equivalently efficient code.

p.s. That strict-map could be rewritten to be fully variadic just like map.

EmEf · August 10, 2025, 8:59pm

Here is a way to get you started on a "cheaply constructed but expensive" for/fold variant that protects you from sequences of unequal length:

#lang racket

(require (for-syntax syntax/parse))

(define-syntax (for/fold/protected stx)
  (syntax-parse stx
    [(_ accu ([x xx] ... clauses ...) body ...)
     #:with (determine-length ...) (protect #'(xx ...)) 
     #'(cond
         [(not (= (determine-length xx) ...))
          (error 'for/fold/protect "expected sequences of equal length, given ~a" `[xx ...])]
         [else
          (for/fold accu ([x xx] ... clauses ...) body ...)])]))

(define-for-syntax (protect syn-lists)
  (define lists (syntax->list syn-lists))
  (for/list ([l lists])
    (syntax-parse l
      [((~literal in-list) l) #'stream-length]
      [((~literal in-vector) v) #'sequence-length]
      ;; the next line is wrong 
      [_ #'length])))

(define ll '(1 2 3))
(define kk '#(a b c d))
(define mm '(1 2 3))

(for/fold/protected
    ([r '()] #:result (reverse r))
  ([l (in-list ll)] [m mm] [k (in-vector kk)] #:do [ (printf "~a ~a\n" l k) ])
  (cons (list m l k) r))

You'd have to write a generic length-for to get the "wrong" case done properly.

The "cheap" refers to the macro part.
The "expensive" refers to the upfront checking part, which duplicates the traversal.

When I program, I use fold when I don't know the length and want an "inexpensive" way of failing (terminating the program execution) and for/fold when I am okay with abandoning left-over elements of a sequence (or when I know they are guaranteed to be of the same length).

ddrake · August 11, 2025, 12:28pm

"Learn about Racket's pattern matching" has been on my radar, thank you for prompting me to learn a bit about it.

This approach is lovely! It's so short and so readable. The pattern matching style really appeals to me; as you say, it's nice to describe cases via "shapes" or examples or declaratively, rather than interrogatively with predicates.

Here, it's a perfect description of what I want: the first pattern says "the shape is that of two lists that have at least one element"; the second says "two empty lists"; and the third says "in any other situation, we have an empty and a nonempty list".

And not only that, but when I use that for my solution to the kata, it's 35% faster than the explicit car/cdr solution! I have no idea why. Here's my solution:

(define (comp%% xs ys)
  (let recurse ([xs (sort (map sqr xs) <)]
                [ys (sort ys <)])
    (match* [xs ys]
      [[(cons a as) (cons b bs)] (if (equal? a b)
                                     (recurse as bs)
                                     #f)]
      [[(list)      (list)     ] #t]
      [[_           _          ] #f])))

and comparing that to the explicit car/cdr solution, mine runs faster.

Now my question is, why? I tested, and it's not the explicit length checks. My named let defines a function that seems equivalent to the "compAux" function in the other solution. But mine is significantly faster. Any ideas why?

(Another question: I'm not clear on the difference between match and match*. I read the reference, but it's not so clear to my little pea brain.)

greghendershott · August 11, 2025, 12:56pm

Sorry, using match* here was probably an unnecessary distraction. (Unlike many people here, I am definitely not a professional educator.)

match* is just a convenience for match. For now just think of it as writing e.g. (match (list as bs) _) and wrapping each pattern clause in a list.

By the way, I'm pretty sure I first saw this style -- "classic" handwritten recursion but using match -- in some code by @soegaard, to give credit where it's due.

ddrake · August 11, 2025, 11:34pm

That seems roughly right: these work the same, as far as I see:

(define (match-no-star xs ys)
  (match (list xs ys)
    [ (list (cons a as) (cons b bs)) '2-element-lists]
    [ (list empty empty)  'empty-empty ]
    [ (list as bs) (printf "something else: ~s, ~s\n" as bs)  ]))

(define (match-no-star-lotsa-punctuation xs ys)
  (match `(,xs ,ys)
    [ `( ,(cons a as) ,(cons b bs)) '2-element-lists]
    [ `(,empty ,empty)  'empty-empty ]
    [ `(,as ,bs) (printf "something else: ~s, ~s\n" as bs)  ]))

(define (match-star xs ys)
  (match* (xs ys)
    [ ((cons a as) (cons b bs)) '1+-element-lists]
    [ (empty empty)  'empty-empty ]
    [ (as bs) (printf "something else: ~s, ~s\n" as bs)  ]))

greghendershott · August 12, 2025, 1:50pm

I dodged this part.

Not immediately sure why. Might be related to details like this let and comment from the source for Racket's map:

  (let loop ([l l])
    (cond
      [(null? l) null]
      [else
       (let ([r (cdr l)]) ; so `l` is not necessarily retained during `f`
         (cons (f (car l)) (loop r)))]))

The match approach also does that -- rebinds as to (cdr as), ditto for bs -- and avoids that retention.

This guess is more plausible if Racket's time says the extra time for your explicit car/cdr variation is mostly "GC".

Having said all that, this guess feels insufficient?

p.s. Tip: When using Racket's time in code, run that using racket from the command-line.

(If instead the code is run in e.g. DrRacket or Emacs Racket Mode, the timing is likely to be contaminated by other stuff running, as well if the code is annoated by errortrace for better stack traces, etc.)

dstorrs · August 13, 2025, 9:10pm

BTW, is there a nice idiomatic way to get for/fold and friends to somehow signal or handle the situation in which one of the iteration sequences is exhausted, but the other isn't?

Not sure if this is what you want, but you could use a macro that checks the lengths before starting the fold:

#lang racket

(require (for-syntax  syntax/parse))

(define xs '(1 2))
(define ys '(1 2 3))
(define zs '(1 2 3))

(define-syntax (for/fold/same-length-lists stx)
  (syntax-parse stx
    [(_ xs:expr ys:expr)
     #'(and (= (length xs) (length ys))
            (for/fold ([all-equal #t])
                      ([x xs]
                       [y ys]
                       #:break (not all-equal))
              (and all-equal (equal? x y))))]))

(for/fold/same-length-lists xs ys)
(for/fold/same-length-lists ys zs)

Topic		Replies	Views
Clever Way to Check Unordered List Equality Questions & Answers	20	398	November 23, 2024
Streams questions Questions & Answers	22	216	September 9, 2024
Map versus for/list (& idiomatic Racket, the "for/..." zoo, etc) General	12	167	July 15, 2025
Predicate non-cyclic-pair? Questions & Answers question , drracket	16	346	April 21, 2022
Foldl: Racket vs Haskell/OCaml General	2	411	July 5, 2023

Differences between for/fold and explicit car/cdr recursion?

Related topics