How does in-list "provide better performance"?

The documentation for in-list states:

An in-list application can provide better performance for list iteration when it appears directly in a for clause.

How is this accomplished? It seems counter-intuitive to increase performance by adding an application.

5 Likes

The for forms recognize in-list (or really they expand in-list which describes what to do) which produces code that calls car and cdr directly. If you don’t use in-list, then the for expansion uses more general operations such as make-sequence to handle the sequence.

6 Likes

I just looked at the source for in-list, and I was struck by this clause:

[[(id) (_ lst-expr)]

I didn’t realize a macro wasn’t required to be in the outermost form, or maybe I’m totally misunderstanding this. The _ is a placeholder for in-list, right? And yet the clause refers to elements outside of the (in-list ...) form. What is going on here? :slight_smile:

Here’s the full version for context:

  (define-sequence-syntax *in-list
    (lambda () #'in-list)
    (lambda (stx)
      (syntax-case stx (list)
        [[(id) (_ (list expr))] #'[(id) (:do-in ([(id) expr]) #t () #t () #t #f ())]]
        [[(id) (_ lst-expr)]
         #'[(id)
            (:do-in
             ;;outer bindings
             ([(lst) lst-expr])
             ;; outer check
             (unless-unsafe (check-list lst))
             ;; loop bindings
             ([lst lst])
             ;; pos check
             (pair? lst)
             ;; inner bindings
             ([(id) (unsafe-car lst)]
              [(rest) (unsafe-cdr lst)]) ; so `lst` is not necessarily retained during body
             ;; pre guard
             #t
             ;; post guard
             #t
             ;; loop args
             (rest))]]
        [_ #f])))
1 Like

in-list isn’t a regular macro, it’s a special form that’s handled by the for forms, which is why it’s defined with define-sequence-syntax. The for expansion recognizes in-list on the right hand side of a for clause, and then calls the macro with the entire clause, not just the right hand side. That’s why the patterns look the way they do here.

6 Likes

Following up on this, if I use in-list outside of a for-form, can the for-form still take advantage of improved performance? Or are bindings “opaque” to this sort of thing? That is, in my boggle post, I have something like

(define (find-words-at board maximum-word-length c)
  (in-generator …))

(define (find-words board maximum-word-length)
  (apply in-sequences
         (map (curry find-words-at board maximum-word-length)
              (hash-keys board))))

(for ([word (find-words board 8)])
  …)

I know the generator stuff isn’t the same, but are potential performance gains impeded by define here?

2 Likes

No, the performance benefit is only when it’s used directly in the for form.

2 Likes

So if in the example above, find-words directly evaluated to (in-generator ...), then we would get the performance boost, right?

No, it’s about being syntactically inside the for form, not about what it evaluates to.

2 Likes

If you wonder about the inner workings of for then the :do-in reveals how for works inside.
Inside a for form :do-in will expand into a loop of the form:

The loop is a typical named let loop. Consider how (for ([x (in-list '(1 2 3))]) (displayln x))
could be written in this style. It becomes more or less:

The form :do-in is used like this:

(:do-in ([(outer-id ...) outer-expr] ...)
        outer-check
        ([loop-id loop-expr] ...)
        pos-guard
        ([(inner-id ...) inner-expr] ...)
        pre-guard
        post-guard
        (loop-arg ...))

This tells us how to implement in-list using :do-in:

Consider now the case

(for ([x (in-list '(1 2 3))]
      [y (in-list '(a b c))])
  (list x y))

This is a parallel loop, where x and y runs over the lists '(1 2 3) and '(a b c) in parallel.
The for form must therefore find the pieces from the two loops and fuse them together.
E.g. combine ([xs initial-xs]) and ([ys initials-ys]) into ([xs initial-xs] [ys initials-ys]) .

This process explains why for needs to see the in-list in the rhs of the clause.
If at compile time, the in-list is present for can generate the various pieces needed
to loop through the elements and put these pieces together with pieces from other iterations.
If on the other hand the in-list is missing, then it can’t do anything better than output
something that at runtime must check whether it needs to loop through a list, a vector,
a string or similar.

Documentation for :do-in

5 Likes