Clever Way to Check Unordered List Equality

Hi, Racket Discourse!

Just a quick one today: Have any of you come across a "clever" way to check for unordered list equality?

I've come up with this (low-effort) snippet, so far:

(define (and-reduce f tree-a tree-b)
  (cond [(and (pair? tree-a) (pair? tree-b))
         (and (f (car tree-a) (car tree-b))
              (and-reduce f (cdr tree-a) (cdr tree-b)))]
        [else (equal? tree-a tree-b)]))

(define (congruent? tree-a tree-b)
  (cond [(and (pair? tree-a) (pair? tree-b))
         (for/or ([permuted (in-permutations tree-b)])
           (and-reduce congruent? tree-a permuted))]
        [else (equal? tree-a tree-b)]))

(congruent? '(1 (a (4 3) (a c)) ((((2)) ((3)))))
            '(1 ((((2)) ((3)))) (a (4 3) (a c))))
; => #true

(congruent? '(1 2 3 4)
            '(4 2 3 1))
; => #true

(congruent? '((1) 2 (3))
            '(2 (1) ((3))))
; => #false

It's not bad, but I feel like it is a very "brute-force" kind of solution, especially with the permutations.

The idea is to eventually abstract this out into a macro-like form that I can use for expressions like this:

(≡app
   (exp (log a b) (log a))
   ≡ [distribute:spread]
   (exp (log a) × 2) ; => (exp (log a) (log a)) 
   (exp (log b) (log a)))

where the parenthesized expressions being matched and transformed are basically "typed bags", in the sense of each bag having a "head", e.g., exp and log.

I thought also of perhaps using the "traces" of the lists, e.g., for something like:

'(1 (1 2) 3 4 ((4)))
=> (1), ((1)), ((2)), (3), (4), (((4)))

and then comparing these "traces" between lists.

Potato, potato?


Edit for posterity: I realize now that my question is not very clear. Although I ostensibly enquire about a "clever way" to test between unordered lists for equality, I think it might come down to more of a "what is a relatively ergonomic structure which allows for the behaviour of multisets" question.

My thinking was that if I could find a "clever" way of thinking about equality (of multiset-like things), then I could generalize that to obtain what I desired.

I'm not sure this is exactly the function you want but, if it is, you can avoid creating the permutations, which'll probably be a lot faster.

#lang racket
(module+ test (require rackunit))

(provide
 (contract-out
  [congruent?
   (-> list? list? boolean?)]))

(define (congruent? a b)
  (equal? (to-ht a) (to-ht b)))

(define (to-ht l)
  (define ht (make-hash))
  (for ([x (in-list l)])
    (hash-set! ht x (+ 1 (hash-ref ht x 0))))
  ht)

(module+ test
  (check-equal? (congruent? '() '()) #t)
  (check-equal? (congruent? '() '(1)) #f)
  (check-equal? (congruent? '(1) '()) #f)
  (check-equal? (congruent? '(1) '(1)) #t)
  (check-equal? (congruent? '(1) '(2)) #f)
  (check-equal? (congruent? '(1 2) '(1 2)) #t)
  (check-equal? (congruent? '(1 2) '(2 1)) #t)
  (check-equal? (congruent? '(1 1 2) '(1 2 2)) #f))
2 Likes

Hi, @robby. Thank you.

I realize from your answer that I gave rather poor examples and did not explicitly say so, but I would like to check the congruence up to an arbitrary depth; so, unordered lists of unordered list or atoms.

Of course, we could generalize somewhat from your example to cover this case.
Not a very performant attempt, but demonstrates the principle:

(define (nest n x)
  (if (zero? n) x `(,(nest (- n 1) x))))

(define ((trace depth) tree)
  (cond [(not (list? tree))
         (nest (+ depth 1) tree)]
        [else
         (append* (map (trace (+ depth 1)) tree))]))

(define (list->hash lst)
  (for/hash ([group (in-list (group-by identity lst))])
    (values (car group) (length group))))

(define (congruent?* tree-a tree-b)
  (equal?
   (list->hash ((trace 0) tree-a))
   (list->hash ((trace 0) tree-b))))

(congruent?* '(1 (a (4 3) (c a)) ((((2)) ((3)))))
             '(1 ((((2)) ((3)))) ((3 4) a (a c))))
; => #true

(congruent?* '(1 2 3 4)
             '(4 2 3 1))
; => #true

(congruent?* '((1) 2 (3))
             '(2 (1) ((3))))
; => #false

The reason I began checking somewhat structurally, is that I would like to use pattern matching for my hypothetical (≡app ...) macro, but I would also like to allow for duplicate identifiers for equal but arbitrarily nested terms in the body and I am not yet confident enough in my macro-abilities to keep track of the plumbing and have the terms survive the expansion to whichever forms/bindings eventually. Hence, lists.


Edit: Okay, so this implementation is wrong. It happens to succeed, but as I alluded in my reply to @Antigen-11, it fails to take into account the assumption I mention in the beginning about the identities of the lists themselves, or the heads; but even this is still an under-constrained statement. I shall need to elaborate.


Edit: This is more in line with what I intended but failed to express:

(define (nest path x)
  (if (null? path) x `(,(car path) ,(nest (cdr path) x))))

(define ((trace [path (list)]) tree)
  (cond [(not (pair? tree))
         (list (nest (reverse path) tree))]
        [else
         (append* (map (trace (cons (gensym) path)) tree))]))

((trace) '(1 2 (1 1 1 (2 3 4) 5) 4 (1 2 3 4 5)))

In essence, I want to keep track of which "spaces" or "nodes" the elements occupy, in order to distinguish them properly, although this won't work as is with the above.

I'd use immutable hash tables for a hash-based approach:

(define (congruent? a b)
  (define (list->hash-frequencies lst)
    (for/fold ([h (hash)])
              ([elem (in-list lst)])
      (hash-update h elem add1 0)))
  (equal? (list->hash-frequencies a) (list->hash-frequencies b)))

Or an O(N^2) approach that converts one list to a vector (to easily move elements around) and checks each element of the other list to see if hasn't yet been found in the vector:

(require (only-in srfi/43 vector-swap!))
(define (congruent? a b)
  (define (vec-index vec what start end)
    (for/first ([(elem i) (in-indexed (in-vector vec start end))]
                #:when (equal? elem what))
      i))
  (define vec-b (list->vector b))
  (if (= (length a) (vector-length vec-b))
      (let loop ([a a]
                 [end (vector-length vec-b)])
        (cond
          [(null? a) #t]
          [(vec-index vec-b (car a) 0 end)
           => (lambda (i)
                (vector-swap! vec-b i (sub1 end))
                (loop (cdr a) (sub1 end)))]
          [else #f]))
      #f))
2 Likes

I would turn the lists into sets and use equal?.

I think this definition works on your examples, but you might
need a more elaborate solution.

(define (congruent? tree-a tree-b)
  (equal? (list->set tree-a) (list->set tree-b)))

I'm sorry to bother you, but there might be a typo here.

The trace function seems to be a flatten-like procedure, but ((trace 0) '(0 (1) () (2))) returns '((0) ((1)) ((2))).

I try redefining this function as follows and the new one works fine.

(define ((trace depth) tree)
    (cond [(not (list? tree))
           (nest (+ depth 1) tree)]
          [else
           (append* (map (trace depth) tree))]))
;; ((trace 0) '(0 (1) () (2)))
;; '(0 1 2)
1 Like

Indeed, @Antigen-11! It was a bit of an awkward function to write and I made the same mistake as with the first attempt, haha.

If I change the definition to use pair? (dubiously?), it works as I intended.

(define ((trace depth) tree)
  (cond [(not (pair? tree))
         (nest (+ depth 1) tree)]
        [else
         (append* (map (trace (+ depth 1)) tree))]))

((trace 0) '(0 (1) () (2)))
; => '((0) ((1)) (()) ((2)))

However, I think I might have contradicted myself somewhat in what I have relayed.

Good catch!

That fails if you have duplicate equal? elements in the lists.

Other people seem to get it, but I'm confused about exactly what definition of 'equal' you are using. If it's as simple as "are the atoms in x the same as the atoms in y" then you could do this:

(equal? (list->set (flatten x))
        (list->set (flatten y)))

If you're looking for something else, what is it exactly?

Then we need to use bags (multi sets) instead.

https://docs.racket-lang.org/rebellion/Multisets.html

Hi, @dstorrs. Maybe somewhat abstruse, yes.

As @soegaard points out below, the intended definition of "equivalence" I am aiming at is that of multisets, which are like sets but where elements also have multiplicity.

Additionally, I want to treat the nested lists as multisets of multisets, hence the elaborate machinations.

Hi, @soegaard.

I have considered using Rebellion's multisets, yes. But I must admit I have not been able/spent enough time to come up with the necessary machinery to deal with rests and pattern matching on Rebellion's implementation.

This is broccoli for thought :thinking:

@bakgatviooldoos

#lang racket
(require rebellion/collection/multiset)

(define (tree->bag t)  
  (if (list? t)
      (sequence->multiset (map tree->bag t))
      t))

(define (congruent? t1 t2)
  (equal? (tree->bag t1) (tree->bag t2)))
1 Like

I thought of this proto-≡app implementation, to better explain what I am trying to achieve:

(require (for-syntax syntax/parse
                     racket/syntax))

(define-match-expander multiset
  (lambda (stx)
    (define (nest path x)
      (cond [(null? path) x]
            [else
             #`(list #,(car path) #,(nest (cdr path) x))]))

    (define ((trace [path (list)]) tree)
      (cond [(not (pair? tree))
             (list (nest (reverse path) (datum->syntax stx tree)))]
            [else
             (apply append (map (trace (cons (generate-temporary) path)) tree))]))
  
    (syntax-parse stx
      [(_ content* ...+)
       #:with (content ...) ((trace) (syntax->datum #'(content* ...)))
       #'(list content ...)])))

(define (nest path x)
  (if (null? path) x `(,(car path) ,(nest (cdr path) x))))

(define ((trace [path (list)]) tree)
  (cond [(not (pair? tree))
         (list (nest (reverse path) tree))]
        [else
         (append* (map (trace (cons (gensym) path)) tree))]))

(define-syntax (≡app stx)
  (syntax-parse stx
    [(_ content ...+
        [body ...])
     #`(match-lambda
         [#,(syntax-local-introduce
             #'(multiset content ...))
          body ...]
         [_ #false])]))

((≡app 1 (a 3 b) 4 (d)
       [(list a b d)])
 ((trace) '(1 (2 3 4) 4 (5))))
; => '(2 4 5)

It's not pretty and it is ordered, but what I would like to do now, is somehow keep track of the (generate-temporary) calls and replace them again with new temporaries, and finally check that the bindings for the temporaries which replaced the same original temporary, are equal.

The idea is that I can then inside of multiset replace #'(list ...) with #'(list-no-order ...) and somehow inject a #:when (equal? ...) into the match in ≡app to enforce the constraints, but I am unsure how to have the bindings "escape" the multiset expander, if that makes sense.


Also, in the above, something like:

((≡app 1 a (d)
       [(list a d)])
 ((trace) '(1 ((2 3 4) 4) (5))))

will fail to match, because of the "extra" pieces in the ((trace) ...) call.

The forms for which look something like:

(multiset ...)
~> ((list g5 1) (list g5 a) (list g5 (list g6 d)))

((trace) ...)
~> ((g3002557 1) (g3002557 (g3002558 (g3002559 2))) (g3002557 (g3002558 (g3002559 3))) (g3002557 (g3002558 (g3002559 4))) (g3002557 (g3002558 4)) (g3002557 (g3002560 5)))

Edit: Almost there?

(define (equal?* . args)
  (cond [(null? args) #true]
        [else
         (define fst (first args))
         (for/fold ([state #true])
                   ([arg   (in-list (rest args))])
           (and state (equal? fst arg)))]))

(define-syntax (≡app stx)
  (define ≡sat (make-parameter (hash)))

  (define (≡sat-update k)
    (define temp (generate-temporary))
    {≡sat (hash-update {≡sat} k (lambda (v) (cons temp v)) (list))}
    temp)
  
  (define (nest path x)
    (cond [(null? path) x]
          [else
           #`(list #,(≡sat-update (car path)) #,(nest (cdr path) x))]))

  (define ((trace [path (list)]) tree)
    (cond [(not (pair? tree))
           (list (nest (reverse path) (datum->syntax stx tree)))]
          [else
           (apply append (map (trace (cons (gensym) path)) tree))]))
  
  (syntax-parse stx
    [(_ content* ...+
        [body ...])
     #:with (content ...)     ((trace) (syntax->datum #'(content* ...)))
     #:with ((?≡sat ...) ...) (hash-values {≡sat})
     #:with ooo               #'(... ...)
     #'(match-lambda
         [(list-no-order content ... _ ooo)
          #:when (and (equal?* ?≡sat ...) ...)
          body ...]
         [_ #false])]))

((≡app 1 (c) ((4 a 3) b)
  [ (list a b c) ])
 ((trace) '(1 ((2 3 4) 4) (5))))
; => '(2 4 5)