Clever Way to Check Unordered List Equality

Hi, Racket Discourse!

Just a quick one today: Have any of you come across a "clever" way to check for unordered list equality?

I've come up with this (low-effort) snippet, so far:

(define (and-reduce f tree-a tree-b)
  (cond [(and (pair? tree-a) (pair? tree-b))
         (and (f (car tree-a) (car tree-b))
              (and-reduce f (cdr tree-a) (cdr tree-b)))]
        [else (equal? tree-a tree-b)]))

(define (congruent? tree-a tree-b)
  (cond [(and (pair? tree-a) (pair? tree-b))
         (for/or ([permuted (in-permutations tree-b)])
           (and-reduce congruent? tree-a permuted))]
        [else (equal? tree-a tree-b)]))

(congruent? '(1 (a (4 3) (a c)) ((((2)) ((3)))))
            '(1 ((((2)) ((3)))) (a (4 3) (a c))))
; => #true

(congruent? '(1 2 3 4)
            '(4 2 3 1))
; => #true

(congruent? '((1) 2 (3))
            '(2 (1) ((3))))
; => #false

It's not bad, but I feel like it is a very "brute-force" kind of solution, especially with the permutations.

The idea is to eventually abstract this out into a macro-like form that I can use for expressions like this:

(≡app
   (exp (log a b) (log a))
   ≡ [distribute:spread]
   (exp (log a) × 2) ; => (exp (log a) (log a)) 
   (exp (log b) (log a)))

where the parenthesized expressions being matched and transformed are basically "typed bags", in the sense of each bag having a "head", e.g., exp and log.

I thought also of perhaps using the "traces" of the lists, e.g., for something like:

'(1 (1 2) 3 4 ((4)))
=> (1), ((1)), ((2)), (3), (4), (((4)))

and then comparing these "traces" between lists.

Potato, potato?


Edit for posterity: I realize now that my question is not very clear. Although I ostensibly enquire about a "clever way" to test between unordered lists for equality, I think it might come down to more of a "what is a relatively ergonomic structure which allows for the behaviour of multisets" question.

My thinking was that if I could find a "clever" way of thinking about equality (of multiset-like things), then I could generalize that to obtain what I desired.

I'm not sure this is exactly the function you want but, if it is, you can avoid creating the permutations, which'll probably be a lot faster.

#lang racket
(module+ test (require rackunit))

(provide
 (contract-out
  [congruent?
   (-> list? list? boolean?)]))

(define (congruent? a b)
  (equal? (to-ht a) (to-ht b)))

(define (to-ht l)
  (define ht (make-hash))
  (for ([x (in-list l)])
    (hash-set! ht x (+ 1 (hash-ref ht x 0))))
  ht)

(module+ test
  (check-equal? (congruent? '() '()) #t)
  (check-equal? (congruent? '() '(1)) #f)
  (check-equal? (congruent? '(1) '()) #f)
  (check-equal? (congruent? '(1) '(1)) #t)
  (check-equal? (congruent? '(1) '(2)) #f)
  (check-equal? (congruent? '(1 2) '(1 2)) #t)
  (check-equal? (congruent? '(1 2) '(2 1)) #t)
  (check-equal? (congruent? '(1 1 2) '(1 2 2)) #f))
2 Likes

Hi, @robby. Thank you.

I realize from your answer that I gave rather poor examples and did not explicitly say so, but I would like to check the congruence up to an arbitrary depth; so, unordered lists of unordered list or atoms.

Of course, we could generalize somewhat from your example to cover this case.
Not a very performant attempt, but demonstrates the principle:

(define (nest n x)
  (if (zero? n) x `(,(nest (- n 1) x))))

(define ((trace depth) tree)
  (cond [(not (list? tree))
         (nest (+ depth 1) tree)]
        [else
         (append* (map (trace (+ depth 1)) tree))]))

(define (list->hash lst)
  (for/hash ([group (in-list (group-by identity lst))])
    (values (car group) (length group))))

(define (congruent?* tree-a tree-b)
  (equal?
   (list->hash ((trace 0) tree-a))
   (list->hash ((trace 0) tree-b))))

(congruent?* '(1 (a (4 3) (c a)) ((((2)) ((3)))))
             '(1 ((((2)) ((3)))) ((3 4) a (a c))))
; => #true

(congruent?* '(1 2 3 4)
             '(4 2 3 1))
; => #true

(congruent?* '((1) 2 (3))
             '(2 (1) ((3))))
; => #false

The reason I began checking somewhat structurally, is that I would like to use pattern matching for my hypothetical (≡app ...) macro, but I would also like to allow for duplicate identifiers for equal but arbitrarily nested terms in the body and I am not yet confident enough in my macro-abilities to keep track of the plumbing and have the terms survive the expansion to whichever forms/bindings eventually. Hence, lists.


Edit: Okay, so this implementation is wrong. It happens to succeed, but as I alluded in my reply to @Antigen-11, it fails to take into account the assumption I mention in the beginning about the identities of the lists themselves, or the heads; but even this is still an under-constrained statement. I shall need to elaborate.


Edit: This is more in line with what I intended but failed to express:

(define (nest path x)
  (if (null? path) x `(,(car path) ,(nest (cdr path) x))))

(define ((trace [path (list)]) tree)
  (cond [(not (pair? tree))
         (list (nest (reverse path) tree))]
        [else
         (append* (map (trace (cons (gensym) path)) tree))]))

((trace) '(1 2 (1 1 1 (2 3 4) 5) 4 (1 2 3 4 5)))

In essence, I want to keep track of which "spaces" or "nodes" the elements occupy, in order to distinguish them properly, although this won't work as is with the above.

I'd use immutable hash tables for a hash-based approach:

(define (congruent? a b)
  (define (list->hash-frequencies lst)
    (for/fold ([h (hash)])
              ([elem (in-list lst)])
      (hash-update h elem add1 0)))
  (equal? (list->hash-frequencies a) (list->hash-frequencies b)))

Or an O(N^2) approach that converts one list to a vector (to easily move elements around) and checks each element of the other list to see if hasn't yet been found in the vector:

(require (only-in srfi/43 vector-swap!))
(define (congruent? a b)
  (define (vec-index vec what start end)
    (for/first ([(elem i) (in-indexed (in-vector vec start end))]
                #:when (equal? elem what))
      i))
  (define vec-b (list->vector b))
  (if (= (length a) (vector-length vec-b))
      (let loop ([a a]
                 [end (vector-length vec-b)])
        (cond
          [(null? a) #t]
          [(vec-index vec-b (car a) 0 end)
           => (lambda (i)
                (vector-swap! vec-b i (sub1 end))
                (loop (cdr a) (sub1 end)))]
          [else #f]))
      #f))
2 Likes

I would turn the lists into sets and use equal?.

I think this definition works on your examples, but you might
need a more elaborate solution.

(define (congruent? tree-a tree-b)
  (equal? (list->set tree-a) (list->set tree-b)))

I'm sorry to bother you, but there might be a typo here.

The trace function seems to be a flatten-like procedure, but ((trace 0) '(0 (1) () (2))) returns '((0) ((1)) ((2))).

I try redefining this function as follows and the new one works fine.

(define ((trace depth) tree)
    (cond [(not (list? tree))
           (nest (+ depth 1) tree)]
          [else
           (append* (map (trace depth) tree))]))
;; ((trace 0) '(0 (1) () (2)))
;; '(0 1 2)
1 Like

Indeed, @Antigen-11! It was a bit of an awkward function to write and I made the same mistake as with the first attempt, haha.

If I change the definition to use pair? (dubiously?), it works as I intended.

(define ((trace depth) tree)
  (cond [(not (pair? tree))
         (nest (+ depth 1) tree)]
        [else
         (append* (map (trace (+ depth 1)) tree))]))

((trace 0) '(0 (1) () (2)))
; => '((0) ((1)) (()) ((2)))

However, I think I might have contradicted myself somewhat in what I have relayed.

Good catch!

That fails if you have duplicate equal? elements in the lists.

Other people seem to get it, but I'm confused about exactly what definition of 'equal' you are using. If it's as simple as "are the atoms in x the same as the atoms in y" then you could do this:

(equal? (list->set (flatten x))
        (list->set (flatten y)))

If you're looking for something else, what is it exactly?

1 Like

Then we need to use bags (multi sets) instead.

https://docs.racket-lang.org/rebellion/Multisets.html

Hi, @dstorrs. Maybe somewhat abstruse, yes.

As @soegaard points out below, the intended definition of "equivalence" I am aiming at is that of multisets, which are like sets but where elements also have multiplicity.

Additionally, I want to treat the nested lists as multisets of multisets, hence the elaborate machinations.

Hi, @soegaard.

I have considered using Rebellion's multisets, yes. But I must admit I have not been able/spent enough time to come up with the necessary machinery to deal with rests and pattern matching on Rebellion's implementation.

This is broccoli for thought :thinking:

@bakgatviooldoos

#lang racket
(require rebellion/collection/multiset)

(define (tree->bag t)  
  (if (list? t)
      (sequence->multiset (map tree->bag t))
      t))

(define (congruent? t1 t2)
  (equal? (tree->bag t1) (tree->bag t2)))
1 Like

I thought of this proto-≡app implementation, to better explain what I am trying to achieve:

(require (for-syntax syntax/parse
                     racket/syntax))

(define-match-expander multiset
  (lambda (stx)
    (define (nest path x)
      (cond [(null? path) x]
            [else
             #`(list #,(car path) #,(nest (cdr path) x))]))

    (define ((trace [path (list)]) tree)
      (cond [(not (pair? tree))
             (list (nest (reverse path) (datum->syntax stx tree)))]
            [else
             (apply append (map (trace (cons (generate-temporary) path)) tree))]))
  
    (syntax-parse stx
      [(_ content* ...+)
       #:with (content ...) ((trace) (syntax->datum #'(content* ...)))
       #'(list content ...)])))

(define (nest path x)
  (if (null? path) x `(,(car path) ,(nest (cdr path) x))))

(define ((trace [path (list)]) tree)
  (cond [(not (pair? tree))
         (list (nest (reverse path) tree))]
        [else
         (append* (map (trace (cons (gensym) path)) tree))]))

(define-syntax (≡app stx)
  (syntax-parse stx
    [(_ content ...+
        [body ...])
     #`(match-lambda
         [#,(syntax-local-introduce
             #'(multiset content ...))
          body ...]
         [_ #false])]))

((≡app 1 (a 3 b) 4 (d)
       [(list a b d)])
 ((trace) '(1 (2 3 4) 4 (5))))
; => '(2 4 5)

It's not pretty and it is ordered, but what I would like to do now, is somehow keep track of the (generate-temporary) calls and replace them again with new temporaries, and finally check that the bindings for the temporaries which replaced the same original temporary, are equal.

The idea is that I can then inside of multiset replace #'(list ...) with #'(list-no-order ...) and somehow inject a #:when (equal? ...) into the match in ≡app to enforce the constraints, but I am unsure how to have the bindings "escape" the multiset expander, if that makes sense.


Also, in the above, something like:

((≡app 1 a (d)
       [(list a d)])
 ((trace) '(1 ((2 3 4) 4) (5))))

will fail to match, because of the "extra" pieces in the ((trace) ...) call.

The forms for which look something like:

(multiset ...)
~> ((list g5 1) (list g5 a) (list g5 (list g6 d)))

((trace) ...)
~> ((g3002557 1) (g3002557 (g3002558 (g3002559 2))) (g3002557 (g3002558 (g3002559 3))) (g3002557 (g3002558 (g3002559 4))) (g3002557 (g3002558 4)) (g3002557 (g3002560 5)))

Edit: Almost there?

(define (equal?* . args)
  (cond [(null? args) #true]
        [else
         (define fst (first args))
         (for/fold ([state #true])
                   ([arg   (in-list (rest args))])
           (and state (equal? fst arg)))]))

(define-syntax (≡app stx)
  (define ≡sat (make-parameter (hash)))

  (define (≡sat-update k)
    (define temp (generate-temporary))
    {≡sat (hash-update {≡sat} k (lambda (v) (cons temp v)) (list))}
    temp)
  
  (define (nest path x)
    (cond [(null? path) x]
          [else
           #`(list #,(≡sat-update (car path)) #,(nest (cdr path) x))]))

  (define ((trace [path (list)]) tree)
    (cond [(not (pair? tree))
           (list (nest (reverse path) (datum->syntax stx tree)))]
          [else
           (apply append (map (trace (cons (gensym) path)) tree))]))
  
  (syntax-parse stx
    [(_ content* ...+
        [body ...])
     #:with (content ...)     ((trace) (syntax->datum #'(content* ...)))
     #:with ((?≡sat ...) ...) (hash-values {≡sat})
     #:with ooo               #'(... ...)
     #'(match-lambda
         [(list-no-order content ... _ ooo)
          #:when (and (equal?* ?≡sat ...) ...)
          body ...]
         [_ #false])]))

((≡app 1 (c) ((4 a 3) b)
  [ (list a b c) ])
 ((trace) '(1 ((2 3 4) 4) (5))))
; => '(2 4 5)

Yoh, taken a while to get to this point, but I think it is making nice progress.

In the end, I decided to implement my own bag struct, because there were certain properties I wanted baked into the implementation, like elements with infinite multiplicities.

Next steps are to wrap this in something like minikanren's run form, to allow for better control of the results.

Also, maybe generators should be replaced by something else (I have not looked into the performance impact), but they were conceptually easy to use.

(define bag-find (∘ find-solutions bag-project))

(define (g)
  (bag-find
   ; target (what we want to find)
   (bagexp
    (pow x 200) (pow y 200) (pow z 200) (pow m 3)
    u
    (pow n +inf) (pow s +inf) (pow t +inf)
    ((bag (pow u 4))
     . pow . 2))

   ; source (what we are given)
   (bagexp
    (pow x 400) (pow y 200) 3 3 3
    ,(list 1 2 3 4)
    (pow 4 +inf) (pow s +inf) (pow t +inf)
    ((bag (pow ,(list 1 2 3 4) 4))
     . pow . 2))))

(with-names (m n s t u x y z)
  (time
   (for ([(soln i) (in-indexed (in-producer (g) (void)))]
         [(_)      (in-range 100)])
     (printf "soln #~a\n" (+ i 1))
     (pretty-print soln)
     (newline))))
;=>
soln #1
'#hash((m . 3) (n . s) (s . 4) (t . t) (u . (1 2 3 4)) (x . y) (y . x) (z . x))

soln #2
'#hash((m . 3) (n . 4) (s . s) (t . t) (u . (1 2 3 4)) (x . y) (y . x) (z . x))

soln #3
'#hash((m . 3) (n . s) (s . t) (t . 4) (u . (1 2 3 4)) (x . y) (y . x) (z . x))

soln #4
'#hash((m . 3) (n . 4) (s . t) (t . s) (u . (1 2 3 4)) (x . y) (y . x) (z . x))

soln #5
'#hash((m . 3) (n . t) (s . s) (t . 4) (u . (1 2 3 4)) (x . y) (y . x) (z . x))

soln #6
'#hash((m . 3) (n . t) (s . 4) (t . s) (u . (1 2 3 4)) (x . y) (y . x) (z . x))

soln #7
'#hash((m . 3) (n . s) (s . 4) (t . t) (u . (1 2 3 4)) (x . x) (y . y) (z . x))

soln #8
'#hash((m . 3) (n . 4) (s . s) (t . t) (u . (1 2 3 4)) (x . x) (y . y) (z . x))

soln #9
'#hash((m . 3) (n . s) (s . t) (t . 4) (u . (1 2 3 4)) (x . x) (y . y) (z . x))

soln #10
'#hash((m . 3) (n . 4) (s . t) (t . s) (u . (1 2 3 4)) (x . x) (y . y) (z . x))

soln #11
'#hash((m . 3) (n . t) (s . s) (t . 4) (u . (1 2 3 4)) (x . x) (y . y) (z . x))

soln #12
'#hash((m . 3) (n . t) (s . 4) (t . s) (u . (1 2 3 4)) (x . x) (y . y) (z . x))

soln #13
'#hash((m . 3) (n . s) (s . 4) (t . t) (u . (1 2 3 4)) (x . x) (y . x) (z . y))

soln #14
'#hash((m . 3) (n . 4) (s . s) (t . t) (u . (1 2 3 4)) (x . x) (y . x) (z . y))

soln #15
'#hash((m . 3) (n . s) (s . t) (t . 4) (u . (1 2 3 4)) (x . x) (y . x) (z . y))

soln #16
'#hash((m . 3) (n . 4) (s . t) (t . s) (u . (1 2 3 4)) (x . x) (y . x) (z . y))

soln #17
'#hash((m . 3) (n . t) (s . s) (t . 4) (u . (1 2 3 4)) (x . x) (y . x) (z . y))

soln #18
'#hash((m . 3) (n . t) (s . 4) (t . s) (u . (1 2 3 4)) (x . x) (y . x) (z . y))

cpu time: 78 real time: 75 gc time: 15

Thanks so much for all of the helpful comments so far :heart:

Look, Ma, almost like Mr. Friedman used to make.

It is still horribly slow, but it's getting there:

(define (person age name party)
  ((bagexp
    (person: ,name (age: ,age)))
   . bagrel≡ . party))

(define (eats meal name party)
  ((bagexp
    (person: ,name (age: ,(free)))
    (eats:   ,name (meal: ,meal)))
   . bagrel≡ . party))

(define (drinks drink name party)
  ((bagexp
    (person: ,name (age: ,(free)))
    (drinks: ,name (drink: ,drink)))
   . bagrel≡ . party))

(define (friends name-x name-y party)
  ((bagexp
    (friends: ,name-x ,name-y))
   . bagrel≡ . party))

(define (each fact party)
  ((bagexp ,fact) . bagrel≡ . party))

(define party
  (bagexp
   (person: "John" (age: 27))
   (person: "Mary" (age: 25))
   (person: "Phil" (age: 27))
   (person: "Nina" (age: 27))
   (person: "Neal" (age: 25))
   (person: "Faye" (age: 28))
   (person: "Joel" (age: 25))
   (person: "Mona" (age: 25))
    
   (eats: (meal: "Beef") "John")
   (eats: (meal: "Beef") "Mary")
   (eats: (meal: "Fish") "Neal")
   (eats: (meal: "Stew") "Phil")
   (eats: (meal: "Stew") "Nina")
   (eats: (meal: "Lamb") "Mona")
   (eats: (meal: "Soup") "Joel")
   (eats: (meal: "Fish") "Faye")
   
   (friends: "Mary" "Joel")
   (friends: "John" "Mary")
   (friends: "Nina" "Joel")
   (friends: "Faye" "John")
   (friends: "Mary" "Faye")
   (friends: "Joel" "Mona")
   
   (drinks: (drink: "Cola") "Mary")
   (drinks: (drink: "Aqua") "Mary")
   (drinks: (drink: "Cola") "Neal")
   (drinks: (drink: "Beer") "Joel")
   (drinks: (drink: "Wine") "Mona")
   (drinks: (drink: "Aqua") "John")
   (drinks: (drink: "Aqua") "Nina")
   (drinks: (drink: "Aqua") "Faye")))

(define solns
  (with-vars (name age meal drink)
    ; conjunction
    (bagrel*
     (eats   meal  name party)
     (drinks drink name party)
     
     ; disjunction
     (bagrel+
      (drinks drink "Mary" party)
      (drinks drink "Joel" party))
     
     (bagrel+
      (friends name "Mary" party)
      (friends name "Joel" party))
     
     ; contradiction of first relation if any of the rest succeed
     (bagrel-
      (person age name party)
      (name . bagrel≡ . "Mary")
      (name . bagrel≡ . "Joel")))))

(time
 (for ([soln (in-producer solns (void))])
   (displayln soln)))
;=>
#hash((#<drink:> . Aqua) (#<age:> . 27) (#<name:> . Nina) (#<meal:> . Stew))
#hash((#<drink:> . Aqua) (#<age:> . 28) (#<name:> . Faye) (#<meal:> . Fish))
#hash((#<drink:> . Aqua) (#<age:> . 27) (#<name:> . John) (#<meal:> . Beef))
cpu time: 1656 real time: 2884 gc time: 187

Edit:
Hey! Speed-up by half by figuring out how to toggle between the "exact" and "eager" modes of the combinatorial enumeration of the target and source bags.

The match is eager if the target is "smaller than" the source, but exact if they have the same arity.

;=>
#hash((#<name:> . Nina) (#<meal:> . Stew) (#<drink:> . Aqua) (#<test:> . #<bag[2]: (Stew . 1) (is-yummy . 1)>) (#<age:> . 27))
#hash((#<name:> . Faye) (#<meal:> . Fish) (#<drink:> . Aqua) (#<test:> . #<bag[2]: (Fish . 1) (is-yummy . 1)>) (#<age:> . 28))
#hash((#<name:> . John) (#<meal:> . Beef) (#<drink:> . Aqua) (#<test:> . #<bag[2]: (Beef . 1) (is-yummy . 1)>) (#<age:> . 27))
cpu time: 890 real time: 1295 gc time: 46
1 Like

Here's how I'd solve the original unordered flattened tree equality problem, not accounting for that infinite multiplicities wrinkle you mentioned:

(define (in-tree t)
  (match t
    ['() (stream)]
    [(cons a t*) (stream-append (in-tree a) (in-tree t*))]
    [_ (stream t)]))

(equal? (sequence->multiset (in-tree x))
        (sequence->multiset (in-tree y)))

Pattern matching on multisets is interesting. I'd be open to adding that feature to Rebellion proper.

1 Like

Hey, @notjack, in the flesh!

Yeah, this thread kind of got away from me; should have started a new one once I realized I was barking up the wrong branch of the tree, haha. Alas, here we are.

Thanks for that little quip, I will steal it for safe-keeping.

It's been enlightening, though. I assumed, looking at the sources for rebellion, that you had been rightly conservative in (not) implementing pattern matching on multisets because they present the problem of multi-patterns...

From my admittedly poor investigation, it basically forces you to use some form of constraint programming to tame those beasties.[0]

Luckily, though, the issue of matching bags (or multisets) is reasonably straightforward, being a combinatorial problem. Interesting bit about that:

If you have two multisets you want to unify, or match, and both of them happen to be "abstract" (what I call things which contain variables), you have two perspectives which need to be reconciled.

What I mean, is that when matching from a "target" (a pattern) to a "source" (a value), the multiplicities of the target need to "fit under" the multiplicities for the source: we can't have a variable from the target "be under" more than one group of terms from the source, because then our variables might be two things at once.

; this makes sense
source := XXX YY
target := aab cc

; this, not so much: `c` ≠ `X` and `Y` simultaneously
source := XXX YY
target := aac bc

And obviously this has to happen from both the source and the target's perspectives if both contain variables.[1]

Regarding the infinite multiplicities: I don't know if I have come up with a "consistent" way of dealing with them, but basically, I consider them to be their own thing, in that only "one" infinity can match to any other infinity (in full; I do allow for partial matches on finite things, but I digress), although there is probably more fun to be had by expanding that definition. My latest rework of the code has neglected them somewhat in favour of getting the basic semantics working again.

Lol, I even (ab)use complex numbers for the "arity" of the sets, so that 100 means a multiset of 100 things, but 100+2i means a multiset of 100 plus two infinities of things.

I am planning, but have not written any code yet, to be able to "abstract" the multiplicities of the multisets to allow for variable multiplicities by allowing the multiplicities to be complex numbers.

My idea is that something like N = 32-4i, where N is the multiplicity of some term, is essentially a "linear equation" (we can "solve" for i) and the complex numbers behave how you would expect under linear transformations (multiplying by and adding integers to them). Not very fleshed out yet, but it is simmering.

I would love to see what your perspective on the matter would be.


Edit:
Another interesting problem, when dealing with constraints programming, is when to realize an abstract object, i.e., substitute variables in a bag.

The way I currently do this, is via a control operator, like, bagrel≡ or bagrel*, called bagrel$, which explicitly substitutes any known variables in bags contained by the solution at that point.

Ideally you would do this automatically, and as soon as the variable became bound to a concrete value, but my code is not yet organized well-enough to find an obvious place for this to occur, and wastes a lot of time checking concrete bags in the process.

I wanted to divorce the bags themselves from having to keep track of whether they contain variables, for example, but perhaps it might be necessary in the interest of better heuristics to devise such a scheme.


Edit:
[0] I forget that my implicit assumption is to allow duplicate pattern variables. It becomes easier (but much less useful in my opinion), if this is disallowed.

[1] I realize this makes it sound as if the relationship is "conjunction", but it is in fact "disjunction": resolving both perspectives does not mean at the same time, just at all. I should proofread more.

Just my 2 cents: This "unordered list equality" problem makes me think of the algorithm course i had with Prof. Emmanuel Kounalis and a short piece of code based on this course i rewrite a few year ago (but i'm perheaps out of topics, i apologize i did not take time to read and understand all the thread...):

;; set difference aA - B
;;  (set-difference '(1 2 4 5 6) '(4 5 6 2 8)) -> '(1)
(define (set-difference aA B)
  (cond ((null? aA)
	 '())
	((member (first aA) B) ;; a C-? B      C- means "include"
	 (set-difference (rest aA) B)) ;; set-difference(A,B)
	(else
	 (cons (first aA) (set-difference (rest aA) B))))) ;; { a ,  set-difference(A,B) }

A few notes: set-difference is not a good name, difference is a french word ,i should have used substraction perheaps...

I suppose unordered lists as equivalent to sets.

Equality of set A and B, should be tested with A - B = B - A, hope i'm not wrong :scream: .

I suppose tree are sets containing sets and then the same algorithm could be implemented by diving in the subsets recursively but i did not write the code...

About time complexity this should be exponential (in the NP class) but i have not think about that too much today... so i could be wrong...

(The goal of the course was first to wrote simple algorithm that works and after more complex one, more fast. This one was probably from the beginning of the course... :grinning: )

Hi, @damien_mattei.

Indeed, you have the basic idea. As you say, the original question is "merely" an extension of this idea to trees.

Interesting that you mention the use of "difference" in this context, versus the usage of "subtract". It is true, if you asked, "show me the difference between two sets," the "direction" of the relation would be somewhat ambiguous.

I read a piece recently which spoke about the relationship between words in English of Anglo-Saxon origin and those from the Normans. It doesn't apply in this case (I don't think?), both words originating from Latin, but I thought it was interesting.

1 Like