Giving a pair with cycles to a procedure that recurs on the car and the cdr of the pair gives problems if the procedure does not check against cycles. I found no predicate for lists without cycles. Predicate list? of racket/base does not discriminate between lists without cycles and those with cycles. Predicates proper-list? and circular-list? of SRFI/1 do not recognize all lists containing cycles (in accordance with docs). So I made my own predicate, but wonder whether or not such a predicate already is available. Is it? Below some code that explains my question. My predicate recurs on the car and cdr of pairs only, but can easily be extended such as to recur on vectors, hashes etc.
#lang racket
; atom-or-non-cyclic-pair?
; Predicate that returns true for non-pairs and for pairs without cycles.
(define (atom-or-non-cyclic-pair? lst)
(define (atom-or-non-cyclic-pair? hset x)
(cond
((pair? x)
(cond
((set-member? hset x) #f)
(else
(define hs (set-add hset x))
(and
(atom-or-non-cyclic-pair? hs (car x))
(atom-or-non-cyclic-pair? hs (cdr x))))))
(else #t)))
(atom-or-non-cyclic-pair? (seteq) lst))
(define xy
(let*
((ph1 (make-placeholder #f))
(ph2 (make-placeholder #f))
(x (cons 1 ph1))
(y (cons 2 ph2)))
(placeholder-set! ph1 y)
(placeholder-set! ph2 x)
(make-reader-graph (list x y))))
(require SRFI/1)
(list? xy) ; -> #t in accorcance with the docs.
(proper-list? xy) ; -> #t in accorcance with the docs.
(circular-list? xy) ; -> #f in accorcance with the docs.
(atom-or-non-cyclic-pair? xy) ; -> #f
(define (limited-flatten x n)
(cond
((zero? n) '())
((null? x) '())
((pair? x)
(append
(limited-flatten (car x) (sub1 n))
(limited-flatten (cdr x) (sub1 n))))
(else (list x))))
(limited-flatten xy 20)
(flatten xy) ; infinite loop
Do look up the tortoise-and-hare algorithm. It doesn’t allocate anything, and Chez uses it for cycle detection and so does Racket. But it escapes me whether this functionality is exposed.
Tortoise and hare work for lists but not for trees I think. See my example. Predicate proper-list?, which uses the tortoise and hare does not detect the cycle in my example.
I think you should (must?) just treat this as cycle detection in a directed graph (with outdegree 2). The straightforward way to do it takes time and space O(V), where V is the number of vertexes in the graph.
#lang racket
(require racket/generator)
(define (not-a-cyclic-pair? tree)
; Consider a pair to be the node of a tree or subtree.
; Consider everything else than a pair to be a leaf.
; A traverser recursively traverses a tree.
; It yields every node.
; It ignores leaves.
(define (make-traverser)
(generator ()
(let loop ((tree tree))
(when (pair? tree) (yield tree) (loop (car tree)) (loop (cdr tree))))))
(define traverse-tree1 (make-traverser))
(define traverse-tree2 (make-traverser))
(define (done?) (eq? (generator-state traverse-tree1) 'done))
(define (not-a-cyclic-pair? x lag)
(or (done?)
(let ((x (traverse-tree1)))
(or (done?)
(let ((x (traverse-tree1))
(lag (traverse-tree2)))
(and (not (eq? x lag)) (not-a-cyclic-pair? x lag)))))))
(not-a-cyclic-pair? (traverse-tree1) (traverse-tree2)))
(define xy ; This is a cyclic pair.
(let*
((ph1 (make-placeholder #f))
(ph2 (make-placeholder #f))
(x (cons 1 ph1))
(y (cons 2 ph2)))
(placeholder-set! ph1 y)
(placeholder-set! ph2 x)
(make-reader-graph (list x y))))
(not-a-cyclic-pair? xy) ; -> #f
; The following pair contains copies of part of itself, but is not cyclic.
; A pair containing #n= and #n#, but such that #n# is not part of #n=,
; is not cyclic.
(not-a-cyclic-pair? (read (open-input-string "(#0=(1 #1=(2 3)) #0# #1#)"))) ; -> #t
(not-a-cyclic-pair? (read (open-input-string "((1 #1=(2 #1#)))"))) ; -> #f
(not-a-cyclic-pair? (cons 1 2)) ; -> #t
Hi sorawee
The point is I wanted to avoid the mutable-seteq.
I already had the solution with mutable-seteq.
Can also be done with an immutable seteq as argument in the named let.
Jos
Oh dang, this is embarrassing… my algorithm was wrong!
And the one you had in the first post is inefficient.
Let’s talk about inefficiency first. Consider:
#lang racket
(define (atom-or-non-cyclic-pair? lst)
(define (atom-or-non-cyclic-pair? hset x)
(cond
((pair? x)
(cond
((set-member? hset x) #f)
(else
(define hs (set-add hset x))
(and
(atom-or-non-cyclic-pair? hs (car x))
(atom-or-non-cyclic-pair? hs (cdr x))))))
(else #t)))
(atom-or-non-cyclic-pair? (seteq) lst))
(define g
(let loop ([left '()] [right '()] [n 25]) ; or 50 or any higher number
(case n
[(0) (cons left right)]
[else (loop (cons left right) (cons left right) (sub1 n))])))
(time (atom-or-non-cyclic-pair? g))
;; cpu time: 18113 real time: 17710 gc time: 1509
;; #t
This takes ~20s on my computer, and it’s exponentially worse as n grows bigger.
DFS can be used to detect cycles, but I used DFS to do something else (that is not detecting cycles).
Here’s a corrected algorithm:
#lang racket
(define (has-cycle? g)
(define visited (make-hasheq))
(let visit! ([g g])
(match g
[(cons a b)
(case (hash-ref visited g 'unvisited)
[(unvisited)
(hash-set! visited g 'visiting)
(begin0 (or (visit! a) (visit! b))
(hash-set! visited g 'done))]
[(visiting) #t]
[(done) #f])]
[_ #f])))
(define g
(let loop ([left '()] [right '()] [n 10000])
(case n
[(0) (cons left right)]
[else (loop (cons left right) (cons left right) (sub1 n))])))
(time (has-cycle? g))
(shared ([x (cons 1 y)]
[y (cons x 2)])
(has-cycle? x)) ;=> #t
I’m not sure why you want to avoid mutable-seteq (and probably make-hasheq as well?). I think this is one of a few places where mutation is a right tool for the job, so if it’s just for the sake of purity, I think that’s not worth it. That being said, if you really want to use hasheq instead of make-hasheq, you can. Search for “store-passing style” on the internet.
I was trying to avoid sets and hashes altogether such as to reduce time and memory and was advised to try tortoise and hare. I don't think the real cycle detection can be done without a set or a hash or anything alike.
About efficiency:
Efficiency is determined mainly by the algorithm and in the second place by its implementation.
The code you show has exactly the same algorithm as that in my first post. Are you sure you compare to my code in my first post? Your implementation has a mutable hash. My implementation of the very same algorithm uses immutable sets. Immutable sets may be somewhat slower than mutable ones, but not very much, I hope.
The code with generators clearly is much slower, but is based on another (and for my purpose incorrect) algorithm. Not only the generators make it slower. It also traverses the list casi parallelly twice.
The code you show has exactly the same algorithm as that in my first post. Are you sure you compare to my code in my first post?
In my previous email, I showed two programs. The first program (atom-or-non-cyclic-pair?) is completely yours, and I demonstrated that it took ~20 seconds to run when n = 25.
The second program (has-cycle?) is mine, and I demonstrated that it took 10 milliseconds to run when n = 10000. The algorithm is different: each vertex is visited only once, whereas for your algorithm, vertices could be visited exponentially many times in n.
(define (atom-or-non-cyclic-pair? lst)
(define hset (mutable-seteq))
(define (atom-or-non-cyclic-pair? x)
(cond
((pair? x)
(cond
((set-member? hset x) #f)
(else
(set-add! hset x)
(and
(atom-or-non-cyclic-pair? (car x))
(atom-or-non-cyclic-pair? (cdr x))))))
(else #t)))
(atom-or-non-cyclic-pair? lst))
(define g
(let loop ([left '()] [right '()] [n 10000]) ; or 50 or any higher number
(case n
[(0) (cons left right)]
[else (loop (cons left right) (cons left right) (sub1 n))])))
(time (atom-or-non-cyclic-pair? g)) ; cpu time: 0 real time: 3 gc time: 0
´´´´
I am confused that an immutable set takes so much more time.
Do you know why?
Jos
(1) atom-or-non-cyclic-pair? from your last email (which uses mutable-seteq) is incorrect. g has no cycle, so atom-or-non-cyclic-pair? should return #t, but it returns #f. This is basically the same mistake that I made in my first email.
(2) It changes a lot because hset is always growing over time when you use a mutable set. When you use an immutable set, the set would “shrink” when you backtrack. (“shrink” in a sense that additions in recursive calls will be ignored, since immutable sets never modify existing sets.)
For example, consider: (cons (cons 1 2) (cons 3 4))
Let A = (cons (cons 1 2) (cons 3 4))
Let B = (cons 1 2)
Let C = (cons 3 4)
In the version that uses immutable set:
(atom-or-non-cyclic-pair? (seteq) (cons (cons 1 2) (cons 3 4)))
> (atom-or-non-cyclic-pair? (seteq A) (cons 1 2))
> > (atom-or-non-cyclic-pair? (seteq A B) 1)
backtrack to (atom-or-non-cyclic-pair? (seteq A) (cons 1 2))
> > (atom-or-non-cyclic-pair? (seteq A B) 2)
backtrack to (atom-or-non-cyclic-pair? (seteq A) (cons 1 2))
backtrack to (atom-or-non-cyclic-pair? (seteq) (cons (cons 1 2) (cons 3 4)))
> (atom-or-non-cyclic-pair? (seteq A) (cons 3 4))
> > (atom-or-non-cyclic-pair? (seteq A C) 3)
backtrack to (atom-or-non-cyclic-pair? (seteq A) (cons 3 4))
> > (atom-or-non-cyclic-pair? (seteq A C) 4)
backtrack to (atom-or-non-cyclic-pair? (seteq A) (cons 3 4))
backtrack to (atom-or-non-cyclic-pair? (seteq) (cons (cons 1 2) (cons 3 4)))
backtrack to top-level call
In the version that uses mutable set:
(atom-or-non-cyclic-pair? (cons (cons 1 2) (cons 3 4))) with hset = (mutable-seteq)
> (atom-or-non-cyclic-pair? (cons 1 2)) with hset = (mutable-seteq A)
> > (atom-or-non-cyclic-pair? 1) with hset = (mutable-seteq A B)
backtrack to (atom-or-non-cyclic-pair? (cons 1 2)) with hset = (mutable-seteq A B)
> > (atom-or-non-cyclic-pair? 2) with hset = (mutable-seteq A B)
backtrack to (atom-or-non-cyclic-pair? (cons 1 2)) with hset = (mutable-seteq A B)
backtrack to (atom-or-non-cyclic-pair? (cons (cons 1 2) (cons 3 4))) with hset = (mutable-seteq A B)
> (atom-or-non-cyclic-pair? (cons 3 4)) with hset = (mutable-seteq A B)
> > (atom-or-non-cyclic-pair? 3) with hset = (mutable-seteq A B C)
backtrack to (atom-or-non-cyclic-pair? (cons 3 4)) with hset = (mutable-seteq A B C)
> > (atom-or-non-cyclic-pair? 4) with hset = (mutable-seteq A B C)
backtrack to (atom-or-non-cyclic-pair? (cons 3 4)) with hset = (mutable-seteq A B C)
backtrack to (atom-or-non-cyclic-pair? (cons (cons 1 2) (cons 3 4))) with hset = (mutable-seteq A B C)
backtrack to top-level call with hset = (mutable-seteq A B C)
Most introduction to algorithms textbooks should cover graph search algorithms, so I highly suggest you find one to read if you wish to understand these algorithms.
This mutable-set version is wrong. It returns false for the timed (atom-or-non-cyclic-pair? g) call at the end, but g is non-cyclic.
The mutable-hash version that sorawee posted (the corrected version) uses three states. A node either
has not been visited yet
has been visited, and we are still exploring the subgraph reachable from that node
has been visited, and we have verified that the subgraph reachable from that node is non-cyclic.
In presentations on cycle detection in graphs, these states are often mapped to the colors white, grey, and black.
The three-state algorithm is faster than the two-state algorithm. When you hit a verified non-cyclic subtree, you can skip re-traversing it. The two-state algorithm loses the distinction between "not yet visited" and "visited and verified non-cyclic".
In Racket, it is natural to implement the three-state algorithm using a mutable hash and to implement the two-state algorithm with an immutable set. But you could implement the two-state algorithm with a mutable set/hash by uncoloring nodes when you're done with them, and you could implement the three-state algorithm with an immutable hash (or two immutable sets) by writing the code in store-passing style, as sorawee pointed out. But the fundamental difference is between three states and two states.