The elegantest way to manipulate part of a list?

When I look at the “remove” functions of lisp, I noticed there are optional arguments :start :end to designate the part of the list you want to manipulate. http://clhs.lisp.se/Body/f_rm_rm.htm

I cannot find an elegant counterpart of :start :end in racket.

I can split a list to multiple sub-lists and change the sub-list, then combine them together, but it doesn't give me a feel of “that's the way”.

Here is a more specific question:

remove duplicates between index [1,6) from (list 0 1 1 3 3 5 5) elegantly (as short as possible?)

In CL:

(delete-duplicates (list 0 1 1 3 3 5 5) :start 1 :end 6) 
; => (0 1 3 5 5)

The dumb method of split & combine

#lang racket
(let ((l (list 0 1 1 3 3 5 5)))
  (append (take l 1)
          (remove-duplicates (drop (take l 6) 1))
          (drop l 6)))

The remove-duplicates can also be sort which also manipulate a list, but the list I want to sort or remove-duplicates is part of a larger list.

What's the optimal way to reuse existing list manipulating functions over part of a larger list?

1 Like

This looks like one of those rare occasions where "batteries included" has some reservations.
That said, if I needed such functionality, I'd probably just go with something like:

(define (remove-duplicates+ lst #:start (start 0) #:end (end #f))
  (for/fold ((rr '())
             (es (set))
             #:result (reverse rr))
            ((e (in-list lst))
             (i (in-naturals)))
    (if (and (>= i start)
             (or (not end)
                 (< i end)))
        (if (set-member? es e)
            (values rr es)
            (values (cons e rr)
                    (set-add es e)))
        (values (cons e rr)
                es))))

Which should do the trick:

> (define lst '(0 1 1 3 3 5 5))
> (remove-duplicates+ lst #:start 1 #:end 6)
'(0 1 3 5 5)
> 

The question is whether such addition (#:start and #:end kwargs) are worth including in stock remove-duplicates. Any thoughts?

2 Likes

It takes me sometime to fully digest your code, it does feel like the racket-y way.

I don't think it is worth, because this will open a time-consuming black hole of adding #:start & #:end to all kinds of functions.


However, I do want to come up a generic macro to apply a general list function to part of a list, like

(apply-to-range func lst start end)
or
(apply-to func lst (in-range start end [step])) ;; the last part are indices of wantted parted
;; the `func` take a (sub)list as argument.

so that one may simply (apply-to-range remove-duplicates '(0 1 1 3 3 5 5) 1 6)

But I think it cannot be efficiently implemented this way without mutable list (split and combine the list is a must)?

No macro needed. Racket is higher-order. (I know CL is but somehow the culture isn’t.)

How about:

#lang racket

(define (only start end f xs . args)
  (define as (take xs start))
  (define bs (drop (take xs end) start))
  (define cs (drop xs end))
  (append as
            (apply f bs args)
            cs))


(only 1 6 remove-duplicates (list 0 1 1 3 3 5 5))
4 Likes

The following uses Lenses, Generic Collections and Lenses for Generic Collections

So you need to install lens collections collections-lens

#lang racket

(require (for-syntax syntax/to-string)
         syntax/parse/define)

(define-syntax-parser report
  [(_ expr:expr) #`(displayln (~a (~a "(" #,(syntax->string #'expr) "): " #:width 75) expr))])


;; you need to install packages: lens collections collections-lens
(require lens/common
         data/collection
         data/collection/lens)

(define seq/remdups (compose1 remove-duplicates sequence->list))

;; alternatively if we want to avoid explicitly converting the sequence to a list
;; we can use foldl from collections package to iterate over sequences (base:fold iterates over lists)
(define (seq/remdups2 seq)
  (define accum
    (foldl (λ (accum v)
                     (match-define (cons set result) accum)
                     (if (set-member? set v)
                         accum
                         (cons (set-add set v)
                               (cons v result))))
           (cons (set) '())
           seq))
  (reverse (cdr accum)))


(define lst '(0 1 1 3 3 5 5))
(report (sequence->list (lens-transform (subsequence-lens 1 6) lst seq/remdups)))
(report (sequence->list (lens-transform (subsequence-lens 1 6) lst seq/remdups2)))

;; if you always want lists:
(define (apply-to-range func lst start end)
  (sequence->list (lens-transform (subsequence-lens start end) lst (compose1 func sequence->list))))

(report (apply-to-range remove-duplicates '(0 1 1 3 3 5 5) 1 6))

(define/contract (apply-to func lst subseq-lens)
  (-> (-> list? list?) list? (lens/c stream? sequence?) list?) ;; adding some contracts
  (sequence->list (lens-transform subseq-lens lst (compose1 func sequence->list))))


;; personally I would continue calling it subsequence-lens
;; to avoid confusion with base:in-range, however depending on context it might make sense to rename it
;; for example when used in a context where it is unlikely to clash with other in-range implementations

;; here I mostly rename it so that it uses the same example syntax as you provided
(define in-range subsequence-lens)
(report (apply-to remove-duplicates '(0 1 1 3 3 5 5) (in-range 1 6)))
(report (apply-to remove-duplicates '(0 1 1 3 3 5 5) (subsequence-lens 1 6)))


;; this is just an example it doesn't support negative steps like base:in-range
(define/contract (step-lens step)
  (-> exact-positive-integer? any/c)
  (if (= 1 step)
      identity-lens
      (make-lens (λ (target)
                   (for/list ([(t index) (in-indexed target)]
                              #:when (= 0 (remainder index step)))
                     t))
                 (λ (target view)
                   (define vec (list->vector view)) ;; to avoid list-ref
                   (define length (vector-length vec))
                   (for/fold ([l '()] #:result (reverse l))
                             ([(t index) (in-indexed target)])
                     (define-values (view-index remainder) (quotient/remainder index step))
                     (if (= 0 remainder)
                         (if (< view-index length)
                             (cons (vector-ref vec view-index)
                                   l)
                             l) ;; if view has less elements drop the step-value from the list
                         (cons t l)))))))

;; supporting dropped values makes it awkward (needed to allow remove-duplicates with step-size)
;; if we don't support dropping values it is easier:
;; (λ (target view)
;;   (define vec (list->vector view)) ;; to avoid list-ref
;;   (define length (vector-length vec))
;;   (for/list ([(t index) (in-indexed target)])
;;     (define-values (view-index remainder) (quotient/remainder index step))
;;     (if (= 0 remainder)
;;         (vector-ref vec view-index)
;;         t)))
;; in general dropping values during lens-transform seems awkward, maybe the glass package tries to solve that?

(report (lens-view (step-lens 3) (list 1 2 3 4 5 6 7 8)))
(report (lens-set  (step-lens 3) (list 1 2 3 4 5 6 7 8) '(a b c)))
(report (lens-set  (step-lens 3) (list 1 2 3 4 5 6 7 8) '(a b)))

(define (in-range/step start end [step 1])
  (lens-thrush (subsequence-lens start end)
               (step-lens step)))

(report (lens-view (in-range/step 1 7 2) '(0 1 1 3 3 3 5 5)))
(report (apply-to remove-duplicates '(0 1 1 3 3 3 5 5) (in-range/step 1 7 2)))

Output:

(sequence->list (lens-transform (subsequence-lens 1 6) lst seq/remdups)):  (0 1 3 5 5)
(sequence->list (lens-transform (subsequence-lens 1 6) lst seq/remdups2)): (0 1 3 5 5)
(apply-to-range remove-duplicates '(0 1 1 3 3 5 5) 1 6):                   (0 1 3 5 5)
(apply-to remove-duplicates '(0 1 1 3 3 5 5) (in-range 1 6)):              (0 1 3 5 5)
(apply-to remove-duplicates '(0 1 1 3 3 5 5) (subsequence-lens 1 6)):      (0 1 3 5 5)
(lens-view (step-lens 3) (list 1 2 3 4 5 6 7 8)):                          (1 4 7)
(lens-set  (step-lens 3) (list 1 2 3 4 5 6 7 8) '(a b c)):                 (a 2 3 b 5 6 c 8)
(lens-set  (step-lens 3) (list 1 2 3 4 5 6 7 8) '(a b)):                   (a 2 3 b 5 6 8)
(lens-view (in-range/step 1 7 2) '(0 1 1 3 3 3 5 5)):                      (1 3 3)
(apply-to remove-duplicates '(0 1 1 3 3 3 5 5) (in-range/step 1 7 2)):     (0 1 1 3 3 5 5)

I used this topic as a prompt to explore lenses a little bit more. (no expert on using them)
I think there are circumstances where using lenses is neat, however I am unsure how well they are optimized by racket / whether using them introduces a considerable performance hit. I haven't tested.

For code that needs to be fast / should have fewer dependencies, I would use one of the other solutions.

That said I think it would be nice if these lenses / generic collections were integrated into the language and there was no overhead (that I speculate is currently there).


The (experimental) Glass: Composable Optics package has 3 Traversals that doesn't allow the number of elements to change during the traversal, that seems reasonable to me, considering the complications above.

It probably would be easier to map elements that are supposed to disappear to a nothing value and then filter that value out in a later step, ideally doing that would be supported by optimizations to not make it more expensive. I haven't looked into all of this very deeply, I might be off on some things.

2 Likes