Break out of the cycle. What's the Racket way alternative?

Hi,

Iterating through a string is a piece of cake:

(for ([c "Hello"])
    (display c)
    (newline))

Even in a more sophisticated way:

 (define (list-chars str)
    (for ([c str]
          [i (in-naturals)])
      (printf "~a: ~a\n" i c)))

Almost what I need. What I need actually:

  • Input: a nonempty string of characters without spaces
  • Output: a prefix of the input string defined as follows:
    - First character goes into the prefix regardless of case
    - If the next character is lowercase, it goes to the prefix
    - If the next character is uppercase, it goes to the prefix, and the prefix is done and returned.

It should work like this:
alfa -> alfa
Alfa -> Alfa
DiCAp -> DiC
BRaVo -> BR
b -> b
B -> B
...
No big deal, if you can break out of the cycle. I can't even see how things like drop while may help.

1 Like
#lang racket

#; {String -> String}
(define (prefix str)
  (let/ec return
    (define char* (string->list str))
    (for/fold ([result (~a (first char*))]) ([c (rest char*)])
      (cond
        [(char-upper-case? c) (return (~a result c))]
        [(char-lower-case? c) (~a result c)]
        [else result]))))

(define examples
  '[(alfa  alfa)
    (Alfa  Alfa)
    (DiCAp DiC)
    (BRaVo BR)
    (b     b)
    (B     B)])

(require rackunit)



(for-each (λ (x) (check-equal? (prefix (~a (first x))) (~a (second x)))) examples)

The following is better: (1) it’s functional and doesn’t rely on a control effect and (2) it probably deals with large strings better because string-append (in ~a) can get expensive when strings get long.

#; {String -> String}
(define (prefix str)
  (define char* (string->list str))
  (let while ([char* (rest char*)] [result (list (first char*))])
    (cond
      [(empty? char*) (list->string (reverse result))]
      [else 
       (define c (first char*))
       (cond
         [(char-upper-case? c) (list->string (reverse (cons c result)))]
         [(char-lower-case? c) (while (rest char*) (cons c result))]
         [else #;unspecified: (while (rest char*) result)])])))

(define examples
  '[(alfa  alfa)
    (Alfa  Alfa)
    (DiCAp DiC)
    (BRaVo BR)
    (b     b)
    (B     B)])

(require rackunit)

(for-each (λ (x) (check-equal? (prefix (~a (first x))) (~a (second x)))) examples)
1 Like

Here’s another solution, using for‘s #:final, which is less general than let/ec, but happens to fit the task. I also use cons to accumulate results in constant time.

#; {String -> String}
(define (prefix str)
  (define char* (string->list str))
  (for/fold ([result (list (first char*))]
             #:result (list->string (reverse result)))
            ([c (in-list (rest char*))])
    #:final (char-upper-case? c)
    (cons c result)))

2 Likes

I think I'm going to call this the HtDP approach? You presented it as a state machine, so I wrote it as a state machine with two states, the initial state and the continue state.

;; the initial state
(define (prefix str)
  ;; string is nonempty, so we can definitely advance to character 1
  (prefix-continue str 1))

;; given a string and the position of the first unexamined character,
;; return the prefix that matches the specification.
(define (prefix-continue str posn)
  (cond [(<= (string-length str) posn)
         ;; string is over, return it:
         str]
        [else
         (match (string-ref str posn)
           [(? char-upper-case?) (substring str 0 (add1 posn))] ; stop
           [(? char-lower-case?) (prefix-continue str (add1 posn))] ; continue
           [other (error 'abbrev-continue "unexpected character: ~v" other)])]))

(define examples
  '[(alfa  alfa)
    (Alfa  Alfa)
    (DiCAp DiC)
    (BRaVo BR)
    (b     b)
    (B     B)])

(require rackunit)

(for-each (λ (x) (check-equal? (prefix (~a (first x))) (~a (second x)))) examples)

1 Like

Are regular expressions allowed?

#lang racket/base

(define (make-prefix s)
  (car (regexp-match #px"^.[[:lower:]]*[[:upper:]]?" s)))

(module+ test
  (require rackunit)
  (check-equal? (make-prefix "alfa") "alfa")
  (check-equal? (make-prefix "Alfa") "Alfa")
  (check-equal? (make-prefix "DiCap") "DiC")
  (check-equal? (make-prefix "BRaVo") "BR")
  (check-equal? (make-prefix "b") "b")
  (check-equal? (make-prefix "B") "B"))
1 Like

Everything's allowed! Unfortunately, it won't work (?) with Unicode, but it isn't your fault.

This example shows that there's nothing like the right tool for the task.

Amazing display, well beyond my expectations! Thank you very much!

A lot of material to study. A sad point: even after reading this or that, the Racket Guide including, I discover important bits and pieces I never seen before. Much less tinkered with.

1 Like

What's #;unspecified: ? It has something to do with the Typed Racket? What do you call such decorations?

Google isn't particularly helpful in this case.

It’s an S-expression comment. Here, only unspecified: is commented, but the rest of the code ( (while (rest char*) result) 
 ) is not.

What does unspecified: mean?

You are reading too much into this. It’s not a programming construct. It’s just a “textual comment”, saying that you have not specified what should be done in that case. You wrote:

Input: a nonempty string of characters without spaces
Output: a prefix of the input string defined as follows:

  • First character goes into the prefix regardless of case
  • If the next character is lowercase, it goes to the prefix
  • If the next character is uppercase, it goes to the prefix, and the prefix is done and returned.

But this specification is incomplete. What if the next character is a number -- what should happen? Matthias didn’t know what should be done, so he left a comment saying that he’s implementing that case in a way that makes sense to him, but might not be what you ultimately want if/when you refine your specification. He was trying to be concise by using the S-expression comment there, but if that causes so much confusion, perhaps this might help?

(define (prefix str)
  (define char* (string->list str))
  (let while ([char* (rest char*)] [result (list (first char*))])
    (cond
      [(empty? char*) (list->string (reverse result))]
      [else 
       (define c (first char*))
       (cond
         [(char-upper-case? c) (list->string (reverse (cons c result)))]
         [(char-lower-case? c) (while (rest char*) (cons c result))]
         [else 
          ;; This case is unspecified in the problem statement. 
          ;; Here, we just ignore the character
          (while (rest char*) result)])])))

EDITED: the else case ignores the char, not treating it like lower case.

As an aside to this, I am partial to using the sexpr comments to help me remember non-keyword arguments that might not be obvious at first--a habit which I picked up from reading other people's code:

(and (close-ports) (subprocess-kill sp #;force? #true))
(bytes->string/utf-8 (read-bytes bytes-count out) #;err-char=ïżœ #\uFFFD)
(play-sound path:ding-off #;asyn? #true)

It's such a nice feature, and it has the added benefit of the visual symmtery with #:.

2 Likes

Huh, TIL character classes in regular expressions don't cover the full range of Unicode codepoints. Using properites works, though; changing the regular expression in the above to #px"^.\\p{Ll}*\\p{Lu}?" makes (make-prefix "ÄbĂ©Ä’f") return "ÄbĂ©Ä’".

(At lest, until you get into multi-codepoint extended grapheme clusters like when dealing with combinining characters; Racket's regular expressions don't have an atom to match one of those like perl's \X. Hmm. Maybe I should work on a PCRE2 binding library...)

3 Likes

Fantastic, just the same! Almost unbelievable.

Maybe I should work on a PCRE2 binding library...

Why not improve the Racket implementation?

There are so many things that PCRE regular expression dialect supports that Racket ones don't that it's easier to just use it than trying to reimplement everything I want in a codebase I'm not familiar with.

note that , the input does not seems to be string in above code, an exact code is:

#lang racket
#; {String -> String}
(define (prefix str)
  (define char* (string->list str))
  (let while ([char* (rest char*)] [result (list (first char*))])
    (cond
      [(empty? char*) (list->string (reverse result))]
      [else 
       (define c (first char*))
       (cond
         [(char-upper-case? c) (list->string (reverse (cons c result)))]
         [(char-lower-case? c) (while (rest char*) (cons c result))]
         [else #;unspecified: (while (rest char*) result)])])))

(define examples
  '[("alfa"  "alfa")
    ("Alfa"  "Alfa")
    ("DiCAp" "DiC")
    ("BRaVo" "BR")
    ("b"     "b")
    ("B"     "B")])

;;(require rackunit)

(for-each (λ (x)
            (display (prefix (first x)))
            (display "   ")
            (display (second x))
            (newline)) examples)


result:

Welcome to DrRacket, version 8.11 [cs].
Language: racket, with debugging; memory limit: 8192 MB.
alfa   alfa
Alfa   Alfa
DiC   DiC
BR   BR
b   b
B   B

Will you please point out the difference? Just for convenience.

just that in the code given as solution the input are quoted symbol ,example alfa not string "alfa", try this and you will see it cause an error:

Welcome to DrRacket, version 8.11 [cs].
Language: racket, with debugging; memory limit: 8192 MB.
> (prefix (first (first examples)))
. . string->list: contract violation
  expected: string?
  given: 'alfa

but why it doen not cause an error ,i do not really know , but i suppose rackunit and check-equal? encapsulate the error, but i'm not sure!

but the algorithm is perfectly valid, this is a minor bug.