Break out of the cycle. What's the Racket way alternative?

just for fun and for compare (and i admit a bit of publicity :wink: )i'm testing a little more concise solution in Scheme+ that i share with all:

#lang reader "../src/SRFI-105.rkt"

(module prefix racket

(require "../Scheme+.rkt")

(define (prefix str)
  (define char* (string->list str))
  (define (vhile char* result) ;; while is already defined in Scheme+
    (condx
      [(empty? char*) (list->string (reverse result))]
      [exec (define c (first char*))]
      [(char-upper-case? c) (list->string (reverse (cons c result)))]
      [(char-lower-case? c) (vhile (rest char*) (cons c result))]
      [else (vhile (rest char*) result)]))

   (vhile (rest char*) (list (first char*))))

(define examples
  '[("alfa"  "alfa")
    ("Alfa"  "Alfa")
    ("DiCAp" "DiC")
    ("BRaVo" "BR")
    ("b"     "b")
    ("B"     "B")])


(for-each (λ (x)
            (display (prefix (first x)))
            (display "   ")
            (display (second x))
            (newline)) examples)

) ; end module

and the result:

Welcome to DrRacket, version 8.11 [cs].
Language: reader "../src/SRFI-105.rkt", with debugging; memory limit: 8192 MB.
SRFI-105 Curly Infix parser with optimization by Damien MATTEI
(based on code from David A. Wheeler and Alan Manuel K. Gloria.)

Options :

Infix optimizer is ON.
Infix optimizer on sliced containers is ON.

Parsed curly infix code result = 

(module prefix racket
  (require "../Scheme+.rkt")
  (define (prefix str)
    (define char* (string->list str))
    (define (vhile char* result)
      (condx
       ((empty? char*) (list->string (reverse result)))
       (exec (define c (first char*)))
       ((char-upper-case? c) (list->string (reverse (cons c result))))
       ((char-lower-case? c) (vhile (rest char*) (cons c result)))
       (else (vhile (rest char*) result))))
    (vhile (rest char*) (list (first char*))))
  (define examples
    '(("alfa" "alfa")
      ("Alfa" "Alfa")
      ("DiCAp" "DiC")
      ("BRaVo" "BR")
      ("b" "b")
      ("B" "B")))
  (for-each
   (λ (x)
     (display (prefix (first x)))
     (display "   ")
     (display (second x))
     (newline))
   examples))

alfa   alfa
Alfa   Alfa
DiC   DiC
BR   BR
b   b
B   B

it uses condx a cond variant that allows execution of statement in it , so there is no more need of 2 nested cond ,a single condx is enought.

another solution in Scheme+ would be to use the def form that allows like def in python to return at any point of the procedure (from the current call or even all the recursive calls) , here is this solution:

#lang reader "../src/SRFI-105.rkt"

(module prefix racket

(require "../Scheme+.rkt")

(define (prefix str)
  (define char* (string->list str))
  
  (def (vhile char* result) ;; while is already defined in Scheme+
       (when (empty? char*)
	     (return (list->string (reverse result))))
       
       (define c (first char*))
       (cond [(char-upper-case? c) (list->string (reverse (cons c result)))]
	     [(char-lower-case? c) (vhile (rest char*) (cons c result))]
	     [else (vhile (rest char*) result)]))

   (vhile (rest char*) (list (first char*))))

(define examples
  '[("alfa"  "alfa")
    ("Alfa"  "Alfa")
    ("DiCAp" "DiC")
    ("BRaVo" "BR")
    ("b"     "b")
    ("B"     "B")])

;;(require rackunit)

(for-each (λ (x)
            (display (prefix (first x)))
            (display "   ")
            (display (second x))
            (newline)) examples)

) ; end module


and the result in the execution window with parsed code result and final result:

Welcome to DrRacket, version 8.11 [cs].
Language: reader "../src/SRFI-105.rkt", with debugging; memory limit: 8192 MB.
SRFI-105 Curly Infix parser with optimization by Damien MATTEI
(based on code from David A. Wheeler and Alan Manuel K. Gloria.)

Options :

Infix optimizer is ON.
Infix optimizer on sliced containers is ON.

Parsed curly infix code result = 

(module prefix racket
  (require "../Scheme+.rkt")
  (define (prefix str)
    (define char* (string->list str))
    (def
     (vhile char* result)
     (when (empty? char*) (return (list->string (reverse result))))
     (define c (first char*))
     (cond
      ((char-upper-case? c) (list->string (reverse (cons c result))))
      ((char-lower-case? c) (vhile (rest char*) (cons c result)))
      (else (vhile (rest char*) result))))
    (vhile (rest char*) (list (first char*))))
  (define examples
    '(("alfa" "alfa")
      ("Alfa" "Alfa")
      ("DiCAp" "DiC")
      ("BRaVo" "BR")
      ("b" "b")
      ("B" "B")))
  (for-each
   (λ (x)
     (display (prefix (first x)))
     (display "   ")
     (display (second x))
     (newline))
   examples))

alfa   alfa
Alfa   Alfa
DiC   DiC
BR   BR
b   b
B   B
> 

Is there a regular-expression package for Racket that uses S-expressions
for regular expressions instead of this escaped-character by
escaped-character gibberish?

-- hendrik

scramble has one: https://docs.racket-lang.org/scramble/index.html#%28mod-path._scramble%2Fregexp%29

3 Likes

but why it doen not cause an error ,i do not really know , but i suppose rackunit and check-equal? encapsulate the error, but i'm not sure!
but the algorithm is perfectly valid, this is a minor bug.

In the for-each that runs the tests, there’s ~a that converts symbols to strings. There’s no bug.

https://docs.racket-lang.org/parser-tools/Lexers.html#(mod-path._parser-tools%2Flex-sre)

ah... ok.I did not have noticed it.thank

Damien

There's an old port of Alex Shinn's irregex library to Racket but it''s a few years behind the current release.

I like the irregex library a lot, especially the fact that it makes Olin Shivers' SRE's available... but when I've tried to use it in practice, it turns out to be far far slower than the built-in regexps. I think that the right solution here is to build a structured front-end for the existing regexp package. Or to make irregex much faster, that would be nifty too. I also have a vague recollection that there was something like this for Rhombus, might have been more of a proof-of-concept? Maybe @usao would know more?

A regexp sublanguage is shown in the Rhombus paper, just to demonstrate how powerful the macro system can be. The same sublanguage is also used as test cases, in rhombus/tests/rx-space.rhm. I think Cooper is working on a more complete version of that.

A structured notation for regexps isn’t anything new, afaik. Emacs Lisp has the rx notation, which in turn is influenced by Scheme Regular Expressions (SRFI 115).

2 Likes

about the rx notation , is there a way to use it like emacs do it:
(rx " ...." )

insteatd of #rx" .... "

because i do not know how to modify the SRFI-105.rkt parser i use that do not support #rx" ... " notation

seems only Racket use this notation.

i mean is there a way to use a single string as a regexp :

> (regexp-match #px"^[[:blank:]]*[;]*[[:ascii:]]*$" " (;b")
'(" (;b")
> (regexp-match "^[[:blank:]]*[;]*[[:ascii:]]*$" " (;b")
#f

with the same result of course.

There are regexp and pregexp constructors. The advantage of the reader syntax is some compile-time checks.

In your example, wrap the string with (pregexepg).

1 Like

great , i haven't found it in the doc, perheaps i could even modify the parser now....

> (regexp-match #px"^[[:blank:]]*[;]*[[:ascii:]]*$" " (;b")
Error: SRFI-105 REPL :Unsupported # extension unsupported character causing this message is character:p
. . ../Scheme-PLUS-for-Racket/main/Scheme-PLUS-for-Racket/src/SRFI-105.rkt:136:17: SRFI-105 REPL :Unsupported # extension unsupported character causing this message is character:p
> "^[[:blank:]]*[;]*[[:ascii:]]*$"
"^[[:blank:]]*[;]*[[:ascii:]]*$"
> pregexepg
pregexepg: undefined;
 cannot reference an identifier before its definition
> regexp
#<procedure:regexp>
> pregexp
#<procedure:pregexp>
> (pregexp "^[[:blank:]]*[;]*[[:ascii:]]*$")
#px"^[[:blank:]]*[;]*[[:ascii:]]*$"

i upgraded the SRFI 105 for Racket parser to support Racket's regular expressions notation:

;; Racket's regular expressions special syntax
	    ((char=? c #\r) (if (not (equal? (read-char port) #\x))
				(error "process-sharp : awaiting regexp : character x not found")
				(let ((str (my-read port)))
				  (if (not (string? str))
				      (error "process-sharp : awaiting regexp : string not found" str)
				      (list 'regexp str)))))

	    ((char=? c #\p) (if (not (equal? (read-char port) #\x))
				(error "process-sharp : awaiting regexp : character x not found")
				(let ((str (my-read port)))
				  (if (not (string? str))
				      (error "process-sharp : awaiting pregexp : string not found" str)
				      (list 'pregexp str)))))

commited in version 7.9 : GitHub - damien-mattei/Scheme-PLUS-for-Racket: Scheme+ for Racket by Damien Mattei

Since regexp and pregexp values can be embedded in compiled code, a macro can expand e.g. (rx ".") to '#rx".". One of my packages has rx and px macros that do so. In particular, with #lang at-exp racket, you can write @px{\s} without the extra escaping of #px"\\s".

1 Like