Matching regexp in case statement?

Hello,

As per usual, I'm learning racket by implementing pieces of my blog software over. I have 20+ years of blog posts (and tweets, over 11k in total) that have been migrated over to Markdown at various points, which means that metadata is not consistent across the board.

Currently I'm trying to handle parsing post dates from the YAML-formatted "frontmatter" on the posts, and hitting a wall trying to select the date format to parse with based on matching with regexp:

(define parse-post-date
  (λ (date-string default)
    (case #t
      [(list? (regexp-match #px"\\d{4}-\\d{2}-\\d{2}T\\d{2}:\\d{2}:\\d{2}Z" date-string))
       (parse-date "yyyy-MM-dd'T'hh:mm:ss'Z'" date-string)]
      [(list? (regexp-match #px"\\d{4}-\\d{2}-\\d{2} \\d{2}:\\d{2}" date-string))
       (parse-date "yyyy-MM-dd hh:mm" date-string)]
      [else default])))

I can't seem to figure out how to write a condition that evaluates to false when the regexp matches (seems to return a list with the matched string '("2002-12-23") but I can't figure out how to turn that to something that matches #t)

The two tests:

(module+ test
  (check-eq? (parse-post-date "2002-12-23T00:00:00Z" "") (date 2002 12 23))
  (check-eq? (parse-post-date "2002-12-23 00:00" "") (date 2002 12 23))
  )

[update added test results]

One of the test failures:

FAILURE
name:       check-eq?
location:   posts.rkt:49:2
actual:     ""
expected:   #<date 2002-12-23>

There's a definite non-zero chance I'm going about this all wrong, so any tips would be appreciated.

Thanks!

--Steve

1 Like

Without testing anything, I think, you meant to use cond here and not case.

(define parse-post-date
  (λ (date-string default)
    (cond
      [(list? (regexp-match #px"\\d{4}-\\d{2}-\\d{2}T\\d{2}:\\d{2}:\\d{2}Z" date-string))
       (parse-date "yyyy-MM-dd'T'hh:mm:ss'Z'" date-string)]
      [(list? (regexp-match #px"\\d{4}-\\d{2}-\\d{2} \\d{2}:\\d{2}" date-string))
       (parse-date "yyyy-MM-dd hh:mm" date-string)]
      [else default])))

Note that the syntax of case is

(case expr
      [(datum ...) expr]
      ...)

So your [(list? ...]) is not evaluated. Your case expression will compare #t to the symbol list?.

1 Like

Ohhh, interesting. I'm definitely still confused between cond/case/case-lambda... Do I still even need the list? if I'm just trying to see if the regexp matched, or will the matched result evaluate to a non-false value?

To test if the regex matches, I would put (regexp-match …) in a
conditional test like (if <> … …) or (cond [<> …] …), or use
match with a regexp-style pattern (match string… [(regexp … pat…) …] …).

The (case expr [(datum …) body …] …) form is "dispatch to body
when expr is one of the corresponding datum …, where datum … are
implicitly quoted (so are effectively literals, like symbols, strings,
numbers, etc.).

The cond form is multi-armed if, with the addition that each
consequent can have definitions in the body.

The case-lambda form is for multi-arity lambdas.

Thank you @benknoble and @soegaard! I'm definitely closer now.

[Meta-question: if I'm having other/another issue with this function can I continue the thread here or should I start a new one for the new issue?)

[update fixed that bit, this is working now!]

Cheers

More stuff you didn't ask for...

I definitely prefer using 'match' here. Here's a first pass:

(define parse-post-date
  (λ (date-string default)
    (match date-string
      [(regexp #px"^\\d{4}-\\d{2}-\\d{2}T\\d{2}:\\d{2}:\\d{2}Z$")
       (parse-date "yyyy-MM-dd'T'hh:mm:ss'Z'$" date-string)]
      [(regexp #px"^\\d{4}-\\d{2}-\\d{2} \\d{2}:\\d{2}$")
       (parse-date "yyyy-MM-dd hh:mm" date-string)])))

(you'll notice I also threw carets and dollar-signs in there because I'm paranoid with my regexps.)

But then, it seems sad to duplicate the information in the regexp and the string passed to parse-date. Why not just grab it from the regexp?

Viz:


(define parse-post-date
  (λ (date-string default)
    (match date-string
      [(regexp #px"^(\\d{4})-(\\d{2})-(\\d{2})T(\\d{2}):(\\d{2}):(\\d{2})Z$"
               (list _ y m d h s))
       (my-date-maker y m d h s)]
      [(regexp #px"^(\\d{4})-(\\d{2})-(\\d{2}) (\\d{2}):(\\d{2})$"
               (list _ y m d h m))
       (my-date-maker y m d h #f)])))

All untested code, alas...

1 Like

Hi @jbclements - Good points all, and I really appreciate the feedback on good practices. Mine are overly informed by declarative programming and lack of racket experience!

I actually thought of grabbing the clusters from the regex, but your code here is even cleaner than what I'd have come up with. So thanks, I'm learning things!