Making my function more elegant

I'm making a function that calculates whether or not a number is valid. Specifically whether or not an Australian Tax File Number (TFN) is valid (see Clearwater Software for an explanation of what's happening).

Now the TFNs I am receiving are mostly in text format. So I'm getting a string to begin with. In order to perform the maths, I have to cast the string to its individual digits and then map that with a list. I then perform the calculations upon the list of numbers and return a result.

That I as follows:

(define (validate-tfn candidate)
  (define tfn-regex (pregexp "(\\d{3})\\s?(\\d{3})\\s?(\\d{3})"))
  (define magic-weights '(1 4 3 7 5 8 6 9 10))
  (if (regexp-match tfn-regex candidate)
      (zero?
       (remainder 
        (foldr + 0 (map * magic-weights
                        (map string->number
                             (map string
                                  (string->list
                                   (foldr string-append ""
                                          (cdr (regexp-match tfn-regex candidate)))))))) 11))
      #f))

(module+ test
  (test-case "returns #t for a valid AU TFN"
    (check-true (validate-tfn "123456782")))
  (test-case "returns #f for an invalid AU TFN"
    (check-false (validate-tfn "123456789"))))

Which feels a bit unwieldy. So I'm looking for suggestions to make the code more elegant.

For instance:

  • Can I get the value of string->list to be little strings and not chars?
  • Use something other than the cdr of the regexp-match to get the string with no spaces?

I'm sure there are other things I haven't thought of so I'm keen to explore.

Thanks in advance.

1 Like

Without having looked at the details yet, you could use compose to simplify the nested map calls.

1 Like

Also, you can apply + to multiple arguments:

> (+ 1 2 3 4)
10

so you can get rid of the foldr + 0 ....

Edit: Since you have a list of numbers, you'd need (apply + ...).

2 Likes

I would write this:

(define tfn-regex (pregexp "(\\d{3})\\s?(\\d{3})\\s?(\\d{3})"))
(define magic-weights '(1 4 3 7 5 8 6 9 10))
(define 0-code (char->integer #\0))

(define (validate-tfn candidate)
  (match candidate
    [(pregexp tfn-regex (cons _ (app string-append* s)))
     (zero? (remainder
             (for/sum ([c (in-string s)]
                       [w (in-list magic-weights)])
               (* (- (char->integer c) 0-code) w))
             11))]
    [_ #f]))

(module+ test
  (test-case "returns #t for a valid AU TFN"
    (check-true (validate-tfn "123456782")))
  (test-case "returns #f for an invalid AU TFN"
    (check-false (validate-tfn "123456789"))))

Comments:

  • Avoid using string-append in a loop (foldr in this case). Each string-append requires a new allocation of the entire string. It’s better to gather all the strings that you want to append, and then string-append* once, which will allocate only once.
  • You should store the result of (regexp-match tfn-regex candidate) so that you don’t need to compute it several times.
4 Likes

Here's a variant of sorawee's solution, but using cond:

#lang racket

(define tfn-regex (pregexp "(\\d{3})\\s?(\\d{3})\\s?(\\d{3})"))
(define magic-weights '(1 4 3 7 5 8 6 9 10))
(define 0-code (char->integer #\0))
(define (char->digit char)
  (- (char->integer char) 0-code))

(define (valid-tfn? candidate)
  (define match-result (regexp-match tfn-regex candidate))
  (cond
    [match-result
     (define digit-string (string-append* (cdr match-result)))
     (zero? (remainder
             (for/sum ([c (in-string digit-string)]
                       [w (in-list magic-weights)])
               (* (char->digit c) w))
             11))]
    [else #f]))

(module+ test
  (require rackunit)

  (test-case "returns #t for a valid AU TFN"
    (check-true (valid-tfn? "123456782")))
  (test-case "returns #f for an invalid AU TFN"
    (check-false (valid-tfn? "123456789"))))
1 Like

You could also collapse down your conditionals and clarify the main-line logic by pushing infrastructure out of the body:

#lang racket

(define tfn-regex (pregexp "(\\d{3})\\s?(\\d{3})\\s?(\\d{3})"))
(define magic-weights '(1 4 3 7 5 8 6 9 10))
(define 0-code (char->integer #\0))

(define (char->digit char)
  (- (char->integer char) 0-code))

(define (valid-tfn? candidate)
  (match (regexp-match tfn-regex candidate)
    [#f #f]
    [(list _  a b c)
     (zero? (remainder
             (for/sum ([c (in-list (map char->digit (string->list (string-append a b c))))]
		               [w (in-list magic-weights)])
	       (* c w))
             11))]))

(module+ test
  (require rackunit)

  (test-case "returns #t for a valid AU TFN"
    (check-true (valid-tfn? "123456782")))

  (test-case "returns #f for an invalid AU TFN"
    (check-false (valid-tfn? "123456789"))))
1 Like

These are interesting. These are generally designed so common typing errors like transpositions and off by one errors result in an invalid number

Here is mine for UK NHS numbers

1 Like

Quick thought: You could simplify things a bit by using match instead of let+cond. Also, there was a missing require -- you were using racket/base but you didn't include racket/list, so the last function wasn't available to valid-nhs-number?.

#lang racket/base
(require racket/list racket/match)
; nhs number validator                                                                         
;; 10 digits                                                                                   
;; last is check digit                                                                         
(provide calculate-nhs-number-check-digit
         valid-nhs-number?)

(define nums '(10 9 8 7 6 5 4 3 2))
(define (calculate-nhs-number-check-digit list-of-digits)
  (match (- 11 (modulo (foldl (λ (a b r) (+ r (* a b))) 0 list-of-digits nums)
                       11))
    [11  0]
    [10 #f]
    [c   c]))

(define (valid-nhs-number? list-of-digits)
  (and (last list-of-digits)
       (= (calculate-nhs-number-check-digit (take list-of-digits 9))
	      (last list-of-digits))))


(valid-nhs-number? '(6 2 5 7 6 0 0 4 1 3))  ; #t                                               
(valid-nhs-number? '(4 8 5 7 7 7 3 4 5 6))  ; #f                                               

Also, what is the and in valid-nhs-number? there for? It's checking that the last element in the digits list is not #f, and then it's checking that digit against the result of the calculation, but the calculation is based on only the first 9 digits of the list so the comparison might not make sense.

2 Likes

@robertpostill I focused so much on the checksum processing, that it occurred only later to me that there seem to be two problems with the regular expression:

  • It's not anchored with ^ and $, so it matches "abc123456782xyz". More subtly, it'll match "1234567890" (ten digits, matching the first nine) or "1234567 890" (ten digits, matching the last nine).
  • It matches any whitespace, so it matches "123\t456\n782"

So my suggestion would be

(define tfn-regex (pregexp "^(\\d{3}) ?(\\d{3}) ?(\\d{3})$"))

In case you deliberately used \s to also match non-breaking space, thin space and so on, it get's more tricky. It'll depend on where your data is coming from and what you want to allow.

1 Like

My attempt:

#lang racket/base

(define (validate-tfn candidate)
  (define tfn-regex
    (pregexp "^(\\d)(\\d)(\\d) ?(\\d)(\\d)(\\d) ?(\\d)(\\d)(\\d)$"))
  (cond
    [(regexp-match tfn-regex candidate)
     => (λ (matches)
          (define sum
            (for/sum ([digit (cdr matches)]
                      [weight '(1 4 3 7 5 8 6 9 10)])
              (* weight (string->number digit))))
          (zero? (remainder sum 11)))]
    [else #f]))

(module+ test
  (require rackunit)
  (test-case "returns #t for a valid AU TFN"
             (check-true (validate-tfn "123456782")))
  (test-case "returns #f for an invalid AU TFN"
             (check-false (validate-tfn "123456789"))))
5 Likes

Wow, I just wanted to thank everyone for their efforts here. This is exactly the feedback I wanted. Like seeing the match version vs the cond version for example is really helpful. The for/sum form was also very interesting and I'll be looking at that.

I appreciated the performance tips too.

4 Likes