Lexer: what means "matches no string"?

nothing
A lexer abbreviation that matches no string.

I am not sure if this means never or always.

If "no string" is an empty string, this would match always. In that case I would like to know how to stop endless recursion? Is it possible to create a stateful lexer?

If it matches never, I would like to know why this is useful?

nothing interacts with union (or :or from the SRE library): the branches with nothing are discarded. So one possible use case (though I don’t know if it’s an intended use case) is to prune some branches that you don’t want away, especially within a regexp abstraction defined via define-lex-trans.

Here’s a concrete example.

#lang racket

(require parser-tools/lex
         (prefix-in : parser-tools/lex-sre)
         (for-syntax syntax/parse))

(define-lex-trans option
  (syntax-parser
    [(_ re) #'(:or (:: "some " re) "none")]))

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

((lexer
  [(option "abc") 'matched]
  ["" 'unmatched])
 (open-input-string "some abc")) ;=> matched

((lexer
  [(option "abc") 'matched]
  ["" 'unmatched])
 (open-input-string "some def")) ;=> unmatched

((lexer
  [(option "abc") 'matched]
  ["" 'unmatched])
 (open-input-string "none")) ;=> matched

;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;

((lexer
  [(option nothing) 'matched]
  ["" 'unmatched])
 (open-input-string "none")) ;=> matched

((lexer
  [(option nothing) 'matched]
  ["" 'unmatched])
 (open-input-string "some abc")) ;=> unmatched

The above code defines a regexp abstraction called option. So (option "abc") matches both none and some abc, but not some def.

But now, if I want to match only none, but not any some, I can give nothing as the regexp to option.

1 Like

On the theoretical side: The empty regexp serves as the zero in the algebra of regular languages. If you consider the set of strings (i.e., the regular language) the regexp recognizes, the empty regexp is just the empty set. Concatenation corresponds to Cartesian product with string concatenation, so concatenation with empty always results in empty; union corresponds to set union, so empty is the identity element. In this sense, it is a zero, which is very important (although itself very trivial).

2 Likes