Regex Quantifiers Algebra

Regular expressions or parts of a regular expression can be repeated. I specify the number of repetitions with a type of operator called quantifiers. A quantifier has two attributes:

  1. A lower limit for the number of repetitions is represented by a natural number (i.e. a non-negative integer)
  2. An upper limit for the number of repetitions is represented either by a natural number or the empty string. The latter means unlimited.

I write the quantifiers inside a pair of braces, with a comma between the lower and upper limit, for example:

  • Zero or one repetition: {0,1}
  • One, two or three repetitions: {1,3}
  • Fourteen or fifteen repetitions: {14,15}
  • Two or more repetitions: {2,}

The quantifier {1,1} is an identity operator:

  • a{1,1} equals a

Quantifiers are unary, left associative, and has high precedence. Concatenation as well as Alternation have lower precedence. That gives the following rules:

  • Concatination: ab{1,2} equals a(b{1,2})
  • Concatination: a{1,2}b equals (a{1,2})b
  • Alternation: a|b{1,2} equals a|(b{1,2})
  • Alternation: a{1,2}|b equals (a{1,2})|b

I’ve got a lot of syntactic sugar in my regular expression jar. A single natural number in a quantifier represents both the lower and upper limit:

  • {3} equals {3,3}

I may write Kleene closure — zero or more repetitions — as an asterisk without braces:

  • a* equals a{0,}

I may write Positive closure — one or more repetitions — as a plus sign without braces:

  • a+ equals a{1,}

I may write the Optional operation — zero or one repetition — as a question mark without braces:

  • a? equals a{0,1}

More algebra:

  • (a*)* equals a*
  • (a+)+ equals a+
  • (a?)? equals a?
  • (a*)+ equals (a+)* equals a*
  • (a*)? equals (a?)* equals a*
  • (a+)? equals (a?)+ equals a*
  • Kleene closure is Positive closure or nothing: a* equals (a+|)
  • Optional a is a or nothing: a? equals (a|)? equals (a|)
  • Kleene closure of a or nothing is Kleene closure of a: (a|)* equals a*
  • Positive closure of a is a concatenated with Kleene closure of a: a+ equals aa*

Pomodoro Technique Illustrated -- New book from The Pragmatic Programmers, LLC


%d bloggers like this: