Regular expressions or parts of a regular expression can be repeated. I specify the number of repetitions with a type of operator called quantifiers. A quantifier has two attributes:
- A lower limit for the number of repetitions is represented by a natural number (i.e. a non-negative integer)
- An upper limit for the number of repetitions is represented either by a natural number or the empty string. The latter means unlimited.
I write the quantifiers inside a pair of braces, with a comma between the lower and upper limit, for example:
- Zero or one repetition: {0,1}
- One, two or three repetitions: {1,3}
- Fourteen or fifteen repetitions: {14,15}
- Two or more repetitions: {2,}
The quantifier {1,1} is an identity operator:
- a{1,1} equals a
Quantifiers are unary, left associative, and has high precedence. Concatenation as well as Alternation have lower precedence. That gives the following rules:
- Concatination: ab{1,2} equals a(b{1,2})
- Concatination: a{1,2}b equals (a{1,2})b
- Alternation: a|b{1,2} equals a|(b{1,2})
- Alternation: a{1,2}|b equals (a{1,2})|b
I’ve got a lot of syntactic sugar in my regular expression jar. A single natural number in a quantifier represents both the lower and upper limit:
- {3} equals {3,3}
I may write Kleene closure — zero or more repetitions — as an asterisk without braces:
- a* equals a{0,}
I may write Positive closure — one or more repetitions — as a plus sign without braces:
- a+ equals a{1,}
I may write the Optional operation — zero or one repetition — as a question mark without braces:
- a? equals a{0,1}
More algebra:
- (a*)* equals a*
- (a+)+ equals a+
- (a?)? equals a?
- (a*)+ equals (a+)* equals a*
- (a*)? equals (a?)* equals a*
- (a+)? equals (a?)+ equals a*
- Kleene closure is Positive closure or nothing: a* equals (a+|)
- Optional a is a or nothing: a? equals (a|)? equals (a|)
- Kleene closure of a or nothing is Kleene closure of a: (a|)* equals a*
- Positive closure of a is a concatenated with Kleene closure of a: a+ equals aa*

0 Responses to “Regex Quantifiers Algebra”