Regex at Windows’ command line

Regular Expressions are patterns that match strings in text files or text streams. All strings that can possibly be matched by a particular expression are collectively a regular language. Every regular language has at least one corresponding finite automaton. The finite automaton is a very simple state machine.

In order to match regular languages we need only three operations – Les trois Mousqetaires de Regex:

  1. Kleene star, i.e * – repeat the preceding pattern zero or more times
  2. Concatenation – match two consecutive patterns, e.g. 73 match 7 followed by 3
  3. Alternation – match exactly one out of a list of patterns, i.e. logical OR

The precedence is as the list above. Sometimes, that’s not enough and thus we also need D’Artagnan, the fourth musketeer. I’m talking about parenthesis for grouping – just like in any other programming language.

The unix command-line tool egrep meets the regular expressions litmus test. It supports concatenation, Kleene star, alternation, and grouping.

What about Windows then?

There’s a built-in command-line tool in Windows called findstr. And findstr supports Kleene star and concatenation. That’s 2/4 of the mandatory operations we need to match regular languages. Unlike egrep, it lacks alternation and forced precedence with parenthesis – and thus you can’t use findstr for regular expressions.

What to do then if you want to write regular expressions at the command line in Windows? There are at least two alternatives:

  • UnxUtils are native Win32 ports of common GNU utilities. Native means that the executables only depend on the Microsoft C-runtime (msvcrt.dll) and not an emulation layer. One of the executables is egrep.
  • Cygwin is a Linux API emulation layer and a vast number of tools – among them egrep.

D’Artagnan and his three musketeer friends Athos, Porthos, and Aramis lived by the motto tous pour un, un pour tous. If you don’t have grouping and the three mandatory operations, then you can’t write regular expressions.

Advertisements

2 Responses to “Regex at Windows’ command line”


  1. 1 JR 2011-02-3 at 20.24

    Hey – just read your PragPub article.

    http://pragprog.com/magazines/2011-02/de-morgan-to-the-rescue

    Was great. Just wanted to say hi and good job.

    Cheers – JR

  2. 2 Staffan Nöteberg 2011-02-3 at 22.14

    Thanks JR!

    BTW: the translation of your The Agile Samurai into Swedish is now done. It will be released in Sweden during this spring.


Comments are currently closed.




%d bloggers like this: