Recursive Regexes

Oniguruma supports recursive regular expressions. They can e.g. be used for matching a generative grammar like the following:

  1. expression -> expression + term
  2. expression -> expression - term
  3. expression -> term
  4. term -> ( expression )
  5. term -> digit

I wrote the following regex that corresponds to the grammar above:

  • /^(?<expression>(?<term>\d|\(\g<expression>\))([-+]\g<expression>)?)$/

Note that \g<expression> is invoked recursively – like rule number 4 in the grammar. I chose to only allow single digit numbers to make the expression crisper as example. This could of course easily be changed to general integers, floats or imaginary numbers.

Here goes some testing:

irb(main):001:0> r = /^(?<expression>(?<term>\d|\(\g<expression>\))([-+]\g<expression>)?)$/
=> /^(?<expression>(?<term>\d|\(\g<expression>\))([-+]\g<expression>)?)$/
irb(main):002:0> "2+4"[r]
=> "2+4"
irb(main):003:0> "2+45"[r]
=> nil
irb(main):004:0> "2-3(4+3)"[r]
=> nil
irb(main):005:0> "2-3(+3)"[r]
=> nil
irb(main):006:0> "2-31(+3)"[r]
=> nil
irb(main):007:0> "2-3+(+3)"[r]
=> nil
irb(main):008:0> "2-3+(4+3)"[r]
=> "2-3+(4+3)"

Expressions that return nil are obviously not correct.

Pomodoro Technique Illustrated -- New book from The Pragmatic Programmers, LLC

Advertisement

0 Responses to “Recursive Regexes”



  1. Leave a Comment

Leave a Reply

Fill in your details below or click an icon to log in:

Gravatar
WordPress.com Logo

Please log in to WordPress.com to post a comment to your blog.

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Connecting to %s





Follow

Get every new post delivered to your Inbox.