You might be tempted to read the following regular expression as *third or fifth row*:

`'fifth row'.match /third|fifth row/ #=> #<MatchData "fifth row">`

`'third row'.match /third|fifth row/ #=> #<MatchData "third">`

But unfortunately, as you can see, it’s more like either *third* (only) or else *fifth row*. This is due to something called *order of operations* or *operator precedence*. The invisible operator for concatenation has higher precedence than the alternation operator `|`

.

To oil these wheels, we now add parentheses to our three operators. In a regular expression, the sub expression enclosed in parentheses get the highest priority:

`'fifth row'.match /(third|fifth) row/ #=> #<MatchData "fifth row">`

`'third row'.match /(third|fifth) row/ #=> #<MatchData "third row">`

Note that the parentheses are meta-characters, not literals. They won’t match anything in the subject string. And of course it’s possible to nest parentheses:

`'third row'.match /(third|(four|fif)th) row/ #=> #<MatchData "third row">`

`'fourth row'.match /(third|(four|fif)th) row/ #=> #<MatchData "fourth row">`

`'fifth row'.match /(third|(four|fif)th) row/ #=> #<MatchData "fifth row">`

There are three things we need to remember, to know in what order and with what operands the regular expression engine will execute the operators:

**Operator precedence**is an ordered list that tells you if one operator should be executed before another operator in a regular expression. Several operators can have the same priority. In mathematics, the terms inside the parentheses have the highest priority. Multiplication and division have a lower priority. Addition and subtraction have the lowest. This is why`6+6/(2+1) = 8`

.**Operator position**indicates where the operands are located in relation to the operator. The position can be*prefix*,*infix*, or*postfix*. If the operator is prefix, then the operand resides to the right of the operator, as the unary minus sign e.g.`-3`

. An infix operator has an operand on each side, as in addition`1+2`

. A postfix operator stands to the right of its operand, as the exclamation point that represents the faculty operator in`5!`

.**Operator associativity**tells us how to group two operators on the same precedence level. An infix operator can be right-associative, left-associative or non-associative. In mathematics, the infix operations addition and subtraction have the same precedence. Since both are left-associative the following equation holds:`1-2+3 = (1-2)+3 = 2`

. Prefix or postfix operators are either associative or non-associative. If they are associative, we start with the operator that is closest to the operand. An operator that is non-associative can’t compete with operators of same precedence.

Here goes the table for the operators we have studied so far. Later on, there’s a complete table of all regex operators.

Operator | Symbol | Precedence | Position | Associativity |
---|---|---|---|---|

Kleene star | `*` |
1 | Postfix | Associative |

Concatenation | N/A | 2 | Infix | Left-associative |

Alternation | `|` |
3 | Infix | Left-associative |

If you think this is hard to remember, then try to memorize the mnemonic *SCA*. It stands for Star-Concat-Alter, i.e. the order of precedence in regular expressions.

## 0 Responses to “Regular Expression Precedence”