Using Rule number 2 and Rule number 4, we can create regular expressions that consists of any sequence of symbols from our alphabet. Rule number 2 said that if the symbol *a* is in the alphabet, then `a`

is a regular expression. Rule number 4 said that if `p`

and `q`

are two regular expressions, then the concatenation `pq`

is a regular expression as well. The concatenation symbol itself is invisible. Just write the two regular expressions right after each other:

`'moda'[/m/] #=> "m" – we found the substring s in the string"moda"`

`'moda'[/o/] #=> "o"`

`'moda'[/mo/] #=> "mo" - /mo/ is /m/ concatenated with /o/`

`'moda'[/da/] #=> "da"`

`'moda'[/moda/] #=> "moda" - /moda/ is /mo/ concatenated with /da/`

`'moda'[/mado/] #=> nil – no match, since the order was changed`

There are some handy terms we usually use for parts of strings:

**Prefix:**A prefix is the substring we have left if we remove zero or more symbols from the end of a string. The strings*m*,*mo*,*mod*, and*moda*are all prefixes of the string*moda*. Even the empty string ε is a prefix*moda*.**Suffix:**The suffix is the substring that is left if we remove zero or more symbol from the beginning of the string. The strings*moda*,*oda*,*da*,*a*, and ε are all suffixes of the string*moda*.**Substring:**A substring is what we have left if we remove a prefix and a suffix from a string. Note that the prefix and/or the suffix can be ε. Substrings must still be consecutive in the original string. The strings*od*and*moda*, but not*mda*, are substrings of*moda*.

For any regular expression `p`

, it’s true that `εp = pε = p`

, thus we say that the empty string ε is the *identity* under concatenation. There is no *annihilator* under concatenation, i.e., there’s no regular expression `0`

so that for any regular expression `p`

it holds that `0p = p0 = 0`

. Concatenation is not commutative, since `pq`

is not equal to `qp`

, but it’s associative since for any regular expressions `p`

and `q`

it’s true that `p(qr) = (pq)r`

.

If we think of concatenation as a product, then regular expressions also support exponentiation. We write the exponent enclosed in braces to the right of the regular expression:

`'aaa'[/aaa/] #=> "aaa"`

`'aaa'[/a{3}/] #=> "aaa" – yes, the string includes 3 concatenated a`

`'aaa'[/a{4}/] #=> nil – no, the string doesn't include 4 a`

This is obviously just syntactic sugar. All regular expressions that we can write using the exponential operator, can also be unfolded. There are more shortcuts for finite repeated concatenations:

`'aa'[/a?/] #=> "a" – the optional operator written as question mark`

`'b'[/a?/] #=> "" – zero repeats of a matches the empty string`

`'aa'[/a{,2}/] #=> "aa" – at least two a`

`'aa'[/a{1,2}/] #=> "aa" – at least one a and at moust two a`

`'a'[/a{1,2}/] #=> "a"`

We will soon see that the concatenation of two regular expressions are not the same as the concatenation of two strings. Remember that a regular expression corresponds to a set of strings. For example, if `p = {a, b}`

and `q = {c, d}`

, then `pq = {ac, ad, bc, bd}`