<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
	>

<channel>
	<title>Staffan Nöteberg&#039;s blog</title>
	<atom:link href="http://blog.staffannoteberg.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://blog.staffannoteberg.com</link>
	<description>Brainmoda &#124; Same brain - more astute usage</description>
	<lastBuildDate>Fri, 04 May 2012 06:02:36 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
<cloud domain='blog.staffannoteberg.com' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' />
<image>
		<url>http://s2.wp.com/i/buttonw-com.png</url>
		<title>Staffan Nöteberg&#039;s blog</title>
		<link>http://blog.staffannoteberg.com</link>
	</image>
	<atom:link rel="search" type="application/opensearchdescription+xml" href="http://blog.staffannoteberg.com/osd.xml" title="Staffan Nöteberg&#039;s blog" />
	<atom:link rel='hub' href='http://blog.staffannoteberg.com/?pushpress=hub'/>
		<item>
		<title>Interview on Time Management and Future Book Projects</title>
		<link>http://blog.staffannoteberg.com/2012/05/04/interview-on-time-management-and-future-book-projects/</link>
		<comments>http://blog.staffannoteberg.com/2012/05/04/interview-on-time-management-and-future-book-projects/#comments</comments>
		<pubDate>Fri, 04 May 2012 06:02:27 +0000</pubDate>
		<dc:creator>Staffan Nöteberg</dc:creator>
				<category><![CDATA[Agile]]></category>
		<category><![CDATA[Brain]]></category>
		<category><![CDATA[Howto]]></category>
		<category><![CDATA[Iterative]]></category>
		<category><![CDATA[Kitchen Timer]]></category>
		<category><![CDATA[Memory]]></category>
		<category><![CDATA[Mind]]></category>
		<category><![CDATA[Pomodoro]]></category>
		<category><![CDATA[Pomodoro Technique]]></category>
		<category><![CDATA[Pomodoro Technique Illustrated]]></category>
		<category><![CDATA[Procrastination]]></category>
		<category><![CDATA[Psychology]]></category>
		<category><![CDATA[Sustainable Pace]]></category>
		<category><![CDATA[Tecnica del Pomodoro]]></category>
		<category><![CDATA[Time box]]></category>
		<category><![CDATA[Time Management]]></category>

		<guid isPermaLink="false">http://blog.staffannoteberg.com/?p=1324</guid>
		<description><![CDATA[Baris: Effectively managing your to-do list is a big part of the Pomodoro Technique. I really like the simplicity of having a super simple list with items grouped as “now”, “today”, “later”. Is the “now list” your invention? Please tell me the thought process behind it. Staffan: I think it’s my invention, even though many [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.staffannoteberg.com&#038;blog=2614182&#038;post=1324&#038;subd=brainmoda&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p><strong>Baris: </strong><em>Effectively managing your to-do list is a big part of the Pomodoro Technique. I really like the simplicity of having a super simple list with items grouped as “now”, “today”, “later”. Is the “now list” your invention? Please tell me the thought process behind it.</em></p>
<p><strong>Staffan: </strong><em>I think it’s my invention, even though many other people most certainly have similar concepts. Even if you decide to focus on just one thing, your thoughts easily starts to wander now and then. Writing the title of your current activity on a slip of paper and putting it next to the keyboard reminds you with in a fraction of a second what it was.<br />
</em></p>
<p>I&#8217;m interviewed by Baris Sarer. The full text is here:</p>
<li>Part one: <a href="http://www.pomodorotime.org/pomodoro-technique-2/staffan-noteborg-interview-on-pomodor-technique-part-i/" title="http://www.pomodorotime.org/pomodoro-technique-2/staffan-noteborg-interview-on-pomodor-technique-part-i/" target="_blank">http://www.pomodorotime.org/pomodoro-technique-2/staffan-noteborg-interview-on-pomodor-technique-part-i/</a></li>
<li>Part two: <a href="http://www.pomodorotime.org/pomodoro-technique-2/staffan-noteborg-interview-on-pomodoro-technique-part-ii/" title="http://www.pomodorotime.org/pomodoro-technique-2/staffan-noteborg-interview-on-pomodoro-technique-part-ii/" target="_blank">http://www.pomodorotime.org/pomodoro-technique-2/staffan-noteborg-interview-on-pomodoro-technique-part-ii/</a></li>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/brainmoda.wordpress.com/1324/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/brainmoda.wordpress.com/1324/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/brainmoda.wordpress.com/1324/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/brainmoda.wordpress.com/1324/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/brainmoda.wordpress.com/1324/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/brainmoda.wordpress.com/1324/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/brainmoda.wordpress.com/1324/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/brainmoda.wordpress.com/1324/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/brainmoda.wordpress.com/1324/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/brainmoda.wordpress.com/1324/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/brainmoda.wordpress.com/1324/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/brainmoda.wordpress.com/1324/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/brainmoda.wordpress.com/1324/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/brainmoda.wordpress.com/1324/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.staffannoteberg.com&#038;blog=2614182&#038;post=1324&#038;subd=brainmoda&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://blog.staffannoteberg.com/2012/05/04/interview-on-time-management-and-future-book-projects/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/892160c571f99f4c4f75dbd01ca95406?s=96&#38;d=http%3A%2F%2Fs0.wp.com%2Fi%2Fmu.gif" medium="image">
			<media:title type="html">snoteberg</media:title>
		</media:content>
	</item>
		<item>
		<title>New inteview about Time Management by Turing China</title>
		<link>http://blog.staffannoteberg.com/2012/03/13/new-inteview-about-time-management-by-turing-china/</link>
		<comments>http://blog.staffannoteberg.com/2012/03/13/new-inteview-about-time-management-by-turing-china/#comments</comments>
		<pubDate>Tue, 13 Mar 2012 07:42:48 +0000</pubDate>
		<dc:creator>Staffan Nöteberg</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://blog.staffannoteberg.com/?p=1320</guid>
		<description><![CDATA[&#8216;I&#8217;ve used &#8220;give it a try for five minutes &#8212; I&#8217;ll start the kitchen timer now&#8221; with children in several situations where I expect that their major impediment is to get started&#8217; Turing China published a long interview with me in Chinese http://bit.ly/yTt4zP and in English http://bit.ly/A4q6MW.<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.staffannoteberg.com&#038;blog=2614182&#038;post=1320&#038;subd=brainmoda&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p><em>&#8216;I&#8217;ve used &#8220;give it a try for five minutes &#8212; I&#8217;ll start the kitchen timer now&#8221; with children in several situations where I expect that their major impediment is to get started&#8217;</em></p>
<p>Turing China published a long interview with me in Chinese <a href="http://bit.ly/yTt4zP" target="tc">http://bit.ly/yTt4zP</a> and in English <a target="tc" href="http://bit.ly/A4q6MW">http://bit.ly/A4q6MW</a>.</p>
<p><a href="http://www.pomodoro-book.com/en" target="pb"><img class="aligncenter size-full wp-image-778" title="Pomodoro Technique Illustrated -- New book from The Pragmatic Programmers, LLC" src="http://brainmoda.files.wordpress.com/2008/02/idg_en.gif?w=500" alt="Pomodoro Technique Illustrated -- New book from The Pragmatic Programmers, LLC"   /></a></p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/brainmoda.wordpress.com/1320/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/brainmoda.wordpress.com/1320/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/brainmoda.wordpress.com/1320/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/brainmoda.wordpress.com/1320/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/brainmoda.wordpress.com/1320/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/brainmoda.wordpress.com/1320/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/brainmoda.wordpress.com/1320/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/brainmoda.wordpress.com/1320/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/brainmoda.wordpress.com/1320/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/brainmoda.wordpress.com/1320/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/brainmoda.wordpress.com/1320/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/brainmoda.wordpress.com/1320/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/brainmoda.wordpress.com/1320/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/brainmoda.wordpress.com/1320/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.staffannoteberg.com&#038;blog=2614182&#038;post=1320&#038;subd=brainmoda&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://blog.staffannoteberg.com/2012/03/13/new-inteview-about-time-management-by-turing-china/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/892160c571f99f4c4f75dbd01ca95406?s=96&#38;d=http%3A%2F%2Fs0.wp.com%2Fi%2Fmu.gif" medium="image">
			<media:title type="html">snoteberg</media:title>
		</media:content>

		<media:content url="http://brainmoda.files.wordpress.com/2008/02/idg_en.gif" medium="image">
			<media:title type="html">Pomodoro Technique Illustrated -- New book from The Pragmatic Programmers, LLC</media:title>
		</media:content>
	</item>
		<item>
		<title>HTML5 Form Validation With Regex</title>
		<link>http://blog.staffannoteberg.com/2012/03/01/html5-form-validation-with-regex/</link>
		<comments>http://blog.staffannoteberg.com/2012/03/01/html5-form-validation-with-regex/#comments</comments>
		<pubDate>Thu, 01 Mar 2012 21:22:10 +0000</pubDate>
		<dc:creator>Staffan Nöteberg</dc:creator>
				<category><![CDATA[Howto]]></category>
		<category><![CDATA[Javascript]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[Regex]]></category>
		<category><![CDATA[Regexp]]></category>
		<category><![CDATA[Regular Expressions]]></category>
		<category><![CDATA[html5]]></category>
		<category><![CDATA[pattern]]></category>
		<category><![CDATA[rails]]></category>
		<category><![CDATA[validation]]></category>

		<guid isPermaLink="false">http://blog.staffannoteberg.com/?p=1274</guid>
		<description><![CDATA[Client side validation has always been a potential headache for front-end programmers. Embedded blocks with a mixture of imperative JavaScript and declarative regex can be a mess. HTML5 has ambition to add abstraction layers that would make this a bit easier. As I&#8217;ll explain below, theres&#8217; still a long way to go before it&#8217;s rock [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.staffannoteberg.com&#038;blog=2614182&#038;post=1274&#038;subd=brainmoda&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p><img src="http://brainmoda.files.wordpress.com/2012/03/html5-pattern.jpg?w=500&#038;h=356" alt="" title="HTML5 pattern" width="500" height="356" class="aligncenter size-full wp-image-1313" /></p>
<p>Client side validation has always been a potential headache for front-end programmers. Embedded blocks with a mixture of imperative JavaScript and declarative regex can be a mess. HTML5 has ambition to add abstraction layers that would make this a bit easier. As I&#8217;ll explain below, theres&#8217; still a long way to go before it&#8217;s rock solid.</p>
<p>There are two ideas that enters the scene now:</p>
<ol>
<li>The <code>&lt;input&gt;</code> tag has new <code>type</code> attribute values like <a href="http://www.w3.org/TR/html5/states-of-the-type-attribute.html" target="w3"><code>url</code>, <code>email</code>, <code>date</code>, <code>telephone number</code>, and <code>color</code></a>.</li>
<li>The <code>&lt;input&gt;</code> tag has the new <a href="http://www.w3.org/TR/html5/common-input-element-attributes.html#attr-input-pattern" target="w3">attribute <code>pattern</code></a> where you can describe allowed input with a regex.</li>
</ol>
<p>Note that it&#8217;s only validation. It would have been nice to have filtering (e.g. remove spaces in a credit card number) or even replacing (<i>euro</i> is sent to server, whether the user enters <i>euro</i> or <i>€</i>).</p>
<p>In case (1)  as well as (2), a nice red-green feedback lets the user know if the user entered text is correct. The tool-tip of the input widget can also have a descriptive message of what the system expects from the user. You just set a value of the <code>title</code> attribute. More on that below.</p>
<p><b>1. New values for the <code>type</code> attribute of the <code>&lt;input&gt;</code> tag</b></p>
<p>To use the <code>type</code> attribute is simple. Here&#8217;s an example with the new value <code>email</code>:</p>
<p><code> &lt;input type="email" required /&gt;</code></p>
<p>This made me curious. I guess that <code>email</code> is implemented with a regex under the hood. What does it look like? I don&#8217;t know, but it&#8217;s not correct. As a matter of fact the <a href="http://www.w3.org/TR/html5/states-of-the-type-attribute.html" target="w3">spec for the <code>email</code> attribute value</a> is incorrect. It looks like this:</p>
<blockquote><p><i>A valid e-mail address is a string that matches the ABNF production 1*( atext / &#8220;.&#8221; ) &#8220;@&#8221; ldh-str *( &#8220;.&#8221; ldh-str ) where atext is defined in RFC 5322 section 3.2.3, and ldh-str is defined in RFC 1034 section 3.5.</i></p></blockquote>
<p>So currently, the HTML5 browsers accepts the email <code>-@-</code> and doesn&#8217;t accept <code>"staffan nöteberg"@rekursiv.se</code> &#8212; I tried. It should be the other way around. (Yes, spaces and diaeresis makes sense to the left of the @ sign, as it&#8217;s a local mailbox routing that might involve a not so SMTP:ish system. For the record I tried&#8230;</p>
<p> <code>echo 'hello!' |<br />&nbsp;&nbsp;/usr/lib/sendmail '"staffan nöteberg"@rekursiv.se'</code></p>
<p>&#8230;and it works!).</p>
<p>However, even though it&#8217;s already implemented in many browsers, W3C makes it clear that it&#8217;s only a working draft. For the moment there&#8217;s a note in the <a href="http://www.w3.org/TR/html5/states-of-the-type-attribute.html" target="w3">document</a> that they are aware of this error:</p>
<blockquote><p><i>NOTE: This requirement is a willful violation of RFC 5322, which defines a syntax for e-mail addresses that is simultaneously too strict (before the &#8220;@&#8221; character), too vague (after the &#8220;@&#8221; character), and too lax (allowing comments, white space characters, and quoted strings in manners unfamiliar to most users) to be of practical use here.</i></p></blockquote>
<p>My recommendation is to NOT use the <code>email</code> attribute until it has a better implementation.</p>
<p><b>2. New attribute <code>pattern</code> of the <code>&lt;input&gt;</code> tag</b></p>
<p>The <code>input</code> tag has <a href="http://www.w3.org/TR/2011/WD-html5-diff-20110525/#new-attributes" target="w3">several new attributes</a> to specify constraints: <code>autocomplete</code>, <code>min</code>, <code>max</code>, <code>multiple</code>, <code>pattern</code>, and <code>step</code>. I&#8217;m particularly interested in the <code>pattern</code> attribute. It&#8217;s more generic than the new values of the <code>type</code> attribute mentioned above.</p>
<p>The <code>pattern</code> value is a regex. In what regex dialect? Yes, you guessed it: JavaScript according to <a href="http://www.ecma-international.org/publications/standards/Ecma-262.htm" target="ecma">ECMA-262 Edition 5</a>. This is a major drawback, since the regex support in JavaScript is modest (e.g. there&#8217;s even no meta class to match a letter &#8212; many other regex engines support the Unicode <code>\p{L}</code>). The whole user input must be matched by the regex, not only a fraction. You can look at it as if your regex is prefixed with <code>^(?:</code> and suffixed with <code>)$</code>.</p>
<p>Here are three pragmatic (but not globally perfect) examples I created:</p>
<ul>
<li>Strong password: <code>&lt;input title="at least eight symbols containing at least one number, one lower, and one upper letter" type="text" pattern="(?=.*\d)(?=.*[a-z])(?=.*[A-Z]).{8,}" required /&gt;</code></li>
<li>Email address: <code>&lt;input type="text" title="email" required pattern="[^@]+@[^@]+\.[a-zA-Z]{2,6}" /&gt;</code></li>
<li>Phone number: <code>&lt;input type="text" required pattern="(\+?\d[- .]*){7,13}" title="international, national or local phone number"/&gt;</code></li>
</ul>
<p>I leave it as a reader exercise to interpret these regexes. And you can try them too! They are online in this test page: </p>
<ul>
<li><a href="http://staffannoteberg.com/html5-pattern.html" target="html5-pattern">http://staffannoteberg.com/html5-pattern.html</a></li>
</ul>
<p>If you combine  <code>type="email"</code> and <code>pattern</code> then both constraints must be fulfilled.</p>
<p><b>Summary</b></p>
<p>HTML5 form validation is a good idea. The <code>pattern</code> tag is very generic, albeit its rather limited regex dialect. Be careful with the new values of the <code>type</code> attribute, as they are only in prototype status currently.</p>
<p>Finally: What about browser support. I&#8217;m in deep water here, but I understand it as there&#8217;s support for this kind of validation in IE 10+, Firefox 8+, Chrome 16+, Opera 11.6+, and Opera Mobile 10+. There&#8217;s partial or none support in Safari and Android.</p>
<p><a href="http://www.pomodoro-book.com/en" target="pb"><img class="aligncenter size-full wp-image-778" title="Pomodoro Technique Illustrated -- New book from The Pragmatic Programmers, LLC" src="http://brainmoda.files.wordpress.com/2008/02/idg_en.gif?w=500" alt="Pomodoro Technique Illustrated -- New book from The Pragmatic Programmers, LLC"   /></a></p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/brainmoda.wordpress.com/1274/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/brainmoda.wordpress.com/1274/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/brainmoda.wordpress.com/1274/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/brainmoda.wordpress.com/1274/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/brainmoda.wordpress.com/1274/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/brainmoda.wordpress.com/1274/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/brainmoda.wordpress.com/1274/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/brainmoda.wordpress.com/1274/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/brainmoda.wordpress.com/1274/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/brainmoda.wordpress.com/1274/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/brainmoda.wordpress.com/1274/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/brainmoda.wordpress.com/1274/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/brainmoda.wordpress.com/1274/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/brainmoda.wordpress.com/1274/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.staffannoteberg.com&#038;blog=2614182&#038;post=1274&#038;subd=brainmoda&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://blog.staffannoteberg.com/2012/03/01/html5-form-validation-with-regex/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/892160c571f99f4c4f75dbd01ca95406?s=96&#38;d=http%3A%2F%2Fs0.wp.com%2Fi%2Fmu.gif" medium="image">
			<media:title type="html">snoteberg</media:title>
		</media:content>

		<media:content url="http://brainmoda.files.wordpress.com/2012/03/html5-pattern.jpg" medium="image">
			<media:title type="html">HTML5 pattern</media:title>
		</media:content>

		<media:content url="http://brainmoda.files.wordpress.com/2008/02/idg_en.gif" medium="image">
			<media:title type="html">Pomodoro Technique Illustrated -- New book from The Pragmatic Programmers, LLC</media:title>
		</media:content>
	</item>
		<item>
		<title>The Scent of Regex Requisite</title>
		<link>http://blog.staffannoteberg.com/2012/02/01/the-scent-of-regex-requisite/</link>
		<comments>http://blog.staffannoteberg.com/2012/02/01/the-scent-of-regex-requisite/#comments</comments>
		<pubDate>Wed, 01 Feb 2012 21:00:28 +0000</pubDate>
		<dc:creator>Staffan Nöteberg</dc:creator>
				<category><![CDATA[Howto]]></category>
		<category><![CDATA[Java]]></category>
		<category><![CDATA[Javascript]]></category>
		<category><![CDATA[PHP]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[Python]]></category>
		<category><![CDATA[Regex]]></category>
		<category><![CDATA[Regexp]]></category>
		<category><![CDATA[Regular Expressions]]></category>
		<category><![CDATA[Ruby]]></category>
		<category><![CDATA[Scala]]></category>
		<category><![CDATA[Technology]]></category>

		<guid isPermaLink="false">http://blog.staffannoteberg.com/?p=1253</guid>
		<description><![CDATA[Did you know that the famous quote &#8220;Some people, when confronted with a problem, think: I know, I&#8217;ll use regular expressions. Now they have two problems.&#8221; dates all the way back to 1997? However, most programmers agree that regex has its time and place. But, how can we know when to use regex? It&#8217;s really [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.staffannoteberg.com&#038;blog=2614182&#038;post=1253&#038;subd=brainmoda&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p><img src="http://brainmoda.files.wordpress.com/2012/02/scent-requisite.jpg?w=500&#038;h=348" alt="" title="scent-requisite" width="500" height="348" class="alignnone size-full wp-image-1254" /></p>
<p><em>Did you know that the famous quote <a target="jz" href="http://regex.info/blog/2006-09-15/247">&#8220;Some people, when confronted with a problem, think: I know, I&#8217;ll use regular expressions. Now they have two problems.&#8221;</a> dates all the way back to 1997? However, most programmers agree that regex has its time and place. But, how can we know when to use regex? It&#8217;s really simple. We must use our nose and feel the scent of regex requisite. Below is a list of five scents that puts the R-word in our working memory.</em></p>
<p><b>Text to type </b></p>
<p>A text sequence is also a kind of data type. <img src="http://brainmoda.files.wordpress.com/2012/02/scent-text-to-type1.jpg?w=300&#038;h=210" alt="" title="scent-text-to-type" width="300" height="210" class="alignright size-thumbnail wp-image-1256" /> You may have read it from a file or perhaps a user entered it into your system. But you don&#8217;t confine yourself to text. You want to transform it into a bunch of structured data records. You read record after record from the text hank. Each record consists of a series of comma delimited fields and each record is terminated with a semicolon. Regex loves to parse text hanks.</p>
<p><b>Non recursive</b></p>
<p>A recursively-defined data type may be instantiated with values that contains values of the very same type. <img src="http://brainmoda.files.wordpress.com/2012/02/scent-non-recursive-data1.jpg?w=300&#038;h=210" alt="" title="scent-non-recursive-data" width="300" height="210" class="alignright size-medium wp-image-1262" /> Think of fir branches. Each branch consists of one stem, zero or more sub branches, and many needles. The sub branches are fir branches as well. A branch may have a sub branch, which may have a sub branch, which may have a sub branch etc. In theory there is no limit to how many levels we can have. As soon as you want to translate text into recursive data &#8212; then regex is usually not the best tool. To parse an entire HTML document with nested div tags is an example of recursive data.</p>
<p><b>Not lucid</b></p>
<p>If the input is small, regex often doesn&#8217;t add anything. <img src="http://brainmoda.files.wordpress.com/2012/02/scent-not-lucid.jpg?w=300&#038;h=211" alt="" title="scent-not-lucid" width="300" height="211" class="alignright size-medium wp-image-1263" /> But, when you do search-and-replace in 2000 files and what you want to replace has a variable appearance &#8212; then a neat little regex is the generalized solution. You can capture different versions and replace them with something that actually depends on the input data. It is quality &#8212; no mistakes &#8212; and quantity &#8212; no misses. In a small input, you can modify by hand. You can easily see what should be changed and to what.</p>
<p><b>Emerging</b></p>
<p>Suddenly it happens: you have input from a user, from the network, from another system or from a file. <img src="http://brainmoda.files.wordpress.com/2012/02/scent-emerging.jpg?w=300&#038;h=211" alt="" title="scent-emerging" width="300" height="211" class="alignright size-medium wp-image-1264" /> You can not predict what will come, more than that it&#8217;s text. It may be a lot and it may be a tiny little piece. Yes, it can even be an empty string. This very uncertainty makes the generalized description of the input data characteristics, useful. You describe a pattern, not a specific entity. Regex is a superhero when it comes to describing generalized patterns.</p>
<p><b>Complex logic</b></p>
<p>I&#8217;ve described before how <a href="http://blog.staffannoteberg.com/tag/iso-8601">20+ lines of Java code could be transformed into one small regex</a>. This is not a general law. Regex is a limited programming language suited to solve a very specific class of problems. However, in this case the imperative Java code had a lot of nested as well as consecutive conditional statements; if-else-if-if-else &#8212; i.e. complex logic. Regex is a declarative language. You describe what you want, and not how to get there. Thus, you don&#8217;t have to state all these scrubby paths.</p>
<p>Some of these five scents partly overlap, but each of them are well worth to remember. Facing a programming problem, if you can&#8217;t feel any of them, then you can be pretty sure that there are better tools than regex.</p>
<p><a href="http://www.pomodoro-book.com/en" target="pb"><img class="aligncenter size-full wp-image-778" title="Pomodoro Technique Illustrated -- New book from The Pragmatic Programmers, LLC" src="http://brainmoda.files.wordpress.com/2008/02/idg_en.gif?w=500" alt="Pomodoro Technique Illustrated -- New book from The Pragmatic Programmers, LLC"   /></a></p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/brainmoda.wordpress.com/1253/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/brainmoda.wordpress.com/1253/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/brainmoda.wordpress.com/1253/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/brainmoda.wordpress.com/1253/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/brainmoda.wordpress.com/1253/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/brainmoda.wordpress.com/1253/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/brainmoda.wordpress.com/1253/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/brainmoda.wordpress.com/1253/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/brainmoda.wordpress.com/1253/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/brainmoda.wordpress.com/1253/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/brainmoda.wordpress.com/1253/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/brainmoda.wordpress.com/1253/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/brainmoda.wordpress.com/1253/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/brainmoda.wordpress.com/1253/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.staffannoteberg.com&#038;blog=2614182&#038;post=1253&#038;subd=brainmoda&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://blog.staffannoteberg.com/2012/02/01/the-scent-of-regex-requisite/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/892160c571f99f4c4f75dbd01ca95406?s=96&#38;d=http%3A%2F%2Fs0.wp.com%2Fi%2Fmu.gif" medium="image">
			<media:title type="html">snoteberg</media:title>
		</media:content>

		<media:content url="http://brainmoda.files.wordpress.com/2012/02/scent-requisite.jpg" medium="image">
			<media:title type="html">scent-requisite</media:title>
		</media:content>

		<media:content url="http://brainmoda.files.wordpress.com/2012/02/scent-text-to-type1.jpg?w=300" medium="image">
			<media:title type="html">scent-text-to-type</media:title>
		</media:content>

		<media:content url="http://brainmoda.files.wordpress.com/2012/02/scent-non-recursive-data1.jpg?w=300" medium="image">
			<media:title type="html">scent-non-recursive-data</media:title>
		</media:content>

		<media:content url="http://brainmoda.files.wordpress.com/2012/02/scent-not-lucid.jpg?w=300" medium="image">
			<media:title type="html">scent-not-lucid</media:title>
		</media:content>

		<media:content url="http://brainmoda.files.wordpress.com/2012/02/scent-emerging.jpg?w=300" medium="image">
			<media:title type="html">scent-emerging</media:title>
		</media:content>

		<media:content url="http://brainmoda.files.wordpress.com/2008/02/idg_en.gif" medium="image">
			<media:title type="html">Pomodoro Technique Illustrated -- New book from The Pragmatic Programmers, LLC</media:title>
		</media:content>
	</item>
		<item>
		<title>From Regular Expression to Finite Automaton</title>
		<link>http://blog.staffannoteberg.com/2012/01/26/from-regular-expression-to-finite-automaton/</link>
		<comments>http://blog.staffannoteberg.com/2012/01/26/from-regular-expression-to-finite-automaton/#comments</comments>
		<pubDate>Thu, 26 Jan 2012 13:25:09 +0000</pubDate>
		<dc:creator>Staffan Nöteberg</dc:creator>
				<category><![CDATA[C#]]></category>
		<category><![CDATA[finite automata]]></category>
		<category><![CDATA[Howto]]></category>
		<category><![CDATA[Java]]></category>
		<category><![CDATA[Javascript]]></category>
		<category><![CDATA[PHP]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[Python]]></category>
		<category><![CDATA[Regex]]></category>
		<category><![CDATA[Regexp]]></category>
		<category><![CDATA[Regular Expressions]]></category>
		<category><![CDATA[Ruby]]></category>
		<category><![CDATA[Scala]]></category>
		<category><![CDATA[Technology]]></category>
		<category><![CDATA[alternation]]></category>
		<category><![CDATA[concaten]]></category>
		<category><![CDATA[kleene star]]></category>

		<guid isPermaLink="false">http://blog.staffannoteberg.com/?p=1231</guid>
		<description><![CDATA[For each regular expression &#8212; and I mean the three operators and the six recursive rules style &#8212; there is a finite automaton that accepts exactly the same strings. Since this is not a university book in mathematics, I&#8217;ll show you an inductive reasoning about this and not a formal proof. The hypothesis is thus [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.staffannoteberg.com&#038;blog=2614182&#038;post=1231&#038;subd=brainmoda&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p><img src="http://brainmoda.files.wordpress.com/2012/01/regex-to-fa.jpg?w=500&#038;h=353" alt="" title="Regex to FA" width="500" height="353" class="alignnone size-full wp-image-1232" /></p>
<p>For each regular expression &#8212; and I mean the <a href="http://blog.staffannoteberg.com/2012/01/18/four-more-rules-for-regular-expressions/">three operators and the six recursive rules style</a> &#8212; there is a finite automaton that accepts exactly the same strings. Since this is not a university book in mathematics, I&#8217;ll show you an inductive reasoning about this and not a formal proof.</p>
<p>The hypothesis is thus that for an arbitrary regular expression <code>p</code>, we can create a finite automaton that has exactly one start state, no paths into the start state, no paths out from the acceptance state and that accepts exactly the same strings that are matched by <code>p</code>.</p>
<ul>
<li><b>The empty string ε</b> is a regular expression corresponding to a finite automaton with a start state, a path that accepts the empty string ε and leads from the start state to an acceptance state. We&#8217;ll call this an //ε-path//.</li>
<li><b>The empty set Ø</b> is the set equivalent to a regular expression that can&#8217;t match any single string &#8212; not even the empty string ε. It is the same as a two-state automaton, with no single path. One state is start and the other one acceptance. But, they are not linked.</li>
<li><b>A regular expression that only matches the symbol <i>b</i></b> corresponds to a finite automaton with two states: start and acceptance. There&#8217;s a path from start t acceptance, and it only accepts the symbol <i>b</i>.</li>
</ul>
<p>All three finite automata above have two states. One is start and the other one is acceptance. The difference is that the first one has an ε-path from start to acceptance, the second one has no path, and the third one has a <i>b</i>-path. Now we&#8217;ll continue. Imagine that we have two regular expressions <code>p</code> and <code>q</code> corresponding to finite automata <code>s</code> and <code>t</code> respectively.</p>
<ul>
<li><b>Concatenation</b> of two regular expressions <code>p</code> and <code>q</code> means that we first match a string with <code>p</code>, directly followed by a string that&#8217;s matched by <code>q</code>. To create this finite automaton we first add ε-paths from every acceptance state in <code>s</code> to the start state of <code>t</code>. Then we deprive all acceptance states in <code>s</code> their acceptance status and we&#8217;ll also withdraw the start status of the start state in <code>t</code>.</li>
<li><b>Alternation</b> of two regular expressions <code>p</code> and <code>q</code>, i.e. <code>p|q</code> is like a finite automaton with a new start state that has ε-paths to all start staes of <code>s</code> and <code>t</code>. The new finite automaton also has a new acceptance state that is reached with ε-paths from all acceptance states of <code>s</code> and <code>t</code>. The start and acceptance states of <code>s</code> and <code>t</code> are thus not start and acceptance state in our new automaton.</li>
<li><b>Kleene star</b> is the concatenation closure. Assume that <code>p = q*</code>. Then <code>s</code> is the finite automaton we get if we take <code>t</code> and add two states and four paths as follows: One new state is the start state and the other one is an acceptance state. All acceptance states of ´s´ loses that status in <code>s</code>, but instead gets an ε-path to the new acceptance state. We add two ε-paths from the new initial state &#8212; one to the old start state and one to the new acceptance state. In addition to that, we insert one ε-path from each of the old acceptance states to the old start state.</li>
</ul>
<p>Look at the pictures above. Then take a deep breath and feel if you can translate an arbitrary regular expression to a finite automaton. Finally assess the last picture where the regular expression <code>(w|bb)*</code> is depicted as a graph using the method described above. Does it feel reasonable?</p>
<p><a href="http://www.pomodoro-book.com/en" target="pb"><img class="aligncenter size-full wp-image-778" title="Pomodoro Technique Illustrated -- New book from The Pragmatic Programmers, LLC" src="http://brainmoda.files.wordpress.com/2008/02/idg_en.gif?w=500" alt="Pomodoro Technique Illustrated -- New book from The Pragmatic Programmers, LLC"   /></a></p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/brainmoda.wordpress.com/1231/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/brainmoda.wordpress.com/1231/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/brainmoda.wordpress.com/1231/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/brainmoda.wordpress.com/1231/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/brainmoda.wordpress.com/1231/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/brainmoda.wordpress.com/1231/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/brainmoda.wordpress.com/1231/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/brainmoda.wordpress.com/1231/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/brainmoda.wordpress.com/1231/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/brainmoda.wordpress.com/1231/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/brainmoda.wordpress.com/1231/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/brainmoda.wordpress.com/1231/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/brainmoda.wordpress.com/1231/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/brainmoda.wordpress.com/1231/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.staffannoteberg.com&#038;blog=2614182&#038;post=1231&#038;subd=brainmoda&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://blog.staffannoteberg.com/2012/01/26/from-regular-expression-to-finite-automaton/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/892160c571f99f4c4f75dbd01ca95406?s=96&#38;d=http%3A%2F%2Fs0.wp.com%2Fi%2Fmu.gif" medium="image">
			<media:title type="html">snoteberg</media:title>
		</media:content>

		<media:content url="http://brainmoda.files.wordpress.com/2012/01/regex-to-fa.jpg" medium="image">
			<media:title type="html">Regex to FA</media:title>
		</media:content>

		<media:content url="http://brainmoda.files.wordpress.com/2008/02/idg_en.gif" medium="image">
			<media:title type="html">Pomodoro Technique Illustrated -- New book from The Pragmatic Programmers, LLC</media:title>
		</media:content>
	</item>
		<item>
		<title>Simple Regular Expression Examples</title>
		<link>http://blog.staffannoteberg.com/2012/01/25/simple-regular-expression-examples/</link>
		<comments>http://blog.staffannoteberg.com/2012/01/25/simple-regular-expression-examples/#comments</comments>
		<pubDate>Wed, 25 Jan 2012 20:34:50 +0000</pubDate>
		<dc:creator>Staffan Nöteberg</dc:creator>
				<category><![CDATA[C#]]></category>
		<category><![CDATA[Howto]]></category>
		<category><![CDATA[Java]]></category>
		<category><![CDATA[Javascript]]></category>
		<category><![CDATA[PHP]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[Python]]></category>
		<category><![CDATA[Regex]]></category>
		<category><![CDATA[Regexp]]></category>
		<category><![CDATA[Regular Expressions]]></category>
		<category><![CDATA[Ruby]]></category>
		<category><![CDATA[Scala]]></category>
		<category><![CDATA[Technology]]></category>
		<category><![CDATA[alternation]]></category>
		<category><![CDATA[concatenation]]></category>
		<category><![CDATA[kleene star]]></category>

		<guid isPermaLink="false">http://blog.staffannoteberg.com/?p=1222</guid>
		<description><![CDATA[Now, we have three operators and a small framework. After all this theory, you might wonder if it&#8217;s possible for us to solve any problems. Yes, of course we can. Here are some examples: All binary strings with no more than one zero: '01101'.match /1*(0&#124;)1*/ #=&#62; #&#60;MatchData "011"&#62; '0111'.match /1*(0&#124;)1*/ #=&#62; #&#60;MatchData "0111"&#62; '1101'.match /1*(0&#124;)1*/ [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.staffannoteberg.com&#038;blog=2614182&#038;post=1222&#038;subd=brainmoda&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p><img src="http://brainmoda.files.wordpress.com/2012/01/match_one.jpg?w=500&#038;h=347" alt="" title="Match one" width="500" height="347" class="alignnone size-full wp-image-1226" /></p>
<p>Now, we have three operators and a small framework. After all this theory, you might wonder if it&#8217;s possible for us to solve any problems. Yes, of course we can. Here are some examples:</p>
<p>All binary strings with no more than one zero:</p>
<p><code>'01101'.match /1*(0|)1*/ #=&gt; #&lt;MatchData "011"&gt;</code><br />
<code>'0111'.match /1*(0|)1*/ #=&gt; #&lt;MatchData "0111"&gt;</code><br />
<code>'1101'.match /1*(0|)1*/ #=&gt; #&lt;MatchData "1101"&gt;</code><br />
<code>'11010'.match /1*(0|)1*/ #=&gt; #&lt;MatchData "1101"&gt;</code></p>
<p>All binary strings with at least one pair of consecutive zeroes:</p>
<p><code>'101001'.match /(1|0)*00(1|0)*/ #=&gt; #&lt;MatchData "101001"&gt;</code><br />
<code>'10101'.match /(1|0)*00(1|0)*/ #=&gt; nil</code><br />
<code>'1010100'.match /(1|0)*00(1|0)*/ #=&gt; #&lt;MatchData "1010100"&gt;</code></p>
<p>All binary strings that have no pair of consecutive zeros:</p>
<p><code>'1010100'.match /1*(011*)*(0|)/ #=&gt; #&lt;MatchData "101010"&gt;</code><br />
<code>'101001'.match /1*(011*)*(0|)/ #=&gt; #&lt;MatchData "1010"&gt;</code><br />
<code>'0010101'.match /1*(011*)*(0|)/ #=&gt; #&lt;MatchData "0"&gt;</code><br />
<code>'0110101'.match /1*(011*)*(0|)/ #=&gt; #&lt;MatchData "0110101"&gt;</code></p>
<p>All binary strings ending in 01:</p>
<p><code>'110101'.match /(0|1)*01/ #=&gt; #&lt;MatchData "110101"&gt;</code><br />
<code>'11010'.match /(0|1)*01/ #=&gt; #&lt;MatchData "1101"&gt;</code><br />
<code>'1'.match /(0|1)*01/ #=&gt; nil</code><br />
<code>'01'.match /(0|1)*01/ #=&gt; #&lt;MatchData "01"&gt;</code></p>
<p>All binary strings not ending in 01:</p>
<p><code>'010'.match /(0|1)*(0|11)|1|0|/ #=&gt; #&lt;MatchData "010"&gt;</code><br />
<code>'011'.match /(0|1)*(0|11)|1|0|/ #=&gt; #&lt;MatchData "011"&gt;</code><br />
<code>''.match /(0|1)*(0|11)|1|0|/ #=&gt; #&lt;MatchData ""&gt;</code><br />
<code>'1'.match /(0|1)*(0|11)|1|0|/ #=&gt; #&lt;MatchData "1"&gt;</code><br />
<code>'01'.match /(0|1)*(0|11)|1|0|/ #=&gt; #&lt;MatchData "0"&gt;</code><br />
<code>'101'.match /(0|1)*(0|11)|1|0|/ #=&gt; #&lt;MatchData "10"&gt;</code></p>
<p>All binary strings that have every pair of consecutive zeroes before every pair of consecutive ones:</p>
<p><code>'0110101'.match /0*(100*)*1*(011*)*(0|)/ #=&gt; #&lt;MatchData "0110101"&gt;</code><br />
<code>'00101100'.match /0*(100*)*1*(011*)*(0|)/ #=&gt; #&lt;MatchData "0010110"&gt;</code><br />
<code>'11001011'.match /0*(100*)*1*(011*)*(0|)/ #=&gt; #&lt;MatchData "110"&gt;</code><br />
<code>'1100'.match /0*(100*)*1*(011*)*(0|)/ #=&gt; #&lt;MatchData "110"&gt;</code><br />
<code>'0011'.match /0*(100*)*1*(011*)*(0|)/ #=&gt; #&lt;MatchData "0011"&gt;</code></p>
<p>See if you can find even better regular expressions that solve these problems. Remember that there&#8217;re an infinite number of synonyms to each regular expression.</p>
<p><a href="http://www.pomodoro-book.com/en" target="pb"><img class="aligncenter size-full wp-image-778" title="Pomodoro Technique Illustrated -- New book from The Pragmatic Programmers, LLC" src="http://brainmoda.files.wordpress.com/2008/02/idg_en.gif?w=500" alt="Pomodoro Technique Illustrated -- New book from The Pragmatic Programmers, LLC"   /></a></p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/brainmoda.wordpress.com/1222/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/brainmoda.wordpress.com/1222/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/brainmoda.wordpress.com/1222/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/brainmoda.wordpress.com/1222/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/brainmoda.wordpress.com/1222/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/brainmoda.wordpress.com/1222/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/brainmoda.wordpress.com/1222/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/brainmoda.wordpress.com/1222/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/brainmoda.wordpress.com/1222/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/brainmoda.wordpress.com/1222/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/brainmoda.wordpress.com/1222/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/brainmoda.wordpress.com/1222/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/brainmoda.wordpress.com/1222/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/brainmoda.wordpress.com/1222/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.staffannoteberg.com&#038;blog=2614182&#038;post=1222&#038;subd=brainmoda&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://blog.staffannoteberg.com/2012/01/25/simple-regular-expression-examples/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/892160c571f99f4c4f75dbd01ca95406?s=96&#38;d=http%3A%2F%2Fs0.wp.com%2Fi%2Fmu.gif" medium="image">
			<media:title type="html">snoteberg</media:title>
		</media:content>

		<media:content url="http://brainmoda.files.wordpress.com/2012/01/match_one.jpg" medium="image">
			<media:title type="html">Match one</media:title>
		</media:content>

		<media:content url="http://brainmoda.files.wordpress.com/2008/02/idg_en.gif" medium="image">
			<media:title type="html">Pomodoro Technique Illustrated -- New book from The Pragmatic Programmers, LLC</media:title>
		</media:content>
	</item>
		<item>
		<title>Regular Expression Precedence</title>
		<link>http://blog.staffannoteberg.com/2012/01/24/regular-expression-precedence/</link>
		<comments>http://blog.staffannoteberg.com/2012/01/24/regular-expression-precedence/#comments</comments>
		<pubDate>Tue, 24 Jan 2012 15:00:24 +0000</pubDate>
		<dc:creator>Staffan Nöteberg</dc:creator>
				<category><![CDATA[C#]]></category>
		<category><![CDATA[Howto]]></category>
		<category><![CDATA[Java]]></category>
		<category><![CDATA[Javascript]]></category>
		<category><![CDATA[PHP]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[Python]]></category>
		<category><![CDATA[Regex]]></category>
		<category><![CDATA[Regexp]]></category>
		<category><![CDATA[Regular Expressions]]></category>
		<category><![CDATA[Ruby]]></category>
		<category><![CDATA[Scala]]></category>
		<category><![CDATA[Technology]]></category>
		<category><![CDATA[alternation]]></category>
		<category><![CDATA[concatenation]]></category>
		<category><![CDATA[kleene star]]></category>
		<category><![CDATA[operator associativity]]></category>
		<category><![CDATA[operator position]]></category>
		<category><![CDATA[operator precedence]]></category>

		<guid isPermaLink="false">http://blog.staffannoteberg.com/?p=1213</guid>
		<description><![CDATA[You might be tempted to read the following regular expression as third or fifth row: 'fifth row'.match /third&#124;fifth row/ #=&#62; #&#60;MatchData "fifth row"&#62; 'third row'.match /third&#124;fifth row/ #=&#62; #&#60;MatchData "third"&#62; But unfortunately, as you can see, it&#8217;s more like either third (only) or else fifth row. This is due to something called order of operations [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.staffannoteberg.com&#038;blog=2614182&#038;post=1213&#038;subd=brainmoda&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p><a href="http://brainmoda.files.wordpress.com/2012/01/precedence.jpg"><img class="alignnone size-full wp-image-1214" title="Precedence" src="http://brainmoda.files.wordpress.com/2012/01/precedence.jpg?w=500&#038;h=354" alt="" width="500" height="354" /></a></p>
<p>You might be tempted to read the following regular expression as <em>third or fifth row</em>:</p>
<p><code>'fifth row'.match /third|fifth row/ #=&gt; #&lt;MatchData "fifth row"&gt;</code><br />
<code>'third row'.match /third|fifth row/ #=&gt; #&lt;MatchData "third"&gt;</code></p>
<p>But unfortunately, as you can see, it&#8217;s more like either <em>third</em> (only) or else <em>fifth row</em>. This is due to something called <em>order of operations</em> or <em>operator precedence</em>. The invisible operator for concatenation has higher precedence than the alternation operator <code>|</code>.</p>
<p>To oil these wheels, we now add parentheses to our three operators. In a regular expression, the sub expression enclosed in parentheses get the highest priority:</p>
<p><code>'fifth row'.match /(third|fifth) row/ #=&gt; #&lt;MatchData "fifth row"&gt;</code><br />
<code>'third row'.match /(third|fifth) row/ #=&gt; #&lt;MatchData "third row"&gt;</code></p>
<p>Note that the parentheses are meta-characters, not literals. They won&#8217;t match anything in the subject string. And of course it&#8217;s possible to nest parentheses:</p>
<p><code>'third row'.match /(third|(four|fif)th) row/ #=&gt; #&lt;MatchData "third row"&gt;</code><br />
<code>'fourth row'.match /(third|(four|fif)th) row/ #=&gt; #&lt;MatchData "fourth row"&gt;</code><br />
<code>'fifth row'.match /(third|(four|fif)th) row/ #=&gt; #&lt;MatchData "fifth row"&gt;</code></p>
<p>There are three things we need to remember, to know in what order and with what operands the regular expression engine will execute the operators:</p>
<ul>
<li><strong>Operator precedence</strong> is an ordered list that tells you if one operator should be executed before another operator in a regular expression. Several operators can have the same priority. In mathematics, the terms inside the parentheses have the highest priority. Multiplication and division have a lower priority. Addition and subtraction have the lowest. This is why <code>6+6/(2+1) = 8</code>.</li>
<li><strong>Operator position</strong> indicates where the operands are located in relation to the operator. The position can be <em>prefix</em>, <em>infix</em>, or <em>postfix</em>. If the operator is prefix, then the operand resides to the right of the operator, as the unary minus sign e.g. <code>-3</code>. An infix operator has an operand on each side, as in addition <code>1+2</code>. A postfix operator stands to the right of its operand, as the exclamation point that represents the faculty operator in <code>5!</code>.</li>
<li><strong>Operator associativity</strong> tells us how to group two operators on the same precedence level. An infix operator can be right-associative, left-associative or non-associative. In mathematics, the infix operations addition and subtraction have the same precedence. Since both are left-associative the following equation holds: <code>1-2+3 = (1-2)+3 = 2</code>. Prefix or postfix operators are either associative or non-associative. If they are associative, we start with the operator that is closest to the operand. An operator that is non-associative can&#8217;t compete with operators of same precedence.</li>
</ul>
<p>Here goes the table for the operators we have studied so far. Later on, there&#8217;s a complete table of all regex operators.</p>
<table width="100%">
<tbody>
<tr>
<th><span style="text-decoration:underline;">Operator</span></th>
<th><span style="text-decoration:underline;">Symbol</span></th>
<th><span style="text-decoration:underline;">Precedence</span></th>
<th><span style="text-decoration:underline;">Position</span></th>
<th><span style="text-decoration:underline;">Associativity</span></th>
</tr>
<tr>
<td>Kleene star</td>
<td style="text-align:center;"><code>*</code></td>
<td style="text-align:center;">1</td>
<td>Postfix</td>
<td>Associative</td>
</tr>
<tr>
<td>Concatenation</td>
<td style="text-align:center;">N/A</td>
<td style="text-align:center;">2</td>
<td>Infix</td>
<td>Left-associative</td>
</tr>
<tr>
<td>Alternation</td>
<td style="text-align:center;"><code>|</code></td>
<td style="text-align:center;">3</td>
<td>Infix</td>
<td>Left-associative</td>
</tr>
</tbody>
</table>
<p>If you think this is hard to remember, then try to memorize the mnemonic <em>SCA</em>. It stands for Star-Concat-Alter, i.e. the order of precedence in regular expressions.</p>
<p><a href="http://www.pomodoro-book.com/en" target="pb"><img class="aligncenter size-full wp-image-778" title="Pomodoro Technique Illustrated -- New book from The Pragmatic Programmers, LLC" src="http://brainmoda.files.wordpress.com/2008/02/idg_en.gif?w=500" alt="Pomodoro Technique Illustrated -- New book from The Pragmatic Programmers, LLC"   /></a></p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/brainmoda.wordpress.com/1213/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/brainmoda.wordpress.com/1213/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/brainmoda.wordpress.com/1213/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/brainmoda.wordpress.com/1213/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/brainmoda.wordpress.com/1213/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/brainmoda.wordpress.com/1213/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/brainmoda.wordpress.com/1213/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/brainmoda.wordpress.com/1213/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/brainmoda.wordpress.com/1213/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/brainmoda.wordpress.com/1213/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/brainmoda.wordpress.com/1213/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/brainmoda.wordpress.com/1213/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/brainmoda.wordpress.com/1213/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/brainmoda.wordpress.com/1213/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.staffannoteberg.com&#038;blog=2614182&#038;post=1213&#038;subd=brainmoda&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://blog.staffannoteberg.com/2012/01/24/regular-expression-precedence/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/892160c571f99f4c4f75dbd01ca95406?s=96&#38;d=http%3A%2F%2Fs0.wp.com%2Fi%2Fmu.gif" medium="image">
			<media:title type="html">snoteberg</media:title>
		</media:content>

		<media:content url="http://brainmoda.files.wordpress.com/2012/01/precedence.jpg" medium="image">
			<media:title type="html">Precedence</media:title>
		</media:content>

		<media:content url="http://brainmoda.files.wordpress.com/2008/02/idg_en.gif" medium="image">
			<media:title type="html">Pomodoro Technique Illustrated -- New book from The Pragmatic Programmers, LLC</media:title>
		</media:content>
	</item>
		<item>
		<title>The Kleene Star operator</title>
		<link>http://blog.staffannoteberg.com/2012/01/23/the-kleene-star-operator/</link>
		<comments>http://blog.staffannoteberg.com/2012/01/23/the-kleene-star-operator/#comments</comments>
		<pubDate>Mon, 23 Jan 2012 11:50:21 +0000</pubDate>
		<dc:creator>Staffan Nöteberg</dc:creator>
				<category><![CDATA[C#]]></category>
		<category><![CDATA[Howto]]></category>
		<category><![CDATA[Java]]></category>
		<category><![CDATA[Javascript]]></category>
		<category><![CDATA[PHP]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[Python]]></category>
		<category><![CDATA[Regex]]></category>
		<category><![CDATA[Regexp]]></category>
		<category><![CDATA[Regular Expressions]]></category>
		<category><![CDATA[Ruby]]></category>
		<category><![CDATA[Scala]]></category>
		<category><![CDATA[Technology]]></category>
		<category><![CDATA[kleene]]></category>
		<category><![CDATA[kleene star]]></category>
		<category><![CDATA[stephen kleene]]></category>

		<guid isPermaLink="false">http://blog.staffannoteberg.com/?p=1201</guid>
		<description><![CDATA[All finite languages ​​can be described by regular expressions. You can simply list the strings as an alternation string1&#124;string2&#124;string3 etc. But there are also some languages with an infinite number of strings that can be described by regular expressions. To achieve that, we use an operator we call Kleene star after the American mathematician Stephen [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.staffannoteberg.com&#038;blog=2614182&#038;post=1201&#038;subd=brainmoda&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p><img src="http://brainmoda.files.wordpress.com/2012/01/kleene.jpg?w=500&#038;h=343" alt="" title="Kleene star" width="500" height="343" class="aligncenter size-full wp-image-1204" /></p>
<p>All finite languages ​​can be described by regular expressions. You can simply list the strings as an alternation <code>string1|string2|string3</code> etc. But there are also some languages with an infinite number of strings that can be described by regular expressions. To achieve that, we use an operator we call <i>Kleene star</i> after the American mathematician Stephen Cole Kleene. If <code>p</code> is a regular expression, then <code>p*</code> is the smallest superset of <code>p</code> that contains ε (the empty string) and is closed under the concatenation operation. It is the set of finite length strings that can be created by concatenating strings that match the expression <code>p</code>. If <code>p</code> can match any other string than ε, then <code>p*</code> will match an infinite number of possible strings.</p>
<p>The real name of the typographic symbol (glyph) used to denote the Kleene Star is <i>asterisk</i>. It&#8217;s a Greek (not geek) word and it logically enough means <i>little star</i>. Normally, it has five or six arms, but in its original purpose &#8212; to describe birth dates in a family tree &#8212; it had seven arms. This is a very popular symbol. In Unicode, there are a score of different variants and many communities have chosen their own meaning of the asterisk. In typography it means a footnote. In musical notation, it may mean that the sustain pedal of the piano should be lifted. On our cell phones, we use the asterisk to navigate menus in touch-tone systems. So there is no world-wide interpretation of *. However, in this book it&#8217;ll always mean the operator <i>Kleene star</i>.</p>
<p>Do you want to see a simple example? The concatenation closure of one single symbol, such as a, is <code>a* = { ε, a, aa, aaa,... }</code>. Want to see a more academic example? The concatenation closure of the set consisting solely of the empty string ε is &#8212; well, the set consisting solely of the empty string <code>ε* =  ε</code>. Want to see a more complicated example? <code>(1|0)* = { ε, 0, 1, 01, 10, 001, 010, 011,... }</code>. It may thus be different matches of the expression that concatenates. Can you write a regular expression that matches all binary strings that contain at least one zero? Or all binary strings with an even number of ones?</p>
<p>The operator <i>Kleene star</i> &#8212; pronounced /ˈkleɪniː stɑ:r/ klay-nee star &#8212; is unary,  i.e. it only takes one operand. The operand is the regular expression to the left, which allows us to say that it&#8217;s a postfix operator. It has the highest priority of all operators and it is associative. The latter means that if two operators of the same priority are competing, then the operator closest to the operand wins. Since <code>p** = p*</code>, we say that the Kleene star is idempotent. And I want to emphasize again that <code>p* = (p|)*</code>, i.e. the empty string ε is always present in a closure. We&#8217;ll later on see that there is a very common &#8212; and none the less disastrous &#8212; mistake to forget this very fact.</p>
<p>Here are some possible answers to two the questions above:</p>
<p><code>'110'.match /(0|1)*0(0|1)*/ #=&gt; #&lt;MatchData "110"&gt; - all strings with at least oe zero</code><br />
<code>'1111'.match /(0|1)*0(0|1)*/ #=&gt; nil</code><br />
<code>'1001'.match /((10*1)|0*)*/ #=&gt; #&lt;MatchData "1001"&gt; - all strings with an even number of ones</code><br />
<code>'11001'.match /((10*1)|0*)*/ #=&gt; #&lt;MatchData "1100"&gt;</code><br />
<code>''.match /((10*1)|0*)*/ #=&gt; #&lt;MatchData ""&gt; - even the empty string has an even number of ones</code><br />
<code>'1001'.match /((10*1)|0)|/ #=&gt; #&lt;MatchData "1001"&gt; - another way to express an even number of ones</code><br />
<code>'11001'.match /((10*1)|0)|/ #=&gt; #&lt;MatchData "11"&gt;</code><br />
<code>''.match /((10*1)|0)|/ #=&gt; #&lt;MatchData ""&gt;</code><br />
<code>'1'.match /((10*1)|0)|/ #=&gt; #&lt;MatchData ""&gt;</code><br />
<code>'010'.match /((10*1)|0)|/ #=&gt; #&lt;MatchData "0"&gt;</code><br />
<code>'01'.match /((10*1)|0)|/ #=&gt; #&lt;MatchData "0"&gt;</code></p>
<p>The positive closure operator <code>+</code> and the <i>at least n</i> operator <code>{n,}</code> are abstractions for expressing infinite concatenation. We&#8217;ll soon explore them in more detail.</p>
<p><a target="pb" href="http://www.pomodoro-book.com/en"><img src="http://brainmoda.files.wordpress.com/2008/02/idg_en.gif?w=500" alt="Pomodoro Technique Illustrated -- New book from The Pragmatic Programmers, LLC" title="Pomodoro Technique Illustrated -- New book from The Pragmatic Programmers, LLC"   class="aligncenter size-full wp-image-778" /></a></p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/brainmoda.wordpress.com/1201/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/brainmoda.wordpress.com/1201/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/brainmoda.wordpress.com/1201/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/brainmoda.wordpress.com/1201/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/brainmoda.wordpress.com/1201/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/brainmoda.wordpress.com/1201/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/brainmoda.wordpress.com/1201/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/brainmoda.wordpress.com/1201/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/brainmoda.wordpress.com/1201/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/brainmoda.wordpress.com/1201/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/brainmoda.wordpress.com/1201/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/brainmoda.wordpress.com/1201/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/brainmoda.wordpress.com/1201/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/brainmoda.wordpress.com/1201/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.staffannoteberg.com&#038;blog=2614182&#038;post=1201&#038;subd=brainmoda&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://blog.staffannoteberg.com/2012/01/23/the-kleene-star-operator/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/892160c571f99f4c4f75dbd01ca95406?s=96&#38;d=http%3A%2F%2Fs0.wp.com%2Fi%2Fmu.gif" medium="image">
			<media:title type="html">snoteberg</media:title>
		</media:content>

		<media:content url="http://brainmoda.files.wordpress.com/2012/01/kleene.jpg" medium="image">
			<media:title type="html">Kleene star</media:title>
		</media:content>

		<media:content url="http://brainmoda.files.wordpress.com/2008/02/idg_en.gif" medium="image">
			<media:title type="html">Pomodoro Technique Illustrated -- New book from The Pragmatic Programmers, LLC</media:title>
		</media:content>
	</item>
		<item>
		<title>Regular Expression Alternation</title>
		<link>http://blog.staffannoteberg.com/2012/01/20/regular-expression-alternation/</link>
		<comments>http://blog.staffannoteberg.com/2012/01/20/regular-expression-alternation/#comments</comments>
		<pubDate>Fri, 20 Jan 2012 12:42:43 +0000</pubDate>
		<dc:creator>Staffan Nöteberg</dc:creator>
				<category><![CDATA[C#]]></category>
		<category><![CDATA[Howto]]></category>
		<category><![CDATA[Java]]></category>
		<category><![CDATA[Javascript]]></category>
		<category><![CDATA[PHP]]></category>
		<category><![CDATA[Python]]></category>
		<category><![CDATA[Regex]]></category>
		<category><![CDATA[Regexp]]></category>
		<category><![CDATA[Regular Expressions]]></category>
		<category><![CDATA[Ruby]]></category>
		<category><![CDATA[Technology]]></category>
		<category><![CDATA[alternation]]></category>
		<category><![CDATA[concatenation]]></category>

		<guid isPermaLink="false">http://blog.staffannoteberg.com/?p=1189</guid>
		<description><![CDATA[From rule number 2 and rule number 3 we can define paradigms &#8212; a number of possible patterns. This means that we add two or more languages by applying the set operator union to them. The union of the sets {a, b} and {c, d} is {a, b, c, d}. Hence, it&#8217;s all the elements [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.staffannoteberg.com&#038;blog=2614182&#038;post=1189&#038;subd=brainmoda&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p><img src="http://brainmoda.files.wordpress.com/2012/01/alternation.jpg?w=500&#038;h=352" alt="" title="Alternation" width="500" height="352" class="aligncenter size-full wp-image-1190" /></p>
<p>From <a href="http://blog.staffannoteberg.com/2012/01/17/alphabet-string-language/">rule number 2</a> and <a href="http://blog.staffannoteberg.com/2012/01/18/four-more-rules-for-regular-expressions">rule number 3</a> we can define paradigms &#8212; a number of possible patterns. This means that we add two or more languages by applying <i>the set operator union</i> to them. The union of the sets {a, b} and {c, d} is {a, b, c, d}. Hence, it&#8217;s all the elements that are either in one or more of the sets. In boolean logic, we call this the <i>inclusively or</i>. In regular expressions, it is called alternation and is written with a vertical bar <code>|</code>. Here are some examples:</p>
<p><code>'a'.match /a|b/ #=&gt; #&lt;MatchData "a"&gt; - a is either a or b</code><br />
<code>'ab'.match /a|b/ #=&gt; #&lt;MatchData "a"&gt; - leftmost chosen</code><br />
<code>'ba'.match /a|b/ #=&gt; #&lt;MatchData "b"&gt; - leftmost chosen</code><br />
<code>'c'.match /a|b/ #=&gt; nil - here we found neither a nor b</code></p>
<p>Note that most regex engines selects the leftmost alternative. There are exceptions to this rule. A regex engine based on DFA or POSIX NFA selects the longest alternative. Most regex engines are basic NFA and select the leftmost.</p>
<p>Can you write a regular expression that matches all binary strings of length one? The binary alphabet is { 0, 1 }. Since there aren&#8217;t a huge number of binary strings of length one, you can pretty quickly list them: { 0, 1 }. The regular expression with alternation then becomes <code>0|1</code>:</p>
<p><code>'0'.match /0|1/ #=&gt; #&lt;MatchData "0"&gt;</code><br />
<code>'1'.match /0|1/ #=&gt; #&lt;MatchData "1"&gt;</code><br />
<code>'2'.match /0|1/ #=&gt; nil</code><br />
<code>'10'.match /0|1/ #=&gt; #&lt;MatchData "1"&gt;</code></p>
<p>There are four binary strings of length two &#8212; {00, 01, 10, 11}. We can capture them with <code>00|10|01|11</code>:</p>
<p><code>'10'.match /00|10|01|11/ #=&gt; #&lt;MatchData "10"&gt;</code><br />
<code>'01'.match /00|10|01|11/ #=&gt; #&lt;MatchData "01"&gt;</code><br />
<code>'12'.match /00|10|01|11/ #=&gt; nil</code><br />
<code>'11'.match /00|10|01|11/ #=&gt; #&lt;MatchData "11"&gt;</code><br />
<code>'1210'.match /00|10|01|11/ #=&gt; #&lt;MatchData "10"&gt;</code></p>
<p>Maybe you didn&#8217;t notice, but we used concatenation in the regular expression above (can you see the invisible concatenation symbol between the two binary digits in the regular expression; if not, maybe you should make an appointment with an optometrist; or maybe not; not even an optometrist can help you see invisible symbols). Each of the binary strings of length two are made up of two concatenated binary strings of length one. Since concatenation has higher precedence than alternation, we didn&#8217;t need any parentheses.</p>
<p>Alternation is commutative: for two regular expressions <code>p</code> and <code>q</code> it holds that <code>p|q = q|p</code>. It is also associative: <code>p|(q|r) = (p|q)|r</code>. An interesting and very useful fact is that concatenation distributes over alternation. This means that for all regular expressions <code>p</code>, <code>q</code> and <code>r</code> it&#8217;s true that <code>p(q|r) = pq|pr</code> and <code>(p|q)r = pq|pr</code>. The consequence of that is that <code>(0|1)(0|1) = (0|1)0|(0|1)1 = 00|10|01|11</code>. So another way to match any binary strings of length two is:</p>
<p><code>'10'.match /(0|1)(0|1)/ #=&gt; #&lt;MatchData "10"&gt;</code><br />
<code>'01'.match /(0|1)(0|1)/ #=&gt; #&lt;MatchData "01"&gt;</code><br />
<code>'12'.match /(0|1)(0|1)/ #=&gt; nil</code><br />
<code>'11'.match /(0|1)(0|1)/ #=&gt; #&lt;MatchData "11"&gt;</code><br />
<code>'1210'.match /(0|1)(0|1)/ #=&gt; #&lt;MatchData "10"&gt;</code></p>
<p>The brackets were needed, of course, because concatenation has higher precedence than alternation. We can also have the empty string ε as one of our alternatives:</p>
<p><code>'moda'.match /moda|/ #=&gt; #&lt;MatchData "moda"&gt; - either moda or nothing is moda</code><br />
<code>'moda'.match /mado|/ #=&gt; #&lt;MatchData ""&gt; - either mado or nothing is nothing</code></p>
<p><a target="pb" href="http://www.pomodoro-book.com/en"><img src="http://brainmoda.files.wordpress.com/2008/02/idg_en.gif?w=500" alt="Pomodoro Technique Illustrated -- New book from The Pragmatic Programmers, LLC" title="Pomodoro Technique Illustrated -- New book from The Pragmatic Programmers, LLC"   class="aligncenter size-full wp-image-778" /></a></p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/brainmoda.wordpress.com/1189/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/brainmoda.wordpress.com/1189/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/brainmoda.wordpress.com/1189/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/brainmoda.wordpress.com/1189/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/brainmoda.wordpress.com/1189/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/brainmoda.wordpress.com/1189/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/brainmoda.wordpress.com/1189/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/brainmoda.wordpress.com/1189/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/brainmoda.wordpress.com/1189/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/brainmoda.wordpress.com/1189/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/brainmoda.wordpress.com/1189/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/brainmoda.wordpress.com/1189/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/brainmoda.wordpress.com/1189/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/brainmoda.wordpress.com/1189/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.staffannoteberg.com&#038;blog=2614182&#038;post=1189&#038;subd=brainmoda&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://blog.staffannoteberg.com/2012/01/20/regular-expression-alternation/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/892160c571f99f4c4f75dbd01ca95406?s=96&#38;d=http%3A%2F%2Fs0.wp.com%2Fi%2Fmu.gif" medium="image">
			<media:title type="html">snoteberg</media:title>
		</media:content>

		<media:content url="http://brainmoda.files.wordpress.com/2012/01/alternation.jpg" medium="image">
			<media:title type="html">Alternation</media:title>
		</media:content>

		<media:content url="http://brainmoda.files.wordpress.com/2008/02/idg_en.gif" medium="image">
			<media:title type="html">Pomodoro Technique Illustrated -- New book from The Pragmatic Programmers, LLC</media:title>
		</media:content>
	</item>
		<item>
		<title>Regular Expression Concatenation</title>
		<link>http://blog.staffannoteberg.com/2012/01/19/regular-expression-concatenation/</link>
		<comments>http://blog.staffannoteberg.com/2012/01/19/regular-expression-concatenation/#comments</comments>
		<pubDate>Thu, 19 Jan 2012 13:25:32 +0000</pubDate>
		<dc:creator>Staffan Nöteberg</dc:creator>
				<category><![CDATA[C#]]></category>
		<category><![CDATA[Howto]]></category>
		<category><![CDATA[Java]]></category>
		<category><![CDATA[Javascript]]></category>
		<category><![CDATA[PHP]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[Python]]></category>
		<category><![CDATA[Regex]]></category>
		<category><![CDATA[Regexp]]></category>
		<category><![CDATA[Regular Expressions]]></category>
		<category><![CDATA[Ruby]]></category>
		<category><![CDATA[Technology]]></category>
		<category><![CDATA[concatenation]]></category>

		<guid isPermaLink="false">http://brainmoda.wordpress.com/?p=1181</guid>
		<description><![CDATA[Using Rule number 2 and Rule number 4, we can create regular expressions that consists of any sequence of symbols from our alphabet. Rule number 2 said that if the symbol a is in the alphabet, then a is a regular expression. Rule number 4 said that if p and q are two regular expressions, [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.staffannoteberg.com&#038;blog=2614182&#038;post=1181&#038;subd=brainmoda&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p><img src="http://brainmoda.files.wordpress.com/2012/01/concat.jpg?w=500&#038;h=347" alt="" title="Concatenation" width="500" height="347" class="aligncenter size-full wp-image-1182" /></p>
<p>Using <a href="http://blog.staffannoteberg.com/2012/01/17/alphabet-string-language/">Rule number 2</a> and <a href="http://blog.staffannoteberg.com/2012/01/18/four-more-rules-for-regular-expressions">Rule number 4</a>, we can create regular expressions that consists of any sequence of symbols from our alphabet. Rule number 2 said that if the symbol <i>a</i> is in the alphabet, then <code>a</code> is a regular expression. Rule number 4 said that if <code>p</code> and <code>q</code> are two regular expressions, then the concatenation <code>pq</code> is a regular expression as well. The concatenation symbol itself is invisible. Just write the two regular expressions right after each other:</p>
<p><code>'moda'[/m/] #=&gt; "m" – we found the substring s in the string"moda"</code><br />
<code>'moda'[/o/] #=&gt; "o"</code><br />
<code>'moda'[/mo/] #=&gt; "mo" - /mo/ is /m/ concatenated with /o/</code><br />
<code>'moda'[/da/] #=&gt; "da"</code><br />
<code>'moda'[/moda/] #=&gt; "moda" - /moda/ is /mo/ concatenated with /da/</code><br />
<code>'moda'[/mado/] #=&gt; nil – no match, since the order was changed</code></p>
<p>There are some handy terms we usually use for parts of strings:</p>
<ul>
<li><b>Prefix:</b> A prefix is the substring we have left if we remove zero or more symbols from the end of a string. The strings <i>m</i>, <i>mo</i>, <i>mod</i>, and <i>moda</i> are all prefixes of the string <i>moda</i>. Even the empty string ε is a prefix <i>moda</i>.</li>
<li><b>Suffix:</b> The suffix is the substring that is left if we remove zero or more symbol from the beginning of the string. The strings <i>moda</i>, <i>oda</i>, <i>da</i>, <i>a</i>, and ε are all suffixes of the string <i>moda</i>.</li>
<li><b>Substring:</b> A substring is what we have left if we remove a prefix and a suffix from a string. Note that the prefix and/or the suffix can be ε. Substrings must still be consecutive in the original string. The strings <i>od</i> and <i>moda</i>, but not <i>mda</i>, are substrings of <i>moda</i>.</li>
</ul>
<p>For any regular expression <code>p</code>, it&#8217;s true that <code>εp = pε = p</code>, thus we say that the empty string ε is the <i>identity</i> under concatenation. There is no <i>annihilator</i> under concatenation, i.e., there&#8217;s no regular expression <code>0</code> so that for any regular expression <code>p</code> it holds that <code>0p = p0 = 0</code>. Concatenation is not commutative, since <code>pq</code> is not equal to <code>qp</code>, but it&#8217;s associative since for any regular expressions <code>p</code> and <code>q</code> it&#8217;s true that <code>p(qr) = (pq)r</code>.</p>
<p>If we think of concatenation as a product, then regular expressions also support exponentiation. We write the exponent enclosed in braces to the right of the regular expression:</p>
<p><code>'aaa'[/aaa/] #=&gt; "aaa"</code><br />
<code>'aaa'[/a{3}/] #=&gt; "aaa" – yes, the string includes 3 concatenated a</code><br />
<code>'aaa'[/a{4}/] #=&gt; nil – no, the string doesn't include 4 a</code></p>
<p>This is obviously just syntactic sugar. All regular expressions that we can write using the exponential operator, can also be unfolded. There are more shortcuts for finite repeated concatenations:</p>
<p><code>'aa'[/a?/] #=&gt; "a" – the optional operator written as question mark</code><br />
<code>'b'[/a?/] #=&gt; "" – zero repeats of a matches the empty string</code><br />
<code>'aa'[/a{,2}/] #=&gt; "aa" – at least two a</code><br />
<code>'aa'[/a{1,2}/] #=&gt; "aa" – at least one a and at moust two a</code><br />
<code>'a'[/a{1,2}/] #=&gt; "a"</code></p>
<p>We will soon see that the concatenation of two regular expressions are not the same as the concatenation of two strings. Remember that a regular expression corresponds to a set of strings. For example, if <code>p = {a, b}</code> and <code>q = {c, d}</code>, then <code>pq = {ac, ad, bc, bd}</code></p>
<p><a target="pb" href="http://www.pomodoro-book.com/en"><img src="http://brainmoda.files.wordpress.com/2008/02/idg_en.gif?w=500" alt="Pomodoro Technique Illustrated -- New book from The Pragmatic Programmers, LLC" title="Pomodoro Technique Illustrated -- New book from The Pragmatic Programmers, LLC"   class="aligncenter size-full wp-image-778" /></a></p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/brainmoda.wordpress.com/1181/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/brainmoda.wordpress.com/1181/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/brainmoda.wordpress.com/1181/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/brainmoda.wordpress.com/1181/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/brainmoda.wordpress.com/1181/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/brainmoda.wordpress.com/1181/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/brainmoda.wordpress.com/1181/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/brainmoda.wordpress.com/1181/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/brainmoda.wordpress.com/1181/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/brainmoda.wordpress.com/1181/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/brainmoda.wordpress.com/1181/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/brainmoda.wordpress.com/1181/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/brainmoda.wordpress.com/1181/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/brainmoda.wordpress.com/1181/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=blog.staffannoteberg.com&#038;blog=2614182&#038;post=1181&#038;subd=brainmoda&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://blog.staffannoteberg.com/2012/01/19/regular-expression-concatenation/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/892160c571f99f4c4f75dbd01ca95406?s=96&#38;d=http%3A%2F%2Fs0.wp.com%2Fi%2Fmu.gif" medium="image">
			<media:title type="html">snoteberg</media:title>
		</media:content>

		<media:content url="http://brainmoda.files.wordpress.com/2012/01/concat.jpg" medium="image">
			<media:title type="html">Concatenation</media:title>
		</media:content>

		<media:content url="http://brainmoda.files.wordpress.com/2008/02/idg_en.gif" medium="image">
			<media:title type="html">Pomodoro Technique Illustrated -- New book from The Pragmatic Programmers, LLC</media:title>
		</media:content>
	</item>
	</channel>
</rss>
