Saturday, March 6, 2010

Regular Expressions

Regular expressions provide a flexible way for matching patterns in text and are commonly used in syntax highlighting systems for example. They can also be used to validate data input like Email addresses format.

Personally I use regular expression as soon as I need to find many different occurrences of a string that have a common pattern. For example I once needed to insert links into diverse technical PDF manuals that were split by chapters and sections, respecting a standard nomenclature i.e. “ManualTitle-CH01” for the chapter 1. Each document referring to the other ones as “Chapter XX” or “Section XX” and containing internal references like “Figure XX”. In cases like that you can have hundreds or thousands different string to search meaning maybe hundreds lines of code while regular expressions could accomplish the task with a few.

In order to use the Regular Expressions in Visual Studio you need :
using System.Text.RegularExpressions;
Here is a small code example for finding multiple pattern matches.
private void findMatches(string text)
{
 Regex pattern = new Regex("Chapter\s(\d+)|Section\s(\d+)|Figure\s(\d+)");
 MatchCollection matches = new MatchCollection();

 matches = Regex.Matches(text, pattern, RegexOptions.IgnoreCase);
 foreach(Match m in matches)
 {
  /*Work with the match here, generate and insert link etc.*/
 }
}
Here are a couple of links that might prove usefull :
Regular Expression Syntax
Regular Expression Language Elements
Regular Expression Tester
I hope this first post may help some of you.

No comments:

Post a Comment