Say, I have a string that I need to verify the correct format of; e.g. RR1234566-001
(2 letters, 7 digits, dash, 1 or more digits). I use something like:
Regex regex = new Regex(patternString);
if (regex.IsMatch(stringToMatch))
{
return true;
}
else
{
return false;
}
This works to tell me whether the stringToMatch
follows the pattern defined by patternString
. What I need though (and I end up extracting these later) are:
123456
and 001
-- i.e. portions of the stringToMatch
.
Please note that this is NOT a question about how to construct regular expressions. What I am asking is: "Is there a way to match and extract values simultaneously without having to use a split function later?"
-
You can use regex groups to accomplish that. For example, this regex:
(\d\d\d)-(\d\d\d\d\d\d\d)
Let's match a telephone number with this regex:
var regex = new Regex(@"(\d\d\d)-(\d\d\d\d\d\d\d)"); var match = regex.Match("123-4567890"); if (match.Success) ....
If it matches, you will find the first three digits in:
match.Groups[1].Value
And the second 7 digits in:
match.Groups[2].Value
P.S. In C#, you can use a @"" style string to avoid escaping backslashes. For example, @"\hi\" equals "\\hi\\". Useful for regular expressions and paths.
P.S.2. The first group is stored in Group[1], not Group[0] as you would expect. That's because Group[0] contains the entire matched string.
Neil Williams : +1 Very thorough! I'd add one thing though, the reason that you start on match.Groups[1] and not [0] is because [0] contains the entire matched string. -
Use grouping and Matches instead.
I.e.:
// NOTE: pseudocode. Regex re = new Regex("(\\d+)-(\\d+)"); Match m = regex.Match(stringToMatch)) if (m.success) { String part1 = m.Groups[1].Value; String part2 = m.Groups[2].Value; return true; } else { return false; }
You can also name the matches, like this:
Regex re = new REgex("(?<Part1>\\d+)-(?<Part2>\\d+)");
and access like this
String part1 = m.Groups["Part1"].Value; String part2 = m.Groups["Part2"].Value;
gnomixa : very useful tip!Rob Fonseca-Ensor : +1 for named groups -
You can use parentheses to capture groups of characters:
string test = "RR1234566-001"; // capture 2 letters, then 7 digits, then a hyphen, then 1 or more digits string rx = @"^([A-Za-z]{2})(\d{7})(\-)(\d+)$"; Match m = Regex.Match(test, rx, RegexOptions.IgnoreCase); if (m.Success) { Console.WriteLine(m.Groups[1].Value); // RR Console.WriteLine(m.Groups[2].Value); // 1234566 Console.WriteLine(m.Groups[3].Value); // - Console.WriteLine(m.Groups[4].Value); // 001 return true; } else { return false; }
Andomar : +1 for the right regex... btw if you use IgnoreCase, you can use [a-z] instead of [A-Za-z].
0 comments:
Post a Comment