By Mark


2015-08-03 17:00:06 8 Comments

I'm trying to extract a string between two quotes, and I thought I had my regex working, but it's giving me two strings in my GroupCollection, and I can't get it to ignore the first one, which includes the first quote and ID=

The string that I want to parse is

Test ID="12345" hello

I want to return 12345 in a group, so that I can manipulate it in code later. I've tried the following regex: http://regexr.com/3bgtl, with this code:

nodeValue = "Test ID=\"12345\" hello";
GroupCollection ids = Regex.Match(nodeValue, "ID=\"([^\"]*)").Groups;

The problem is that the GroupCollection contains two entries:

ID="12345

12345

I just want it to return the second one.

3 comments

@Wiktor Stribi┼╝ew 2015-08-03 17:27:39

Your code is totally OK and is the most efficient from all the solutions suggested here. Capturing groups allow the quickest and least resource-consuming way to match substrings inside larger texts.

All you need to do with your regex is just access the captured group 1 that is defined by the round brackets. Like this:

var nodeValue = "Test ID=\"12345\" hello";
GroupCollection ids = Regex.Match(nodeValue, "ID=\"([^\"]*)").Groups;
Console.WriteLine(ids[1].Value);
// or just on one line
// Console.WriteLine(Regex.Match(nodeValue, "ID=\"([^\"]*)").Groups[1].Value);

See IDEONE demo

Please have a look at Grouping Constructs in Regular Expressions:

Grouping constructs delineate the subexpressions of a regular expression and capture the substrings of an input string. You can use grouping constructs to do the following:

  • Match a subexpression that is repeated in the input string.
  • Apply a quantifier to a subexpression that has multiple regular expression language elements. For more information about quantifiers, see [Quantifiers in Regular Expressions][3].
  • Include a subexpression in the string that is returned by the [Regex.Replace][4] and [Match.Result][5] methods.
  • Retrieve individual subexpressions from the [Match.Groups][6] property and process them separately from the matched text as a whole.

Note that if you do not need overlapping matches, capturing group mechanism is the best solution here.

@lintmouse 2015-08-03 17:13:47

There are a few ways to accomplish this. I like named capture groups for readability.

Regex with named capture group:

"(?<capture>.*?)"

And your code would be:

match.Groups["capture"].Value

@Flat Eric 2015-08-03 17:07:24

Use positive lookbehind operator:

GroupCollection ids = Regex.Match(nodeValue, "(?<=ID=\")[^\"]*").Groups;

You also used a capturing group (the parenthesis), this is why you get 2 results.

@Mark 2015-08-03 17:13:00

This worked perfect, thanks for the quick response. I was using groups, as I believe in some cases there was more than one instance of the ID, and it's actually being passed into a replace later.

Related Questions

Sponsored Content

13 Answered Questions

[SOLVED] How to extract a substring using regex

35 Answered Questions

[SOLVED] RegEx match open tags except XHTML self-contained tags

  • 2009-11-13 22:38:26
  • Jeff
  • 2880577 View
  • 1493 Score
  • 35 Answer
  • Tags:   html regex xhtml

14 Answered Questions

[SOLVED] Regex to get string between curly braces

  • 2009-01-05 13:21:34
  • ChrisInCambo
  • 250834 View
  • 115 Score
  • 14 Answer
  • Tags:   javascript regex

27 Answered Questions

[SOLVED] Why not inherit from List<T>?

11 Answered Questions

[SOLVED] Check whether a string matches a regex in JS

64 Answered Questions

[SOLVED] What is the difference between String and string in C#?

39 Answered Questions

5 Answered Questions

[SOLVED] \d is less efficient than [0-9]

25 Answered Questions

[SOLVED] Case insensitive 'Contains(string)'

13 Answered Questions

[SOLVED] Regex Match all characters between two strings

  • 2011-05-24 11:45:58
  • 0xbadf00d
  • 759436 View
  • 440 Score
  • 13 Answer
  • Tags:   regex

Sponsored Content