By Dominic Cronin


2013-05-19 09:44:15 8 Comments

I have a .net regex which I am testing using Windows Powershell. The output is as follows:

> [System.Text.RegularExpressions.Regex]::Match("aaa aaa bbb", "aaa.*?bbb")


Groups   : {aaa aaa bbb}
Success  : True
Captures : {aaa aaa bbb}
Index    : 0
Length   : 11
Value    : aaa aaa bbb

My expectation was that using the ? quantifier would cause the match to be aaa bbb, as the second group of a's is sufficient to satisfy the expression. Is my understanding of non-greedy quantifiers flawed, or am I testing incorrectly?

Note: this is plainly not the same problem as Regular Expression nongreedy is greedy

4 comments

@Tim Pietzcker 2013-05-19 14:19:08

This is a common misunderstanding. Lazy quantifiers do not guarantee the shortest possible match. They only make sure that the current quantifier, from the current position, does not match more characters than needed for an overall match.

If you truly want to ensure the shortest possible match, you need to make that explicit. In this case, this means that instead of .*?, you want a subregex that matches anything that is neither aaa nor bbb. The resulting regex will therefore be

aaa(?:(?!aaa|bbb).)*bbb

@Dominic Cronin 2013-05-19 18:35:40

I just did what I should have done in the first place, and consulted the relevant chapter of Friedl. That led me to aaa((?!aaa).)*bbb, which is more or less what you said, except that your answer has the added details of making the subexpression non-capturing and also tests for bbb in the negative lookahead. Nice answer.

@Casimir et Hippolyte 2013-05-19 13:49:32

This is not a greedy/lazy problem. The problem comes to the fact that your string is analysed from left to right. When the first aaa is matched, the regex engine add characters one by one to have the complete pattern.

Note that with a greedy behaviour, in your example, you obtain the same result: the first aaa is matched, the regex engine take all the last characters and backtrack character by character until having the complete match.

@HamZa 2013-05-19 10:00:32

Well it's really simple, we have the following string

aaa aaa bbb

Let's see we have this regex aaa.*?bbb. The regex engine will start with aaa

aaa aaa bbb

The regex engine has now .*?bbb. It will proceed with the space

aaa space aaa bbb

but we still have some characters until bbb ? So the regex engine will continue it's way and match the second set of a

aaa aaa space bbb

Finally the regex engine will match bbb:

aaa aaa bbb


So let's see, if we only want to match the second aaa we could use the following regex:

(?<!^)aaa.*?bbb, this means to match aaa that is not at the beginning of the sentence.

We may also use aaa(?= bbb).*?bbb, this means to match aaa that is followed by space bbb.

See it working 1 - 2.

Just came to my senses, but why don't you directly use aaa bbb ?

@j.holetzeck 2013-05-19 09:57:41

Compare the result for the string aaa aaa bbb bbb:

regex: aaa.*?bbb 
result: aaa aaa bbb

regex: aaa.*bbb
result: aaa aaa bbb bbb

The regex engine finds first occurrence of aaa and then skips all characters (.*?) until first occurrence of bbb, but for the greedy operator (.*) it will go on to find a larger result and therefore match the last occurrence of bbb.

@duozmo 2014-03-22 06:42:34

This is the clearest explanation of what's happening. +1

Related Questions

Sponsored Content

15 Answered Questions

[SOLVED] What is a non-capturing group in regular expressions?

21 Answered Questions

[SOLVED] Non greedy (reluctant) regex matching in sed?

12 Answered Questions

34 Answered Questions

[SOLVED] RegEx match open tags except XHTML self-contained tags

  • 2009-11-13 22:38:26
  • Jeff
  • 2733325 View
  • 1356 Score
  • 34 Answer
  • Tags:   html regex xhtml

2 Answered Questions

[SOLVED] Greedy regex quantifier not matching password criteria

1 Answered Questions

[SOLVED] std::regex_match and lazy quantifier with strange behavior

7 Answered Questions

[SOLVED] Greedy vs. Reluctant vs. Possessive Quantifiers

  • 2011-03-16 00:55:41
  • Regex Rookie
  • 89325 View
  • 346 Score
  • 7 Answer
  • Tags:   regex regex-greedy

1 Answered Questions

[SOLVED] Regex: greedy quantifier behaving lazy

  • 2016-01-08 01:05:40
  • user1807768
  • 385 View
  • 2 Score
  • 1 Answer
  • Tags:   c# regex

4 Answered Questions

[SOLVED] perl non-greedy problem

Sponsored Content