By Divers


2012-01-19 18:16:23 8 Comments

I have next code:

public static void createTokens(){
    String test = "test is a word word word word big small";
    Matcher mtch = Pattern.compile("test is a (\\s*.+?\\s*) word (\\s*.+?\\s*)").matcher(test);
    while (mtch.find()){
        for (int i = 1; i <= mtch.groupCount(); i++){
            System.out.println(mtch.group(i));
        }
    }
}

And have next output:

word
w

But in my opinion it must be:

word
word

Somebody please explain me why so?

2 comments

@Garrett Hall 2012-01-19 18:23:35

By using \\s* it will match any number of spaces including 0 spaces. w matches (\\s*.+?\\s*). To make sure it matches a word separated by spaces try (\\s+.+?\\s+)

@Alan Moore 2012-01-19 18:46:02

Trouble is, the regex is already consuming the space characters before and after the word, so now you're trying to consume them twice.

@Daniel Gray 2017-07-05 10:21:29

All you would need to do is remove the space from the regex like ...\\s+)word(\\s+...

@theglauber 2012-01-19 18:22:03

Because your patterns are non-greedy, so they matched as little text as possible while still consisting of a match.

Remove the ? in the second group, and you'll get
word
word word big small

Matcher mtch = Pattern.compile("test is a (\\s*.+?\\s*) word (\\s*.+\\s*)").matcher(test);

@Alan Moore 2012-01-19 18:41:13

And now the second group is capturing too much instead of too little. Non-greediness is not the problem, and greediness is not the solution.

@theglauber 2012-01-19 18:49:09

You're correct, but IMHO, the non-greedyness of the second capturing group explains why it captures simply "w". The first capturing group has to capture "word" because of the "word" literal following it. I don't know exactly what he's looking for and he edited the question after i submitted my answer, so i can't supply a correct regexp.

Related Questions

Sponsored Content

8 Answered Questions

[SOLVED] Is there a regular expression to detect a valid regular expression?

  • 2008-10-05 17:07:35
  • psytek
  • 201378 View
  • 981 Score
  • 8 Answer
  • Tags:   regex

73 Answered Questions

87 Answered Questions

[SOLVED] Is Java "pass-by-reference" or "pass-by-value"?

15 Answered Questions

[SOLVED] What is a non-capturing group in regular expressions?

55 Answered Questions

[SOLVED] How to create a memory leak in Java?

67 Answered Questions

[SOLVED] How do I generate random integers within a specific range in Java?

  • 2008-12-12 18:20:57
  • user42155
  • 3947119 View
  • 3411 Score
  • 67 Answer
  • Tags:   java random integer

20 Answered Questions

[SOLVED] How do you use a variable in a regular expression?

  • 2009-01-30 00:11:05
  • JC Grubbs
  • 702929 View
  • 1288 Score
  • 20 Answer
  • Tags:   javascript regex

58 Answered Questions

[SOLVED] How do I read / convert an InputStream into a String in Java?

29 Answered Questions

[SOLVED] Regular expression to match a line that doesn't contain a word

18 Answered Questions

[SOLVED] How do you access the matched groups in a JavaScript regular expression?

  • 2009-01-11 07:21:20
  • nickf
  • 733371 View
  • 1303 Score
  • 18 Answer
  • Tags:   javascript regex

Sponsored Content