By Hugoware


2009-01-22 16:49:14 8 Comments

Obviously, you can use the | (pipe?) to represent OR, but is there a way to represent AND as well?

Specifically, I'd like to match paragraphs of text that contain ALL of a certain phrase, but in no particular order.

12 comments

@DevWL 2015-11-13 11:30:56

In addition to the accepted answer

I will provide you with some practical examples that will get things more clear to some of You. For example lets say we have those three lines of text:

[12/Oct/2015:00:37:29 +0200] // only this + will get selected
[12/Oct/2015:00:37:x9 +0200]
[12/Oct/2015:00:37:29 +020x]

See demo here DEMO

What we want to do here is to select the + sign but only if it's after two numbers with a space and if it's before four numbers. Those are the only constraints. We would use this regular expression to achieve it:

'~(?<=\d{2} )\+(?=\d{4})~g'

Note if you separate the expression it will give you different results.

Or perhaps you want to select some text between tags... but not the tags! Then you could use:

'~(?<=<p>).*?(?=<\/p>)~g'

for this text:

<p>Hello !</p> <p>I wont select tags! Only text with in</p> 

See demo here DEMO

@James Brown 2018-12-06 11:45:55

Which answer was the accepted answer? Please add a link to it for the future me.

@Emanuele Del Grande 2014-06-30 11:25:56

The AND operator is implicit in the RegExp syntax.
The OR operator has instead to be specified with a pipe.
The following RegExp:

var re = /ab/;

means the letter a AND the letter b.
It also works with groups:

var re = /(co)(de)/;

it means the group co AND the group de.
Replacing the (implicit) AND with an OR would require the following lines:

var re = /a|b/;
var re = /(co)|(de)/;

@JESii 2014-09-12 18:34:52

Unfortunately, this is not what the OP asked for. This finds anything in that order, whereas they wanted them in any order. Check out the answer by stackoverflow.com/users/20938/alan-moore below which is the correct one.

@Emanuele Del Grande 2018-09-06 10:07:47

@JESii thanks for your point, you are right and I misundertsood the question from Hugoware, I focused particularly on his first sentence. The right answer is a proper use of the lookahead operator, as AlanMoore wrote. Anyhow I think someone may find my clarification useful, as is has been already upvoted, so I wouldn't throw everything away. Regards.

@garbagecollector 2014-11-20 21:36:21

You could pipe your output to another regex. Using grep, you could do this:

grep A | grep B

@mug896 2013-12-27 13:49:46

Why not use awk?
with awk regex AND, OR matters is so simple

awk '/WORD1/ && /WORD2/ && /WORD3/' myfile

@fanjabi 2011-04-20 12:11:40

Look at this example:

We have 2 regexps A and B and we want to match both of them, so in pseudo-code it looks like this:

pattern = "/A AND B/"

It can be written without using the AND operator like this:

pattern = "/NOT (NOT A OR NOT B)/"

in PCRE:

"/^(^A|^B)/"

regexp_match(pattern,data)

@Alan Moore 2011-11-14 14:32:10

That's true in terms of formal logic, but it's absolutely no help here. In regexes, NOT can be even more difficult to express than AND.

@Codemole 2013-05-28 14:52:06

Clever point. any AND could be composed using NOT and OR

@Stefan 2013-06-28 14:57:43

nice, exactly what I was looking for! thx!

@petermeissner 2013-10-16 12:59:19

Does this work in practice or does it not work?

@Melebius 2013-11-18 10:25:15

@marvin_dpr It worked for me in CMake while the other suggestion (?=expr) not. It seems to be implementation dependent.

@Lambda Fairy 2013-12-30 01:57:54

Doesn't ^ mean "beginning of string" in regex syntax?

@tripleee 2015-02-17 12:41:58

In regex in general, ^ is negation only at the beginning of a character class. Unless CMake is doing something really funky (to the point where calling their pattern matching language "regex" could be regarded as misleading or incorrect) I'm guessing the fact that it worked for you was an isolated accident.

@Nishchal Gautam 2016-06-13 08:20:03

Thanks a lot, I feel like a god :)

@Hammad Khan 2011-09-09 18:13:04

Use AND outside the regular expression. In PHP lookahead operator did not not seem to work for me, instead I used this

if( preg_match("/^.{3,}$/",$pass1) && !preg_match("/\s{1}/",$pass1))
    return true;
else
    return false;

The above regex will match if the password length is 3 characters or more and there are no spaces in the password.

@Jason Cohen 2009-01-22 16:58:53

Use a non-consuming regular expression.

The typical (i.e. Perl/Java) notation is:

(?=expr)

This means "match expr but after that continue matching at the original match-point."

You can do as many of these as you want, and this will be an "and." Example:

(?=match this expression)(?=match this too)(?=oh, and this)

You can even add capture groups inside the non-consuming expressions if you need to save some of the data therein.

@Hugoware 2009-01-22 17:39:52

Do you just place them all in a row, no separators between them? i.e. (?=apple)(?=orange)(?=pear)

@Robert P 2009-01-22 18:18:24

Separators (or any characters) would indicate that those must exist before the next non-consuming group.

@Robert P 2009-01-22 18:24:57

Giving this a try...I don't think this is exactly what he means.

@Robert P 2009-01-22 18:27:19

perl -e "q{some stuff and things} =~ /(?=some)(?=stuff)(?=things)/ ? print 'yes' : print 'no'" prints 'no'.

@Jason Cohen 2009-01-22 20:57:34

Thanks for the good comments; I've updated the answer to include examples.

@strager 2009-01-22 21:11:49

It should be mentioned that this particular example is called a positive lookahead assertion. It has other uses than "and". Note that the text isn't consumed.

@Nietzche-jou 2009-01-22 21:30:13

Using (?=) like this results in a regex that can never succeed. But it is the conjunction analog to |. The OP is just wrong in what he thinks will solve his problem.

@kriss 2010-06-14 22:32:50

perl -e "q{some stuff and things} =~ /(?=.*some)(?=.*stuff)(?=.*things)/ ? print 'yes' : print 'no'"

@Pithikos 2011-11-25 13:59:19

Can you please add some easy example in perl code in your answer?

@Jageen 2017-08-23 00:26:26

It worth to mention that lookahead is not supported in japavascript.

@Alan Moore 2009-01-22 20:04:11

You need to use lookahead as some of the other responders have said, but the lookahead has to account for other characters between its target word and the current match position. For example:

(?=.*word1)(?=.*word2)(?=.*word3)

The .* in the first lookahead lets it match however many characters it needs to before it gets to "word1". Then the match position is reset and the second lookahead seeks out "word2". Reset again, and the final part matches "word3"; since it's the last word you're checking for, it isn't necessary that it be in a lookahead, but it doesn't hurt.

In order to match a whole paragraph, you need to anchor the regex at both ends and add a final .* to consume the remaining characters. Using Perl-style notation, that would be:

/^(?=.*word1)(?=.*word2)(?=.*word3).*$/m

The 'm' modifier is for multline mode; it lets the ^ and $ match at paragraph boundaries ("line boundaries" in regex-speak). It's essential in this case that you not use the 's' modifier, which lets the dot metacharacter match newlines as well as all other characters.

Finally, you want to make sure you're matching whole words and not just fragments of longer words, so you need to add word boundaries:

/^(?=.*\bword1\b)(?=.*\bword2\b)(?=.*\bword3\b).*$/m

@Lincoln 2012-09-19 14:12:36

Exactly right - there is a tutorial about this as well! ocpsoft.org/tutorials/regular-expressions/and-in-regex

@zest 2013-05-23 12:36:43

Thanks a lot .* this make a difference

@zx81 2014-05-17 09:42:33

+1 for clear and succint answer showcasing one of the best uses for lookaheads (unlike uses such as a hack to count the percentage match of a password). :)

@cbuchart 2014-10-10 15:06:33

Works with QRegExp!

@Liam 2015-10-15 15:55:15

Is this supported in MySQL's regexp() ?

@Alan Moore 2015-10-15 19:39:36

@Liam:. MySQL uses the POSIX ERE flavor, so no. It effectively sacrifices features in favor of performance, which seems reasonable to me. There's more information here.

@DelightedD0D 2017-07-19 21:38:06

replace .* with [\s\S]* in javascript if you have new lines as . in javascript's regex engine does not match new lines and cannot be made to with modifiers

@Juha Syrjälä 2009-01-22 18:07:04

You can do that with a regular expression but probably you'll want to some else. For example use several regexp and combine them in a if clause.

You can enumerate all possible permutations with a standard regexp, like this (matches a, b and c in any order):

(abc)|(bca)|(acb)|(bac)|(cab)|(cba)

However, this makes a very long and probably inefficient regexp, if you have more than couple terms.

If you are using some extended regexp version, like Perl's or Java's, they have better ways to do this. Other answers have suggested using positive lookahead operation.

@Casimir et Hippolyte 2013-06-13 18:05:53

I don't think your approach is more inefficient than 3 lookaheads with their catastrophic backtracking. Sure it is longer to write, but note that you can easily generate the pattern automatically. Note that you can improve it to fail quicker with a(bc|cb)|b(ac|ca)|c(ab|ba). And the most important, you can use it with all regex flavour.

@jpalecek 2009-01-22 16:59:30

If you use Perl regular expressions, you can use positive lookahead:

For example

(?=[1-9][0-9]{2})[0-9]*[05]\b

would be numbers greater than 100 and divisible by 5

@user54579 2009-01-22 16:57:52

Is it not possible in your case to do the AND on several matching results? in pseudocode

regexp_match(pattern1, data) && regexp_match(pattern2, data) && ...

@Alan Wolfe 2015-09-04 20:33:12

I'm in a situation where i have some code that is a data table of rules, with a single regex pattern match string to test the rule's validity. Moving to multiple tests isn't something I can do in my case, and commonly in other folks' cases as well!

@pilif 2009-01-22 16:56:05

The order is always implied in the structure of the regular expression. To accomplish what you want, you'll have to match the input string multiple times against different expressions.

What you want to do is not possible with a single regexp.

@Robert P 2009-01-22 18:29:35

It's not technically impossible, but not worthwhile to implement. I dunno why someone downvoted though...

@Alan Moore 2009-01-22 21:07:51

Probably because it's not only possible, it's simple, assuming your regex flavor supports lookaheads. And that's a good bet; most of today's major programming languages do support them.

Related Questions

Sponsored Content

18 Answered Questions

[SOLVED] How do you use a variable in a regular expression?

  • 2009-01-30 00:11:05
  • JC Grubbs
  • 592158 View
  • 1108 Score
  • 18 Answer
  • Tags:   javascript regex

10 Answered Questions

[SOLVED] jQuery selector regular expressions

71 Answered Questions

16 Answered Questions

[SOLVED] How do you access the matched groups in a JavaScript regular expression?

  • 2009-01-11 07:21:20
  • nickf
  • 641398 View
  • 1149 Score
  • 16 Answer
  • Tags:   javascript regex

18 Answered Questions

[SOLVED] Regular Expression for alphanumeric and underscores

  • 2008-12-03 04:25:27
  • Jim
  • 923985 View
  • 493 Score
  • 18 Answer
  • Tags:   regex

11 Answered Questions

[SOLVED] How to do a regular expression replace in MySQL?

15 Answered Questions

[SOLVED] Regular expression to search for Gadaffi

  • 2011-03-19 22:14:17
  • SiggyF
  • 52134 View
  • 360 Score
  • 15 Answer
  • Tags:   regex search

27 Answered Questions

8 Answered Questions

[SOLVED] How to escape text for regular expression in Java

  • 2008-09-12 23:36:36
  • Matt
  • 202570 View
  • 300 Score
  • 8 Answer
  • Tags:   java regex escaping

7 Answered Questions

[SOLVED] Is there a regular expression to detect a valid regular expression?

  • 2008-10-05 17:07:35
  • psytek
  • 100668 View
  • 657 Score
  • 7 Answer
  • Tags:   regex

Sponsored Content