2010-09-29 10:18:14 8 Comments
Trying to find the links on a page.
my regex is:
/<a\s[^>]*href=(\"\'??)([^\"\' >]*?)[^>]*>(.*)<\/a>/
but seems to fail at
<a title="this" href="that">what?</a>
How would I change my regex to deal with href not placed first in the a tag?
Related Questions
Sponsored Content
37 Answered Questions
54 Answered Questions
[SOLVED] Which "href" value should I use for JavaScript links, "#" or "javascript:void(0)"?
- 2008-09-25 17:54:27
- 2cBGj7vsfp
- 2245001 View
- 3985 Score
- 54 Answer
- Tags: javascript html performance optimization href
30 Answered Questions
[SOLVED] How to create an HTML button that acts like a link?
- 2010-05-25 16:39:47
- Andrew
- 5761892 View
- 1774 Score
- 30 Answer
- Tags: html button hyperlink anchor htmlbutton
18 Answered Questions
55 Answered Questions
[SOLVED] How do I check if an element is hidden in jQuery?
- 2008-10-07 13:03:18
- Philip Morton
- 2551957 View
- 7492 Score
- 55 Answer
- Tags: javascript jquery dom visibility
14 Answered Questions
[SOLVED] How to move an element into another element?
- 2009-08-14 20:14:45
- Mark Richman
- 1054824 View
- 1618 Score
- 14 Answer
- Tags: javascript jquery html
7 Answered Questions
[SOLVED] How does PHP 'foreach' actually work?
- 2012-04-07 19:33:57
- DaveRandom
- 392643 View
- 1931 Score
- 7 Answer
- Tags: php loops foreach iteration php-internals
26 Answered Questions
[SOLVED] Retrieve the position (X,Y) of an HTML element
- 2009-01-14 09:35:03
- monaung
- 1293218 View
- 1423 Score
- 26 Answer
- Tags: javascript html css dom position
23 Answered Questions
[SOLVED] What are valid values for the id attribute in HTML?
- 2008-09-16 09:08:52
- Mr Shark
- 427334 View
- 1950 Score
- 23 Answer
- Tags: html
16 Answered Questions
[SOLVED] How do I find out which DOM element has the focus?
- 2009-01-30 20:21:31
- Tony Peterson
- 626673 View
- 1235 Score
- 16 Answer
- Tags: javascript dom
10 comments
@Meloman 2019-01-22 12:54:27
The following is working for me and returns both
href
andvalue
of the anchor tag.The multidimensional array called
$urls
contains now associative sub-arrays that are easy to use.@Milan Malani 2016-08-26 11:17:59
For the one who still not get the solutions very easy and fast using SimpleXML
Its working for me
@Ravi Prakash 2016-07-06 05:23:10
preg_match_all("/(]>)(.?)(</a)/", $contents, $impmatches, PREG_SET_ORDER);
It is tested and it fetch all a tag from any html code.
@Gordon 2010-09-29 10:35:53
Reliable Regex for HTML are difficult. Here is how to do it with DOM:
The above would find and output the "outerHTML" of all
A
elements in the$html
string.To get all the text values of the node, you do
To check if the
href
attribute exists you can doTo get the
href
attribute you'd doTo change the
href
attribute you'd doTo remove the
href
attribute you'd doYou can also query for the
href
attribute directly with XPathAlso see:
On a sidenote: I am sure this is a duplicate and you can find the answer somewhere in here
@Asciiom 2013-10-10 14:11:56
Reliable regex for parsing HTML are inherently impossible even since HTML is not a regular language.
@Toto 2010-09-29 11:43:02
I agree with Gordon, you MUST use an HTML parser to parse HTML. But if you really want a regex you can try this one :
This matches
<a
at the begining of the string, followed by any number of any char (non greedy).*?
thenhref=
followed by the link surrounded by either"
or'
Output:
@Michal - wereda-net 2014-11-28 17:51:56
just for info: if we search in a text containing many a elements than expression (.*?) is wrong
@Aif 2010-09-29 10:21:13
why don't you just match
then
which works. I've just removed the first capture braces.
@Ignacio Bustos 2013-10-22 15:33:25
i recomend to use
preg_match_all("/<a.*?href\s*=\s*['\"](.*?)['\"]/", $str, $res, PREG_SET_ORDER);
in order to catch correctly all href values in usingforeach($res as $key => $val){echo $val[1]}
@Ruel 2010-09-29 10:25:36
Using your regex, I modified it a bit to suit your need.
<a.*?href=("|')(.*?)("|').*?>(.*)<\/a>
I personally suggest you use a HTML Parser
EDIT: Tested
@bergin 2010-09-29 10:28:04
using myregextester.com - sorry, doesnt find the links
@Ruel 2010-09-29 10:33:33
@bergin, Hi I modified my answer, and it works now.
@bergin 2010-09-29 10:38:50
it says: NO MATCHES. CHECK FOR DELIMITER COLLISION.
@Ruel 2010-09-29 10:41:17
Can you please tell me the text to match? I use:
<a title="this" href="that">what?</a>
@Adam 2010-09-29 10:25:32
I'm not sure what you're trying to do here, but if you're trying to validate the link then look at PHP's filter_var()
If you really need to use a regular expression then check out this tool, it may help: http://regex.larsolavtorvik.com/
@CharlesLeaf 2010-09-29 10:23:22
Quick test:
<a\s+[^>]*href=(\"\'??)([^\1]+)(?:\1)>(.*)<\/a>
seems to do the trick, with the 1st match being " or ', the second the 'href' value 'that', and the third the 'what?'.The reason I left the first match of "/' in there is that you can use it to backreference it later for the closing "/' so it's the same.
See live example on: http://www.rubular.com/r/jsKyK2b6do
@CharlesLeaf 2010-09-29 10:30:43
@bergin please specify, what doesn't work? I get the exact value from the href in your test HTML. What are you expecting that this doesn't do? I see you use a different site for testing, there I also get the 'href' value succesfully from your example. myregextester.com/?r=d966dd6b
@Alex Pliutau 2010-09-29 10:22:23
The pattern you want to look for would be the link anchor pattern, like (something):
@funerr 2016-09-09 11:36:40
What if the anchor has more attributes?