need a simply preg_match, which will find "c.aspx" (without quotes) in the content if it finds, it will return the whole url. As a example
$content = '<div>[4]<a href="/m/c.aspx?mt=01_9310ba801f1255e02e411d8a7ed53ef95235165ee4fb0226f9644d439c11039f%7c8acc31aea5ad3998&n=783622212">New message</a><br/>';
now it should preg_match "c.aspx" from $content and will give a output as
"/m/c.aspx?mt=01_9310ba801f1255e02e411d8a7ed53ef95235165ee4fb0226f9644d439c11039f%7c8acc31aea5ad3998&n=783622212"
The $content should have more links except "c.aspx". I don't want them. I only want all url that have "c.aspx".
Please let me know how I can do it.
From stackoverflow
-
You use DOM to parse HTML, not regex. You can use regex to parse the attribute value though.
Edit: updated example so it checks for c.aspx.
$content = '<div>[4]<a href="/m/c.aspx?mt=01_9310ba801f1255e02e411d8a7ed53ef95235165ee4fb0226f9644d439c11039f%7c8acc31aea5ad3998&n=783622212">New message</a> <a href="#bar">foo</a> <br/>'; $dom = new DOMDocument(); $dom->loadHTML($content); $anchors = $dom->getElementsByTagName('a'); if ( count($anchors->length) > 0 ) { foreach ( $anchors as $anchor ) { if ( $anchor->hasAttribute('href') ) { $link = $anchor->getAttribute('href'); if ( strpos( $link, 'c.aspx') ) { echo $link; } } } }Ken Keenan : There is also a PHP function, `parse_url()` that you can use once you've extracted the URL from the href attributeSHAKTI : Wow, Thanks, It really works. Thank you very very much. :)karim79 : @meder - voted up, and seriously, I love you. There is *no* regex solution to this problem. -
If you want to find any quoted string with c.aspx in it:
/"[^"]*c\.aspx[^"]*"|'[^']*c\.aspx[^']*'/But really, for parsing most HTML you'd be better off with some sort of DOM parser so that you can be sure what you're matching is really an href.
0 comments:
Post a Comment