uri regex
by Cjackson other posts by this author
Jun 14 2005 7:56PM messages near this date
Re: FORGED_YAHOO_RCVD false positive
|
Re: uri regex
Hi,
I flunked the IQ test so I need some help. I want to match all domains
in the body that are not in .com,.org.us,.edu,.gov and .mil. But there's
more. I need to match some characters at the end of the URI that can
often be found there such as > .?)*!"';
The rule would match http://www.go.za and http://www.go.za), but not
match http://www.go.com
Here's my regex that does not work...
m{https?://[^\s/:"')!?> *]+(?<!\.com)(?<!\.net)(?<!\.org)(?<!\.gov)(?<!\.us)(?<!\.edu)(?<!\.m
il)(?:"|'|:|\?|!|> |\*|\)|$)}
It works for all of the characters except for an ending "." such as
http://www.go.com.
I have grappled with this for some time and read the pcrepattern.txt
accompanying Exim source, but damn if I can get it to work. Anybody want
to spit out the answer?
Thanks,
Craig Jackson
Thread:
Cjackson
Stuart Johnston
Bret Miller
Craig Jackson
Craig Jackson
Bret Miller
|