Click to See Complete Forum and Search --> : Validating A URL
brendandonhue
06-18-2003, 04:57 PM
I am looking for a way to validate a URL. I don't need to check that it actually exists as in the example below this- I just need to know that its feasible that its a real URL.
For example, it must begin with either http:// or ftp:// (yes thats ok because I dont want https or mailto or other links), then it must contain a string, then it would have to have a period, followed by at least 2 characters.
I'm guessing I will have to use regular expressions, but I dont know exactly how to go about this.
-Thanks in advance.
http://us2.php.net/manual/en/function.ereg-replace.php has the following code, but it does not work for www. It must have http:// or ftp:// at the beginning.
$text = ereg_replace("[[:alpha:]]+://[^<>[:space:]]+[[:alnum:]/]", "<a href=\"\\0\">\\0</a>", $text);
Jona
brendandonhue
06-18-2003, 05:35 PM
Thanks jono that looks perfect!
But how do I use that to validate the URL now?
Lets say the url is stored in $url, how would I check if it meets those requirements?
if ($url == $text) ??
if(preg_match("[[:alpha:]]+://[^<>[:space:]]+[[:alnum:]/]", $url)){
#Is a valid URL
} else {
#Is not a valid URL
}
Jona
brendandonhue
06-18-2003, 05:50 PM
Awesome :)
I would recommend something more like this, as the one Jona posted has many obvious flaws:
<?PHP
$url = "http://www.infinitypages.com";
if (preg_match("/^(http(s?):\/\/|ftp:\/\/{1})((\w+\.){1,})\w{2,}$/i", $url)) {
echo "Valid URL";
}
else {
echo "Invalid URL";
}
?>
Let me explain it now.. It will match like this:
Must start with either http(with an optional s):// or ftp:// followed by any number of alphanumeric characters and the understrike (this allows for subdomains, etc) followed by an ending that is at least 2 characters long.
Note: if you do not want it to allow https:// just remove the (s?) or if you want to add mailto: just add it in with a new | after the ftp://
brendandonhue
06-18-2003, 05:54 PM
Ok I mostly understand it now.
If you dont mind
| is boolean operator for or, is that correct?
Can you explain
((w+.){1,})w{2,}
I dont understand what that does..maybe I should just read the manual for preg_match()
Hey, it's not my RegExp, so don't go saying that I suck at them! :) I was just giving Brendandonhue a suggestion... I didn't even know if it worked or not. lol I anticipated you'd come in here and tell me that I did something wrong. :D
I'm not the world's greatest PHP programmer, as can plainly be seen here, so don't expect my suggestions to be solutions all the time. :)
Jona
brendandonhue
06-18-2003, 05:58 PM
Jona-I think he was just trying to reccommend a better method. I didn't see any of the flaws in yours but hey im not the best PHPer either :D
Sure:
((w+.){1,}) means to match one or more occurence of an alphanumeric (and the _) character followed by a period, one or more time. ie. test. or test.test.test. will both validate. This is so that http://infinitypages.com or http://your.very.own.subdomain.infinitypages.com will both validate (as they are valid addresses)
w{2,}$ means $url must end with two or more alphanumeric and the understrike characters... So, this will NOT allow for http://www.infinitypages.com/home.php... Wasn't sure if you wanted it to.
Originally posted by Jona
... you'd come in here and tell me that I did something wrong.Sorry. Didn't mean it that way. As brendandonhue said, I just wanted to provide a better alternative, as the other regexp would't really do all that much to validate a URL
Yeah, I know. I wasn't upset or anything--besides, it's your job lol. But like I said, I knew you'd come over here and say something about it. :D No offense taken whatsoever. ;)
Jona
This regexp is a touch tiddier than my last, and allows for an optional trailing slash:
if (preg_match("/^(http(s?):\\/\\/|ftp:\\/\\/{1})((\w+\.)+)\w{2,}(\/?)$/i", $url)) {
brendandonhue
06-18-2003, 06:34 PM
Ok thanks, got it :)
jeffmott
06-18-2003, 06:37 PM
I built a regular expression for URIs about the same time I made one for e-mails. This one comes from RFC2396 Uniform Resource Identifiers (URI): Generic Syntax. Because it is generic syntax, it does not test for any specific scheme, which is good and bad. Good: allows for the many valid schemes currently in existence, and others to be created later. Bad: a non-valid scheme will not be recognized.
But anyway, it is attached. Since this one is far more complicated and lengthy than the e-mail, I've also attached the outlines for each part. It should help you to break it down, or build another one up if you choose to.
I have not found any bugs, but if anyone else does please tell me.
brendandonhue
06-18-2003, 06:40 PM
Thanks-but that is really long. Im not sure if I need that, or if I should just stick with the simpler one.
Originally posted by jeffmott
...regular expression...Good 'ol jeffmott.... :)
Originally posted by pyro
Good 'ol jeffmott.... :)
Jeffmott: The RegExpert! :D
Jona
jeffmott
06-19-2003, 06:40 AM
:D
brendandonhue
06-19-2003, 09:08 AM
Thanks Jona, Pyro, and Jeffmott!
Now I have another question of course though.
When I put the URL validation into an IF statement-it always returns false. It is in there with two other conditions.
Heres the line:
if (preg_match("/^(http(s?):\/\/|ftp:\/\/{1})((w+.)+)w{2,}(/?)$/i", $url) && $text != "" && $description !="")
I know the rest of the script works fine, because if I just remove the preg_match part of the IF it works.
TIA
Originally posted by brendandonhue
[B]ecause if I just remove the preg_match part of the IF it works.
TIA
Switch that around--take out everything except for the preg_match. (I'm not saying that will make it work, but you could try it.)
Jona
brendandonhue
06-19-2003, 10:20 AM
Sorry, it doesn't work that way either :(
I would nest the if statements:
if ($text != "" && $description !="") {
if (preg_match("/^(http(s?):\/\/|ftp:\/\/{1})((w+.)+)w{2,}(/?)$/i", $url)) {
//all's well
}
//either $text or $description is ""
}
jeffmott
06-19-2003, 12:46 PM
Thanks-but that is really longYes it is. I usually stick it away in a variable of function so I never have to look at it again. :)
The specificness of it comes in more handy when you want to locate URIs in a body of plain text.
mldarshana
05-15-2009, 01:34 AM
I would recommend something more like this, as the one Jona posted has many obvious flaws:
<?PHP
$url = "http://www.infinitypages.com";
if (preg_match("/^(http(s?):\/\/|ftp:\/\/{1})((\w+\.){1,})\w{2,}$/i", $url)) {
echo "Valid URL";
}
else {
echo "Invalid URL";
}
?>
Let me explain it now.. It will match like this:
Must start with either http(with an optional s):// or ftp:// followed by any number of alphanumeric characters and the understrike (this allows for subdomains, etc) followed by an ending that is at least 2 characters long.
Note: if you do not want it to allow https:// just remove the (s?) or if you want to add mailto: just add it in with a new | after the ftp://
Many Thanks .... this is exactly what i've looked for :)