www.webdeveloper.com
Page 1 of 2 12 LastLast
Results 1 to 15 of 25

Thread: Validating A URL

  1. #1
    Join Date
    May 2003
    Posts
    550

    Validating A URL

    I am looking for a way to validate a URL. I don't need to check that it actually exists as in the example below this- I just need to know that its feasible that its a real URL.
    For example, it must begin with either http:// or ftp:// (yes thats ok because I dont want https or mailto or other links), then it must contain a string, then it would have to have a period, followed by at least 2 characters.

    I'm guessing I will have to use regular expressions, but I dont know exactly how to go about this.
    -Thanks in advance.

  2. #2
    Join Date
    Jan 2003
    Location
    Texas
    Posts
    10,413
    http://us2.php.net/manual/en/function.ereg-replace.php has the following code, but it does not work for www. It must have http:// or ftp:// at the beginning.

    PHP Code:
    $text ereg_replace("[[:alpha:]]+://[^<>[:space:]]+[[:alnum:]/]""<a href=\"\\0\">\\0</a>"$text); 
    Jona
    Visit Slightly Remarkable to see my portfolio, resumé, and consulting rates.

  3. #3
    Join Date
    May 2003
    Posts
    550
    Thanks jono that looks perfect!
    But how do I use that to validate the URL now?

    Lets say the url is stored in $url, how would I check if it meets those requirements?

    if ($url == $text) ??

  4. #4
    Join Date
    Jan 2003
    Location
    Texas
    Posts
    10,413
    PHP Code:
    if(preg_match("[[:alpha:]]+://[^<>[:space:]]+[[:alnum:]/]"$url)){
    #Is a valid URL
    } else {
    #Is not a valid URL

    Jona
    Visit Slightly Remarkable to see my portfolio, resumé, and consulting rates.

  5. #5
    Join Date
    May 2003
    Posts
    550
    Awesome

  6. #6
    Join Date
    Dec 2002
    Location
    High on life
    Posts
    10,104
    I would recommend something more like this, as the one Jona posted has many obvious flaws:

    PHP Code:
    <?PHP
    $url 
    "http://www.infinitypages.com";

    if (
    preg_match("/^(http(s?):\/\/|ftp:\/\/{1})((\w+\.){1,})\w{2,}$/i"$url)) {
        echo 
    "Valid URL";
    }
    else {
        echo 
    "Invalid URL";
    }

    ?>
    Let me explain it now.. It will match like this:

    Must start with either http(with an optional s):// or ftp:// followed by any number of alphanumeric characters and the understrike (this allows for subdomains, etc) followed by an ending that is at least 2 characters long.

    Note: if you do not want it to allow https:// just remove the (s?) or if you want to add mailto: just add it in with a new | after the ftp://

    Personal website http://www.ryanbrill.com/
    Business website: http://www.infinitywebdesign.com/
    TypeSpace http://www.typespace.org/

    I reject your reality and substitute it with my own!

  7. #7
    Join Date
    May 2003
    Posts
    550
    Ok I mostly understand it now.

    If you dont mind
    | is boolean operator for or, is that correct?

    Can you explain
    ((w+.){1,})w{2,}
    I dont understand what that does..maybe I should just read the manual for preg_match()

  8. #8
    Join Date
    Jan 2003
    Location
    Texas
    Posts
    10,413
    Hey, it's not my RegExp, so don't go saying that I suck at them! I was just giving Brendandonhue a suggestion... I didn't even know if it worked or not. lol I anticipated you'd come in here and tell me that I did something wrong.

    I'm not the world's greatest PHP programmer, as can plainly be seen here, so don't expect my suggestions to be solutions all the time.


    Jona
    Visit Slightly Remarkable to see my portfolio, resumé, and consulting rates.

  9. #9
    Join Date
    May 2003
    Posts
    550
    Jona-I think he was just trying to reccommend a better method. I didn't see any of the flaws in yours but hey im not the best PHPer either

  10. #10
    Join Date
    Dec 2002
    Location
    High on life
    Posts
    10,104
    Sure:

    ((w+.){1,}) means to match one or more occurence of an alphanumeric (and the _) character followed by a period, one or more time. ie. test. or test.test.test. will both validate. This is so that http://infinitypages.com or http://your.very.own.subdomain.infinitypages.com will both validate (as they are valid addresses)

    w{2,}$ means $url must end with two or more alphanumeric and the understrike characters... So, this will NOT allow for http://www.infinitypages.com/home.php... Wasn't sure if you wanted it to.

    Personal website http://www.ryanbrill.com/
    Business website: http://www.infinitywebdesign.com/
    TypeSpace http://www.typespace.org/

    I reject your reality and substitute it with my own!

  11. #11
    Join Date
    Dec 2002
    Location
    High on life
    Posts
    10,104
    Originally posted by Jona
    ... you'd come in here and tell me that I did something wrong.
    Sorry. Didn't mean it that way. As brendandonhue said, I just wanted to provide a better alternative, as the other regexp would't really do all that much to validate a URL

    Personal website http://www.ryanbrill.com/
    Business website: http://www.infinitywebdesign.com/
    TypeSpace http://www.typespace.org/

    I reject your reality and substitute it with my own!

  12. #12
    Join Date
    Jan 2003
    Location
    Texas
    Posts
    10,413
    Yeah, I know. I wasn't upset or anything--besides, it's your job lol. But like I said, I knew you'd come over here and say something about it. No offense taken whatsoever.

    Jona
    Visit Slightly Remarkable to see my portfolio, resumé, and consulting rates.

  13. #13
    Join Date
    Dec 2002
    Location
    High on life
    Posts
    10,104
    This regexp is a touch tiddier than my last, and allows for an optional trailing slash:

    PHP Code:
    if (preg_match("/^(http(s?):\\/\\/|ftp:\\/\\/{1})((\w+\.)+)\w{2,}(\/?)$/i"$url)) { 

    Personal website http://www.ryanbrill.com/
    Business website: http://www.infinitywebdesign.com/
    TypeSpace http://www.typespace.org/

    I reject your reality and substitute it with my own!

  14. #14
    Join Date
    May 2003
    Posts
    550
    Ok thanks, got it

  15. #15
    Join Date
    Nov 2002
    Location
    NY, USA
    Posts
    731
    I built a regular expression for URIs about the same time I made one for e-mails. This one comes from RFC2396 Uniform Resource Identifiers (URI): Generic Syntax. Because it is generic syntax, it does not test for any specific scheme, which is good and bad. Good: allows for the many valid schemes currently in existence, and others to be created later. Bad: a non-valid scheme will not be recognized.

    But anyway, it is attached. Since this one is far more complicated and lengthy than the e-mail, I've also attached the outlines for each part. It should help you to break it down, or build another one up if you choose to.

    I have not found any bugs, but if anyone else does please tell me.
    Attached Files Attached Files
    for(split(//,'))*))91:+9.*4:1A1+9,1))2*:..)))2*:31.-1)4131)1))2*:3)"'))
    {for(ord){$i+=$_&7;grep(vec($s,$i++,1)=1,1..($_>>3)-4);}}print"$s\n";

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
HTML5 Development Center



Recent Articles