/    Sign up×
Community /Pin to ProfileBookmark

Extract Domain From Url

Folks,

One look at google or StackOverFlow and programmers come with variety of ways to extract a domain from a url. I’m spoilt for choice. Need to fool-proof one. How-about we checkout the code you use yourselves ? as of me, like I said. Spoilt for choice and confused.

to post a comment
PHP

13 Comments(s)

Copy linkTweet thisAlerts:
@ginerjmNov 19.2021 — Do a phpinfo and LOOK at what you can deduce from that output
Copy linkTweet thisAlerts:
@NogDogNov 19.2021 — If you are talking about an arbitrary URL from some input or other source: [u][parse_url()](https://www.php.net/parse_url)[/u]

If you mean the hostname where the script is currently running: [u][$_SERVER['SERVER_NAME']](https://www.php.net/manual/en/reserved.variables.server.php)[/u]
Copy linkTweet thisAlerts:
@developer_webauthorNov 19.2021 — @NogDog#1639628

I meant extracting the second level name 'nogdog.com', in our example, from the likes of:

http://www.nogdog.com/index.php?search=dogs

http://nogdog.com/index.php?category_dogs/1.html

http://dogs.nogdog.com

http://canines.dogs.nogdog.com
Copy linkTweet thisAlerts:
@developer_webauthorNov 19.2021 — Invalid url or not but this code seems to be working to extract all level domains:
<i>
</i>&lt;?php

$url = 'http://canine.dogs.nogdog.com/index.php?search=alsation';
$domain = parse_url($url);
echo $domain['host'];

?&gt;

Outputting:

**canine.dogs.nogdog.com**

Needed it to extract only:

**nogdog.com**
Copy linkTweet thisAlerts:
@ginerjmNov 19.2021 — A purely arbitrary output. How could any function accomplish that for you? Tomorrow you may want 'developer.com'.
Copy linkTweet thisAlerts:
@developer_webauthorNov 19.2021 — Invalid urls or not but these codes seem to be working to extract all the level domains:

1
<i>
</i>&lt;?php

$url = 'http://canine.dogs.ginerjm.com/index.php?search=alsation';
$domain = parse_url($url);
echo $domain['host']; //canine.dogs.ginerjm.com

?&gt;


2
<i>
</i>&lt;?php

$url = 'http://canine.dogs.ginerjm.com/index.php';
$domain = parse_url($url);
echo $domain['host']; //canine.dogs.ginerjm.com

?&gt;


3
<i>
</i>&lt;?php

$url = 'http://www.ginerjm.com/index.php?search=alsation';
$domain = parse_url($url);
echo $domain['host']; //www.ginerjm.com

?&gt;


Needed it to extract only the 2nd level domain:

**nogdog.com**
Copy linkTweet thisAlerts:
@NogDogNov 19.2021 — And what's going to happen when it's www.example.co.uk -- I'm guessing you don't want "co.uk"? Why does it matter if any domain has the www or other sub-domain (or sub-sub-domain)? In many cases the sub-domain is important, and going to foo.example.com will be totally different than going to bar.example.com (or going to foo.bar.example.com).

If you really, really need to do this for some business reason that I cannot conceive of, then your friends will be explode(), array_slice(), and implode(). If you're too lazy to RTFM and figure it out, I'll do it for you for US$50 paid in advance.
Copy linkTweet thisAlerts:
@ginerjmNov 19.2021 — RTFM. My favorite acronym.
Copy linkTweet thisAlerts:
@developer_webauthorNov 23.2021 — @NogDog#1639640

Hi,

Just logged on after a few days.

I wrote this script and was delighted:
<i>
</i>&lt;?php

$url = 'http://www.nogdog.com/?search=cars';
$parse_url = parse_url($url);
$domain = $parse_url['host'];
$level_domains = explode('.',$domain);
print_r($level_domains);
$level_domains_count = count($level_domains);
$array_pos = $level_domains_count - 2;
echo $level_domains[$array_pos];

?&gt;


It worked as it echoed "nogdog".

Then I read your message above. I had forgotten about the ".co.uk".

Now my code is echoing "co".
<i>
</i>&lt;?php

$url = 'http://www.nogdog.co.uk/?search=cars';
$parse_url = parse_url($url);
$domain = $parse_url['host'];
$level_domains = explode('.',$domain);
print_r($level_domains);
$level_domains_count = count($level_domains);
$array_pos = $level_domains_count - 2;
echo $level_domains[$array_pos];

?&gt;


Anyway, I have a hunch how to fix this. But tell me, is not "nogdog" the 2nd level domain from nogdog.com ? Usually, when we say register a domain name (.com,.net,.org, etc), we are refering to the 2nd level. First level is TLD.

Now, when you register nogdog.co.uk. Then which is the TLD and which is the 2nd level domain ? I mean, what domain are you registering here bcos if you consider to be registering "nogdog" as a domain that here it is the 3rd level. And we never register a 3rd level.

We never register:

www.nogdog.com.

We register:

nogdog

or:

nogdog.com.

The "www" in "www.nogdog.com" is counted as a subdomain.

So, which one is counted as a subdomain on this one then ?

www.nogdog.co.uk.

Again the "www" ? If so, then in this case the subdomain starts from level 4.

While with .com, .net, .org the "www" subdomain or subdomains altogether start from level 3.

Confusing.

Don't tell me the "nogdog.co.uk" is counted as the domain or 2nd-level while the "co.uk" is counted as one or the TLD and not "uk" is not counted as the TLD here.

It seems the ".co.uk" is an issue here. You know of any other "issuefull" domains like it ?
Copy linkTweet thisAlerts:
@developer_webauthorNov 23.2021 — @ginerjm#1639642

Then you should keep reciting "RTFM" constantly speeding up each time bobbing your head backward and fourth like they do at the wailing wail in Palestine. Just make sure you don't bang your head and bash your brains out or your brain would be of no use to us in this forum. We could do with a little php getting spat-out by your brain now and then. ;)
Copy linkTweet thisAlerts:
@developer_webauthorNov 23.2021 — @NogDog

I forgot to answer your question.

You asked why I want to extract domain names from urls.

It's because on my searchengine, I don;t want you submitting anybody else's url. Like your competition, using inappropriate keywords as a sabotage.

I only want you submitting your own urls from your own domain.

That's why, my form will ask you for your email that belongs to your domain name.

So when you submit your url, my link-submission-form will extract the domain name from it and then expect you to submit an email with that domain. No gmals, hotmails, etc.

Then my form will email you a link-submission-confirmation-link. If you click it, only then your link will be indexed. If you did not submit it (but your competition as a sabotage with false details to get you banned) then you won't click the link and my index won't list your url with the false details (anchor texts, keywords and link descriptions) they submitted.

Not getting into building spider yet. That can come later.

First, people just submit their links manually.

Guessing you're from North of England.

Anyway, reading this, which you might find interesting about the ".uk":

https://www.techradar.com/uk/news/why-your-business-needs-a-uk-domain

From that link, I got one of my questions answered that I asked you previously. It seems the ".co,uk" was counted as a TLD all this time and not just the ".uk" part. But now the ".uk" is a TLD itself all by it's own as it should've 20/30 years back. I wonder if the ".co.uk" will still be counted as a TLD or not. I guess it would.

I do not know what keywords to google to find more confusing TLDS like it. I got to feed the list to my script.
Copy linkTweet thisAlerts:
@NogDogNov 23.2021 — I'd use a regular expression to make sure whatever is at the top of the domain matches the user's email domain:
[code=php]
$user_domain = array_pop(explode('@', $user_email));
$regex = '/(^|.)' . preg_quote($user_domain) . '$/i';
$submitted_domain = parse_url($submitted_url, PHP_URL_HOST)
if(!preg_match($regex, $submitted_domain)) {
// not allowed
}
[/code]

No, I'm not going to explain it all...I actually have other things to spend hours on, including my job. See:

  • - https://www.php.net/preg_match

  • - https://www.php.net/manual/en/reference.pcre.pattern.syntax.php

  • - https://www.php.net/preg_quote

  • - https://www.php.net/explode

  • - https://www.php.net/array_pop
  • Copy linkTweet thisAlerts:
    @developer_webauthorDec 14.2021 — @NogDog#1639782

    Thanks!

    I missed your post for 3 weeks!
    ×

    Success!

    Help @developer_web spread the word by sharing this article on Twitter...

    Tweet This
    Sign in
    Forgot password?
    Sign in with TwitchSign in with GithubCreate Account
    about: ({
    version: 0.1.9 BETA 4.19,
    whats_new: community page,
    up_next: more Davinci•003 tasks,
    coming_soon: events calendar,
    social: @webDeveloperHQ
    });

    legal: ({
    terms: of use,
    privacy: policy
    });
    changelog: (
    version: 0.1.9,
    notes: added community page

    version: 0.1.8,
    notes: added Davinci•003

    version: 0.1.7,
    notes: upvote answers to bounties

    version: 0.1.6,
    notes: article editor refresh
    )...
    recent_tips: (
    tipper: @Yussuf4331,
    tipped: article
    amount: 1000 SATS,

    tipper: @darkwebsites540,
    tipped: article
    amount: 10 SATS,

    tipper: @Samric24,
    tipped: article
    amount: 1000 SATS,
    )...