Click to See Complete Forum and Search --> : Dead Url Search!


reverentx
08-06-2005, 01:46 AM
is it posibble to make a small script in php there check if the page
do not exist anymore.
And also checks for a 404 redirect or just an redirect
and if it not find the pages or it redirected then it returns with an $error=redirect ?

Sheldon
08-06-2005, 02:13 AM
Save a text file as error.txt with the followinbg code the upload it to your root dir and rename to ".htaccess" file (Dont forget the .)

ErrorDocument 404 /root_folder/not_found.html


You can select and type of page or address just change after the first /location/file

These are other file types to add into this same file
ErrorDocument 500 /cgi-bin/crash-recover
ErrorDocument 500 "Sorry, our script crashed. Oh dear
ErrorDocument 500 http://xxx/
ErrorDocument 404 /Lame_excuses/not_found.html
ErrorDocument 401 /Subscription/how_to_subscribe.html


Hope this helps

Sheldon

reverentx
08-06-2005, 02:26 AM
i like to check not my domain but other domains!
if they do not exist like: http://www.domain.do.not.exist.com
or they have a 404 direct....
(people are adding their pages to our database and I like to check if that domain exist automatic instead of I need to check them manually...
I have 200 pages for approval every day....)
and i like to make a small scrip checking the urls if they exist or they have a redirect then I can disable them automatic.... :D

ShrineDesigns
08-06-2005, 02:43 AM
i use this for my sitefunction lookup($uri)
{
$u = parse_url($uri);

if(!isset($u['path']))
{
$u['path'] = '/';
}
if(!isset($u['port']))
{
$u['port'] = getservbyname('www', 'tcp');
}
$addr = gethostbyname($u['host']);
$sp = @socket_create(AF_INET, SOCK_STREAM, SOL_TCP);

if(!$sp)
{
return false;
}
@socket_set_timeout($sp, 1, 0);
$result = @socket_connect($sp, $addr, $u['port']);

if(!$result)
{
return false;
}
$in = "HEAD {$u['path']} HTTP/1.1\r\nHost: {$u['host']}\r\nConnection: Close\r\n\r\n";
@socket_write($sp, $in, strlen($in));
$out = @socket_read($sp, 1024);
@socket_close($sp);
$out = explode("\r\n", $out, 2);
unset($out[1]);
$out = explode(" ", $out[0]);
return array($out[1], $out[2]);
}

reverentx
08-06-2005, 02:58 AM
But I understand some of the script
'domain' should be replaced with $url - correct
'port' i am not shure of what are doing?
getservbyname('www', 'tcp'); is the any part the need to be changed in that one?
I am not strong in arrays how do I call that variabel when it comes back

ShrineDesigns
08-06-2005, 05:55 AM
you don't need to modify the function it works, the returned value is an array, with the index 0 is the status code, and index 1 is the status message
example<?php
// ...

$u = 'http://www.google.com';
print_r(lookup($u)); // should return '200', 'Ok'

$u = 'http://www.abc.xyz/path/to/nowhere.htm';
print_r(lookup($u)); // should return '404', 'Not Found'
?>

reverentx
08-06-2005, 08:18 AM
if (lookup($s_bar_url)) == "'200','ok'") {
is this correct?

reverentx
08-06-2005, 09:38 AM
Great It Working Thanks

ShrineDesigns
08-06-2005, 06:38 PM
it would actually it be more like this<?php
// ...
$st = lookup('http://www.google.com/');

if(!empty($st) && $st[0] < 400)
{
echo 'url exists';
}
else
{
echo 'url does not exists or could not connect';
}
?>

reverentx
08-07-2005, 02:27 AM
from the function : return array($out[1], $out[2]);
your last code: if(!empty($st) && $st[0] < 400)
I am a little confused of this two variable names whitch to use???
my final code nr 1:
lookup($s_bar_url);
if(!empty($s_bar_url) && $s_bar_url[0] < 400) {

second check code:
$tsturl = lookup($newurl);
if(empty($tsturl) && $tsturl[0] > 399) {
I havent tutched anything in the function.....
and it seems that my code1 dosen't work!
2 code i have not checked yet
but are they correct.....

ShrineDesigns
08-31-2005, 05:09 PM
oops, i posted it with an error in it, i updated it and it works correctly now