Click to See Complete Forum and Search --> : perl Robot


amazing_andr3
05-16-2005, 04:47 PM
Can someone point me to a resource that can get me started in building a robot to access and submit information to a particular site?

And the second question is why can't PHP build a robot, how come I have to learn a new language to do this?

tnx

CyCo
05-16-2005, 04:50 PM
http://search.cpan.org/dist/libwww-perl/lib/LWP/RobotUA.pm

amazing_andr3
05-16-2005, 05:04 PM
Thanks that's useful but I think I also need a tutorial that would explain things a bit more.
Basically what I need is a robot that pretends it's a browser, sends a (hard coded) POST request, stores a cookie and uses it in further POSTs.

sammy2222
05-24-2005, 05:07 PM
And the second question is why can't PHP build a robot, how come I have to learn a new language to do this?
what you mean PHP can't do this it can! much easyer than PERL I think anyway look at this it pretends to be a browerser called "msIE6 testing"

function gethtml($url) {
$info = parse_url($url);
$fp = @fsockopen($info[host], 80, $errno, $errstr, 10);
if (!$fp) {
print "Error: that url you entered seems not to exist.\n";
} else {
if(empty($info[path])) {
$info[path] = "/";
}
$out = "GET ".$info[path]."".$info[query]." HTTP/1.1\r\n";
$out .= "Host: ".$info[host]."\r\n";
$out .= "Connection: close \r\n";
$out .= "User-Agent: msIE6 testing\r\n\r\n";
fwrite($fp, $out);
$html = '';
while (!feof($fp)) {
$html .= fread($fp, 8192);
}
fclose($fp);
}
$pieces = explode("\r\n\r\n", $html,2);
$html = $pieces[1];
unset($pieces);
return $html;
}
$html = gethtml(http://www.testsite.com);
print "$html";

that code is untested. if you search google there is some code I have seen that follows robot rules and sruff.
anyway good luck with what your doing.
sam

amazing_andr3
05-28-2005, 01:47 PM
Yes but how will PHP save the cookie and send it again in a subsequent request?

amazing_andr3
06-21-2005, 04:49 PM
I finally got it. A cookie is just another text line in the request, so once you figure out the protocol, it's trivial to receive, store, and send them back.

I also found out how to 'power up' my so-called robot using Cron jobs.

Thanks sammy.

I guess I figured Perl was the only option because I saw some function names with 'cookie' in them, but nah... PHP works just fine for me.