Click to See Complete Forum and Search --> : getstore error.
buggy
08-25-2004, 11:36 AM
Firstly, I am new to perl. I am trying the get web pages using getstore(). I have taken a page, extracted links from it and am trying to get the individual pages. I have done this on one site and all is perfect. However with the second I am not getting the page back even though it exists, it seems to return a default page. I have broken down the problem, here is the code:
use LWP::Simple;
$url[0] = 'http://www.jobs.ie/detail.php?nx_t=s&_c=21&_r=1&csp=1&id=80474';
unless (defined ($ret1 = getstore($url[0],'test.html'))) {
# die "could not locate page test\n";
}
The follwing page is stored:
#http://www.jobs.ie/?NXS=c69bf023d9987bcf7a4e38d08d0c9234
any ideas would be great.
silent11
08-25-2004, 11:43 AM
Your code is working fine for me.
Are you attempting the getstore() from the same computer? Perhaps one is behind a firewall.
Now the link you gave is different from $url[0], which are you having trouble getstor()ing?
buggy
08-26-2004, 03:52 AM
I want to get the page : 'http://www.jobs.ie/detail.php?nx_t=s&_c=21&_r=1&csp=1&id=80474'. If I am just running the code straight the page
http://www.jobs.ie/NXS=c69bf023d99...a4e38d08d0c9234
is stored.
However, and this is the wierd piece, if I firstly go onto the site www.jobs.ie and search and pull up the job on the page 'http://www.jobs.ie/detail.phpnx_t=s&_c=21&_r=1&csp=1&id=80474',then close down my browser, and then run my script above, all works perfectly but that I mean the correct page 'http://www.jobs.ie/detail.php?nx_t=s&_c=21&_r=1&csp=1&id=80474' is returned.
I can go and view any job take the url for it, paste it into my code and getstore to fine but if I dont view the page first in my browser I get the default page
http://www.jobs.ie/NXS=c69bf023d99...a4e38d08d0c9234
all the time.
buggy
08-26-2004, 05:55 AM
found it: NXS=c69bf023d99...a4e38d08d0c9234 relates to the session id. I need to have a valid session in my string.
http://www.jobs.ie/detail.php?NXS=240e17a6a8046c8735520c003f01c760&_t=s&_c=21&_r=1&csp=1&ID=80185
works fine. I can add the session each time and all should be fine, I think.
buggy
08-26-2004, 06:43 AM
I have now input the session id into the string and it still is not returning the correct page on most occasion. Any ideas??
silent11
08-26-2004, 07:53 AM
I understand that the module WWW-Mechanize works well with sessions. Give it a try.
http://search.cpan.org/~petdance/WWW-Mechanize-1.02/lib/WWW/Mechanize.pm
buggy
08-26-2004, 08:15 AM
I am got to the bottom the problem. Yes, I needed to include the seesion id, but I also made a silly error regarding case in the url. It now is working fine, thanks for you help.