Click to See Complete Forum and Search --> : Check if something is an an array


Sephiroth32
03-04-2003, 02:14 PM
I am trying to create a script to check a sites source code for a certain URL to see if they are linking to you.

Unfortunately, I do not know the code to check if an array has a certain url.

Can someone help please? Thanks in advance

Charles
03-04-2003, 02:30 PM
print "It's in there!\n" if (grep {m|^http://www.fee.org/?$|} qw(http://www.fee.org/ http://www.fie.org/ http://www.foe.org/ http://www.fum.org/));

Sephiroth32
03-05-2003, 05:23 PM
to be honest that code kind of confuses me.

Where in the if statement does it say what url to check in?

Sephiroth32
03-13-2003, 02:19 PM
bump :\

Nedals
03-13-2003, 02:43 PM
my @urllist = qw(http://www.fee.org/ http://www.fie.org/ http://www.foe.org/ http://www.fum.org/);
# when using qw, quotes are NOT needed and spaces are used to delimit

my @urllist = ('http://www.fee.org/','http://www.fie.org/','http://www.foe.org/','http://www.fum.org/');
# equivalent
my $urltofind = 'http://www.fee.org/';

# written another way...
if (grep {m|^$urltofind?$|} @urllist) { print "It's in there!\n" }
# grep will search @urllist for zero or one (?) occurances of $urltofind

Hope that helps!

ps: what does bump :\ mean?

Sephiroth32
03-13-2003, 04:05 PM
thanks alot. So if I wanted to scan www.text.com for www.url.com I would put:

my @urllist = ('http://www.test.com/');
my $urltofind = 'http://www.url.com/';
if (grep {m|^$urltofind?$|} @urllist) { print "It's in there!\n" } $urltofind


EDIT:

I tried at first:

my @urllist = ('http://www.unitedff.com/');
my $urltofind = 'http://www.fantasysquare.com/';
if (grep {m|^$urltofind?$|} @urllist) { print "It's in there!\n" }
else { print "check failed\n"; } $urltofind

and it printed check failed even thougn fantasysquare.com is in the source code of unitedff.com. Then I tried:

my $urlist = "http://www.unitedff.com/';
my $urltofind = "http://www.fantasysquare.com/';
if (grep {m|^$urltofind?$|} $urllist) { print "It's in there!\n"; }
else { print "lol\n"; }

and I got an internal server error

Nedals
03-13-2003, 05:45 PM
No! No! No! You must be new at this! :)

What this does is to search a list of URL's (@urllist) to see if a specific URL ($urltofind) is in that list.

So if I wanted to scan www.text.com for www.url.com
What does this mean??
'www.url.com' IS NOT in a list that ONLY contains 'www.test.com' , obviously!!

Maybe a silly question, but do you understand lists and variables?? :confused:

Sephiroth32
03-13-2003, 06:07 PM
lol yeah to be honest I didnt really read the code to well :\

*slaps forhead*

Is there anyway to open up a websites source code? I suppose I could open up 'view-source:url" but that only works in IE

Nedals
03-13-2003, 06:15 PM
Sure!
right-click and view source OR on the menu bar VIEW:SOURCE

BUT...
If you're looking for Perl code, you're SOL. That's on the server and cannot be viewed.

Sephiroth32
03-13-2003, 06:25 PM
damn..

lol I know how to view the source manually :P but there is no way to check the source? oh well. I have seen link checking scripts though they must do it somehow..

jeffmott
03-14-2003, 08:21 AM
use LWP::Simple;
my $source = get('http://www.w3.org/');
if ($source =~ m|http://(?:www\.)?yoursite\.com/?|) {
print 'Yup';
}
else {
print 'Nope';
}

Sephiroth32
03-14-2003, 02:16 PM
Thanks alot you all have been a huge help

also this made me realize how little I really know about CGI. I know the basics and everything but is thete some online tutorials I can go to to learn more?

Nedals
03-14-2003, 02:35 PM
this made me realize how little I really know about CGICGI is the interface beween the browser and the server (kind of). The programming language is Perl! :)

Sephiroth32
03-14-2003, 02:58 PM
:o arg

edit: ugh not again. I tried putting jeffmott's code in the script but replace the first url with http://www.fantasysquare.com and http://www.unitedff.com and it just prints nothing :\

Nedals
03-14-2003, 03:56 PM
OK, Jeff, I have to ask!!! :)

use LWP::Simple;
my $source = get('http://www.w3.org/'); # get(....) ???
if ($source =~ m|http://(?:www\.)?yoursite\.com/?|) { print 'Yup'; }
else { print 'Nope'; }
What does this do?

I think what Sephiroth32 is asking for is some technique to download someone else's Perl code off thier server, which, as far as I know, is impossible!

Sephiroth32
03-14-2003, 04:03 PM
no no! lol

I want to download the source code off a webpage and scan it for a url

I realize you can use view-source but I need to create a script to scan multiplke source codes at once for my link to make sure sites are linking back ti me

Nedals
03-14-2003, 06:40 PM
My mistake! :o
So, within your Perl script you want to get the HTML code from another site.
Here's an educated guess re. Jeff's code

This returns the HTML code

my $source = get('http://www.w3.org/');

Now you need to search it for your URL

if ($source =~ m|http://www\.domain\.com|) { print "It's there"; }
else { print "It's not there"; }

Try it this way. Do you know Perl? The \. is meant to be there

For this to work, the complete URL must exist in the HTML code. Jeff's code allows for other variations.

Sephiroth32
03-14-2003, 07:42 PM
thanks alot for your help I will try it now

but can you please stop treating me like a complete newb lol. I know \ is supposed to be there :) I am just not a perl expert :P

thanks again, trying it now



EDIT:

arg! It isnt printing anything at all. I pasted your code and replaced the urls. This is what I have:


#!/usr/bin/perl
require "cgi-lib.pl";
&ReadParse;
print &PrintHeader;
my $source = get('http://www.unitedff.com/');
if ($source =~ m|http://www\.fantasysquare\.com|) { print "It's there"; }
else { print "It's not there"; }


and yes I have cgi-lib.pl in the directory. dont ask about the &ReadParse stuff, I just copied a cgi file I had already made. to lazy to make a new one

Nedals
03-14-2003, 07:58 PM
please stop treating me like a complete newb lolIt won't happen again! :D

jeffmott
03-14-2003, 08:08 PM
I pasted your code and replaced the urls. This is what I have...Is this your entire script? Don't forget to import the LWP::Simple module.use LWP::Simple;

Sephiroth32
03-14-2003, 08:22 PM
ugh nm you may treat me like a newb :)


EDIT: errr I added the code at the beginning of all that code you gave me and it still doesnt print anything. This is the html it makes:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META content="text/html; charset=windows-1252" http-equiv=Content-Type></HEAD>
<BODY></BODY></HTML>

jeffmott
03-14-2003, 08:44 PM
We're going to enable some warnings to ensure that LWP::Simple is being loaded properly, among other things. I can't guarentee that cgi-lib would pass through these warnings so I'm not going to use it for this example.
#!/usr/bin/perl -wT

use strict;
use warnings FATAL => 'all';
use CGI::Carp qw{fatalsToBrowser};

use LWP::Simple;
my $source = get('http://www.w3.org/');

print "Content-Type: text/html\n\n";
if ($source =~ m|http://(?:www\.)?yoursite\.com/?|) {
print 'Yup';
}
else {
print 'Nope';
}

Sephiroth32
03-14-2003, 10:05 PM
internal server error :\

I also tried taking out the -tw or whatever you had in the path to perl and it didnt help

jeffmott
03-15-2003, 10:42 AM
internal server error :\
Did you copy everything? The line with fatalsToBrowser should display the actual error message instead of just an Internal Server Error message.

Sephiroth32
03-15-2003, 10:47 AM
yesh I know what fatalstobrowser does. I copied everything you put also

jeffmott
03-15-2003, 10:52 AM
Are you certain you have the correct path to Perl? The correct permissions? Uploaded as text? These are the only things that will actually keep the CGI::Carp module from reporting the errors.

Sephiroth32
03-15-2003, 11:32 AM
I uploaded in ASCII, chmoded 755, and use the correct path to perl. I will double check everything now

....yup its fine

jeffmott
03-15-2003, 03:18 PM
...alright then, first we'll make sure a simple hello world script works fine.
#!/usr/bin/perl -wT

print "Content-Type: text/html\n\n";
print 'Hello World';
If that works fine then try this next
#!/usr/bin/perl -wT

use LWP::Simple;

print "Content-Type: text/html\n\n";
print get('http://www.w3.org/');

Sephiroth32
03-15-2003, 04:44 PM
lol I already know my server can handle perl, I run alot of scripts on it so I didnt even try the hello world one.

I put in the second peice of code and there are no errors it just prints nothing

EDIT: I tried the hello world for the heck of it and it worked

Nedals
03-15-2003, 05:28 PM
#!/usr/bin/perl -wT
use LWP::Simple;

print "Content-Type: text/html\n\n";
print get('http://www.w3.org/');
Jeff, will this work within a print statement? Just asking!


Sephiroth32:
Maybe this. At least you will know that the problem lies in the 'get' statement.
#!/usr/bin/perl -wT
use LWP::Simple;

$source = get('http://www.w3.org/');
print "Content-Type: text/html\n\n";
print "Source follows\n\n";
print "$source\n\n";
print "Source is above\n";
exit;

Sephiroth32
03-15-2003, 05:44 PM
Nedals I get an internal server error from that code :\

Nedals
03-15-2003, 06:21 PM
error 500??

If so, add this, as jeff recomended, and you should be able to focus in on the problem.

use CGI::Carp qw{fatalsToBrowser};

Another, maybe overly simplistic, test is to set $source to a line of text. If that doesn't work then I'm not sure what to tell you. You are basically back to 'hello world'.

However, if it does, then there's a problem with that 'get' statement. At that point, scream for Jeff :)

jeffmott
03-15-2003, 08:29 PM
I put in the second peice of code and there are no errors it just prints nothingAdd the following line to that same test script.use warnings FATAL => 'all';If it still runs without an error then that means that LWP::Simple is available on your server and is being correctly loaded. That it doesn't return anything is rather odd. Is this being run off a permanent Web server or your local machine? If it's a Web server, check if they have any special restrictions on initiating additional external requests. If it's your local machine make sure you have access to the internet when the script is run.

Jeff, will this work within a print statement? Just asking!Yes. The get() subroutine simply returns a value and you may do whatever you want with that value, such as assign it to a variable or print it.

I get an internal server error from that codeAs Nedals suggested, try the fatalsToBrowser again. If that still fails for some reason then check your error log on your server. Ask the administrator if you don't know where to find this.