Click to See Complete Forum and Search --> : How do I search through one web site?


Annette
05-17-2004, 03:43 PM
I have a file containing web site names. I wish to create a program that will search each web site that I have listed in my file for a keyword. Does anyone have any idea how to do this?

PeOfEo
05-17-2004, 10:40 PM
You do not create a 'program'. You will be using a server side language and search through an array / dataset / data base. There are a few different search algorythms (binary search is the first that comes to mind), then you can also sort the data by the number of times a keyword appears in a block of text using regex etc. But you are going to need a server side language ot do all of this. You can search the content of each website as I descibed above, you just have to get the source of the site through the server, which is not super hard. You can also just use a google site search thingy.

buntine
05-18-2004, 02:50 AM
You do not create a 'program'.

That is a program.. Anything which tells a computer to perform certain operations is a program.

Regards,
Andrew Buntine.

Annette
05-18-2004, 09:25 AM
Originally posted by PeOfEo
You can search the content of each website as I descibed above, you just have to get the source of the site through the server, which is not super hard. You can also just use a google site search thingy.

How do I read the source of a website and the source of all the web pages associated with that website? I have no idea.

buntine
05-18-2004, 11:26 AM
Each language has differing methods. Have you got any programming experience to work from?

Your best bet would be to use XML or XPATH. But it can also be done in most server-side languages.

Regards,
Andrew Buntine.

PeOfEo
05-18-2004, 08:00 PM
well with my server side language of choice, asp.net, I would either just read the file (assumeing it is a local file) like a text file, or I would use system.net.sockets to get the remote source.
http://articles.************.com/p/articles/mi_m0MLV/is_12_2/ai_95540470 is actually on parsing remote documents. You could also play with some xml with the scripting


with asp I might do something like this... I have no idea what this article is saying, but I dug it up
http://www.aspfree.com/c/a/ASP/Getting-Remote-Pages-with-ASP/

That is a program.. Anything which tells a computer to perform certain operations is a program.

Regards,
Andrew Buntine.

For our purposes we should make a distinction. Program should be kept to executable application. Because the term application can be used for a website or for an executable, and now program too? Please, lets make a distinction... you are going to confuse me if we dont :(