How the search engines plumbs every website?
If i wanted to create a script for indicize various website, what technology i should use? How search engines like google plumbs every website and memorize them in a database?
I want to create a script that indicize specified sites (call them A, B and C). How a script can open an url and examine their contents?
Many programming languages include an implementation of cURL. Then they may use something like libXML to parse the text returned by the URL that was followed, to find more links to try and to analyze the page contents. For instance, in PHP
PHP DOM (libXML)
"Please give us a simple answer, so that we don't have to think, because if we think, we might find answers that don't fit the way we want the world to be."
~ Terry Pratchett in Nation
How to Ask Questions the Smart Way
(not affiliated with this site, but well worth reading)
Users Browsing this Thread
There are currently 1 users browsing this thread. (0 members and 1 guests)