How the search engines plumbs every website?
If i wanted to create a script for indicize various website, what technology i should use? How search engines like google plumbs every website and memorize them in a database?
I want to create a script that indicize specified sites (call them A, B and C). How a script can open an url and examine their contents?
Many programming languages include an implementation of cURL. Then they may use something like libXML to parse the text returned by the URL that was followed, to find more links to try and to analyze the page contents. For instance, in PHP
PHP DOM (libXML)
Users Browsing this Thread
There are currently 1 users browsing this thread. (0 members and 1 guests)