Results 1 to 4 of 4

Thread: Do you know any existing indexing script?

  1. #1
    Join Date
    Jun 2009

    Question Do you know any existing indexing script?

    I'd like to make an online index of existing web pages.
    The website to index is not mine, but it doesn't have a search tool nor have it anytime soon.

    I can download them all to my local computer, and make them all wordpress pages (I'm good at it, but not at SQL) but I think my missing link is how to correlate the content with the real online page. If I had an existing tool / system to index pages that would probably fill in the gap, because I don't really need the content other than to create the index. After that, the content is useless.

    So the found pages should link to the original website, not to the one I'll put up online, which will be only a search form.

    Any idea?

  2. #2
    Join Date
    May 2004
    chennai, tamil nadu, India
    Here is an indexing script written in php

    Chris, Senior Developer,
    Php laravel developers,

  3. #3
    Join Date
    Jun 2009
    Thanks! It's exactly what I was looking for

  4. #4
    Join Date
    Jun 2009
    Ok, I've tried the script:
    The problem is… I can't get it to browse as a browser's agent and it keeps connecting as a "robot", and relying on the robots.txt file, failing to index the pages marked as disallow… or at least so says the error message: "File checking forbidden by required/disallowed string rule".

    I tried to change some if conditions, to make it NOT to find the robots file, or ignore it, but it didn't work. I also tried a mod I found online to "ignore robots" but it did the same, except there was no error. it just ended. Sphider-plus (1.6) did the same.

    If anyone knows how to hack it, I'd appreciate the tip.

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
HTML5 Development Center