www.webdeveloper.com
Results 1 to 4 of 4

Thread: Robots.txt Question

  1. #1
    Join Date
    Oct 2013
    Posts
    2

    Question Robots.txt Question

    Okay, new to the forums (thank you) and I have a simple, or seems simple question regarding the robots.txt file for web sites.

    Based on my sample below, is it necessary to list the last user-agent that is to have access to all of the site in order for the whole site to be indexed? I mean, is the robots.txt file's contents hierarchical? Anyone that can answer this question for me I would appreciate. I have found nothing on the web that supports my question.

    HTML Code:
    User-agent: googlebot        # all services
    Disallow: /private/          # disallow this directory
     
    User-agent: googlebot-news   # only the news service
    Disallow: /                  # on everything
     
    User-agent: *                # all robots
    Disallow: /something/        # on this directory

  2. #2
    Join Date
    Mar 2012
    Posts
    1,402
    No. Robots.txt is an exclusion protocol. I.e. The default is for all bots index everything unless an exclusion applies.

  3. #3
    Join Date
    Oct 2013
    Posts
    2
    Jedaisoul, thanks for your reply. However, with the fact that the last User-Agent listed encompasses all bots, does that have to be listed last in the example provided?

    Thanks,

    Gregg

  4. #4
    Join Date
    Mar 2012
    Posts
    1,402
    The order only matters if "Allow" is used. In which case the "Allow" should precede the "Disallow".

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
HTML5 Development Center



Recent Articles