Okay, new to the forums (thank you) and I have a simple, or seems simple question regarding the robots.txt file for web sites.
Based on my sample below, is it necessary to list the last user-agent that is to have access to all of the site in order for the whole site to be indexed? I mean, is the robots.txt file's contents hierarchical? Anyone that can answer this question for me I would appreciate. I have found nothing on the web that supports my question.
User-agent: googlebot # all services
Disallow: /private/ # disallow this directory
User-agent: googlebot-news # only the news service
Disallow: / # on everything
User-agent: * # all robots
Disallow: /something/ # on this directory
No. Robots.txt is an exclusion protocol. I.e. The default is for all bots index everything unless an exclusion applies.
Jedaisoul, thanks for your reply. However, with the fact that the last User-Agent listed encompasses all bots, does that have to be listed last in the example provided?
The order only matters if "Allow" is used. In which case the "Allow" should precede the "Disallow".