www.webdeveloper.com
Results 1 to 3 of 3

Thread: Hellp with a Pattern in Robots.txt

  1. #1
    Join Date
    May 2005
    Posts
    59

    Hellp with a Pattern in Robots.txt

    Hi,
    I am using a CMS that is generating all sorts of urls related to same page. So I want to block all variations of the the url but sill let search engines crawl the url without parameters, you think this will work?

    Disallow: /member/profile_*.html?*

    As the pattern contains "?" I believe /member/profile_*.html will still be crawled

    I want /member/profile_*.html to be crawled but none of the other variations with the parameters.

    Please advise, thank you!

  2. #2
    Join Date
    Oct 2013
    Posts
    517
    From my reading of info at robotstxt.org, it seems that while you can wildcard robots with a * you cannot do the same with files, nor can you match a pattern or parameters. You can disallow entire folders and specifically named files.

    Specifically:
    you cannot have lines like "User-agent: *bot*", "Disallow: /tmp/*" or "Disallow: *.gif".
    Read more at: http://www.robotstxt.org/robotstxt.html

  3. #3
    Join Date
    May 2005
    Posts
    59
    Quote Originally Posted by Kevin2 View Post
    From my reading of info at robotstxt.org, it seems that while you can wildcard robots with a * you cannot do the same with files, nor can you match a pattern or parameters. You can disallow entire folders and specifically named files.

    Specifically:


    Read more at: http://www.robotstxt.org/robotstxt.html
    Thanks for your reply but the above url does explain much about file paths, read this post from Google, especially the section "URL matching based on path values", it explains the use "*" and "$" to match file paths.

    https://developers.google.com/webmas...ocs/robots_txt

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  
HTML5 Development Center



Recent Articles