WordPress robots.txt file optimized for SEO and Google

« Speedier Sites use CSS SpritesOn-Demand MySQL Backup Shell Script »

May 10, 07

WordPress Blog robots.txt robotA robots.txt file can make a huge impact on your WordPress blogs traffic and search engine rank. This is an SEO optimized WordPress robots.txt file. Keep in mind that if you mess up the robots.txt file by blocking too much, you could lose all of your rank.


Download the complete file: WordPress robots.txt file

I was inspired to revisit this topic after reading Creating the ultimate WordPress robots.txt file, then I revisited this once again and created the: Updated WordPress robots.txt file

Google Says

Make use of the robots.txt file on your web server. This file tells crawlers which directories can or cannot be crawled. Make sure it’s current for your site so that you don’t accidentally block the Googlebot crawler.

header.php meta seo trick

Place this in your wordpress themes header.php file, if the page is a single, page, or if its the home page then the robots will index and follow links on it. Otherwise search engines will not index the pages but will still follow the links.

<?php if(is_single() || is_page() || is_home()) { ?>
    <meta name="googlebot" content="index,noarchive,follow,noodp" />
    <meta name="robots" content="all,index,follow" />
  <meta name="msnbot" content="all,index,follow" />
<?php } else { ?>
    <meta name="googlebot" content="noindex,noarchive,follow,noodp" />
    <meta name="robots" content="noindex,follow" />
  <meta name="msnbot" content="noindex,follow" />
<?php }?>

 

seo robots.txt

See the Updated WordPress robots.txt file

User-agent:  *
# disallow all files in these directories
Disallow: /cgi-bin/
Disallow: /z/j/
Disallow: /z/c/
Disallow: /stats/
Disallow: /dh_
Disallow: /wp-admin/
Disallow: /wp-includes/
Disallow: /contact/
Disallow: /tag/
Disallow: /wp-content/b
Disallow: /wp-content/p
Disallow: /wp-content/themes/askapache/4
Disallow: /wp-content/themes/askapache/c
Disallow: /wp-content/themes/askapache/d
Disallow: /wp-content/themes/askapache/f
Disallow: /wp-content/themes/askapache/h
Disallow: /wp-content/themes/askapache/in
Disallow: /wp-content/themes/askapache/p
Disallow: /wp-content/themes/askapache/s
Disallow: /trackback/
Disallow: /*?*
Disallow: */trackback/


User-agent: Googlebot
# disallow all files ending with these extensions
Disallow: /*.php$
Disallow: /*.js$
Disallow: /*.inc$
Disallow: /*.css$
Disallow: /*.gz$
Disallow: /*.cgi$
Disallow: /*.wmv$
Disallow: /*.png$
Disallow: /*.gif$
Disallow: /*.jpg$
Disallow: /*.cgi$
Disallow: /*.xhtml$
Disallow: /*.php*
Disallow: */trackback*
Disallow: /*?*
Disallow: /z/
Disallow: /wp-*
Allow: /wp-content/uploads/
 

# allow google image bot to search all images
User-agent: Googlebot-Image
Allow: /*
 
# allow adsense bot on entire site
User-agent: Mediapartners-Google*
Disallow: /*?*
Allow: /z/
Allow: /about/
Allow: /contact/
Allow: /wp-content/
Allow: /tag/
Allow: /manual/*
Allow: /docs/*
Allow: /*.php$
Allow: /*.js$
Allow: /*.inc$
Allow: /*.css$
Allow: /*.gz$
Allow: /*.cgi$
Allow: /*.wmv$
Allow: /*.cgi$
Allow: /*.xhtml$
Allow: /*.php*
Allow: /*.gif$
Allow: /*.jpg$
Allow: /*.png$
 
# disallow archiving site
User-agent: ia_archiver
Disallow: /
 
# disable duggmirror
User-agent: duggmirror
Disallow: /

The Breakdown

disallow files in these directories

User-agent:  *
Disallow: /cgi-bin/
Disallow: /z/j/
Disallow: /z/c/
Disallow: /stats/
Disallow: /dh_
Disallow: /about/
Disallow: /contact/
Disallow: /tag/
Disallow: /wp-admin/
Disallow: /wp-includes/
Disallow: /contact
Disallow: /wp-
Disallow: /feed/
Disallow: /trackback/

disallow all files ending with these extensions

User-agent: Googlebot
Disallow: /*.php$
Disallow: /*.js$
Disallow: /*.inc$
Disallow: /*.css$
Disallow: /*.gz$
Disallow: /*.wmv$
Disallow: /*.cgi$
Disallow: /*.xhtml$

disallow all files with ? in url

Disallow: /*?*

disable duggmirror

User-agent: duggmirror
Disallow: /

disallow WayBack archiving site

User-agent: ia_archiver
Disallow: /

allow google image bot to search all images

User-agent: Googlebot-Image
Disallow:
Allow: /*

allow adsense bot on entire site

User-agent: Mediapartners-Google*
Disallow:
Allow: /*

 

Google User-agents

Googlebot
crawl pages from our web index and our news index
Googlebot-Mobile
crawls pages for our mobile index
Googlebot-Image
crawls pages for our image index
Mediapartners-Google
crawls pages to determine AdSense content. We only use this bot to crawl your site if you show AdSense ads on your site.
Adsbot-Google
crawls pages to measure AdWords landing page quality. We only use this bot if you use Google AdWords to advertise your site. Find out more about this bot and how to block it from portions of your site.

 

Google Sponsored Robots.txt Articles

  1. Controlling how search engines access and index your website
  2. The Robots Exclusion Protocol
  3. robots.txt analysis tool
  4. Googlebot
  5. Inside Google Sitemaps: Using a robots.txt file
  6. All About Googlebot

 

AskApache Robots.txt Articles

 

seo is the process of getting websites onto the first page of search engine results

« Speedier Sites use CSS SpritesOn-Demand MySQL Backup Shell Script »


Reader Comments

Skip to form
  1. Tyrone Campbell says:August 04, 13h

    Thats a really good system and plugin since i run a wordpress mu site this will really help out people using the site, one other thing ive been looking for it a sitemap generator that works for wordpress mu, know of any? (email me answer please :D )

    thanks again!

  2. John Doro says:July 11, 19h

    I’d like to try it, but how safe is this? Can we undo this and re-gain our PR supposed this customization won’t work?

  3. AskApache says:May 28, 21h

    @Rene

    Its a very confusing subject.. But you can find out what a robots.txt does in general just by reading through a couple posts on this blog.

    Basically this post is my best robots.txt at that point in time, and the robots.txt I have now is even better.. for me.

    All robots.txt does is tell search engines what to NEVER include in search engine results, so I have moved more towards keeping the robots.txt more open and using meta tags instead.

  4. Rene Dwight says:May 22, 15h

    This is an amzing post but I have a couple of questions,,

    How can adding the above robots.txt file boost rank and add traffic?

    Also please forgive me if I am being dumb.. I checked the robots txt file for your site and it is not like the examples above? I don’t quite understand?

    Kindest regards Rene

  5. max says:March 24, 12h

    Can you also include the noarchive command for the robots and msnbot lines, as well as the googlebot line? If not, why?

  6. eBlogger says:October 04, 7h

    I was looking for a detailed Search Engine Optimization article on the robots.txt
    This is really detailed and useful.

    Thanks

    :-)

  7. JoeyPrimiani says:August 19, 22h

    Google and the Marching RobotsFriday, I called my bffs at Google to get the answer straight on the robots.txt file. If you have never heard of a robots.txt file, it is a simple text file that [...]

  8. derek says:June 19, 23h

    Also could I do something like this?

    User-agent: *
    Disallow: */wp-content/
    Disallow: */wp-admin/
    Disallow: */wp-includes/
    Disallow: */wp-
    Disallow: */feed/
    Disallow: /trackback/
    Disallow: /cgi-bin/

  9. derek says:June 19, 22h

    If I put my robots.txt in my root but have my blog installed in a subfolder how can I exclude correctly.
    I mean my cgi-bin folder would work but my blog is installed in a folder called blog which means /blog/wp-admin for instance
    how to disallow then?

  10. Lovedeep Wadhwa says:June 06, 0h

    Amazing post. Thanks.

  11. Pablo Rosales says:May 27, 0h

    Great!! This is exactly what I needed!

    I’ still not sure what exactly remove from your seo robots.txt but I will give it a try :)

  12. tech says:April 01, 18h

    great archive for wordpress users !

Like honey.. We Keep em' Buzzin

Be polite, my .htaccess anti-spam is crazy-tight..
Please wrap any code blocks in <pre>...</pre> tags, code words in <code>...</code> tags.

WebDev Technology

Someone's Reading

Related Articles

Popular

I'm Reading

Technology Articles

Online Tools

.htaccess Forum

Ask Apache News

Random Articles

Other Articles

This work by AskApache.com is licensed under the most accommodating license type available, just credit source according to license. .htaccess examples


Search