« Speedier Sites use CSS SpritesOn-Demand MySQL Backup Shell Script »
A robots.txt file can make a huge impact on your WordPress blogs traffic and search engine rank. This is an SEO optimized WordPress robots.txt file. Keep in mind that if you mess up the robots.txt file by blocking too much, you could lose all of your rank.
Note: This article is outdated, over the years I’ve learned to only use robots.txt as an authoritative blacklist. Now my robots.txt is much simpler and I rely on meta tags, as detailed in my SEO article.. sorry rozkan!
Download the complete file: WordPress robots.txt file
I was inspired to revisit this topic after reading Creating the ultimate WordPress robots.txt file, then I revisited this once again and created the: Updated WordPress robots.txt file
Make use of the robots.txt file on your web server. This file tells crawlers which directories can or cannot be crawled. Make sure it’s current for your site so that you don’t accidentally block the Googlebot crawler.
Place this in your wordpress themes header.php file, if the page is a single, page, or if its the home page then the robots will index and follow links on it. Otherwise search engines will not index the pages but will still follow the links.
<?php if(is_single() || is_page() || is_home()) { ?>
<meta name="googlebot" content="index,noarchive,follow,noodp" />
<meta name="robots" content="all,index,follow" />
<meta name="msnbot" content="all,index,follow" />
<?php } else { ?>
<meta name="googlebot" content="noindex,noarchive,follow,noodp" />
<meta name="robots" content="noindex,follow" />
<meta name="msnbot" content="noindex,follow" />
<?php }?>
See the Updated WordPress robots.txt file
User-agent: * # disallow all files in these directories Disallow: /cgi-bin/ Disallow: /z/j/ Disallow: /z/c/ Disallow: /stats/ Disallow: /dh_ Disallow: /wp-admin/ Disallow: /wp-includes/ Disallow: /contact/ Disallow: /tag/ Disallow: /wp-content/b Disallow: /wp-content/p Disallow: /wp-content/themes/askapache/4 Disallow: /wp-content/themes/askapache/c Disallow: /wp-content/themes/askapache/d Disallow: /wp-content/themes/askapache/f Disallow: /wp-content/themes/askapache/h Disallow: /wp-content/themes/askapache/in Disallow: /wp-content/themes/askapache/p Disallow: /wp-content/themes/askapache/s Disallow: /trackback/ Disallow: /*?* Disallow: */trackback/ User-agent: Googlebot # disallow all files ending with these extensions Disallow: /*.php$ Disallow: /*.js$ Disallow: /*.inc$ Disallow: /*.css$ Disallow: /*.gz$ Disallow: /*.cgi$ Disallow: /*.wmv$ Disallow: /*.png$ Disallow: /*.gif$ Disallow: /*.jpg$ Disallow: /*.cgi$ Disallow: /*.xhtml$ Disallow: /*.php* Disallow: */trackback* Disallow: /*?* Disallow: /z/ Disallow: /wp-* Allow: /wp-content/uploads/ # allow google image bot to search all images User-agent: Googlebot-Image Allow: /* # allow adsense bot on entire site User-agent: Mediapartners-Google* Disallow: /*?* Allow: /z/ Allow: /about/ Allow: /contact/ Allow: /wp-content/ Allow: /tag/ Allow: /manual/* Allow: /docs/* Allow: /*.php$ Allow: /*.js$ Allow: /*.inc$ Allow: /*.css$ Allow: /*.gz$ Allow: /*.cgi$ Allow: /*.wmv$ Allow: /*.cgi$ Allow: /*.xhtml$ Allow: /*.php* Allow: /*.gif$ Allow: /*.jpg$ Allow: /*.png$ # disallow archiving site User-agent: ia_archiver Disallow: / # disable duggmirror User-agent: duggmirror Disallow: /
User-agent: * Disallow: /cgi-bin/ Disallow: /z/j/ Disallow: /z/c/ Disallow: /stats/ Disallow: /dh_ Disallow: /about/ Disallow: /contact/ Disallow: /tag/ Disallow: /wp-admin/ Disallow: /wp-includes/ Disallow: /contact Disallow: /wp- Disallow: /feed/ Disallow: /trackback/
User-agent: Googlebot Disallow: /*.php$ Disallow: /*.js$ Disallow: /*.inc$ Disallow: /*.css$ Disallow: /*.gz$ Disallow: /*.wmv$ Disallow: /*.cgi$ Disallow: /*.xhtml$
Disallow: /*?*
User-agent: duggmirror Disallow: /
User-agent: ia_archiver Disallow: /
User-agent: Googlebot-Image Disallow: Allow: /*
User-agent: Mediapartners-Google* Disallow: Allow: /*
« Speedier Sites use CSS Sprites
On-Demand MySQL Backup Shell Script »
The love of liberty is the love of others; the love of power is the love of ourselves.
-- William Hazlitt
The power of the Web is in its universality. Access by everyone regardless of disability is an essential aspect. Tim Berners-Lee
Thanks, This is exactly what I needed.
thanks!
i had some problem with comment-page-1 title duplicates and maybe I solved with these lines:
Disallow: /comment-page-*/ Disallow: /blog/comment-page/
thanks very much.. i am very confused about my duplicate content…
oh.. it’s very great… there’s not article as well as it.. I will learn slowly about it.. thanks…
thanks a lot for this useful article & configure-files provided
why wouldnt you want google to index the rest of the site as well as the home page?
(from header.php meta seo trick)
Your article very usefull & update. Thanks.
We try this robots for our blog. Hope get more value for SEO.
Great! Thanks for sharing!
I know that this article is outdated, but just one question.
As far as I know, wildcards are not supported in robots.txt (it should be based only on substring matches: http://www.robotstxt.org/faq/robotstxt.html), only “User agent:*” is accepted.
So are you sure your robots.txt (even the new one) works as you expect? Are then wildcards supported?
I have been doing some reading on SEO. I just read this blog. I am using a robots plug in. Is the robots plug in a complement or a replacement to your script? (I am a newbie. Please take it easy.)
Thanks, thats what i was looking for, SEO friendly robots.txt
Hello Apache i am avid reader of your blog. I made a robots txt file based on the information you gave is it ok is there any mistakes? Thanks.
User-agent: * # disallow all files in these directories Disallow: /cgi-bin/ Disallow: /stats/ Disallow: /dh_ Disallow: /wp-admin/ Disallow: /wp-includes/ Disallow: /tag/ Disallow: /hakkinda/ Disallow: /iletisimvereklam/ Disallow: /wp-content/upgrade Disallow: /wp-content/plugins Disallow: /wp-content/languages Disallow: /wp-content/themes/default Disallow: /wp-content/themes/guzel-pro Disallow: /wp-content/themes/wp-max Disallow: /wp-content/themes/classic Disallow: /trackback/ Disallow: */trackback/ Disallow: /.smileys User-agent: Googlebot # disallow all files ending with these extensions Disallow: /*.php$ Disallow: /*.js$ Disallow: /*.inc$ Disallow: /*.css$ Disallow: /*.gz$ Disallow: /*.cgi$ Disallow: /*.wmv$ Disallow: /*.png$ Disallow: /*.gif$ Disallow: /*.jpg$ Disallow: /*.cgi$ Disallow: /*.xhtml$ Disallow: /*.php* Disallow: */trackback* Disallow: /z/ Disallow: /wp-* Allow: /wp-content/uploads/ # allow google image bot to search all images User-agent: Googlebot-Image Allow: /* # allow adsense bot on entire site User-agent: Mediapartners-Google* Disallow: /*?* Allow: /z/ Disallow: /hakkinda/ Disallow: /iletisimvereklam/ Allow: /wp-content/ Allow: /tag/ Allow: /manual/* Allow: /docs/* Allow: /*.php$ Allow: /*.js$ Allow: /*.inc$ Allow: /*.css$ Allow: /*.gz$ Allow: /*.cgi$ Allow: /*.wmv$ Allow: /*.cgi$ Allow: /*.xhtml$ Allow: /*.php* Allow: /*.gif$ Allow: /*.jpg$ Allow: /*.png$ # disallow archiving site User-agent: ia_archiver Disallow: / # disable duggmirror User-agent: duggmirror Disallow: /
Thanks genius! I used these commands for robots.txt and a week later i realized my new pages weren’t indexed by googlebot! By the way if this “ultimate super best robots.txt” why don’t YOU use it ha? I see your robots.txt file is pretty simple! So you are lying!
Perfect! Cheers.
This is what i looking for.
Thanks
That’s exactly the explanation I was looking for! Thanx
This is what i was searching for, thanks a lot, it helped me a lot.
-Naina
Thats a really good system and plugin since i run a wordpress mu site this will really help out people using the site, one other thing ive been looking for it a sitemap generator that works for wordpress mu, know of any? (email me answer please :D )
thanks again!
I’d like to try it, but how safe is this? Can we undo this and re-gain our PR supposed this customization won’t work?
This is an amzing post but I have a couple of questions,,
How can adding the above robots.txt file boost rank and add traffic?
Also please forgive me if I am being dumb.. I checked the robots txt file for your site and it is not like the examples above? I don’t quite understand?
Kindest regards Rene
Can you also include the noarchive command for the robots and msnbot lines, as well as the googlebot line? If not, why?
I was looking for a detailed Search Engine Optimization article on the robots.txt
This is really detailed and useful.
Thanks
:-)
Also could I do something like this?
User-agent: *
Disallow: */wp-content/
Disallow: */wp-admin/
Disallow: */wp-includes/
Disallow: */wp-
Disallow: */feed/
Disallow: /trackback/
Disallow: /cgi-bin/
If I put my robots.txt in my root but have my blog installed in a subfolder how can I exclude correctly.
I mean my cgi-bin folder would work but my blog is installed in a folder called blog which means /blog/wp-admin for instance
how to disallow then?
Amazing post. Thanks.
Great!! This is exactly what I needed!
I’ still not sure what exactly remove from your seo robots.txt but I will give it a try :)
great archive for wordpress users !
Tags: adsense, Blocking, Examples, feed, Google, Logs, mediapartners, phpBB, Robot, robots, robots.txt, SEO, WordPress,
It's very simple - you read the protocol and write the code. -Bill Joy
HTML | DCMI | GRDDL | XOXO | XDMP | XFN | DOM | XML | XHTML 1.1 Strict | CSS 2.1 | W3C
↑ TOPExcept where otherwise noted, content on this site is licensed under a Creative Commons Attribution 3.0 License, just credit with a link.
This site is not supported or endorsed by The Apache Software Foundation (ASF). All software and documentation produced by The ASF is licensed. "Apache" is a trademark of The ASF. NCSA HTTPd.
UNIX ® is a registered Trademark of The Open Group.
POSIX ® is a registered Trademark of The IEEE.
thanks a lot for this useful article