Creating A Robots.txt File For Wordpress
Wordpress is a ready made blogging platform which has become very popular with hundreds of plugins and themes being created pretty much every day, you can take a look at the official plugins over at the main wordpress site, they also have a good range of themes to get you started and off the basic theme that everyone uses.
The reason why I use wordpress is it has a really good admin section and with there latest release it makes uploading plugins/widgets and themes a lot easier. The wordpress blog also reacts well with the search engines however I have had to tweak it slightly to make it more seo.
For example I have created a robots.txt file for wordpress to help me keep google away from the the parts I don’t want it to crawl, you don’t want google to crawl your tags or calendar links, and you certainly don’t want google to follows links that could cause duplicate content like trackback and the comments section.
So for example I will show you a couple of duplicate content issues with wordpress but more importantly I’ll tell you how to fix it.
I will use my previous posts url for this example;
http://directorysubmissions.eu/news/2009/09/08/why-is-everyone-an-seo-expert/
That’s how the url should be displayed and the only way it should be displayed but wordpress produces duplicate content issues like
http://directorysubmissions.eu/news/2009/09/08/why-is-everyone-an-seo-expert/trackback/
and
http://directorysubmissions.eu/news/2009/09/08/why-is-everyone-an-seo-expert/comment-page-1/
If you were to click each of the above links you will see it takes you to a page which has the same content as the others.
You can easily fix this by writing a robots.txt file to ignore the trackback and comments sections, just copy and paste the below example into your robots.txt file and your problem is solved.
User-Agent: *
Allow: /
User-Agent: Googlebot
Disallow: */trackback/
Disallow: */comment-page-1/
Disallow: */tag/
You can also block other sections of your wordpress blog to help keep it google friendly and clean. I have blocked my tag section as well as google can find my posts by going through the categories, you only need one way for google to find your pages and that can be achieved by having inbound links or a good internal link structure.
You can check your current robots.txt file in google webmaster tools and see if google has access to the parts you have blocked or tried to block. If you simply copy and paste my above example into your robots txt file everything will be fine.
Creating A Robots.txt File For Wordpress
Creating a google xml sitemap