Who Else Wants to Lose Money by Getting their Robots.txt File Wrong

Hey, no joke here. Real story. Story about me, though I’d much rather not be the main character this time.

As I’m sure you’ve noticed, there are some AdSense ads on this blog. Not that many but still. Now the funny part. For something like a year the ads haven’t been displaying on any of my archive pages or category pages. All because of one small robots.txt mistake.

For those of you who don’t know, robots.txt is a file that sits in the root directory of your blog or website and waits to be viewed by search engine robots/crawlers. Search engines look at this file to determine which areas of a website are available to them and which aren’t.

The common practice is to use this file to prevent them from accessing duplicate content or some admin and private areas of your blog.

(By the way. There’s a big post on creating a WordPress-friendly robots.txt file coming soon.)

More than a year ago I was playing with the file. I didn’t take it seriously though, and that is where the AdSense problem comes into play.

I don’t know how much money was lost because of it, but it surely is a significant amount. Nevertheless, here’s the lesson for you:

How to make your robots.txt file AdSense-friendly

If you take a look at my current robots.txt file which you can find here: robots.txt you will see that I’m disallowing access to my category listings and date-based archives. The lines:

Disallow: /topics/
Disallow: /category/
Disallow: /2009/
Disallow: /2010/
Disallow: /2011/
Disallow: /2012/

There’s nothing wrong with this. In fact, I still consider it a good practice. The problem lies somewhere else.

You see, “search engine robots” is not merely a synonym for “Google”. Google are not the only guys that have some robots working for them. AdSense uses robots too. It uses them to look at websites, examine the content and come up with ads that match that content.

When you’re disallowing access to your categories and archives then that’s exactly what happens. AdSense robots can no longer see your content … in other words – no ads for you.

There’s one very easy solution to this. You just have to make sure to include an additional record in your robots.txt file. Which is:

User-agent: Mediapartners-Google
Disallow:
Allow: /

It speaks to AdSense robots directly, letting them know that they can access whatever they want. So in plain English – you get your ads back on.

How to check your robots.txt file

Robots.txt file is accessible from your main domain name. Just go to:

yourdomain.com/robots.txt

If there’s no such file then make sure to check this blog shortly for a detailed guide on how to create it.

Remember, if you’re using AdSense and you don’t see the record presented above in your robots.txt file then you need to place it there. Either do it yourself, ask your webmaster, or use a plugin called Robots Meta (the preferred way).

So that’s the story. Please don’t be me and make sure that everything’s fine with your robots.txt file. Finally, feel free to share your own stories. Did you have any similar problems?