How To: What Does a Robots.txt File Do?
The robots.txt file is arguably one of the most important, and basic, SEO guidelines to adhere to. Let’s start with the basics, what is a robots.txt file? According to Google’s Webmaster Tools guidelines,
“this file tells crawlers which directories can or cannot be crawled. Make sure it’s current for your site so that you don’t accidentally block the Googlebot crawler.”
In simple terms, it’s text file that grants and disables access to your site from search engine crawlers.
So what are the key things to keep in mind when creating your site’s robots.txt file?
- Using it incorrectly can and likely will hurt your rankings.
- It’s the control room of your site when the search engine spiders arrive, use it wisely.
- How the spiders interact with this file is an essential part of how search engines function.
- It is mentioned by numerous Google guidelines, so it is also simply, good practice.
Now that we know what a robots.txt file does, let’s have a look at how to use it in the best manner possible. There are three main ways that you can use a robots.txt. They are:
- Full allow – this gives the crawler permission to crawl your entire site. Most sites would want this option as it provides maximum crawlability.
- Full disallow – it’s exactly the opposite of the above listed, as it doesn’t allow the crawler to look at any of your site’s content. This blocks all your content from discovery via search engine and as a result is definitely NOT recommended.
- Conditional allow – Your instructions in the robots.txt dictate to the crawlers what the can and cannot crawl. This gives you a lot of control over your website’s content that makes it into the SERPs.
That’s it for the basics of your robots.txt, remember to use it wisely. Keep on reading up about the technicalities involved, the more you know, the more effective your SEO strategies can be.