Robots.txt
Robots.txt is a file on your website that tells search engines which pages they can and can't access. It controls what shows up in search results.
Robots.txt is a plain text file that lives at the root of your website (yoursite.com/robots.txt). It gives instructions to search engine crawlers about which parts of your site they're allowed to visit and which they should skip. It's one of the first things Google checks when it comes to your site.
Despite the name, it has nothing to do with actual robots. "Robots" is just the technical term for the automated programs that search engines use to scan websites. Your robots.txt file is how you talk to them.
Why It Matters for Your Business
Most small business websites don't need complicated robots.txt rules. But having the file set up correctly prevents problems. A misconfigured robots.txt can accidentally block Google from seeing your entire website, which means you'd disappear from search results completely.
On the flip side, there are pages you probably don't want Google indexing. Admin login pages, shopping cart pages, internal search results, and staging environments. Robots.txt helps keep those out of search results where they don't belong.
The Basics
Every website should have one. Even if it just says "allow everything," having a robots.txt file is a best practice. It tells search engines you've thought about crawling and gives them a starting point. Most CMS platforms create a default one for you.
It's a request, not a command. Robots.txt is more like a polite suggestion. Google and Bing respect it. But not every crawler does. If you need to truly block access to something sensitive, use password protection instead. Don't rely on robots.txt alone for security.
Point to your sitemap. Including a link to your sitemap in your robots.txt file is a common practice. It looks like: Sitemap: https://yoursite.com/sitemap.xml. This helps search engines find your sitemap automatically without needing to guess where it is.
Don't block your CSS or JavaScript. Years ago, some sites blocked these files in robots.txt. That's a bad idea now. Google needs to see your CSS and JavaScript to understand how your pages look and function. Blocking them can hurt your rankings.
Check it before you launch. During development, many sites use robots.txt to block all search engines so unfinished pages don't get indexed. If you forget to update it when you launch, Google will never find your site. This is one of the most common SEO mistakes with new websites.
Frequently Asked Questions
What happens if I don't have a robots.txt file?
Nothing catastrophic. If there's no robots.txt file, search engines will simply crawl everything they can find on your site. For most small business websites, that's fine. But having one gives you control and prevents search engines from wasting time on pages that don't matter, like admin pages or duplicate content.
Can robots.txt block my website from Google?
Yes, and it's more common than you'd think. A single line like Disallow: / tells Google not to crawl any page on your site. This is sometimes left in place accidentally after a site redesign or migration. If your site suddenly disappears from Google, checking robots.txt should be one of the first things you do. Visit yoursite.com/robots.txt in your browser to see what it says.
How do I edit my robots.txt file?
On WordPress, SEO plugins like Yoast let you edit it from the dashboard. On Squarespace and Wix, there are settings panels for it. If your site is custom-built, it's a plain text file in your site's root directory that any developer can edit. Google Search Console also has a robots.txt tester that lets you check whether your rules are working correctly before you make changes.
