robots.txt is a text file at the root of your website that tells search engine crawlers which pages or sections they are allowed or not allowed to access.

Can robots.txt block my site from Google?

Yes. A Disallow: / rule in robots.txt tells all crawlers to skip your entire site, which prevents Google from indexing any of your pages.

Where should the robots.txt file be located?

The robots.txt file must be at the root of your domain, accessible at yourdomain.com/robots.txt. Placing it in a subdirectory will not work.

← Blog

How to Fix robots.txt Issues That Block Search Engines

SEO Feb 3, 2026 2 min read

What the robots.txt check tests

SiteCurl checks two things: whether your site has a robots.txt file at all, and whether the file contains rules that block search engines from crawling important pages. The most common problem is a Disallow: / rule that blocks everything.

robots.txt is a plain text file at the root of your domain (e.g., https://yoursite.com/robots.txt). It gives search engine crawlers instructions about which pages they are and are not allowed to visit.

Why it matters

A missing robots.txt is a minor issue. Search engines will crawl your site without it. But a robots.txt that accidentally blocks crawling is a serious problem. If you have Disallow: / under User-agent: *, you are telling every search engine to ignore your entire site.

This often happens during development. A staging site gets a “block all” robots.txt to prevent indexing, and the file gets copied to production during a migration.

How to fix it

Creating a basic robots.txt

Create a file named robots.txt in your site’s root directory with these contents:

User-agent: *
Allow: /
Sitemap: https://yoursite.com/sitemap.xml

This allows all crawlers to access all pages and points them to your sitemap.

Fixing a “block all” rule

If your robots.txt contains Disallow: /, change it to Allow: / or remove the Disallow line entirely. If you need to block specific paths (like admin pages or internal search results), use targeted rules:

User-agent: *
Allow: /
Disallow: /admin/
Disallow: /search?
Disallow: /cart/
Sitemap: https://yoursite.com/sitemap.xml

Blocking specific bots

If you want to block a specific crawler (like an aggressive scraper) without affecting search engines, add a rule for that bot’s user agent:

User-agent: BadBot
Disallow: /

User-agent: *
Allow: /

Including a sitemap reference

Always include a Sitemap: directive pointing to your XML sitemap. This helps search engines discover all your pages, even ones that are not linked from your navigation.

How to verify the fix

Visit https://yoursite.com/robots.txt in your browser. Confirm it loads, contains Allow: / for the default user agent, and does not block important paths. Run a SiteCurl scan to verify automatically.

robots.txt works alongside your XML sitemap and AI crawler access settings. Review all three for complete crawl control.

Start a free trial to check your robots.txt and 84 other items in one scan.