How to Fix robots.txt Issues That Block Search Engines
What the robots.txt check tests
SiteCurl checks two things: whether your site has a robots.txt file at all, and whether the file contains rules that block search engines from crawling important pages. The most common problem is a Disallow: / rule that blocks everything.
robots.txt is a plain text file at the root of your domain (e.g., https://yoursite.com/robots.txt). It gives search engine crawlers instructions about which pages they are and are not allowed to visit.
Why it matters
A missing robots.txt is a minor issue. Search engines will crawl your site without it. But a robots.txt that accidentally blocks crawling is a serious problem. If you have Disallow: / under User-agent: *, you are telling every search engine to ignore your entire site.
This often happens during development. A staging site gets a “block all” robots.txt to prevent indexing, and the file gets copied to production during a migration.
How to fix it
Creating a basic robots.txt
Create a file named robots.txt in your site’s root directory with these contents:
User-agent: *
Allow: /
Sitemap: https://yoursite.com/sitemap.xml
This allows all crawlers to access all pages and points them to your sitemap.
Fixing a “block all” rule
If your robots.txt contains Disallow: /, change it to Allow: / or remove the Disallow line entirely. If you need to block specific paths (like admin pages or internal search results), use targeted rules:
User-agent: *
Allow: /
Disallow: /admin/
Disallow: /search?
Disallow: /cart/
Sitemap: https://yoursite.com/sitemap.xml
Blocking specific bots
If you want to block a specific crawler (like an aggressive scraper) without affecting search engines, add a rule for that bot’s user agent:
User-agent: BadBot
Disallow: /
User-agent: *
Allow: /
Including a sitemap reference
Always include a Sitemap: directive pointing to your XML sitemap. This helps search engines discover all your pages, even ones that are not linked from your navigation.
How to verify the fix
Visit https://yoursite.com/robots.txt in your browser. Confirm it loads, contains Allow: / for the default user agent, and does not block important paths. Run a SiteCurl scan to verify automatically.
Related checks
robots.txt works alongside your XML sitemap and AI crawler access settings. Review all three for complete crawl control.
Start a free trial to check your robots.txt and 84 other items in one scan.
More on SEO
Find every SEO issue on your site
Run a full SEO audit and get a prioritized fix list in under 60 seconds.
Start 7-Day Studio TrialNo credit card required.