Part of the SEO audit

Make sure your sitemap helps search engines find the right pages

A clean XML sitemap gives crawlers a reliable list of URLs to discover and revisit. SiteCurl checks whether the sitemap exists and whether robots.txt points to it.

Start 7-Day Studio Trial

No signup required. Results in under 60 seconds.

423,000+ checks run and counting

What this check does

SiteCurl looks for /sitemap.xml and records if it loads. It also checks if the sitemap is listed in robots.txt, which makes it easier for crawlers to find.

The check confirms the sitemap gives a valid HTTP response. If the sitemap URL returns a 404, a 500 error, or redirects to a strange path, SiteCurl flags it. A sitemap that exists but is not listed in robots.txt is also flagged, since crawlers may not find it on their own.

SiteCurl does not parse each URL inside the sitemap, but it confirms the file is there and loads. This is the base: if the sitemap itself is broken, nothing it lists can be found.

How this shows up in the real world

An XML sitemap is a list of URLs you want Google to crawl. It is not a ranking signal on its own, but it is a way to be found. For new sites, large sites, and sites with poorly linked sections, the sitemap is often the first way Google learns about a page.

Sitemaps also carry metadata: the lastmod date tells crawlers when the page last changed, and the priority value hints at weight (though most crawlers ignore it). The lastmod date is more useful in practice. It helps crawlers decide which pages to re-crawl after a content update.

The link between the sitemap and robots.txt matters. Adding a Sitemap: line to robots.txt is the standard way to tell crawlers where to find your sitemap. Without it, crawlers rely on Search Console or common URL guesses like /sitemap.xml. Adding the line removes that guesswork.

Sites with more than 50,000 URLs need a sitemap index file that points to child sitemaps. Each child can hold up to 50,000 URLs. SiteCurl checks the root sitemap URL, so if your site uses an index, make sure the index file itself loads and is listed in robots.txt.

Why it matters

Sitemaps do not replace good internal linking, but they boost how pages are found and confirm which URLs you want crawled. They help most on larger sites, new sites, and sections that lack internal links.

For new sites with few outside links, the sitemap may be the only way Google finds your pages in the first days after launch. Without one, you wait for crawlers to find your pages through links. That can take weeks or months.

Sitemaps also work as a check tool. By comparing the URLs in your sitemap with the pages Google has indexed (shown in Search Console), you can spot gaps: pages you want indexed but are not, and pages indexed that should not be. This is one of the most useful ongoing SEO tasks.

Who this impacts most

New sites gain the most from a working sitemap. With no link profile yet, the sitemap is the main way pages are found. A new SaaS product that ships with 20 pages and no sitemap may wait weeks before all pages show up in search.

Large content sites with thousands of articles need sitemaps to handle the scale. Old articles deep in the archive may never be reached by a crawler that starts at the home page. The sitemap makes sure each article is at least sent for crawling.

Online stores with frequent stock changes rely on lastmod dates in the sitemap to flag which product pages changed. Without a sitemap, price updates and new products may not be re-crawled for days or weeks.

How to fix it

Step 1: Create or expose a live sitemap at a stable URL. Most CMS tools and web frameworks have sitemap plugins or gems. If you use a static site builder, add a build step that makes the sitemap. The standard path is /sitemap.xml at the root of your domain.

Step 2: List only canonical, indexable pages. Each URL in the sitemap should return a 200, should not have a noindex tag, and should match the canonical URL on the page. Remove redirects, 404s, and non-canonical forms from the sitemap.

Step 3: Add the sitemap URL to robots.txt. Add a line like Sitemap: https://yourdomain.com/sitemap.xml at the bottom of your robots.txt file. This is the standard way to point all crawlers at your sitemap.

Step 4: Submit the sitemap in Search Console. After big changes (new sections, moves, large content adds), resubmit the sitemap in Google Search Console and Bing Webmaster Tools. This nudges crawlers to read the file sooner than they would on their own.

Common mistakes when fixing this

Listing non-canonical pages. If the sitemap has URLs that redirect to new URLs, or URLs with query params that should point to a clean version, the sitemap sends mixed signals about which URLs matter most.

Leaving old or dead URLs in the sitemap. Google wastes crawl time on dead paths. Clean up the sitemap after each move, URL change, or content cut.

Thinking the sitemap alone fixes findability. A sitemap tells crawlers about your pages, but internal links still carry weight and context that sitemaps do not. You need both: a sitemap for discovery and links for authority.

Never updating the lastmod dates. If each URL in the sitemap has the same lastmod value (or none at all), crawlers cannot rank which pages to re-crawl. Update lastmod only when the page text has real changes.

How to verify the fix

Run a new SiteCurl scan and confirm sitemap warnings are gone. Open the sitemap in the browser to check that it loads, then review a few listed URLs to confirm they are real, canonical pages.

In Search Console, go to the Sitemaps section and check the status. It should show 'Success' with the number of found URLs. If the count is lower than you expect, compare the sitemap URLs to the Coverage report to find which pages are missing and why.

Example findings from a scan

XML sitemap not found at /sitemap.xml

robots.txt does not list a sitemap URL

Sitemap contains pages that are no longer linked internally

Frequently asked questions

Do small sites need a sitemap?

Yes. Small sites can often be crawled without one, but a sitemap is still a low-effort signal that helps with discovery and maintenance.

Should every indexable page be in the sitemap?

Generally yes, especially the pages you want crawled and kept indexed.

Can a sitemap fix poor internal linking?

No. It helps discovery, but internal links are still important for crawl paths and page importance.

Check your XML sitemap now