Skip to main content
SEO

XML Sitemaps: The Complete Guide for SEO

Create and optimize XML sitemaps that help search engines discover your content faster and understand your site structure more effectively.

Liam O'Brien
Liam O'Brien
May 30, 202610 min read
XML Sitemaps: The Complete Guide for SEO

Key Takeaways

  • XML sitemaps help search engines discover content they might miss through crawling
  • Include only canonical, indexable pages in your sitemap
  • Keep sitemaps under 50 MB or 50,000 URLs per file
  • Use sitemap index files for sites with more than 50,000 URLs
  • Update sitemaps dynamically as content changes
  • Submit sitemaps through [Google](/blog/google-analytics-4-guide) Search Console and robots.txt

XML sitemaps are your direct communication channel to search engines. While crawlers can discover content through internal and external links, sitemaps give you a guaranteed way to tell search engines exactly which pages exist on your site and how they relate to each other.

Every site needs an XML sitemap. It is the first file Googlebot requests when discovering a new site. A well-optimized sitemap accelerates content discovery, prioritizes important pages, and provides search engines with metadata about your content.

What to Include in Your Sitemap

Your sitemap should include every page you want indexed and exclude pages you do not want indexed. This sounds simple but requires careful consideration.

Include:

  • Blog posts and articles
  • Category and tag pages (if they add value)
  • Product pages
  • Landing pages
  • Pillar pages and cornerstone content
Exclude:
  • Admin and login pages
  • Internal search results
  • Tag pages with thin content
  • Pagination pages (if using view-all or proper canonicals)
  • Duplicate content pages
  • Staging or development pages

What to Exclude

Excluding the right pages prevents crawl budget waste. Every URL in your sitemap tells Google to crawl that page. If your sitemap includes thousands of thin or duplicate pages, Google wastes crawl budget on low-value content.

Use the noindex meta tag for pages you want in your sitemap temporarily but plan to remove later. Google ignores noindex pages in sitemaps after processing. Remove excluded pages from your sitemap entirely.

Size Limits

XML sitemaps have specific size limits. A single sitemap file can contain a maximum of 50,000 URLs and must be smaller than 50 MB uncompressed. Exceed either limit and search engines may truncate or ignore the file.

If your site exceeds these limits, use a sitemap index file. A sitemap index file lists multiple sitemap files. You can have up to 50,000 sitemap files in a single index, giving you a theoretical maximum of 2.5 billion URLs.

Sitemap Index Files

A sitemap index file is an XML file that lists multiple sitemap files. It follows the same format as a regular sitemap but with different tags. Each entry in a sitemap index specifies the location and last modified date of a child sitemap.

Organize your sitemap index logically. Common approaches include:

  • One sitemap per content type (posts, pages, products)
  • One sitemap per content section (blog, guides, resources)
  • Alphabetical sitemaps for very large sites

Image and Video Sitemaps

Image and video sitemaps help search engines discover media content. While regular sitemaps can include images and videos, dedicated media sitemaps provide additional metadata that improves visibility in image and video search.

For image sitemaps, include:

  • Image location URL
  • Image caption and title
  • Image license information (if applicable)
For video sitemaps, include:
  • Video title and description
  • Content URL and player URL
  • Duration and upload date
  • Thumbnail URL

Dynamic Sitemap Generation

Static sitemaps require manual updates every time you publish or remove content. Dynamic sitemaps generate automatically from your content management system or database.

For Next.js sites, use the built-in sitemap.ts file convention to generate sitemaps programmatically. This approach queries your content sources and generates XML output on demand.

For other platforms, use plugins or custom scripts that run on a schedule. WordPress has numerous sitemap plugins. Custom sites can generate sitemaps through cron jobs that query the database.

Submitting Sitemaps to Google

Submit your sitemap through Google Search Console. Add your sitemap URL in the Sitemaps section and monitor the submission status.

Also reference your sitemap in robots.txt using the sitemap directive. This provides a secondary discovery method.

Bing offers similar submission through Bing Webmaster Tools. Submit to both search engines for maximum coverage.

Monitoring Sitemap Performance

Google Search Console shows how many URLs from your sitemap have been indexed, how many have errors, and how many were excluded. Review this data monthly to identify sitemap issues.

Common sitemap errors include:

  • URLs returning 4xx or 5xx status codes
  • URLs blocked by robots.txt
  • Noindex pages included in sitemap
  • URLs pointing to redirected pages
  • Invalid XML formatting
For integration with your site setup, see our Next.js SEO best practices.

For broader technical SEO, see our technical SEO audit checklist.

Standard XML Sitemap Index Example

<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  <sitemap>
    <loc>https://technical-seo.pages.dev/sitemap-posts.xml</loc>
    <lastmod>2026-06-17T13:00:00Z</lastmod>
  </sitemap>
</sitemapindex>

Common Mistakes

  • Blocking JavaScript & CSS in robots.txt: Googlebot needs to render layout styles to calculate Core Web Vitals like CLS and LCP accurately.
  • Not Preloading Critical Hero Images: Forgetting to preload the LCP image delays rendering, resulting in a poor Lighthouse speed score.
  • Ignoring Client-Side Render Latency: Relying entirely on client-side JS executing without an HTML backup blocks indexation on other search engines like Bing.

When This Does Not Apply

  • Static Marketing Pages: Simple, light static sites with minimal dynamic elements rarely need complex server-rendering, database connections, or API performance strategies.
  • Non-Indexed Portals: Staging sites, dashboard pages behind authentication, or internal company wikis do not benefit from structured data or search engine indexability optimization.

Official References

Frequently Asked Questions

How often should I update my sitemap?

Update your sitemap whenever you publish new content or remove existing pages. Dynamic sitemaps update automatically. Static sitemaps should be regenerated at least weekly.

Should I include pagination pages in my sitemap?

Include pagination pages only if they have unique, indexable content. Otherwise, use canonical URLs pointing to the first page or view-all page.

Do sitemaps guarantee my pages will be indexed?

No. Sitemaps are hints, not guarantees. Google uses sitemaps to discover URLs but makes independent decisions about which pages to index.

Can I have multiple sitemaps?

Yes. Use a sitemap index file to manage multiple sitemaps. This approach scales to millions of URLs.

Should I include noindex pages in my sitemap?

No. Including noindex pages in your sitemap confuses search engines. Google has stated it ignores noindex pages in sitemaps, but best practice is to exclude them entirely.

Share:
Liam O'Brien
Liam O'Brien

Full-Stack Developer & Web Architecture Engineer

Liam O'Brien is a Full-Stack Developer with 8+ years of experience building high-performance web applications. He specializes in Next.js, React, and Node.js, with a deep focus on web architecture, performance optimization, and technical SEO. Liam has architected front-end systems for e-commerce platforms handling 10 million+ monthly visitors and has contributed to major open-source projects including Next.js core and React documentation. He is passionate about server-side rendering, edge computing, and building scalable web applications that deliver exceptional user experiences. Liam writes about modern JavaScript frameworks, performance patterns, web vitals optimization, and building for search engine crawlers. He believes that great engineering and great SEO go hand in hand.

Comments are temporarily unavailable.

Stay Updated

Get the latest articles and SEO insights delivered to your inbox.

No spam. Unsubscribe anytime.