XML Sitemaps: The Complete Guide for SEO
Create and optimize XML sitemaps that help search engines discover your content faster and understand your site structure more effectively.

Advertisement
Key Takeaways
- XML sitemaps help search engines discover content they might miss through crawling
- Include only canonical, indexable pages in your sitemap
- Keep sitemaps under 50 MB or 50,000 URLs per file
- Use sitemap index files for sites with more than 50,000 URLs
- Update sitemaps dynamically as content changes
- Submit sitemaps through [Google](/blog/google-analytics-4-guide) Search Console and robots.txt
XML sitemaps are your direct communication channel to search engines. While crawlers can discover content through internal and external links, sitemaps give you a guaranteed way to tell search engines exactly which pages exist on your site and how they relate to each other.
Every site needs an XML sitemap. It is the first file Googlebot requests when discovering a new site. A well-optimized sitemap accelerates content discovery, prioritizes important pages, and provides search engines with metadata about your content.
What to Include in Your Sitemap
Your sitemap should include every page you want indexed and exclude pages you do not want indexed. This sounds simple but requires careful consideration.
Include:
- →Blog posts and articles
- →Category and tag pages (if they add value)
- →Product pages
- →Landing pages
- →Pillar pages and cornerstone content
- →Admin and login pages
- →Internal search results
- →Tag pages with thin content
- →Pagination pages (if using view-all or proper canonicals)
- →Duplicate content pages
- →Staging or development pages
What to Exclude
Excluding the right pages prevents crawl budget waste. Every URL in your sitemap tells Google to crawl that page. If your sitemap includes thousands of thin or duplicate pages, Google wastes crawl budget on low-value content.
Use the noindex meta tag for pages you want in your sitemap temporarily but plan to remove later. Google ignores noindex pages in sitemaps after processing. Remove excluded pages from your sitemap entirely.
Size Limits
XML sitemaps have specific size limits. A single sitemap file can contain a maximum of 50,000 URLs and must be smaller than 50 MB uncompressed. Exceed either limit and search engines may truncate or ignore the file.
If your site exceeds these limits, use a sitemap index file. A sitemap index file lists multiple sitemap files. You can have up to 50,000 sitemap files in a single index, giving you a theoretical maximum of 2.5 billion URLs.
Sitemap Index Files
A sitemap index file is an XML file that lists multiple sitemap files. It follows the same format as a regular sitemap but with different tags. Each entry in a sitemap index specifies the location and last modified date of a child sitemap.
Organize your sitemap index logically. Common approaches include:
- →One sitemap per content type (posts, pages, products)
- →One sitemap per content section (blog, guides, resources)
- →Alphabetical sitemaps for very large sites
Image and Video Sitemaps
Image and video sitemaps help search engines discover media content. While regular sitemaps can include images and videos, dedicated media sitemaps provide additional metadata that improves visibility in image and video search.
For image sitemaps, include:
- →Image location URL
- →Image caption and title
- →Image license information (if applicable)
- →Video title and description
- →Content URL and player URL
- →Duration and upload date
- →Thumbnail URL
Dynamic Sitemap Generation
Static sitemaps require manual updates every time you publish or remove content. Dynamic sitemaps generate automatically from your content management system or database.
For Next.js sites, use the built-in sitemap.ts file convention to generate sitemaps programmatically. This approach queries your content sources and generates XML output on demand.
For other platforms, use plugins or custom scripts that run on a schedule. WordPress has numerous sitemap plugins. Custom sites can generate sitemaps through cron jobs that query the database.
Submitting Sitemaps to Google
Submit your sitemap through Google Search Console. Add your sitemap URL in the Sitemaps section and monitor the submission status.
Also reference your sitemap in robots.txt using the sitemap directive. This provides a secondary discovery method.
Bing offers similar submission through Bing Webmaster Tools. Submit to both search engines for maximum coverage.
Monitoring Sitemap Performance
Google Search Console shows how many URLs from your sitemap have been indexed, how many have errors, and how many were excluded. Review this data monthly to identify sitemap issues.
Common sitemap errors include:
- →URLs returning 4xx or 5xx status codes
- →URLs blocked by robots.txt
- →Noindex pages included in sitemap
- →URLs pointing to redirected pages
- →Invalid XML formatting
For broader technical SEO, see our technical SEO audit checklist.
Standard XML Sitemap Index Example
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<sitemap>
<loc>https://technical-seo.pages.dev/sitemap-posts.xml</loc>
<lastmod>2026-06-17T13:00:00Z</lastmod>
</sitemap>
</sitemapindex>
Common Mistakes
- →Blocking JavaScript & CSS in robots.txt: Googlebot needs to render layout styles to calculate Core Web Vitals like CLS and LCP accurately.
- →Not Preloading Critical Hero Images: Forgetting to preload the LCP image delays rendering, resulting in a poor Lighthouse speed score.
- →Ignoring Client-Side Render Latency: Relying entirely on client-side JS executing without an HTML backup blocks indexation on other search engines like Bing.
When This Does Not Apply
- →Static Marketing Pages: Simple, light static sites with minimal dynamic elements rarely need complex server-rendering, database connections, or API performance strategies.
- →Non-Indexed Portals: Staging sites, dashboard pages behind authentication, or internal company wikis do not benefit from structured data or search engine indexability optimization.
Official References
Advertisement
Frequently Asked Questions
How often should I update my sitemap?
Update your sitemap whenever you publish new content or remove existing pages. Dynamic sitemaps update automatically. Static sitemaps should be regenerated at least weekly.
Should I include pagination pages in my sitemap?
Include pagination pages only if they have unique, indexable content. Otherwise, use canonical URLs pointing to the first page or view-all page.
Do sitemaps guarantee my pages will be indexed?
No. Sitemaps are hints, not guarantees. Google uses sitemaps to discover URLs but makes independent decisions about which pages to index.
Can I have multiple sitemaps?
Yes. Use a sitemap index file to manage multiple sitemaps. This approach scales to millions of URLs.
Should I include noindex pages in my sitemap?
No. Including noindex pages in your sitemap confuses search engines. Google has stated it ignores noindex pages in sitemaps, but best practice is to exclude them entirely.

Full-Stack Developer & Web Architecture Engineer
Liam O'Brien is a Full-Stack Developer with 8+ years of experience building high-performance web applications. He specializes in Next.js, React, and Node.js, with a deep focus on web architecture, performance optimization, and technical SEO. Liam has architected front-end systems for e-commerce platforms handling 10 million+ monthly visitors and has contributed to major open-source projects including Next.js core and React documentation. He is passionate about server-side rendering, edge computing, and building scalable web applications that deliver exceptional user experiences. Liam writes about modern JavaScript frameworks, performance patterns, web vitals optimization, and building for search engine crawlers. He believes that great engineering and great SEO go hand in hand.
Comments are temporarily unavailable.
Stay Updated
Get the latest articles and SEO insights delivered to your inbox.
No spam. Unsubscribe anytime.
Related Articles

Google AI Overviews and AI Mode SEO: A Practical Visibility Framework (2026)
An in-depth guide to achieving high visibility in Google AI Overviews and AI Mode conversational search. Learn the RAG pipeline, key ranking factors, E-E-A-T requirements, and structured data optimization.

Core Web Vitals Debugging Playbook: Diagnose and Fix LCP, INP, and CLS Issues
Stop guessing why your Core Web Vitals are failing. Learn a systematic debugging workflow for LCP, INP, and CLS issues with real diagnostic techniques, CrUX analysis, and framework-specific fixes.

Internal Linking Strategy for SEO: A Complete Framework
Build an internal linking framework that distributes link equity, establishes content relationships, and drives rankings across your entire site.
Advertisement