Most WordPress sites have a sitemap. Far fewer have a good one. The difference matters more than many site owners realize - a poorly constructed sitemap doesn't just fail to help search engines, it actively wastes crawl budget by pointing Googlebot at pages that should never be indexed in the first place.
An XML sitemap is a structured file that tells search engines which URLs on your site you consider worth crawling and indexing. It's not a guarantee of indexing - Google makes that decision independently - but it is a direct communication channel between your site and the crawler. Done well, it accelerates discovery of new content, helps surface pages that lack strong internal links, and gives you a clear picture of what you're asking search engines to index.
What a Sitemap Actually Communicates
The XML sitemap protocol, originally proposed by Google and now supported by all major search engines, allows you to list URLs alongside optional metadata: the last modification date (<lastmod>), how frequently the content changes (<changefreq>), and relative priority (<priority>). In practice, Google has stated it largely ignores <changefreq> and <priority> since they're self-reported and therefore unreliable. The two signals that carry genuine weight are the URL itself and the <lastmod> date - but only when that date is accurate and reflects a meaningful content change, not a cosmetic template update.
The core message of a sitemap is: "These are the URLs I want you to know about." That framing is important, because it means every URL you include is an implicit endorsement. Including low-quality or duplicate pages dilutes that signal.
What Belongs in Your Sitemap
The inclusion logic is straightforward in principle: if a page is indexable, valuable, and canonical, it belongs in the sitemap. In practice, that means:
Published posts and pages - your core editorial content, service pages, landing pages, and evergreen articles. These are the primary reason the sitemap exists.
Key taxonomy pages - category and tag archive pages that aggregate meaningful content and could realistically rank for broad terms. A well-curated category page is a legitimate indexable asset. A tag page with two posts attached to it is not.
Custom post type archives - product pages, portfolio items, case studies, and similar post types that represent distinct indexable content.
Image sitemaps (optional but worthwhile) - Google supports a sitemap extension for images, which helps surface images in Google Image Search. If visual content is a meaningful traffic source for your site, an image sitemap is worth implementing. The Image Compressor and related tools can help ensure those images are optimized before you surface them.
What Does Not Belong in Your Sitemap
This is where most WordPress sitemaps go wrong. The default behavior of several plugins is to include every URL the CMS generates - which includes a large number of URLs that actively harm your sitemap's signal quality.
Noindex pages - if a page carries a
noindexdirective, including it in the sitemap sends a contradictory signal. You're telling the crawler "here's a URL I want you to know about" while simultaneously saying "don't index this." Remove noindex URLs from the sitemap entirely.Paginated pages -
/page/2/,/page/3/, and so on rarely deserve independent indexing. They exist for user navigation, not as standalone ranking targets. Include only the first page of a paginated series, if at all.Canonicalized duplicates - if a URL has a canonical tag pointing to a different URL, the duplicate should not appear in the sitemap. The sitemap should list the canonical version only.
404 and redirected URLs - including URLs that return non-200 status codes wastes crawl budget and confuses crawlers. Any URL in your sitemap should return a clean 200 response.
Admin, login, and utility pages -
/wp-admin/,/wp-login.php, search result pages (/?s=query), and similar utility URLs have no place in an XML sitemap.Thin or near-duplicate author archives - author archive pages on single-author blogs, or author pages with minimal content, typically add noise rather than value.
Auditing your sitemap against these criteria is one of the fastest ways to improve crawl efficiency. The article on crawl efficiency optimization covers the broader picture of how crawl budget decisions affect your SEO.
Single Sitemap vs. Sitemap Index
The XML sitemap protocol supports two formats. A single sitemap file lists URLs directly and has a hard limit of 50,000 URLs and 50MB uncompressed. A sitemap index file is a parent document that points to multiple child sitemaps - each covering a subset of the site (posts, pages, products, images, etc.).
For most WordPress sites under a few thousand pages, a single sitemap or a small set of type-specific sitemaps works fine. For larger sites - e-commerce stores with thousands of products, news publishers with years of archived content, or multi-language sites using hreflang - a sitemap index is the appropriate structure. It lets you segment by content type, making it easier to monitor indexing rates per section in Google Search Console and to diagnose issues when a specific content type isn't being crawled as expected.
A typical sitemap index for a mid-size WordPress site might look like this:
<?xml version="1.0" encoding="UTF-8"?>
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<sitemap>
<loc>https://example.com/sitemap-posts.xml</loc>
<lastmod>2025-06-01</lastmod>
</sitemap>
<sitemap>
<loc>https://example.com/sitemap-pages.xml</loc>
<lastmod>2025-05-28</lastmod>
</sitemap>
<sitemap>
<loc>https://example.com/sitemap-categories.xml</loc>
<lastmod>2025-05-20</lastmod>
</sitemap>
</sitemapindex>Generating a Sitemap in WordPress
WordPress core has included a basic sitemap generator since version 5.5. It lives at /wp-sitemap.xml and produces a sitemap index covering posts, pages, custom post types, authors, and taxonomies. It requires zero configuration, which is both its strength and its weakness. The core sitemap gives you no control over exclusions, no image sitemap support, and no integration with your noindex settings - meaning it will happily include pages you've marked noindex via other plugins.
For production sites, a dedicated SEO plugin gives you the control you actually need.
Signocore SEO
The Signocore SEO plugin generates sitemaps with direct awareness of your indexability settings. Pages marked noindex are automatically excluded from the sitemap - there's no possibility of the contradiction described above. The plugin produces a sitemap index with type-specific child sitemaps, supports image sitemaps, and respects canonical configurations so duplicates don't leak into the output. Because Signocore SEO is built around technical correctness rather than feature volume, the sitemap output is clean by default rather than requiring manual cleanup after installation.
Yoast SEO
Yoast has offered XML sitemap generation for over a decade and remains a widely used option. Its sitemap respects Yoast's own noindex settings and provides post-type and taxonomy toggles. One known friction point is that Yoast's sitemap can include author archives and date-based archives unless you explicitly disable them - and those settings live in a different part of the plugin interface from the sitemap settings, which creates configuration overhead. Yoast also stores significant sitemap-related data in the WordPress database, which can become a performance consideration on high-traffic or large sites.
Manual Generation
For developers who want full control - or who are building headless WordPress setups - generating a sitemap manually via a custom endpoint or a static build process is a legitimate approach. A custom sitemap function registered via add_action('init', ...) with a rewrite rule can produce exactly the output you want with no plugin overhead. The tradeoff is maintenance: you own the logic, so you're responsible for keeping exclusion rules current as the site evolves.
Whichever method you use, validate the output before submitting. The Sitemap Validator checks your sitemap's structure, confirms URLs return 200 responses, and flags common issues like missing <loc> elements or malformed XML.
Submitting Your Sitemap
Generating a sitemap is only half the work. Submitting it directly to search engine webmaster tools ensures faster discovery and gives you feedback on indexing status.
Google Search Console
In Google Search Console, navigate to Indexing > Sitemaps and enter your sitemap URL. Google will fetch it, report how many URLs were submitted versus how many were indexed, and flag any errors. The gap between submitted and indexed URLs is one of the most actionable signals in Search Console - a large gap often indicates quality issues, crawl budget constraints, or canonicalization problems worth investigating. Check this report regularly, not just at initial submission.
Bing Webmaster Tools
Bing Webmaster Tools has a dedicated Sitemaps section under Configure My Site. Bing also supports sitemap submission via a ping URL (https://www.bing.com/ping?sitemap=YOUR_SITEMAP_URL), which can be automated on publish via WordPress hooks. Since Bing powers DuckDuckGo and other downstream search products, it's worth the five minutes to submit there as well.
Beyond direct submission, reference your sitemap in your robots.txt file with a Sitemap: directive. This ensures any crawler that reads robots.txt - not just Google and Bing - discovers your sitemap automatically. The robots.txt Generator makes it straightforward to add this directive correctly.
Keeping Your Sitemap Clean as the Site Grows
A sitemap that's accurate at launch can degrade over time. Content gets redirected, pages get marked noindex, tags accumulate thin archives, and URL structures change. Without periodic maintenance, the sitemap becomes a record of the site's history rather than its current indexable state.
Build these checks into your regular SEO workflow:
Audit after major content changes - when you redirect, delete, or noindex a significant batch of pages, verify those URLs are no longer in the sitemap before the next crawl cycle.
Monitor Search Console's sitemap report monthly - watch the submitted-to-indexed ratio over time. A declining ratio on a growing sitemap often signals that new content isn't meeting Google's quality threshold, or that excluded-but-indexed pages are creating duplicate signals.
Review taxonomy pages annually - tag and category archives tend to proliferate. An annual pass to noindex or consolidate thin taxonomy pages keeps the sitemap tight.
Validate structure after plugin or theme updates - sitemap generation logic can be affected by updates to SEO plugins or custom code. A post-update validation pass catches regressions early.
The broader point is that a sitemap is not a set-and-forget configuration. It's an active signal to search engines about what your site is, and it deserves the same attention as your content strategy. A lean, accurate sitemap - one that includes only canonical, indexable, valuable URLs - does more for your crawl efficiency than a bloated one ever could. Pair it with a well-structured structured data implementation and a clean robots.txt, and you've given search engines the clearest possible picture of your site.