Today, you’ll learn about sitemaps.
We’ll cover the basics first. Then move on to discussing different types and the best practices you can follow when creating a sitemap.
And you’ll see some examples.
What Is a Sitemap?
A sitemap is a file that tells search engines like Google what pages you have on your website. It helps them find and index your site.
Sitemaps are available in extensible markup language (XML) and hypertext markup language (HTML) format. (More on these later.)
While sitemaps are typically created for crawling purposes, companies also build sitemaps when they’re planning their website architecture.
They typically create visual sitemaps. Like this:
This visual sitemap helps to understand how all the content fits together when planning the site structure.
Note: The rest of this article focuses on sitemaps that are relevant for SEO—the ones that help search engines (and website visitors) find your pages. Not visual sitemaps, which are used for web design purposes.
Why Are Sitemaps Important?
When search engine bots crawl your site, they follow links to discover pages.
But sometimes, they can miss a few nooks and crannies. Especially if your site is large or has complex navigation.
That’s where sitemaps come to the rescue.
By creating a sitemap, you’re giving search engines a handy directory of all your pages.
Think of it as a cheat sheet that tells them, “Hey, these are all the pages I have. Don’t miss them!”
Your pages need to be found before they can rank in search results. And sitemaps help with that.
If you’re already using a sitemap, you can run your website through an auditing tool like Site Audit.
The tool scans your sitemap and identifies any errors it might have, like formatting errors. And offers recommendations on how to fix them.
Configure the tool to run your first audit.
After the audit is complete, go to the “Issues” tab and search for “sitemap.”
You’ll see whether there are any errors detected.
If so, click on “Why and how to fix it” to understand what the issue is and how to address it.
Different Kinds of Sitemaps
There are two main kinds of sitemaps: XML and HTML.
Let’s go over each:
An XML sitemap is a file that lists all the pages on your website. Which makes it easier for search engines to crawl and index your content.
XML sitemaps are written for search engine bots—not users.
Along with the list of pages, an XML sitemap can also include other technical details. Like when the page was last modified, how frequently the page content is likely to change, and the page’s priority relative to other pages on the site (indicated on a scale ranging from 0.0 to 1.0).
Here’s what an XML sitemap with this information might look like:
And here’s an example of an XML sitemap URL: yourwebsite.com/sitemap.xml. This means that your XML sitemap could be located at this URL.
An HTML sitemap is a page on your website listing all important website pages.
It serves as a table of contents. And helps both search engine bots and human visitors easily navigate through your site.
Unlike XML sitemaps, HTML sitemaps are designed primarily for users.
They provide a handy overview of your website’s structure and allow visitors to find specific pages quickly.
Here’s an example of an HTML sitemap:
An HTML sitemap’s URL looks like a regular webpage URL.
What Are the Differences Between XML Sitemaps and HTML Sitemaps?
Let’s look at some key distinctions between XML and HTML sitemaps.
XML sitemaps are:
- Intended for search engines
- Written in XML code
- Able to include URLs in any order
- Not designed for human readability or navigation
HTML sitemaps are:
- Intended for users
- Created in HTML and displayed as webpages
- Helpful for providing a structured list of links to pages within the site
- Designed for human readability and navigation. But can also be used by search engines for crawling.
XML Sitemap Examples
XML sitemaps look a bit cryptic.
Let’s see some examples.
- Samsung’s XML sitemap:
- Best Buy’s XML sitemap:
- Shopify’s XML sitemap:
- OpenAI’s XML sitemap:
HTML Sitemap Examples
HTML Sitemaps look more human-friendly.
Here are some examples:
- Microsoft’s HTML sitemap:
- Airbnb’s HTML sitemap:
- Walmart’s HTML sitemap:
- Apple’s HTML sitemap:
Sitemap Best Practices
Before creating a sitemap, consider auditing your website to find and fix any technical issues you might have.
You can use Semrush’s Site Audit tool for this.
Pages with these issues shouldn’t be a part of your sitemap. At least not until those issues are fixed.
Because they might confuse search engine bots and waste their crawl budget (a crawler will only get to so many of your pages before it moves on).
Open the tool, enter your website URL, and click “Start Audit.”
The configuration window will pop up.
Next, select the number of pages you want to check for issues. And click “Start Site Audit.”
After the audit is complete, go to the “Issues” tab. You’ll see which technical errors your site has.
You can also search for a specific error. Just type the name of the issue in the search box and the tool will highlight whether you have that problem.
For example, if we search “redirect chains,” we see 107 pages are affected.
It’s a good idea to use this tool to find and fix errors with your website before creating XML and HTML sitemaps.
Let’s quickly go over some other best practices you can follow:
Include Page Priority
If you’re creating an XML sitemap, you can assign a <priority> tag to your pages.
This tag tells Google how important a particular page is from a crawling standpoint.
Priorities are set with values such as 0.0, 0.1, 0.2 etc. All the way to 1.0. The higher the value, the more important the page is.
If all pages are set with the same priority, Google may not be able to prioritize crawling. So, make sure you’re tagging pages appropriately.
Note: The priority tag isn’t a guarantee that Google will crawl or index pages in the order you specify. It’s more like a suggestion, with Google ultimately deciding the crawling priority.
Indicate Change Frequency
In an XML sitemap, you can use the <changefreq> tag to tell search engines when you expect a specific URL’s content to change.
This can help them schedule their crawling in a more efficient way.
There are seven <changefreq> values you can use:
- Never: Suggests the content at this URL isn’t expected to change ever again. This might be used for archived pages that will remain static indefinitely.
- Yearly: Indicates that the content at the URL changes about once per year. This could be applicable for pages hosting annual reports or yearly event information.
- Monthly: Works best for pages that are updated on a monthly basis
- Weekly: Indicates pages that could get updated each week, such as ecommerce product pages
- Daily: Identifies pages that require daily updates, such as horoscope pages
- Hourly: Signifies pages that needs hourly updates, such as pages that share weather updates
- Always: Works best is for pages that feature real-time information, such as stock prices
Avoid Noindex Pages
Sitemaps signal to search engines which pages you want crawled and indexed.
So, don’t include your pages with noindex tags in your sitemap. This will send conflicting signals to them.
Avoid Duplicate Content
When search engines encounter identical (or near identical) pages in your sitemap, they’re prone to wasting their crawl budget on your duplicate pages.
It’s like they’re going in circles instead of exploring new and valuable content on your website.
By excluding duplicates, you ensure that search engine crawlers focus on the original, unique pages that deserve their attention.
Use Multiple Sitemaps
XML sitemaps have limits—they can’t list more than 50,000 URLs or be larger than 50 MB.
If your sitemap goes over these limits, you’ll have to use multiple sitemaps.
Ensure Your Sitemap Is Error-Free
Sitemaps are an important part of your website. They help search engines and users find your pages.
So, don’t overlook them.
Use Semrush’s Site Audit tool to detect issues with your sitemap.
Get started by signing up for an account today.