A sitemap is an XML document that a website can provide to inform search engines like Google about all the pages available on the website and how often these pages should be re-scanned for updates.
In this article, we will learn how easy it is to generate a sitemap.xml
for your static site built with Eleventy.
Eleventy is a simple static site generator written in Node.js. For the sake of this article, we will assume that you are somewhat familiar with Eleventy. If not, make sure to check out Eleventy's official website and try out the quick start tutorial.
If you are curious to find out more about sitemaps, make sure to check out the official sitemap spec.
Get items in a collection
Eleventy allows us to organise our content using collections. If we add tags to our content (article, blog posts, pages, etc...) Eleventy will create special data entries that we can use in our templates to render pages that contain all the items under a specific tag.
For instance, if we want to build a cooking website, we could tag different recipes with tags like pasta, cookies, dessert, soup, etc. Yeah... I like pasta a lot, if you don't believe me, check out my Instagram! 🍝
To do this, we simply need to add tags in the frontmatter data of our templates like in the following example:
---
title: Fresh pasta dish pesto Genovese and Dublin bay prawns
tags: [pasta, pesto, prawns]
---
# A fresh pasta dish with pesto Genovese and Dublin bay prawns
Talk about the recipe for this yummy pasta here...
At this point, this new recipe will be available under the collections pasta, pesto and prawns.
But how can we access all the items in any of these collections? It turns out it's quite easy: any template can access the collection using collections.<tag_name>
.
For instance, we can create a new template to list all our beautiful pasta dishes (pasta-dishes.md
), and then we can use the special data entry collections.pasta
on it:
---
permalink: /pasta-dishes/
title: All my pasta recipes
---
# All my pasta recipes
{% for recipe in collections.pasta %}
- [recipe.data.title](recipe.url)
{% endfor %}
This code will render a page containing the list of all our pasta dishes. Note how we are looping over the collections.pasta
to retrieve all the items in the collection. Also, note that Eleventy allows us to use templating constructs like for loops and variables even with plain markdown files!
Similarly, we could use other collections like collections.pesto
and collections.prawns
and build dedicated pages around them. Of course, nothing is stopping us from using multiple collections on the same page if we want to.
The special "all" collection
Eleventy automatically generates a special collection called all which will contain every single page in your static website. As you can imagine, this is particularly convenient if you want to generate pages that can be used to list all other pages.
The TLDR; here is that you can loop through all the pages with something like this:
{% for item in collection.all %}
{# do something with the item #}
{% endfor %}
This is exactly what we need to generate our sitemap.xml
!
A date formatter filter for nunjucks
Nunjucks is possibly my favourite JavaScript templating language and somewhat a default with Eleventy (even though Eleventy allows you to use many different templating languages and even to mix and match them as needed!).
I like to use Nunjucks here, but we will shortly bump into a little issue: Nunjucks does not expose a standard way to format a Date
object, so we need to implement this feature ourselves before we can create our sitemap. But don't worry it's going to be just a few extra lines of code if we rely on an external library such as date-fns
.
So, let's start by installing date-fns
in our project:
npm i --save-dev date-fns
Now we need to add this bit of configuration in our .eleventy.js
configuration file (if you don't have one you can simply create one with the following content):
'use strict'
const format = require('date-fns/format')
module.exports = function (config) {
// add `date` filter
config.addFilter('date', function (date, dateFormat) {
return format(date, dateFormat)
})
// ... the rest of your config here ...
}
Now in our templates we can do something like this to convert a JavaScript Date
instance into a string representation using a date format of our choice, for instance:
{{ aDateInstance | date("yyyy-MM-dd") }}
This will print something like 2020-09-16
.
If you want to find out how to write more complicated format strings checkout the official documentation for the format()
function in date-fns
.
The sitemap.xml template
We are finally ready to write our template for our website sitemap.xml
.
A sitemap is an XML document that looks like this:
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
<url>
<loc>http://www.example.com/</loc>
<lastmod>2005-01-01</lastmod>
<changefreq>monthly</changefreq>
<priority>0.8</priority>
</url>
<url>
<loc>http://www.example.com/page1</loc>
<lastmod>2005-01-01</lastmod>
<changefreq>daily</changefreq>
<priority>0.9</priority>
</url>
<!-- ... more url nodes ... -->
</urlset>
As we can see, we have a very specific structure. There a urlset
node that contains a bunch of url
nodes. Every url
node represents a page on the website and for every page the following parameters are specified:
-
loc
: an absolute URL pointing to the given page -
lastmod
: a date (inyyyy-MM-dd
format) indicating when the page was updated -
changefreq
: a string indicating how often is this page generally updated (can have the following values:always
,hourly
,daily
,weekly
,monthly
,yearly
,never
). -
priority
: a float value indicating the priority of this page in comparison with other pages on your site. Valid values range from0.0
to1.0
.
The only mandatory parameter is loc
, all the others are optional.
Now that we understand how a sitemap.xml
looks like we are ready to create a Nunjucks template (sitemap.njk
) that can generate this file for our website:
---
permalink: /sitemap.xml
---
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
{% for item in collections.all %}
<url>
<loc>{{ site.url }}{{ item.url }}</loc>
<lastmod>{{ item.date | date("yyyy-MM-dd")}}</lastmod>
<changefreq>{{ item.data.sitemapChangefreq | default("monthly") }}</changefreq>
<priority>{{ item.data.sitemapPriority | default(0.8) }}</priority>
</url>
{% endfor %}
</urlset>
That's it! Our template is just looping through collections.all
and for every page, it is creating a dedicated url
node. Note that every page might specify a custom sitemapChangefreq
or sitemapPriority
in their frontmatter that will be used here to populate the changefreq
and the priority
nodes respectively. If no value is found, default arbitrary values are used.
One little important detail is that, since the loc
node needs to contain an absolute URL, we are here using a site.url
variable to retrieve the base URL of the website.
This value is not provided by Eleventy, and therefore we need to add it to our site's data. One way we could do that is by creating a file called _data/site.js
with the following content:
'use strict'
module.exports = function () {
return {
url: 'https://www.example.com'
}
}
Try to build your website and you should now see a new shiny sitemap.xml
being generated. Success! 💪
Remove undesired pages from the sitemap
If you have checked out the content of sitemap.xml
, you might have realised that the sitemap itself is listed in the sitemap... 🤔
... yes, this is probably something we don't want to do!
Similarly, we might have pages we don't want to list in the sitemap, for example, our 404.html
, if we have one.
So, how do we remove these pages?
It is actually quite easy. Eleventy provides a special template attribute called eleventyExcludeFromCollections
that can be specified in the frontmatter to remove that page from any collection (including the all collection).
So we can simply edit the frontmatter of our sitemap.njk
file and att the following attribute:
---
eleventyExcludeFromCollections: true
permalink: /sitemap.xml
---
...
That's it! If we rebuild the website, we should see that the sitemap is not referenced anymore in the sitemap code!
Make sure you add this attribute in any other page you don't want to list in the sitemap.
Bonus: Reference sitmap.xml in robots.txt
To make a search engine actually find our sitemap there one last thing we should do. That is creating a robots.txt
file which references our sitemap.xml
.
A robots.txt
file is a file that most web crawlers will try to fetch from the root of your website (for instance www.example.com/robots.txt
) to try to find some pieces of "advice" on how to "crawl" through the website.
A robots.txt
file generally contains suggestions about pages that should not be indexed and can reference a sitemap for the pages that need to be indexed. In our specific case it could look like this:
User-agent: *
Allow: /
Sitemap: https://www.example.com/sitemap.xml
If you are curious to learn more about robots.txt
you can check out the official Robots Exclusion Protocol spec (still a draft).
How can we generate a robots.txt
for our website? Well, there can be several ways, but we can definitely re-use what we learned the Nunjucks template technique that we just used to generate the sitemap.xml
file.
Let's create a sitemap.njk
template in the root of our project:
---
eleventyExcludeFromCollections: true
permalink: /robots.txt
---
User-agent: *
Allow: /
Sitemap: {{ site.url }}/sitemap.xml
And yes, that's it!
Try to build the website again and you should see a robots.txt
and a sitemap.xml
being nicely generated in the build folder! 🚀
It's a wrap 🌯
In this article, we learned how to generate a sitemap for a static website built with Eleventy and in the process, we explored some interesting topics about Eleventy such as collections, templates, custom template filters and custom data files. If you want to continue on this path, you can also check out how to add a generator meta tag to your Eleventy website.
If you found this article interesting consider following me here, on Twitter and check out my personal website/blog for more articles.
Also, if you like Node.js consider checking out my book Node.js Design Patterns.
Thank you! 👋
Top comments (2)
Very interesting article. So easy to implement. But I have a problem: my sitemap and robots are in folders called "sitemap" and "robots" and the content is in a file called index.html. My eleventy config file has this :
I liked working with html files for ease of use. I didn't know the power of nunjucks.
How can I solve this?
solved: the fences in the front matter, there should be 3
Thanks for reporting this. Apparently there's a rendering problem here on dev. In my code snippet there are actually 3 dashes but then it renders to 6... I'll try to see if i can fix this :/
EDIT: editing and saving the article seems to have fixed this issue :O