First things first: What is a sitemap?
Sitemaps are xml-files containing structured data about the pages of the website. Each page has an entry similar to this one:
<url>
<loc>https://startupnamecheck.com</loc>
<lastmod>2020-03-06T20:31:03+00:00</lastmod>
<priority>0.9</priority>
<changefreq>monthly</changefreq>
</url>
What are sitemaps good for?
Sitemaps are helpers for search-engines to discover all relevant pages and content on a website. While there are also sitemaps for images, the focus here is on web-pages only.
How can I generate a sitemap?
A sitemap can be created in various ways. If you are using a framework such as Laravel you can create these on the fly or whenever you publish or update your content.
After some experiments and checking several solutions on GitHub I've not found the solution I was looking for:
- A simple, permanent crawler of the actual website.
- It considers
noindex
robots tags as well as canonicals and of course thearticle:modified_time
tag. - Ignores JavaScript as Google does mostly. This allows it to run much faster than executing a headless browser only to access a pure HTML5/CSS3 page.
My solution for sitemaps
As mentioned, after some research I haven't found what I had in mind. So, being a developer at heart, I've opted to build my own solution. It's heavily reliant on PHP Spider, a crawler package for PHP. Besides this, the package is using some regex to identify the most interesting parts of the website. Other values, such as priority
are guessed by the depth within the website (nesting level). More detail can also be found on the GitHub repo for Laravel-Sitemaps.
How can I get this?
The package is distributed using composer and can be installed using:
composer require bringyourownideas/laravel-sitemap
This will automatically configure the required Laravel ServiceProvider. If you opted out of package discovery you can install it manually using:
php artisan vendor:publish --provider="BringYourOwnIdeas\LaravelSitemap\SitemapServiceProvider"
How to use the package
The package registers an artisan
-command called generate:sitemap
. This triggers a crawl of your site and writing out of the sitemap. For convenience, you can add this to your deployment steps.
Regular updates of the sitemap
If you'd like to run updates of the sitemap.xml
regularly, you can add a new line in app/Console/Kernel.php
in the schedule
function:
/**
* Define the application's command schedule.
*
* @param \Illuminate\Console\Scheduling\Schedule $schedule
* @return void
*/
protected function schedule(Schedule $schedule)
{
$schedule->command('generate:sitemap')->daily();
// ...or with a defined time...
$schedule->command('generate:sitemap')->daily()->at('02:50');
}
Summary & Feedback
If you've got issues please raise an issue on GitHub. To stay updated please subscribe to my newsletter (below). More information can also be found in the BYOI article around the Laravel Sitemap Generator.
Top comments (0)