Sitemaps XML Guide: Best Tricks, Tips, and Tools

Sitemaps XML Guide: Best Tricks, Tips, and Tools

Every day, every minute and every second, crawlers of various search engines study and collect information from a huge number of websites around the world. These programs are autonomous and trained in such a way in order to effectively scan website pages. The collected information about web pages is passed to the index, and they can be ranked by specific search queries. “But where does the sitemap come in?” – you ask. They are needed so the procedure for scanning web pages and site content can be as accurate as possible.

Today, the creation and regular updating of sitemap XML is the default standard for most websites and is also among the mandatory stages of technical optimization.

What Is an XML Sitemap?

An XML sitemap is an XML file that contains a list of all available site URLs, each of which has specific metadata. The most important are:

– Address and type of web page

– Date and time of the last web page update

– Refresh frequency

– The priority of the web page in the overall structure of the site

– Availability of language versions

It should be noted that the use of the sitemap protocol does not guarantee that web pages will be indexed by search engines. This is just an additional hint for crawlers who can perform a more thorough site scan. However, in most cases, using a sitemap helps optimize the crawl process and prevent possible errors.

The Purpose of Sitemap XML

Using sitemap XML enables search engine crawlers to be more efficient in the following ways:

  • Prioritize web pages according to specified parameters.
  • Fix the refresh frequency of the site pages.
  • Determine which of the site’s pages are new and require mandatory indexing.
  • Determine if there were any changes to pages that have been indexed before.
  • Scan media content on a website (images and videos).
  • Scan the news feed of the site.
  • Index all necessary pages of a website.

Now, we can conclude that the main purpose of sitemap XML is to help search engine robots scan the various content on the site pages as correctly as possible.

Sitemap XML Formats

According to the Search Console reference information, Google supports the following sitemap formats:

  • XML
  • RSS, mRSS and Atom 1.0
  • text file
  • Google platform site

In this article, we will only be taking a look at the XML format, so we will not dwell on the other formats. For the XML format, there are three mandatory requirements that must be followed when creating an XML sitemap:

  1. A single file cannot contain more than 50,000 URLs.
  2. The file size should not exceed 50 MB.
  3. Only UTF-8 encoding is allowed.

As you can see, the main restrictions are related to size, which is why there is a possibility to create multiple files. In this case, you can create a site map, the so-called sitemap index file, and list all used files in it [1]. If your sitemap XML is larger than 50 MB, it can be compressed using the GZIP archiver.

Sitemap XML Classification

According to the Search Console reference information, Google supports the extended syntax in the XML file for the following data types:

  • images
  • video
  • news

An XML sitemap for images is a file with information regarding only the images of the website. With this, you can specify a list of images available on the website and most importantly specify information for each image. This may be the name of the image, its caption, location, etc.

XML sitemap for video is a file with information regarding the video content of the site. In it, you can list the video content available on the website and describe it. The description can contain video duration, number of views, publication date, title, etc.

A sitemap XML for news is used to optimize your site for Google News.

The simplest and most commonly used sitemap is an XML sitemap for website pages, which actually describes the site structure. Let’s start with this.

How to Create a Sitemap For Google

Special XML tags are used to create a XML sitemap for site pages. They can either be required or optional.


XML sitemap example for a website page:

In order to create a sitemap index file, you should also use special XML tags:


Masking in XML files is intended to interpret ordinary characters in their pseudo-codes in HTML format.The format of child sitemap XML files is similar to the standard sitemap XML for site pages. In XML files, character masking must be used for all data values, including URLs. The characters to be masked are shown in the table:

Sitemap XML For Images

By using the sitemap XML for images, you will increase the likelihood that images from the site will appear in Google search results. This is stated in the Help – Search Console.

This is the greatest benefit of sitemap XML for images and can bring the owners of those sites for which ranking in when searching by image. Read the related article – SEO Images. Additionally, using sitemap XML for images will help the search engine find content that is loaded using JavaScript.

The main tags for describing images are:

<image: image> – information about one image. Each URL (<loc> tag) can include up to 1000 <image: image> tags.

<image: loc> – URL of the image.

<image: caption> – caption to the image.

<image: geo_location> – location (country, city, etc.).

<image: license> – URL of the image license.

XML sitemap Example for Images:

Sitemap XML for Videos

Here are the most important recommendations from the list provided by the Search Console Help for creating an XML sitemap for video:

  • A full description of the video can increase the likelihood of it appearing in Google search results at higher positions, thanks to additional metadata.
  • Google will use the description of the video, which it considers the most informative. This can be a description in a sitemap, or it can be text on a web page.
  • Google Crawler will not crawl the video unless it finds them at the URL specified in the sitemap XML.
  • The Google Crawler must have access to the original video file or player.
  • Google Crawler can verify the accuracy of the specified information about your videos. If the information provided in the sitemap XML is not true, then the video may not be indexed.

Let’s take a look at the main tags (required and optional) for a video description. 

Required Tags:

<loc> – the web page where the video is located

<video: title> – video title (up to 100 characters)

<video: player_loc> – location of the player for the video

<video: content_loc> – placement of a specific video

<video: thumbnail_loc> – video preview (not less than 120×90 pixels)

<video: video> – a container for describing the video

<video: description> – description of the video (up to 2000 characters)

Optional Tags:

<video: duration> – the duration of the video

<video: category> – video category

<video: publication_date> – date of publication

<video: view_count> – the number of video views

The Google search engine supports the following video file formats:

  • *.mpg, *.mpeg;
  • *.mp4, *.m4v;
  • *.wmv;
  • *.asf;
  • *.avi;
  • *.ra, *.ram, *.rm;
  • *.mov;
  • *.flv.

An XML sitemap example for videos:

It should be noted that if you post a video on your site using an iframe from the YouTube service, a separate XML sitemap is not required.

XML Sitemap for Google News

And again we are referring to the official information from the Help – Google Console, which states that the presence of a separate sitemap XML file for news ensures the accelerated search and indexing of all news articles on the site. Therefore, a separate XML sitemap for news will be the best solution for news sites that want to publish their content on Google News.

Here’s a list of key guidelines from Google for news sitemap XML:

  • The file must contain the URLs of articles published in the last two days.
  • Articles published more than two days ago must be deleted from the file. At the same time, they will remain in the Google News index for a standard 30-day period.
  • The sitemap should be updated as articles are published on the site.
  • When publishing new articles, you just need to add their URLs to the existing file.

Tag List for XML News sitemap Tags:

  • <news: publication> is the general tag in which the publication is indicated. It has two required child tags.
  • <news: name> – the name of the publication
  • <news: language> – language in ISO 639 format
  • <publication_date> – publication date in W3C format with full date
  • <news: title> – the title of the article, similar to its name on the site
  • <news: genres> – article properties. Possible values:
    PressRelease – official press release
    Satire  –  an article that exposes the subject of discussion in a comic form.
    Blog –  an article that is published on a blog or in a blog format.

Example sitemap XML for Google News:

According to the information in the Google Search Console Help section, the presence of a separate sitemap XML file for news ensures an accelerated search and indexing of all news articles on the site.

XML Sitemap for Multilingual sites

Many sites that target a wide audience have multilingual support. In order for the Google crawler to correctly understand that the site is multilingual in order to correctly scan its pages in different languages, you need to use the attribute rel=”alternate” hreflang=”x”.

In the XML file, create a separate URL element for each address, which in turn should include:

  • The loc tag that points to URLs
  • XHTML: link rel = “alternate” hreflang = “x” for each alternate version of the page, always including the current version.

For example, on an English site there is the option of choosing a German language. In this case, we must report the availability of alternative pages in German. Such an example might look like this:

Having reviewed the various types of XML sitemap and familiarized yourself with the syntax for creating files, you can move on to tools that can generate a sitemap.

2 Tools to Create and Work With XML Sitemap

Here are several options for creating XML sitemap:

  • manually, guided by the syntax rules discussed above
  • desktop programs for generating sitemap XML (crawler programs)
  • on-line XML sitemap generators
  • CMS tools or individual plugins.

It is clear that creating files with which hundreds and thousands of lines manually is very laborious, if not practically impossible. Therefore, we will focus only on desktop programs, online generators, CMS, and their plug-ins.

Crawler-Programs for XML Sitemap Generation

Crawler programs are usually used for large sites with the volume of pages starting from 1000. Most of them are not free software. It is used in work by experienced specialists who can correctly configure the necessary parameters and take full advantage of the product.

As part of this article, we will have a look at 3 programs that have proven themselves in creating XML sitemaps:

1. Screaming Frog

The software product of the British company Screaming Frog Ltd. Works on Windows, Mac OS, and Linux families. There is a free version that allows you to generate sitemap XML up to 500 URLs.

Screaming Frog

The program has a simple and intuitive interface and a fairly large set of settings for generating XML sitemap:

  • Page priority
  • Export to MS Excel
  • Automatic configuration of lastmod and changefreq
  • Formation of a map for images

2. PowerMapper Desktop

PowerMapper Desktop specializes in creating HTML and XML sitemaps. It works with Windows and Mac operating systems.

PowerMapper Desktop

A distinctive feature of this program is the presentation of the site map in a convenient, visual form. It can be a tree structure, tabular, 3D, cloud, etc. 

Online XML Sitemap Generators

Today, online services for the automatic generation of sitemap XML are used by most developers because it is simple, convenient, and fast.

We will highlight 5 of the most famous and popular online services for the automatic generation of sitemap XML. Each of them has its advantages and disadvantages, but the final choice of what to use is up to you. We will do a brief review of each of the services and while emphasizing their capabilities and features.

1. XML-sitemaps

The service allows you to quickly generate sitemap XML by specifying only the site URL and the time of the last change of information on the site. The free version allows you to work with sites up to 500 pages.

XML-sitemaps

If the site has more than five hundred pages, you can purchase PRO-sitemaps and get advanced features. At the moment, PRO-sitemaps have a “lifetime” licence.

2. My Sitemap Generator

This is also a convenient service for creating XML sitemap. It works in a similar way to XML sitemaps. If the site contains up to five hundred pages, you can generate a map for it absolutely free. Otherwise, you would need a STATIC PRO or DYNAMIC package.

 

My Sitemap Generator

The developers of this service claim that their tool allows creating additional tags in the process of generating sitemap XML. This can help search engines get more complete information on the site’s pages and update mode [4].

3. XML Sitemap Generator

This service allows you to create sitemap XML up to 2000 pages in free mode.

XML Sitemap Generator

On the website of the service there is an opportunity to create a personal account for the user and save their own settings in it. It should be noted that registration is optional and you can generate sitemap XML immediately instead.

However, the “Advanced Settings” are definitely a plus.

Before generating sitemap XML, you can exclude individual URLs and image formats and determine what to show in the title, H1, H2 tags, or specify priorities.

The service also provides a plug-in for CMS WordPress and a desktop version of its product for Windows. Read more about tags in our How to Build a Nice Meta Description, H1 Tag: How To Create a Great Header, How to Create Search Engine Friendly Title Tags materials.

4. Web-Site-Map

This is a simple and convenient service for generating sitemap XML up to 3,500 pages in size.

Web-Site-Map

The developers of the service claim that in addition to creating a map, the XML sitemap generator supports national languages. There is a free report on broken links, a sitemap validator, and manual and automatic page prioritization. A more detailed list of functionality is presented on the developer’s website.

5. XML sitemap generator on the countwordsfree service

This is a fairly convenient tool that allows you to analyze sites up to 10,000 lines in size. The service has a clear and ergonomic interface. It is easy to work with, even for beginners.

Count Words Free

There is a small list of settings that can be adjusted for specific tasks. 

All the tools that were reviewed do an excellent job of their main function: automatic generation of XML sitemap. Each of them has its own additional features. Which tool to choose for your work will depend on the specific task, the size of the site, and your personal preferences.

CMS Tools and XML Sitemap Plugins

If the site was created using WordPress, Joomla, Drupal or another CMS, you can use the tools that are already implemented inside the system along with special plug-ins for automatically generating sitemap XML.

WordPress Plugins for Creating XML Sitemap:

WordPress SEO by Yoast. This is the most famous SEO plugin for WordPress today. Its feature set also includes automatic sitemap XML generation. You can create a map for web pages or for media content. A large number of settings are available. Read more about the plugin in our article Yoast Plugin.

Google XML Sitemaps. One of the main advantages of this plugin is that it is free for everyone! It is a convenient and easy to use tool for creating a site map. It also has an extended set of settings.

All in One SEO Pack is a sufficiently powerful SEO-tool, which has functionality for creating a site map. Various settings, rules, and automatic notifications from Google about the addition of new content are available. More information about the plugin and its settings can be found in articles All in One SEO Pack and Setting Up SEO Plugins.

Premium SEO Pack helps almost all work with XML sitemap to occur automatically. When information is updated on the page or new pages appear, the sites map is automatically updated. This is quick and easy.

Rank Math is a WordPress SEO plugin developed by MyThemeShop. In addition to a large set of various functions, it has the ability to make a sitemap XML. You can also customize various types of entries and taxonomies. Read more about the plugin and its settings in our articles Rank Math SEO and Rank Math Plugin Installation Walkthrough.

With all the options for a specific plugin for WordPress along with their advantages, it is almost impossible to say which is the best. It all depends on the tasks you need and the sites you work with. The plugin is selected for a wide range of SEO-tasks, so something like creating a site map is a good addition to its main functions, but not the main advantage of the plugin. 

CMS Drupal: 

Simple XML sitemap module allows you to automatically generate a sitemap. The created map can be automatically sent from CMS to various search engines.

Drupal

This module allows you to create a sitemap XML file for multilingual sites, images, Google News, and regular web pages.

Drupal

Allows flexible adjustment of generation parameters.

CMS Joomla:

The basic functionality of CMS Joomla does not have its own tools for creating an XML sitemap. However, there are several extensions that can complement the functionality of this CMS by customizing and creating XML sitemaps. We will be looking at a single service: JL sitemap. It has all the necessary functionality, is supported by the developer, and is regularly updated.

JL Sitemap

This is a software product of Russian developers from JoomLine Team.

JoomLine Team

It works with Joomla v. 3.9 and higher. It has the following necessary functions for working with sitemap XML:

  • Automatic generation of sitemap XML
  • Ability to manually configure parameters before starting generation
  • There are no restrictions on the number of processed pages.
  • Scheduled XML sitemap generation
  • Support for multilingual XML sitemap

It is evident that both the CMS Drupal and Joomla tools have all the necessary tools and capabilities for creating, configuring, and working with various types of sitemap XML files.

Adding XML Sitemap to the site

Here are the following steps for adding a sitemap XML to the site:

  1. The file must be placed in the root directory of the site: http://www.domain.com/sitemap.xml.
  2. If there are several files, you need to create a sitemap Index file, where all links to XML files should be listed.
  3. Add the sitemap directive to the robots.txt file:
    sitemap: http://www.domain.com/sitemap.xml.
  4. Add the XML file to the Google Search Console.

Analyzing XML Sitemap in the Search Console

Once you have added the sitemap XML and the Google Search Console to the site, you can analyze the sitemap.

Using the functionality of the Google Search Console, you can do the following:

  • Analyze the number of pages sent and indexed by the site
  • See the errors that the system will detect during the analysis

Using this analysis, you can control the process of crawling and, make certain changes if necessary.

Tips & Tricks

First of all, let’s determine when it’s necessary to make a site map:

  1. The site was created recently and does not have enough link mass.
  2. The site is large in size and has a rather complicated or non-standard structure. A large list of sites is suitable in online stores, tourist-related sites, news portals, major sports, and entertainment resources, etc.
  3. The site contains a large amount of multimedia content (images, videos, etc.).
  4. The site is focused on news.

Now let’s see what you shouldn’t add to XML sitemap:

  • Duplicate pages
  • Secondary addresses (non-canonical)
  • Pagination pages
  • URLs that are based on session IDs and parameters
  • Results (dynamically generated web pages) of search and filtering
  • Archived pages
  • Redirects (3 **)
  • Non-existent pages (4 **)
  • Server errors (5 **)
  • Pages that you blocked in the robots.txt file
  • Pages with noindex

General guidelines for sitemap XML:

  •  Only use the XML sitemap for images, videos, and news in cases where it is justified by the specifics of the site.
  • Generate dynamic site maps.
  • Connect sitemap XML to the Google Search Console.
  • Try to correct errors that will be detected in reports.
  • Use simple and clear file names.
  • If you use multiple XML sitemaps, use a clear and uncomplicated structure.

We tried to collect the most significant recommendations when working with an XML sitemap. When and how to use the sitemaps in each particular case is ultimately up to you.

Summing up, we can say that the correct XML sitemap is another step of the ladder called Technical SEO, which leads to the top of the Google search page. Using XML sitemap, you can significantly speed up the indexing of new or updated pages, focus on the most important (promoted pages), and provide search engine crawlers with comprehensive information about the site’s content.

Of course, the use of the XML sitemap is not a prerequisite and does not guarantee a lightning effect for indexing and promoting a resource to the top of search engines. However, we still strongly recommend creating an XML sitemap for the reasons listed throughout this article.

Author

Marcus Onik Marcus Onik

SeoQuake content manager

Leave a Reply

Your email address will not be published. Required fields are marked *