Rel=Canonical Tag : Best SEO Practices for Canonical URLs in 2020
The rel =”Canonical” tag was introduced in 2009 as a result of the collaboration between Google, Microsoft, and Yahoo. Its goal is to help webmasters and search engines effectively solve the problem of duplicated content.
Before, websites, especially online stores, often encountered a problem when several URLs contained almost identical content (for example, a product that is available in several colors or sizes).
In such cases, search engine algorithms misunderstood which URL should be in the search results, which led to problems with SEO because the crawler could select only one URL from all duplicates.
When used correctly, this tag is extremely useful, canonical and its application will be discussed further.
What Is a Canonical Tag
The canonical definition rel=“canonical” is a hint tag for the search engine, designed to inform the crawler of a source with duplicate content. This hint is necessary if you do not want the algorithm to select the source document on its own.
In the absence of this tag and the presence of duplicate content, the search engine does not understand which of the pages are canonical and which of them will be relevant in the search results.
By including a canonical tag, you are signaling to the search engine that you want to see this URL in the search results and more often than not, the algorithm listens to this hint and follows it.
What canonical looks like in source code:
Although the canonical tag has existed for a long time, even in 2020 site owners sometimes face the problem of duplicated content because they do not know how to correctly use this tag.
The Canonical Tag in SEO
Duplicate content poses a challenge for Google’s search engine:
- It has to choose which page to index (Google will only index one).
- Combines reference mass or splits it into several versions.
- Deciding which version to choose for ranking by search queries.
In SEO, the canonical tag is a hint for the search bot, but it is not a directive. In the absence of a canonical, Google will solve the tasks specified in the list on its own. If you want to see the desired page in the search results, then it is better to prevent this.
Why doesn’t Google understand which page is canonical? This is a rather difficult task for the algorithm, you can open the source code and see how the crawler sees your site. For it, it looks like several identical blocks of code and it is difficult for the algorithm to determine which of them is the most relevant to the targeted requests of users.
Perhaps you think that the site does not have duplicated content, you did not create similar content, did not publish the same articles, etc. Keep in mind that the search engine crawls the URL, not the site’s pages.
Take IKEA for example, here we have a page with king-size beds:
Here’s the URL: https://www.ikea.com/us/en/cat/full-queen-and-king-beds-16284/
If we use a filter to sort only the white beds, we get a different URL:
This means that it’s a unique URL for the search engine, while the content on the page stays the same.
A sorting URL is often the reason for the duplicate content.
A solution to this problem:
- rel=“canonical” has to be written in the source code of all sorting pages, referring to the page that you want to see in the search results (in our case: https://www.ikea.com/us/en/cat/full-queen-and-king-beds-16284/).
- Close the sorting options in robots.txt using the disallow directive so that the search engine does not index the sorting pages.
This is just one example, as most often, finding duplicate content on a site is quite difficult.
Here is a shortlist of the most common causes:
- Sort and filter pages (website.io?q=search-term).
- Parameterized addresses for session identifiers (webiste.io sessionid = 2).
- Separate pages for printing (website.io/page and website.io/print/page).
- Site pages for different devices (website.io and m.website.io) or an option with AMP (website.io/ page and amp.website.io/page).
- Content with www and without www (http://example.com and http://www.website.io). This is an example of content duplication in which you need to configure redirects (more about redirects can be found here).
- http and https (http://website.io and https://website.io). In this case, you will also need to configure redirects.
- The same content with and without a slash (website.io/blog/ and website.io/blog).
- Indexed versions of the page (website.io/, website.io/ index.htm, website.io/ index.html, website.io/index.php, website.io/ default.htm, etc.).
In these situations, the proper use of the canonical tag will get rid of unnecessary problems.
It is also worth remembering that the presence of duplicate content will cause Google to use additional scanning resources to find these duplicates and try to find out which of the URLs is canonical.
Having spent time looking for duplicates, the crawler may not crawl other important and new content, which will seriously harm your SEO, these crawling budget problems relate mainly to large sites with hundreds and thousands of pages.
You can learn more about the crawling budget in the official Google manual.
Useful Advice for Canonical Users
The canonical tag will likely be ignored or misinterpreted if the canonicalization chains are not correctly aligned and are mixed with 301 redirects. For example, if you use the tag on page 1 concerning page 2, and then canonize page 2 to page 1, this would be a mistake.
Also, the canonization of page 1 to page 2 will be a big mistake and then redirecting page 2 to page 1 using a 301 redirect. Try to not use complex canonization chains, simplify them as much as possible and you are unlikely to ever encounter a mixed-signals problem
- Cross-domain duplicates and canonical use.
When there are two resources on which the same and/or similar articles are published (or any other content), it makes sense to canonize one page, this will help in its promotion.
Such actions will result in Google not ranking the non-canonical page, so it’s worth considering whether this option suits you.
- Canonical on the original page.
Let’s look at the case when there are 3 pages with duplicate content, the first page is canonical. In addition to placing the tag on duplicates, it will make sense to place canonical on the original page and link to it.
- Multiple rel=canonical on one page.
If more than one rel=canonical tag is found on the page, Google will most likely ignore them, perceiving this as an error. This can happen due to a human error, or because the tag is inserted at different points in the system (if using CMS, plugins, themes, etc.).
- Dynamic canonicals are worth double-checking.
An error in the source code may cause the site to register different tags to different site addresses. It’s worth checking your URLs, especially if you use CMS platforms.
- The HTTP header.
Here’s what it might look like:
Link: <https://website.io/instruction.pdf>; rel = “canonical”
This function will be useful when canonizing pdf documents, it is worth remembering that Google supports this.
- Self-referencing canonical.
On each promoted page, it is worth prescribing rel=canonical to your page. According to John Muller, self-referencing tags are not a critical parameter for search engines, but they greatly facilitate the selection of a canonical URL.
- Using canonical with hreflang.
hreflang is used when there are several languages on your site to indicate the geo-targeting and the language that is to be used in specific instances.
Here are Google’s official recommendations on this subject:
Make sure that when using hreflang, the canonical language for each geolocation refers to itself.
In this example, the English version of the site refers to itself:
Below is hreflang for other language versions of the site.
Here is the French version, which also refers to itself, and below hreflang to other geolocations:
- Do not use canonical too aggressively.
Google tries to follow your prompts, therefore, if you canonize one page to a second one that is not similar, Google will do it, this in itself can be harmful. But if Google detects the canonization of dissimilar pages, it will stop trusting the tips on your site, which can lead to unpredictable consequences.
- Using canonical and noindex.
Using both canonical and noindex contradicts the instructions from Google. A year ago, John Muller, while answering user questions, stated that canonical has a higher priority for search engines than noindex, but I would recommend not trying this method.
Where and How is rel=“canonical” Written?
The Standard Method on the Site Page
Canonical is written in the <head> section in the source code of the page that you are trying to specify a canonical link for.
If the tag is not registered in the <head>, search engines will ignore it.
If you use CMS platforms, most of them allow you to add canonical without the need to use it in the source code of the site.
Canonical at Shopify
This platform allows a single product to exist on multiple URLs, which is a problem for SEO.
Shopify has two different page structures for displaying products, one of which is standard: /product/product_name.
The second way to access the product page is through collections:
Unfortunately for Shopify, rel=”canonical” does not work perfectly because of their system of dynamically updated URL addresses. The solution would be to change the line in the source code responsible for dynamically creating the collection URLs.
We need the “collection-template.liquid” file, find the following line in the file:
This line means that for any products found inside the collection, the name of this collection in the URL will be used. We need to make sure that no matter what collection the product is in, it has the same URL.
To do this, replace the line shown above with this one:
This action will not delete the collection’s URL, pages with collections remain indexable and live, and you will lose duplicates.
Canonical in WordPress
This platform allows you to easily add rel=”canonical” to your pages. The specific method depends on the plugin that is used, here we will look at Yoast and RankMath.
On any page, you can find the “Advanced” tab, a tool for adding a tag will appear in the window that opens. Read more about RankMath plugin settings.
On any page, go to the SEO settings, in the window that appears, you will be able to specify a canonical link. Read more about Yoast plugin settings.
Canonical and 301 Redirects
It is worth understanding the fundamental difference between using a 301 redirect and the canonical tag. Let’s consider this using a small two-page example. If you use a redirect from page 1 to page 2, users will never see page 1.
When using canonical from page 1 to page 2, users can see page 1, but search engines see the second one as canonical. Users will not go to the non-canonical page from the search but can find it on the site.
When is it worth it to use a 301 redirect?
- It is worth choosing one URL and setting up redirects to redirect traffic to the desired address.
- You are moving the site to a new domain and want it to go smoothly. Use a 301 redirect and redirect traffic from the old addresses to new URLs, the same option is relevant if you are combining two sites.
- In cases where the site migrates from http to https (or changes from www to without www) – you need to use 301 redirects.
Common errors when using a 301 redirect:
- Using a 302 instead of 301. A 301 redirect tells crawler bots that the page has been permanently moved to a new page, so this redirect transfers the link weight.
A 302 redirect says that the page has been temporarily moved and the search system must find out and decide whether to save the old page or replace it with the one found in a different location.
- Do not redirect all addresses to the home page when migrating, during migration, every URL should have its unique redirect to the desired page, this is especially true for larger sites.
If you are interested in this method, you can read more about redirects.
A Canonical Audit on the Website
There are several important points that you should pay attention to when checking your canonical tags:
- The presence of the tag on the desired pages.
- Whether the tag points to the desired address.
- Checking if the canonical URL is indexed
- A common mistake is when the page the canonical refers to is blocked in robots.txt.
Below are a few ways you can conduct an audit:
- Manually review the code.
This method is suitable if you have a small number of pages on the site that need to be checked. All browsers allow you to view the source code, in the source code, find the line “rel=canonical” and look at the information you are interested in.
To avoid getting lost in the source code, use the CTRL + F shortcut to find the necessary lines.
2. Specialized tools.
Many services provide the ability to audit canonical tags (Semrush, etc.). I recommend using this option when there are more than 20 pages on your site that need to be checked since it would be time-consuming to do it manually.
The Rel=“canonical” is a Must-Have
If it’s been a long time since you have done an audit of the canonical tag, or you have never done it at all, I highly recommend doing so, since even a small mistake can cause massive damage.
It is an overall good tool when used properly. Despite the apparent simplicity of this tag, you should consider many subtleties so that you do not harm yourself. If you do everything correctly and regularly check your site, you will not have to worry about problems involving duplicate content.