URL Canonicalisation and Normalisation

Alan's picture
| 28th February 2009

I’ve been meaning to write about the new rel=canonical tag, which was proposed by Google, Yahoo and Microsoft on February 12. I managed to squeeze some thoughts on it into my presentation and workshop at SES London, and I’ll be speaking more about it at SES New Yorknext month, but before I blogged about it I really wanted to write more about URL Canonicalisation and Normalisation in general.

Canonicalisation or Canonicalization? Normalisation or Normalization?

I’m British, so I say Canonicalisation and Normalisation. Your mileage may vary.

What is URL Canonicalisation?

We’re talking about search engines here, so let’s try a definition that applies generally, but leans towards search:

URL Canonicalisation
involves taking a set of different URLs that all serve or lead to the same or similar content, and applying rules to select one URL from that set under which that content should be indexed or presented.

I’ve hyperlinked the terms I think are important to more detail below, but before we go into them let’s try defining URL Normalisation.

URL Normalisation
involves taking a single URL and applying a normalisation algorithm to produce a standard form for that URL.

Others define normalisation and canonicalisation as all part of the same thing, but I like to think of them as separate processes. To my way of thinking:

  • you can normalise a single URL but you can only canonicalise a set of URLs
  • an un-normalised URL will serve the same content as a normalised URL, because it’s the same URL
  • all indexed URLs are normalised; not all are canonicalised
  • normalisation occurs before canonicalisation

Now let’s go back and look at those hyperlinked terms in more detail.

Set of different URLs

This is the key to canonicalisation and why it’s needed: the same content is being presented at a number of different URLs. By different URLs, I mean those URLs are really different to each other – they could potentially show different content but (in this case) they don’t. Here is an example set of URLs:

All serve or lead to the same or similar content

If each of the above URLs served the same, or essentially the same, content, it’s likely that they would be canonicalised to fewer URLs – possibly only one. If they each served completely different content, then it’s much less likely that this canonicalisation would take place. By “or lead to”, I mean that the URL may redirect (e.g. with a HTTP 301 or HTTP 302 redirect) to another URL.

Canonicalisation Rules

The rules for canonicalisation vary from engine to engine and time to time. Here are a few examples of when canonicalisation will take place …

  • If www and non-www versions of the URL exist, then canonicalise
  • If the same base URL is seen with different numbers of query parameters, then canonicalise
  • If the filename component of the URL matches a known set of index pages (e.g. index.*, default.*, etc.) then canonicalise
  • If the home page (“/”) redirects to another page, then canonicalise

… and here are some examples of how canonicalisation will take place:

  • Choose the URL with the highest Pagerank (or similar link-based or other off-page criteria)
  • Obey rel=nofollow webmaster hint
  • Choose the simplest URL (e.g. the shortest URL, or the one with fewest query parameters)

Indexed or presented

Sometimes only one URL from a set will be indexed, which means that it will always be the candidate URL to be presented in a set of search results. At other times multiple URLs may be indexed, even though they are known to be part of the same canonical set. One of these URLs will be selected to appear in a given set of search results. The URL that is selected may vary (for example, by query or by searcher location) – but only one will ever appear on a given search results page.

Single URL

Normalisation operates on a single URL rather than on a set of URLs. That single URL may need be supplemented with other data in order for normalisation to take place. For example, un-normalised URLs may be relative or absolute. A normalised URL will always be a fully-qualified absolute URL so, along with a relative URL, the containing URL or tag will need to be known in order for normalisation to take place.

Normalisation algorithm to produce a standard form

Like canonicalisation rules, the normalisation algorithm may vary from engine to engine and time to time. However, it’s much less likely to vary. Here is an example of the kind of things that are done during normalisation:

  1. convert a relative URL to an absolute URL
  2. convert the scheme and the host name components of the URL to lower case
  3. remove the port component if it matches the default port
  4. escape characters that should be represented as octets (or a +)
  5. unescape octets that are better represented as plain characters
  6. convert all escape sequences to upper case

Here are some examples of each operation:

  1. In http://www.silverdisc.co.uk/ , a link to “/contact.html” would be normalised to http://www.silverdisc.co.uk/contact.html
  2. HTTP://WWW.SILVERDISC.CO.UK/contact.html would be normalised to http://www.silverdisc.co.uk/contact.html
  3. http://www.silverdisc.co.uk:80/contact.html would be normalised to http://www.silverdisc.co.uk/contact.html, because 80 is the default port for HTTP connections.
  4. http://www.silverdisc.co.uk/contact.html?name=Alan Perkins would be normalised to http://www.silverdisc.co.uk/contact.html?name=Alan+Perkins or http://www.silverdisc.co.uk/contact.html?name=Alan%20Perkins, because a space is not a valid character in a URL.
  5. http://www.silverdisc.co.uk/cont%61ct.html would be normalised to http://www.silverdisc.co.uk/contact.html, because %61 is better represented as the character “a” in a URL.
  6. A %2a in a URL would be converted to %2A for consistency


That completes this introduction to URL canonicalisation and normalisation. In the next post, I’ll look at rel=nofollow.

You might also like

Would Your Business Benefit Most From PPC or SEO?
Creating the best marketing strategy for reaching your goals means taking a step back, looking at the bigger picture, and finding out which marketing avenues are most likely to work for you. Here are five points to consider when deciding whether PPC or SEO is right for your business.
Sam Rose
On: 26th July 2019
Posted In:
SEO Best Practices for Retail Product Pages
For online retailers, perhaps the most important pages of a website are the product pages. Here I’ll go through some ideas for improving SEO on your product pages, product listing pages, and on-site search results pages.
Sam Rose
On: 17th May 2019
Posted In:
How to Align Your SEO Content with User Intent
When was the last time you reviewed your content in terms of user intent? We’re going to take a look at why you should consider the intent and buying stage of your audience when creating content, and how you can make sure each of your website pages is attracting users at different stages of the marketing funnel.
Sam Rose
On: 8th March 2019
Posted In:

Why Choose SilverDisc

Track Record Of Success

Continuously getting it right for our clients is of paramount importance and we have some great testimonies to this.


Being Google Premier Partners and Bing Select Partners gives you the assurance that we’re fully trained, that we have demonstrated exceptional account management across a range of client spends and industries, and that our track record will continue.

Thought Leaders

We see where digital marketing is heading and position our clients to meet it. For example, we are so far ahead of the SEO game that we helped Google to write their webmaster guidelines, so you can enjoy SEO success without fear of being penalised.

Ethical & Honest

We have your best interests at heart and we give it to you straight, even if that’s painful to you or us. For example, we won’t spend your money marketing a website when it’s plain to us that the money would be wasted. We look for a win-win-win …


We listen to your individual needs, identify targets and work relentlessly towards meeting them, concentrating on what’s effective.


We exist to unlock the value of technology for business, and we love what we do. We have 25 straight years in digital marketing.

Friendly, Open and Giving

These words are at the core of our values. We build strong partnerships and go out of our way to help solve problems for clients, suppliers and the wider world.

Let's Get Started

Please fill in the form below or give us a call on 01536 316100

Follow SilverDisc