4th March 2009
So, Google, Yahoo, Microsoft and, more recently, Ask have announced the new "canonical" link type or, more colloquially, the rel=canonical tag. Much has already been written about this tag and its purpose: to help prevent duplicate content issues. Probably the best summary is this Matt Cutts video
This tag is a welcome addition to the armoury in the fight against duplicate content issues. In addition to Matt's comments, I would make the following points:
Scrapers are forever copying content and publishing it on their own sites/splogs. Sometimes they are exceptionally lazy or stupid, even to the extent that they copy Adsense code onto their own sites. If they copy your rel=canonical tag onto their site, that would give a strong "hint" to the search engine that you were the original owner of the content:
Matt made reference to the Microsoft platform in his video, but I would emphasise the point. Microsoft's implementation of RFC 2396 is flawed. The path component of a URL is supposed to be case sensitive, but Microsoft makes it case insensitive. If there are n alphabetic characters in the path, then a Microsoft implementation gives 2n possible variations of that path, where there should be only one. For example, if n=1 and the path is "/a/". Microsoft would allow "/a/" and "/A/"; if n=2 and the path is "/ab/". Microsoft would allow "/ab/", "/aB", "/Ab" and "/AB/"; and so on. 2n variations gives vast potential for duplicate content and it is a big issue with sites built on the Microsoft platform. The rel=canonical tag makes it very easy to specify the correct, case-sensitive path on a Microsoft platform:
Static Web Content
Thousands of links can be created to a single, static URL, each with a different referrer query parameter attached. For sites built on static content, trying to manage such links has been difficult in the past. Now, it's relatively easy. Each page of static content simply needs to contain a rel=canonical tag:
For the reasons stated above, I would recommend the use of a rel=canonical tag in all static content. In fact, I would recommend its use in all content, static or dynamic - with appropriate care of course. It's a powerful tag and using it wrongly could have dire consequences. In the next post I'll look at some of the limitations of the rel=canonical tag and consider some alternatives.