How rel=canonical is Breaking Sites

 
Adam Audette

Adam Audette is the president of AudetteMedia, Inc. In addition to helping companies like Zappos implement cutting-edge SEO, Adam speaks regularly at search conferences. You can follow him on Twitter as @audette, and find out more about his company's services right here.

all posts by Adam Audette »

 

It’s been several months since the link canonical tag was announced, and it’s being used fairly liberally out in the wild. It’s patently clear to us that this tag is quite powerful and effective, and the consequences of its misuse very serious. It’s being misused a lot (not a surprise). We’re seeing a ton of sites with poor rel=canonical implementation. The end result: it’s causing havoc.

Yes, rel=canonical is breaking websites. I’ll share a few anecdotes in this post that show just how bad it can be.

Why Link Canonical is Dangerous

Part of the problem with rel=canonical is that it’s extremely easy to implement. Just throw a meta tag into the head of a page and you’re good. That very ease belies the power of the tag. Google announced to us at SMX East last October, that 2 out of 3 times the link canonical target influences the organic decsion. That’s right — 2/3 of the time your rel=canonical target is affecting the crawl and indexation of the page, and in turn, the ranking of the page.

You can see how easy it is to mess this up. Now, when clients tell me they’ve “already got plans for the link canonical tag” I get all weird and anxious on the phone… it scares the heck out of me. At least with hard redirects you can visually see the change occur. With the link canonical in place, you don’t see anything, unless you look at the source. It’s like nothing happened at all.

You’re Doing it Wrong

At SES Chicago this month on the Duplicate Content panel, I shared the story of how a client of ours with about 100,000 SKUs on one of their sites had somehow put a link canonical target of the home page on every single page. Every product page, every category, every section, literally every page pointed back with rel=canonical to the root domain. We didn’t realize this had happened until 2 or 3 months after the fact, because we weren’t working on that particular site at the time. You can imagine what the traffic profile looked like.

Susan Esparza calls it like it is

Bruce Clay told me recently, that “we’re recommending that the link canonical be implemented only with professional help.” Amen to that.

Other failures we’ve seen with implementations of link canonical tags:

  • The link canonical points to itself. This is fine when there are no other options during implementation, but not advised for sitewide usage because it can introduce unexpected behaviour. Be careful with this one.
  • A link canonical chain is created, with canonical targets pointing to multiple URLs, and back and forth, becoming a web of confusion. For example: the http://www.mydomain.com link canonical points to http://www.mydomain.com/index.html and that one points back to the canonical version. Choose a canonical version when duplication exists. Stick with it - consistency is key.
  • Link canonicals on deep pages (such as product pages) point to the category or parent URL.

Now that Google is supporting cross-domain usage of rel=canonical, which is fabulous news for advanced SEOs, I imagine it’s going to be even worse out there. Please people, be careful using this tag, and get professional help. Incorrect use of rel=canonical has serious implications.


You can follow any responses to this entry through the RSS 2.0 feed.
You can skip to the end and leave a response. Pinging is currently not allowed.

1

Good post Adam. Why do you think they would canonical tag to the homepage? Error in automation or just plain simple misunderstanding of the tag?

2

A very wise warning Adam - I had a similar problem on a client’s site which I’ve been meaning to blog about for a while but haven’t gotten round to it! :)

We implemented canonicals on product pages across the site to pick up some duplicate content issues (pagination, sort by parameters, etc). But we failed to realise that we’d put the wrong URL format within the canonical tags so that they all pointed to non-existent URLs! o_O

Within a couple of weeks all the product URLs dropped out of Google and interestingly Webmaster Tools reported hundreds of 404 errors, treating the incorrect canonicals as broken links. We fixed the tags quickstyle and rankings came back, together with the additional boost we were originally expecting!

I honestly wasnt expecting an error like this to have such drastic consequences - after all it’s supposed to be a “hint” right? Well it showed how powerful the canonical tag is, and as you point out how it can get you in trouble.

Adam Audette says...
3

@digeratti I believe the error crept in because the link canonical tags were being dynamically generated. We only found out about the bug after traffic had fallen off to product pages.

@jaamit great learnings, thanks for sharing. It’s easy to make mistakes such as those, way too easy. Part of the problem is the link canonical tag doesn’t really “do” anything to the page - it still renders exactly the same way, same URL, etc… so you have to monitor for consequences to see what’s going on. It’s like a back door to trouble!

4

I’ve heard two very similar horror stories just in the last month. In both cases, it involved a CMS snafu where someone unintentionally canonicalized 1000s of URLs to the home-page. The effects on their index were quick and disastrous.

Fortunately, in at least one case, the recovery was pretty quick, post-disaster, but it’s yet another example of how any tool can be dangerous in the wrong hands. Major architectural changes shouldn’t be made because someone read one SEO blog post.

Adam Audette says...
5

@Dr Pete - thanks for commenting, totally agree. It sounds like this is a very common issue. Will be interesting to read other horror stories as they begin to surface.

6

I think this article is pretty useless. It’s like saying HTML is breaking websites because someone coded the site wrong. It’s not the canonical tag breaking websites, it’s dopey developers and SEO’s.

7

the problem is most CMS systems need them but they are not configured for them.. which means.. bad code.

Adam Audette says...
8

@Jared From Subway thanks for trolling by, rel=canonical offers unprecedented control of crawling and indexation for such trivial implementation. For SEOs in the know… that’s a big deal. As far as I know, the search engines haven’t released a supported HTML tag that can break websites.

9

Hi,

Your article intrigued me. But it also scared me.

Why? Cuz I’m a lame-O and I don’t know what “rel=canonical” is… and why would a link named this be dangerously exciting?

I was going to ask Mintz-T for an explanation, but I know he’s pretty fed up with answering my dumb*ss questions. So you’re it!

cheers!

Adam Audette says...
10

@S Douglas that’s easy, here:

Learn about the Link Canonical Element in 5 Minutes:
http://www.mattcutts.com/blog/canonical-link-tag/

11

Adam, I notice in your source code that you’re using the All In One SEO Pack plug-in (as do I, and I’m sure many of your readers), but not for rel=canonical. I tried ticking that check box in the control panel and found that it set every page’s canonical URL to be its actual URL. That makes sense, I guess, since while a given blog post may be found on its own page, the home page, and the archives for that month, the category to which it’s assigned, and whatever tags are on it, that doesn’t necessarily make any of those pages duplicates or even near-duplicates of each other.

However, if every page is the canonical version of itself, that means that no page serves as the canonical version of another page. So doesn’t that make the presence of the tag completely superfluous, like putting a meta robots tag of “index,follow” on a page?

I may be missing something, but I can’t find a reason to use that particular function of the plug-in. Is there a setting I haven’t noticed that will actually set one URL as the canonical version of a different URL?

Adam Audette says...
12

@Bob - yes, good point. I haven’t used the canonical option in All in One SEO. It sounds though like if permalink/post pages are pointing to themselves, there shouldn’t be a problem. As you say, it’s unnecessary and superfluous.

As a general statement, when rel=canonical is configured for larger sites it’s easier to be ‘baked into’ each page by default. So we get lots of canonical targets pointing to themselves. This doesn’t worry me so much, by itself, but may introduce unexpected behaviour down the road.

If at all possible, I recommend only using rel=canonical when a specific requirement warrants it. Not always possible, but preferred and keeps more control in general.

Anyone else have thoughts to share on the canonical feature of All in One SEO Pack?

13

S Douglas,

Those of us who know you, know that you are no dumb*ss. I think it goes like this…
1) Google has a good idea to solve a problem
2) New solution has unintended outcomes
3) Be careful

;-)

You’re wanting more huh?

14

@Hi Mark

I’m sure that Mintzy will tell you I know jacksquat about SEO and programming, coding, designing, etc. I just look at stuff to see if it works, will it make things easier to make a profit, does it hurt, and is it legal. Then I hire somebody else to do the expertise once of all those criteria are first met.

Thanks for the “no dumb@ss” props, tho! lol

15

The caution on not having the link canonical point to itself creates a challenge for news sites in particular. A lot of publishers now use the tag to try to offset duplicate content issues caused by appending tracking codes to URLs (e.g. site.com/article-name?xid=rss-top-stories).

Tech teams generally say they cannot apply the tag to only the coded URLs; from their perspective the articles are in fact only rendering on one URL. (They also say they can’t use 301 redirects or it interferes with their tracking capabilities). So they place the rel=canonical tag on the “clean” canonical URL so that no matter how many different tracking codes may be appended to it (separately), the correct canonical URL is present in the tag. But that means nearly every piece of canonical content on a news site ends up having a rel=canonical pointing to its own URL.

16

I didn’t mean for that example URL to come through as an actual link. site.com/article-name?xid=rss-top-stories

17

The 100,000 SKU bad implementation made me laugh out loud.

At Fuel, we concluded pretty quickly after this came out that since a lot of our sites are dynamic and have a site-wide header file, that it wouldn’t easily solve any problems for us.

Thanks for the article!
:-)

Adam Audette says...
18

@Adam Sherk - thanks for the comment, it’s valuable hearing your experiences on this topic w/ the big publishers you guys work with.

Sorry about the URL - fixed it in your first comment.

Adam Audette says...
19

@Brian Carter - funny, yes, but so sad! Someone mentioned to me the other day that Stephan Spencer had found a case where Google was using rel=canonical wrong somewhere. Anyone know?

I shared at SMX Advanced last June how Google’s dupes of DMOZ on directory.google.com and http://www.google.com was causing issues in some apparel-related SERPs. A week or so after that presentation, the problem I showed (for “clothing”) was fixed. I don’t think it was fixed w/ rel=canonical, though, and at the time they still didn’t support cross-domain usage. (Not only did Google display dupes of its own directory, but also the exact same content on DMOZ.)

20

Hello Adam,

Re:

The link canonical points to itself. This is fine when there are no other options during implementation, but not advised for sitewide usage because it can introduce unexpected behaviour. Be careful with this one.

Google have said that should a page use a canonical tag to point to itself, this is fine and doesn’t do any harm - obviously your point contradicts this, but does it do so based on an actual occurrence (anecdotal or not) where this proved true? Or are your concerns more for the potential of the tag to cause problems in the future?

Thanks,
Jack

Adam Audette says...
21

@Jack - correct, a link canonical pointing to itself is technically fine, and officially supported. It can introduce problems. We’ve seen situations where duplicate content all have their link canonicals pointing to themselves, rather than to a single canonical version. But yes, the concern is really about the potential for this breaking things in unforeseen ways.

22

Adam — I had to respond to #8 up there

Not exactly a “search engine supported” tag, but one that breaks websites all the time?… \…or should I say…breaks their Search Agent Compatibility?

Anyway, I couldn’t agree more that some things (like the canonical tag) should only be handled by a trained professional (of course, I also believe that WYSIWYG is the worst thing to ever happen to the internet) — but the (in)visibility (or lack thereof) of implementation (or an implementation fail) points to the odd circumstance that internet-delivered content is still evaluated by all but maybe a handful of SEOs (and people using screen readers) based on purely visual criteria.

Long story short - it’s too bad it’s forever amateur hour on the internet, and that every epic fail doesn’t automatically add p {text-decoration: blink;} to an offending website’s stylesheet…

23

The ease with which this tag can be implemented by amateurs has been among many concerns about rel=canonical from the beginning. I hope people take your advice to heart and realize that not using it at all is a far better option than using it improperly.

I always advise those who bring up the subject of the rel=canonical tag that if they don’t understand how to use 301 redirects properly, they shouldn’t even consider implementing the rel=canonical tag. It isn’t something to be taken lightly. It must be approached with a very specific strategy and a full understanding of why it’s being done, what it does and what using it will accomplish.

It’s not the “quick fix” solution for correcting the problems site owners have avoided fixing previously - either because it was too time consuming or troublesome to do it right in the first place. It’s putting a band-aid on a gunshot wound and I’m not surprised so many mistakes are being made implementing it.

By the way, I agree with Bob - the redundancy and superfluousness of including rel=canonical on all pages only to incorporate the actual page URL is apparent. I questioned the purpose of it when it was incorporated into the All-in-One-SEO and Platinum SEO plugins, and again when the standalone Canonical URLs plugin was introduced.

If any of them allowed you to specify the canonical URL for the page, rather than simply adding the tag with the actual URL of the page, I might be able see a use for it. Without that functionality, there’s no constructive purpose that I can see to using the rel=canonical tags generated automatically by any of the plugins.

There’s my two cents…okay, more like 25 cents. ;)

24

I’m embarrassed to say I did the exact same thing. A coding error pointed the cononical to the home page for every single one of my 20,000 plus pages. Traffic from Google dropped by 80% from 10-12k visitors to 2-3k. :( :( :(

Now the canonical is pointing to itself. Traffic has not returned so far — this has been 2 months to date.

25

The same is true can be said about any tag/code that does not affect how the page is rendered. We have a client in the classified space that inherently migrated 1.5 millions plus pages from staging to the live environment with a Meta Robots NOINDEX tag intact (they added it to the staging environment as an extra precaution). You ever wonder how long it takes to drop a half million pages from the index?

In the end it comes down to having proper procedures in place at the ground level. The developers and engineers need to be educated, supported and provided with the tools to ensure these simple, but catastrophic errors do not happen.

We put together a SEO course specifically for engineers by engineers that has helped tremendously.

Adam Audette says...
26

Good points @petryshen and a scary tale you tell about the noindex oversight. Education always helps.

27

@petryshen — how long ago was that and do you recall how long it took to recover?

(My guess is that within 2 weeks the pages started disappearing from the index and that they were gone within 7 more days)

28

Correct Adam. The error was spotted 3 weeks after implementation during some random tests. Thankfully, we were able to reverse the slide quickly after cleanup (we also made some minor content changes and resubmitted the XML sitemaps). Within a week had gained back over 70% of the pages. Within two weeks, the pages and and traffic returned to pre error levels.

29

What is the best way to handle canonical links on a blog site that shows recent posts on the home page and duplicates that content on pages dedicated to individual blog posts? For http://blogstalk.com, I am using only canonical links on the individual post pages, but I don’t have any on the home page. I refrain from putting them there, because I have only seen one canonical link per page on the examples that I have encountered. Is it possible to put multiple canonical links to specify that the individual post pages are the ones that should be canonicalized?

30

I am also confused with this tag, the all in seo plugin in wordpress automatically creates canonical url of all the urls of the post. I wonder if this creates some problem.?

31

Hello,
I’ve got a big problem with canonical link,

on a product page, I’ve got a “sort by” tool, with an url parameter, so I’ve put a canonical link to point to the url without any ’sortby’ parameter,

In GWT, I’ve got 17000 duplicate titles errors , with the 2 pages: with and without the ’sortby’ parameter !
Example :
http://www.meilleurmobile.com/.....tDrillDown
and
http://www.meilleurmobile.com/.....Simplicime

containing the tag

Have I done anything wrong , or is it a major GWT bug ?

Thanks for help !

- Arnaud

Leave a Reply

Allowed Tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

 
 
Home Sitemap Contact