Products

Solutions

Resources

Partners

Community

About

New Community Website

Ordinarily, you'd be at the right spot, but we've recently launched a brand new community website... For the community, by the community.

Yay... Take Me to the Community!

The Community Blog is a personal opinion of community members and by no means the official standpoint of DNN Corp or DNN Platform. This is a place to express personal thoughts about DNNPlatform, the community and its ecosystem. Do you have useful information that you would like to share with the DNN Community in a featured article or blog? If so, please contact .

The use of the Community Blog is covered by our Community Blog Guidelines - please read before commenting or posting.


Understanding the SEO importance of Canonical Urls and Duplicate Content

Continuing my blog theme of covering off some basic SEO as it pertains to the DNN Platform, this blog is going to be a primer for really understanding the importance of Canonical Urls, why that is, and how that affects your site.

Canonical? What does that even mean?

If you look up a general dictionary, the word ‘canonical’ has origins in canon law – the meaning is ‘required by canon law’.   What is that?  Generally, that can be broken down to conformance with well-established rules or procedures.

Still confused?  Here’s a simple one : Canonical means the established authority on something.

When applied to Urls, a canonical url is the established, correct, standard, original, required, single, only, most important Url.

Let’s go back further to the early days of the internet, when web servers were a simple way to serve up content and to interlink content between different network types.  The idea that you could have the same document served up from the same website under two different names would mean someone made a mistake and uploaded the same thing twice with two different filenames.   A visitor would be confused which was the correct one.  They would want the canonical document so they didn’t read the wrong one. 

Taking a step sideways, we come across the term CNAME when we are dealing with setting up domain names.  CNAME is an abbreviation of Canonical Name.  If you create a CNAME record to associate a domain with your website, what you are doing is saying ‘new.example.com’ is the Canonical Name for ‘old.example.com’.

Combining those two concepts together and you get to the Canonical Url.  It’s the accepted, standard Url for a particular piece of content on the internet.  Unlike CNAME, however, it’s not an official ‘internet’ term like URL or DNS, it’s something that has appeared in the vernacular.

Why is the Canonical URL so important?

It really was Google who first started to make the desire to have canonical URLs known to the webmasters of the world.  That’s because it is very difficult for a search engine to make a determination which value is correct if it comes across the same (or extremely similar) piece of content with two different URLs.   Going back to our simple duplicated-file example – most humans would be able to work out which one is more current – perhaps the filename is called ‘FINAL’ on one and ‘DRAFT’ on another.  But search engines can’t use judgement calls like that and still remain correct all the time.

In a world of dynamic pages where websites can generate 10s of thousands of different URLs, the task becomes that much harder.

When the same content is available from two (or more) different URLs, it is called duplicate content.   Search engines like Google have decided they don’t like duplicate content and pages with duplicate content don’t rank well.    I don’t know the exact reasons : I can only speculate.  If you asked me to speculate, then I would tell you that duplicate content:

1. Splits the link value of any incoming links – if 5 people link to http://example.com/mypage and 5 people link to http://example.com/my-page – and it’s the same content – all other things being equal, the relevance of that page has had it’s influence cut in half.

2. Can be an indicator of generated and/or low quality content – lots of websites generate endless links, like blog archive pages,  calendars, photo albums – things where you can have many pages and each individual page is not necessarily something that would be relevant to a search.   These can be the pages that are essentially duplicates of each other.  A search engine doesn’t want the same thing clogging up a page of search results.  Each link should be unique.

3. Duplicate content, if allowed to proliferate, could be a vector for people trying to game the search results.  Who wouldn’t want to fill up the front page of search results with 10 different versions of their content.

Remember, that’s my speculation – from experience – but speculation nonetheless.

The fact is that duplicate content on your site can hurt your rankings, and you need to pay attention to it.

How to ensure Duplicate Content is not affecting your site

There are two primary methods in the fight against duplicate content :

  • 301 Permanent Redirects : This means that a status code of 301 is sent back with a new location when a request is made for the Url.
  • Canonical Link element : This is a small Html tag included in the header of a web page, which lists the desired canonical URL for the page.

The choice of which to use depends on the circumstance.  Sometimes you want to show largely the same content for two different Urls (such as two different view of the same content, such as a sorted list), and sometimes you want to force visitors to only see the canonical Url.

Canonical Urls in DNN

As DNN is a CMS with dynamically generated content, there is a higher risk of generating duplicate content where you serve up the same page with a variety of Urls.  Take these cases:

  • example.com/
  • www.example.com/default.aspx
  • example.com/home.aspx
  • www.example.com/home/tabid/56/default.aspx

These can all point to the same piece of content – the home page of a DNN site.   Compound these duplicates with variable URLs generated from a third-party content module, and you could have hundreds, or thousands of duplicate URLs in a DNN site.

Of course, this is nothing new and DNN has had many features in it for a while to combat the duplicate content problem.  The most obvious issue is having your site listed with example.com and www.example.com.   This is easily caused by having the two different versions of the same domain pointed at your site – something that most people do.

In DNN, you can set the site alias (domain) to either be a primary domain, or a canonical domain.  Primary means that a 301 redirect will be issued.  Canonical means that the site will be shown with a valid requested alias, but a Canonical Link element will be generated for the page.  In most cases I recommend using ‘Primary’, but some people have a particular strategy for having multiple aliases to show the same content – for these the ‘Canonical’ option works better.

But there are other issues that DNN needs to handle as a CMS.    There have been various versions of DNN URLs over the years.  There are the different versions of the Home page URL that are all accessible from within a DNN site.   It’s possible for a DNN site to have had several versions over the years and different versions it might have had.   Essentially, the older a DNN site is, the more chance that it has had, or still has, some duplicate content problems.  

New URL Canonicalization Changes and Improvements in DNN 7.1

Now that the DNN 7.1 CTP is out, people can see the new changes within the ‘Advanced’ Friendly Url Provider which specifically target and solve problems with duplicate content by using 301 redirects to the canonical URL for a DNN page.

Some of these are:

Home Page Url : the home page for 7.1 sites with Advanced URLs switched on shows as the ‘site root’ – http://example.com/ – there is no longer any /home.aspx variation of the home page.  All created Urls reference the site root, and any request to /Home.aspx or /Home will redirect back to the site root.

Automatic Redirect of Old URLs  :  Old URLs in DNN are like /default.aspx?TabId=xx Urls .  Any request for the old style of querystring based DNN page URLs will be redirected back to the canonical URL for the page – which DNN knows is the best, most friendly URL for the page.  You can try it on a new CTP install – request http://example.com/default.aspx?tabId=56 or http://example.com/tabid/56/default.aspx – you’ll be redirected back to the Canonical URL for that page : http://example.com/Getting-Started

Automatic Redirect to hyphenated Urls :  A new feature for URLs in DNN 7.1 Advanced URLs is the fact that words within the URL (derived from the page name) are separated with ‘-‘.  This gives clean separation and makes the URL more readable (and ultimately more searchable) – and if you request the old version of the page, without the hyphen, you’ll be redirected back to the canonical URL, which does have the hyphen.   So http://example.com/aboutus will redirect to http://example.com/about-us automatically.

Of course, these are but a few of the URL improvements which are becoming available in DNN 7.1.   I will cover those more in the coming weeks – you should jump on, download the 7.1 CTP and see what other discoveries you can make.

Questions or comments? – feel free to quiz away in the comments below.

Comments

Robert Stordeur
Bruce,

I've been waiting for this kind of feature for so long to conver legacy sites. I had tested switching to human-friendly urls and found out that it broke just about everything.

I had to go to our test install of DNN 7.1.0 to find the web.config entry for advanced, but once I added it, WOW! No matter how I executed the link, it worked great! I have some issues with a custom CSS menu that we used to replace SolPartMenu, but it was time to switch to DDR menu for all the sites.

By the way, the comment link at the top of blog says 3 comments, but none are listed here.
Robert Stordeur Saturday, July 27, 2013 1:06 PM (link)
Bruce Chapman
@Robert : glad it has all worked for you. One of the strengths of the new advanced mode URLs is the ability to efficiently migrate older sites with legacy URL structures toward the clean, modern style. This not only helps improve the appearance of your site, it should also improve the traffic you get from search engines as the signal/noise ratio for key words and phrases in your URLs improve.

Comments have been a little patchy since the new site was launched, but the problems are gradually all getting cleared up.
Bruce Chapman Sunday, July 28, 2013 7:21 PM (link)
Robert Stordeur
Exactly what the earlier human-friendly failed to do. Legacy sites would have been a pain to convert, but they needed the most SEO work. This makes it so much easier.

Also we are addressing the CSS menu problems and found the same issue with a CSS based slideshow where it renders only its link and not it's side menus (can't find the images, I'm thinking it's a path problem). But at least this is a start to getting the sites' SEO up to where it needs to be.

Yea, the comments updates as soon as I posted the previous comment. Not sure what I think of the new site. I had problem getting into the new Gemini replacement as well. Lots of change that may upset a few folk.

Thanks again for the great work.
Robert Stordeur Sunday, July 28, 2013 11:26 PM (link)
Rodney Joyce
Hi Bruce,

Does this negate the need for your Canonical URL Module on the forge?
Rodney Joyce Thursday, August 15, 2013 8:23 AM (link)
Rodney Joyce
Did you see my question above Bruce? I use URL MAster so I generally have canonical URLs, however some SEO tools report that I do not have the actually link in my source - would you recommend installing your Canonical URL Module?
Rodney Joyce Monday, September 2, 2013 1:34 AM (link)
Bruce Chapman
@Rodney I thought I had answered it at some point. Essentially no, it doesn't negate the need for the canonical link module - that module instead is for consolidating pages with multiple Urls down into a single URL. Essentially a redirect and a canonical link solve the same problem - resolving multiple pages of content down to a single URL - but each do it in a different way, and you would use either for separate reasons.

You don't need a canonical link in a page - it's not a requirement so it's wrong for tools to report that as anything but informational, really.
Bruce Chapman Thursday, September 5, 2013 3:32 AM (link)
Marc Arbesman
I just tried this out in my environment. This solves a good portion of our current SEO problems. Glad to see URL re-writing is core now.

I have one question about 301 redirects for the default.aspx in the site rootr. I can see a 301 redirect in the headers when I try to go to http://mysite.com/home.aspx. But I still get a 200 OK response header when I got to http://mysite.com/default.aspx Am I doing something wrong?


Thanks!
Marc Arbesman Friday, September 6, 2013 4:25 PM (link)
fernando toro
Hi Bruce,

After a month of using DNN community edition, I realized my home page has different ways to get to it ---- home, home.aspx, default.aspx and content is being duplicated, so it's not good for seo purposes ....found a solution but is buying urlMaster from ifinity but my budget is limited, I can't afford it..... and found this post,,,,,,

My webmaster updated dnn version to 7.1 hoping this problem could be solved, but url is still doing the same thing... any help would be much appreciated....

Fernando
fernando toro Saturday, September 21, 2013 12:13 PM (link)
cathal connolly
@Fernando -please check your web.config and ensure that you are using urlFormat="advanced" and not urlFormat="humanfriendly" - these changes are all in the advanced url rewriter.
cathal connolly Saturday, September 21, 2013 4:01 PM (link)
Bruce Chapman
@Fernando - @Cathal is correct. By default, an upgrade to 7.1 doesn't activate the advanced functionality, you need to do this as a second, manual step after the upgrade is complete.
Bruce Chapman Sunday, September 22, 2013 6:59 PM (link)
John Jacobson
Bruce,
Thanks for the article.
I have a new site for a client and the redirect from /home to / works beautifully.
However my own site, which has been upgrade from V5 through V6 to V7 doesn't do this.
I'm guessing that I need to do something to make it work but I'm not sure what that is. Can't see anything in the Friendly URL settings.
John Jacobson Wednesday, December 18, 2013 4:51 PM (link)
John Jacobson
I found the answer. Needed to change this line in web.config so it had urlformat="advanced"


John Jacobson Wednesday, December 18, 2013 6:01 PM (link)
Bruce Chapman
@John, yes, that is the correct answer. Upgrading from an earlier version of DNN preserves the prior URL settings, and you have to manually change that to get the new behaviour.
Bruce Chapman Wednesday, December 18, 2013 7:33 PM (link)

Comment Form

Only registered users may post comments.

NewsArchives


Aderson Oliveira (22)
Alec Whittington (11)
Alessandra Daniels (3)
Alex Shirley (10)
Andrew Hoefling (3)
Andrew Nurse (30)
Andy Tryba (1)
Anthony Glenwright (5)
Antonio Chagoury (28)
Ash Prasad (37)
Ben Schmidt (1)
Benjamin Hermann (25)
Benoit Sarton (9)
Beth Firebaugh (12)
Bill Walker (36)
Bob Kruger (5)
Bogdan Litescu (1)
Brian Dukes (2)
Brice Snow (1)
Bruce Chapman (20)
Bryan Andrews (1)
cathal connolly (55)
Charles Nurse (163)
Chris Hammond (213)
Chris Paterra (55)
Clint Patterson (108)
Cuong Dang (21)
Daniel Bartholomew (2)
Daniel Mettler (181)
Daniel Valadas (48)
Dave Buckner (2)
David Poindexter (12)
David Rodriguez (3)
Dennis Shiao (1)
Doug Howell (11)
Erik van Ballegoij (30)
Ernst Peter Tamminga (80)
Francisco Perez Andres (17)
Geoff Barlow (12)
George Alatrash (12)
Gifford Watkins (3)
Gilles Le Pigocher (3)
Ian Robinson (7)
Israel Martinez (17)
Jan Blomquist (2)
Jan Jonas (3)
Jaspreet Bhatia (1)
Jenni Merrifield (6)
Joe Brinkman (274)
John Mitchell (1)
Jon Henning (14)
Jonathan Sheely (4)
Jordan Coopersmith (1)
Joseph Craig (2)
Kan Ma (1)
Keivan Beigi (3)
Kelly Ford (4)
Ken Grierson (10)
Kevin Schreiner (6)
Leigh Pointer (31)
Lorraine Young (60)
Malik Khan (1)
Matt Rutledge (2)
Matthias Schlomann (16)
Mauricio Márquez (5)
Michael Doxsey (7)
Michael Tobisch (3)
Michael Washington (202)
Miguel Gatmaytan (3)
Mike Horton (19)
Mitchel Sellers (40)
Nathan Rover (3)
Navin V Nagiah (14)
Néstor Sánchez (31)
Nik Kalyani (14)
Oliver Hine (1)
Patricio F. Salinas (1)
Patrick Ryan (1)
Peter Donker (54)
Philip Beadle (135)
Philipp Becker (4)
Richard Dumas (22)
Robert J Collins (5)
Roger Selwyn (8)
Ruben Lopez (1)
Ryan Martinez (1)
Sacha Trauwaen (1)
Salar Golestanian (4)
Sanjay Mehrotra (9)
Scott McCulloch (1)
Scott Schlesier (11)
Scott Wilkinson (3)
Scott Willhite (97)
Sebastian Leupold (80)
Shaun Walker (237)
Shawn Mehaffie (17)
Stefan Cullmann (12)
Stefan Kamphuis (12)
Steve Fabian (31)
Steven Fisher (1)
Tony Henrich (3)
Torsten Weggen (3)
Tycho de Waard (4)
Vicenç Masanas (27)
Vincent Nguyen (3)
Vitaly Kozadayev (6)
Will Morgenweck (40)
Will Strohl (180)
William Severance (5)
What is Liquid Content?
Find Out
What is Liquid Content?
Find Out
What is Liquid Content?
Find Out