Products

Solutions

Resources

Partners

Community

About

New Community Website

Ordinarily, you'd be at the right spot, but we've recently launched a brand new community website... For the community, by the community.

Yay... Take Me to the Community!

The Community Blog is a personal opinion of community members and by no means the official standpoint of DNN Corp or DNN Platform. This is a place to express personal thoughts about DNNPlatform, the community and its ecosystem. Do you have useful information that you would like to share with the DNN Community in a featured article or blog? If so, please contact .

The use of the Community Blog is covered by our Community Blog Guidelines - please read before commenting or posting.


Retrieving the list of 404 URLs from a DNN site

The 7.1 release of the DNN Platform has introduced greatly-improved 404 handling, which appears to have been well received by the DNN Community in general.   I have already posted about the new 404 Page Not Found handling in DNN, but something that was not handled in that particular post was knowing what URLs have been requested on your site and result in 404 errors.

Any 7.1 site using Advanced URL mode captures any 404 Errors and logs them to the DNN Event Log, under the specific Event Log type key of ‘Http Error Code 404 Page Not Found’.

The following is an example of what you might see in your Admin->Event Viewer page, after filtering the list for the 404 Errors:

TabId:

PortalAlias: dnndev.me/dnn710

OriginalUrl: /dnn710/deadpage

Referer: http://somepage.com/somepage

Url: http://dnndev.me/dnn710/deadpage

UserAgent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/28.0.1500.72 Safari/537.36

HostAddress: 192.168.0.106

HostName: 192.168.0.106

Server Name: dnndev

The event entry is self-explanatory – you have the Portal Alias that was identified for the 404, the Original Url as requested (which relates to the Request URI, which is minus the ‘host’ or domain name), the Referer field (if the request came from a click in another location) and the URL itself.

The User Agent field shows the type of device that was used to request the URL – this is an important one because it will often reveal whether it is a search bot finding the 404  pages or whether it is regular visitors finding the errors.

The reason the ‘TabId’ value is empty is because there was no matching page found, so the TabId cannot be determined.  In some 404 errors controlled by custom module code, it will be possible for the TabId to be known but still show a 404 error.   An example of this would be ensuring that a non-existent blog post returned a 404 error if the associated blog content could not be found.  

Extracting a list of URLs that returned a 404

A common requirement for analysing the 404 errors on a site is getting a complete list of the URLs that have resulted in a 404.    This information is contained within the DNN Event Log, but it’s not immediately clear how to extract this data.

With that in mind, I wrote this piece of SQL which can be copy/pasted into the Host->SQL page of your DNN site, providing you with an easy-to-copy list of data in table format.   You can, of course, also run this directly in a SQL query tool such as SQL Server Management Studio, though you will have to replace the ‘{objectQualifier}’ and ‘{databaseOwner}’ fields with the relevant values for your site.

Select LogPortalId, LogPortalName, LogCreateDate  
, convert(xml, logProperties).query('data(/LogProperties/LogProperty/PropertyValue[../PropertyName="Url"])') as Url
, convert(xml, logProperties).query('data(/LogProperties/LogProperty/PropertyValue[../PropertyName="Referer"])') as Referer
, convert(xml, logProperties).query('data(/LogProperties/LogProperty/PropertyValue[../PropertyName="UserAgent"])') as UserAgent
from {databaseOwner}{objectQualifier}eventLog 
where LogTypeKey = 'Page_not_found_404'

The above SQL code actually shows a technique which is very useful for extracting information buried in the DNN Event Log. The DNN Event Log is a great solution to a generic location for a wide variety of Event Log types because it is stored in XML format. That does make it difficult to extract the information through a SQL Query tool. The solution is to convert the column back into an XML field, and then use xPath queries to retrieve the specific values required.

Of course this query contains no ‘Where’ clause to filter the results – you can easily add a condition to pull back the data for a specific portal, or for a specific date/time range.   You can also retrieve information based on a where clause for a specific ‘LogProperty’, such as retrieving all the 404 errors found by the Googlebot User Agent as an example.   However, care should be taken when writing these queries, as the Event Log can be quite large, and database indexes do not exist on any of the fields I have mentioned.   For large, active sites, it would be prudent to restore a backup of the live database to a local server for digging through this type of analysis.  By doing this, you limit the amount of database capacity taken up by running large queries and limit the potential for slow responses to site visitors.

The results from the above query on a test site returned the following table, which is much more useful for aggregating large amounts of data than the native XML format of the EventLog table.

Log Portal Id Log Portal Name Log Create Date Url Referer User Agent
0 dnn710 7/22/2013 9:22:56 AM http://dnndev.me/dnn710/deadpage   Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/28.0.1500.72 Safari/537.36
0 dnn710 7/22/2013 9:26:35 AM http://dnndev.me/dnn710/deadpage http://somepage.com/somepage Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/28.0.1500.72 Safari/537.36
0 dnn710 7/22/2013 9:26:12 AM http://dnndev.me/dnn710/deadpage   Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/28.0.1500.72 Safari/537.36

Conclusion

I hope this is useful both for describing how to extract information relating to 404 errors, but also in describing the easiest way to drill down into Event Log data for any type of event.   Let me know via the comments if you have any questions relating to the technique or the 404 Error logging.

Comments

Winston Haybittle
Thanks, that's a great SEO tip!
Winston Haybittle Wednesday, July 24, 2013 10:57 AM (link)

Comment Form

Only registered users may post comments.

NewsArchives


Aderson Oliveira (22)
Alec Whittington (11)
Alessandra Daniels (3)
Alex Shirley (10)
Andrew Hoefling (3)
Andrew Nurse (30)
Andy Tryba (1)
Anthony Glenwright (5)
Antonio Chagoury (28)
Ash Prasad (37)
Ben Schmidt (1)
Benjamin Hermann (25)
Benoit Sarton (9)
Beth Firebaugh (12)
Bill Walker (36)
Bob Kruger (5)
Bogdan Litescu (1)
Brian Dukes (2)
Brice Snow (1)
Bruce Chapman (20)
Bryan Andrews (1)
cathal connolly (55)
Charles Nurse (163)
Chris Hammond (213)
Chris Paterra (55)
Clint Patterson (108)
Cuong Dang (21)
Daniel Bartholomew (2)
Daniel Mettler (181)
Daniel Valadas (48)
Dave Buckner (2)
David Poindexter (12)
David Rodriguez (3)
Dennis Shiao (1)
Doug Howell (11)
Erik van Ballegoij (30)
Ernst Peter Tamminga (80)
Francisco Perez Andres (17)
Geoff Barlow (12)
George Alatrash (12)
Gifford Watkins (3)
Gilles Le Pigocher (3)
Ian Robinson (7)
Israel Martinez (17)
Jan Blomquist (2)
Jan Jonas (3)
Jaspreet Bhatia (1)
Jenni Merrifield (6)
Joe Brinkman (274)
John Mitchell (1)
Jon Henning (14)
Jonathan Sheely (4)
Jordan Coopersmith (1)
Joseph Craig (2)
Kan Ma (1)
Keivan Beigi (3)
Kelly Ford (4)
Ken Grierson (10)
Kevin Schreiner (6)
Leigh Pointer (31)
Lorraine Young (60)
Malik Khan (1)
Matt Rutledge (2)
Matthias Schlomann (16)
Mauricio Márquez (5)
Michael Doxsey (7)
Michael Tobisch (3)
Michael Washington (202)
Miguel Gatmaytan (3)
Mike Horton (19)
Mitchel Sellers (40)
Nathan Rover (3)
Navin V Nagiah (14)
Néstor Sánchez (31)
Nik Kalyani (14)
Oliver Hine (1)
Patricio F. Salinas (1)
Patrick Ryan (1)
Peter Donker (54)
Philip Beadle (135)
Philipp Becker (4)
Richard Dumas (22)
Robert J Collins (5)
Roger Selwyn (8)
Ruben Lopez (1)
Ryan Martinez (1)
Sacha Trauwaen (1)
Salar Golestanian (4)
Sanjay Mehrotra (9)
Scott McCulloch (1)
Scott Schlesier (11)
Scott Wilkinson (3)
Scott Willhite (97)
Sebastian Leupold (80)
Shaun Walker (237)
Shawn Mehaffie (17)
Stefan Cullmann (12)
Stefan Kamphuis (12)
Steve Fabian (31)
Steven Fisher (1)
Tony Henrich (3)
Torsten Weggen (3)
Tycho de Waard (4)
Vicenç Masanas (27)
Vincent Nguyen (3)
Vitaly Kozadayev (6)
Will Morgenweck (40)
Will Strohl (180)
William Severance (5)
What is Liquid Content?
Find Out
What is Liquid Content?
Find Out
What is Liquid Content?
Find Out