Products

Solutions

Resources

Partners

Community

About

New Community Website

Ordinarily, you'd be at the right spot, but we've recently launched a brand new community website... For the community, by the community.

Yay... Take Me to the Community!

The Community Blog is a personal opinion of community members and by no means the official standpoint of DNN Corp or DNN Platform. This is a place to express personal thoughts about DNNPlatform, the community and its ecosystem. Do you have useful information that you would like to share with the DNN Community in a featured article or blog? If so, please contact .

The use of the Community Blog is covered by our Community Blog Guidelines - please read before commenting or posting.


Analyzing an exploding EventLog

Murphy’s law has it that real issues in DotNetNuke (core. modules, skins) don’t manifest themselves until you’ve rolled out in production. We can test all we want, but the fact of the matter is that this has become a highly complex eco-system with many interacting bits and it is next to impossible to test for all scenarios. And I’m pretty sure the kind folks at the Corp’s helpdesk will agree with me on this as well. So let’s not kid ourselves: sh*t happens. It’s our ability to deal with it that separates the good from the bad (or experienced from the inexperienced), IMHO. It is also the reason we have something called dogfooding. This is the last phase of testing where you use your own product (dogfood) in your own production environment. This is after you’ve tested it on a staging server of course. It’s rare, but sometimes it does give rise to issues. After I rolled out DNN 6.1.1 on my own site and had began dogfooding my own DMX 6 my server ground to a halt. Lucky for me I did this over thanksgiving so it was fairly quiet on the interwebs. The first major stumbling block in analyzing what was happening was the fact that the eventlog had exploded. I mean, my 250 Mb odd SQL database had become 4 Gb and that was just one table: the EventLog table.

First aid

As the site ground to a halt I needed to do 2 things: get the site running again and analyze what was happening. But at this point the system is more or less frozen with SQL clipping at 99% CPU. OK, first I brought down the site to get the server back down to earth. Then I proceeded to remove all but the last 100000 records from the EventLog. I could possibly have deleted more, but this brought the DB to a reasonable size (something like 1 Gb). Backup the DB, zip this and bring down to my dev environment for analysis. Final change to the DB: truncate the EventLog table and shrink the DB again. Now to get the site working I got it back online and disabled the GENERAL_EXCEPTION in the EventLog (go to the settings for the eventlog and edit this row). Now the SQL is no longer working overtime. At this point we have to realize the error is still happening but the fact that it is no longer logged prevents SQL server going wild.

Analysis

For those of you who’ve not been here before, the Eventlog table has one major drawback: the actual details of the errors are in an NTEXT field as an XML blob. This prevents us from doing any SQL analysis on the data. For one I’d like to get an idea of which error is prevalent. This requires comparing the details between errors. We need to get to the contents of the XML. Given that it’s in NTEXT, SQL server will refuse to operate on it as it is stored as blobs and can’t be compared. Converting it to NVARCHAR(MAX) is impossible on a serious volume of data (I tried and the process just blocked). I also tried exporting to XML to open up in XML Spy for instance, but this lacks the analysis capabilities that I need. It was useful to run through the first few records to get a first idea, though. You can export to XML using a utility called “bcp”, or bulk copy. This is a tool that comes with SQL server, so you run it from a cmd window:

"C:\Program Files\Microsoft SQL Server\100\Tools\Binn\bcp" "select top 100 LogProperties from new_bring2mind_net.dbo.PortalCore_EventLog" queryout D:\Websites\local.bring2mind.net\eventlog.xml -U "test" -P "test" -c

Here “new_bring2mind_net” is the name of the database, “test” and “test” are username and password. Now all you need to do is to open the resulting file and stick a <foo> and </foo> around the contents to make it a valid XML file. Open in XML Spy (or similar tool) and you can walk through the log. But you’ll quickly notice that the actual details of the event (in the “message” property) are so large, they make it difficult to “see” what is happening in grid view.

So how to actually use the eventlog for analysis? Well, we need to get the XML data to SQL so we can run queries. And this SQL 2005+ can do. The approach I have now pursued is to (1) create a table for the log properties, (2) run a cursor over the EventLog and (3) read the LogProperties as XML and push that into the new table. So let’s look at some code. First we create the new table:

CREATE TABLE PortalCore_EventLogProperties (
 LogEventID INT NOT NULL,
 PropertyName VARCHAR(50) NOT NULL,
 PropertyValue NVARCHAR(MAX)
)
GO

This sets up the table that will receive our data. Now for our bit of magic:

DECLARE @LogProperties XML
DECLARE @LogEventID INT

DECLARE c CURSOR FAST_FORWARD FOR
SELECT LogProperties, LogEventID FROM PortalCore_EventLog

OPEN c
FETCH NEXT FROM c
INTO @LogProperties, @LogEventID

WHILE @@FETCH_STATUS = 0
BEGIN
 PRINT @LogEventID

 INSERT INTO PortalCore_EventLogProperties(LogEventID, PropertyName, PropertyValue)
 SELECT
  @LogEventID,
  Tbl.Col.value('(PropertyName)[1]', 'VARCHAR(50)'),
  Tbl.Col.value('(PropertyValue)[1]', 'NVARCHAR(MAX)')
 FROM
  @LogProperties.nodes('//LogProperty') Tbl(Col)
 WHERE NOT EXISTS (
  SELECT * FROM PortalCore_EventLogProperties 
  WHERE LogEventID=@LogEventID
   AND PropertyName=Tbl.Col.value('(PropertyName)[1]', 'VARCHAR(50)')
 )

 FETCH NEXT FROM c
 INTO @LogProperties, @LogEventID
END

CLOSE c
DEALLOCATE c
GO

This migrates the logproperties to its own table. Note that this can take a while depending on the number of records (I ran 30K records in about 45 minutes while typing this). Optionally you can limit the amount by adding “TOP 1000” or another limiting instruction to the SELECT statement. But now we have the data where we want it. Ready for analysis. Let’s run a few queries.

SELECT * FROM
(SELECT DISTINCT(elp.PropertyValue),
 (SELECT COUNT(*) FROM PortalCore_EventLogProperties elp1
  WHERE elp1.PropertyName='RawURL' AND elp1.PropertyValue=elp.PropertyValue) AS cnt
 FROM PortalCore_EventLogProperties elp
 WHERE elp.PropertyName='RawURL') tmp
ORDER BY cnt DESC

This will show you a list of requested URLs with the number of occurrences in descending order. It may give you a hint where (i.e. which page) is giving you a headache. Now replace the value “RawURL” with “Message” and you’ll get the same but for the actual errors. The message is in effect the stacktrace. So this is the best way forward in determining where an error is occurring. You can of course also export this to a text file so you can work through it.

"C:\Program Files\Microsoft SQL Server\100\Tools\Binn\bcp" "SELECT cnt, PropertyValue FROM (SELECT DISTINCT(elp.PropertyValue), (SELECT COUNT(*) FROM new_bring2mind_net.dbo.PortalCore_EventLogProperties elp1 WHERE elp1.PropertyName='Message' AND elp1.PropertyValue=elp.PropertyValue) AS cnt FROM new_bring2mind_net.dbo.PortalCore_EventLogProperties elp WHERE elp.PropertyName='Message') tmp ORDER BY cnt DESC" queryout D:\Websites\local.bring2mind.net\StackTraces.txt -U "test" -P "test" –c

Good luck debugging!

Comments

Comment Form

Only registered users may post comments.

NewsArchives


Aderson Oliveira (22)
Alec Whittington (11)
Alessandra Daniels (3)
Alex Shirley (10)
Andrew Hoefling (3)
Andrew Nurse (30)
Andy Tryba (1)
Anthony Glenwright (5)
Antonio Chagoury (28)
Ash Prasad (37)
Ben Schmidt (1)
Benjamin Hermann (25)
Benoit Sarton (9)
Beth Firebaugh (12)
Bill Walker (36)
Bob Kruger (5)
Bogdan Litescu (1)
Brian Dukes (2)
Brice Snow (1)
Bruce Chapman (20)
Bryan Andrews (1)
cathal connolly (55)
Charles Nurse (163)
Chris Hammond (213)
Chris Paterra (55)
Clint Patterson (108)
Cuong Dang (21)
Daniel Bartholomew (2)
Daniel Mettler (181)
Daniel Valadas (48)
Dave Buckner (2)
David Poindexter (12)
David Rodriguez (3)
Dennis Shiao (1)
Doug Howell (11)
Erik van Ballegoij (30)
Ernst Peter Tamminga (80)
Francisco Perez Andres (17)
Geoff Barlow (12)
George Alatrash (12)
Gifford Watkins (3)
Gilles Le Pigocher (3)
Ian Robinson (7)
Israel Martinez (17)
Jan Blomquist (2)
Jan Jonas (3)
Jaspreet Bhatia (1)
Jenni Merrifield (6)
Joe Brinkman (274)
John Mitchell (1)
Jon Henning (14)
Jonathan Sheely (4)
Jordan Coopersmith (1)
Joseph Craig (2)
Kan Ma (1)
Keivan Beigi (3)
Kelly Ford (4)
Ken Grierson (10)
Kevin Schreiner (6)
Leigh Pointer (31)
Lorraine Young (60)
Malik Khan (1)
Matt Rutledge (2)
Matthias Schlomann (16)
Mauricio Márquez (5)
Michael Doxsey (7)
Michael Tobisch (3)
Michael Washington (202)
Miguel Gatmaytan (3)
Mike Horton (19)
Mitchel Sellers (40)
Nathan Rover (3)
Navin V Nagiah (14)
Néstor Sánchez (31)
Nik Kalyani (14)
Oliver Hine (1)
Patricio F. Salinas (1)
Patrick Ryan (1)
Peter Donker (54)
Philip Beadle (135)
Philipp Becker (4)
Richard Dumas (22)
Robert J Collins (5)
Roger Selwyn (8)
Ruben Lopez (1)
Ryan Martinez (1)
Sacha Trauwaen (1)
Salar Golestanian (4)
Sanjay Mehrotra (9)
Scott McCulloch (1)
Scott Schlesier (11)
Scott Wilkinson (3)
Scott Willhite (97)
Sebastian Leupold (80)
Shaun Walker (237)
Shawn Mehaffie (17)
Stefan Cullmann (12)
Stefan Kamphuis (12)
Steve Fabian (31)
Steven Fisher (1)
Tony Henrich (3)
Torsten Weggen (3)
Tycho de Waard (4)
Vicenç Masanas (27)
Vincent Nguyen (3)
Vitaly Kozadayev (6)
Will Morgenweck (40)
Will Strohl (180)
William Severance (5)
What is Liquid Content?
Find Out
What is Liquid Content?
Find Out
What is Liquid Content?
Find Out