Products

Solutions

Resources

Partners

Community

About

New Community Website

Ordinarily, you'd be at the right spot, but we've recently launched a brand new community website... For the community, by the community.

Yay... Take Me to the Community!

The Community Blog is a personal opinion of community members and by no means the official standpoint of DNN Corp or DNN Platform. This is a place to express personal thoughts about DNNPlatform, the community and its ecosystem. Do you have useful information that you would like to share with the DNN Community in a featured article or blog? If so, please contact .

The use of the Community Blog is covered by our Community Blog Guidelines - please read before commenting or posting.


An Understanding of the DotNetNuke Search

With the 3.0 release of DotNetNuke, way back in March of 2005 searching was implemented in the project, after a hiatus in the 2.* releases.

 

Since then, not much has changed with the search, though it is still a very mysterious system for most users and developers. I hope to clear up some of the mystery with this blog post.

 

As a DNN site administrator you most likely won't ever worry about how the searching works, until users come to you and ask why they aren't seeing results they might expect. This blog post will explain how search works so you can answer their questions.

 

A few things to note

 

1. The Search implementation within DNN is pretty basic, it searches for the number of times a term is found within a body of "text", more on this later.

 

2. Each module differs in how, and if, they implement the necessary interfaces to interact with the core DNN Search provider. Not all modules implement ISearchable, meaning they won't support the core search.

 

3. Each module chooses what content it provides to the Search provider to be indexed, it might pass everything associated with a particular object, or it might pass only specific values from an object, this would possibly limit the effectiveness of a search for individual modules.

 

The scheduler:

 

The Search indexer runs on a schedule, defined under the Host/Schedule options. Most cases I've seen have the indexer set to run every 30 minutes. If you're making changes to content within your modules and expecting them to show up in the results immediately this is not very likely. To get around this, you can change the time between executions of the indexer, though I wouldn't recommend having the indexer run too frequently if you have a lot of content on your website, the more it runs the more times the database will get hit by each module to load the content.

 

One way to force your content to be indexed is to go to the Schedule page under the Host menu, edit the Search Indexer task, disable and save the task. Edit the task again and enable, this should force the indexer to fire immediately. If this isn't getting your content indexed as you would expect, you can clear the Search tables in the database and have them repopulate completely the next time the indexer runs, I made a blog post on how to do this quite a while ago (http://weblogs.asp.net/christoc/archive/2006/06/26/DotNetNuke-Daily-Tip-_2300_3-6_2F00_26_2F00_06-Clear-Search-Tables.aspx

 

There is also a "re-index" option on the Host/Search Settings page, though personally I've always found it to be a bit flakey and I take the above approach to forcing my content to reindex.

 

A basic overview of how the indexer works:

 

The indexer job fires, and makes a request to each of the modules on the website that support the ISearchable interface. These modules return a collection of SearchInfoObjects, assuming the modules have any content to return. How the modules populate these searchinfoobjects is completely up to the modules. As a developer it is important to populate these searchinfoobjects with unique SearchKey values, otherwise the indexer will log an exception.

 

The search indexer then parses through each of these objects that are returned from each module. The indexer checks to see if this object has already been indexed by checking the last updated date on the object and comparing it to the last updated date in the SearchItem table. If they differ the indexer will update the indexing of this object.

 

If the indexer finds an object in SearchItem that wasn't returned from a module it presumes that item has been deleted so it deletes all indexing for this item. This is a key item that most module developers miss, if you don't pass back an item DNN assumes it no longer exists and it will get removed from the index. This functionality has changed in Cambrian, stay tuned for more Cambrian blog posts this year.

 

The indexing of content basically consists of parsing out individual words in an item and storing these words in a SearchWord table, then creating a reference for each word in the SearchItemWord table.

 

Search results process:

 

When you search for an individual term DNN will look to see if that word exists in the SearchWord table, if so it will then look to see what "items" in SearchItem have this word. It will count the number of times a particular word is found in an item, and return that item as a search result. There is a relevance number passed back to the search results, this number is usually 4 characters. This relevance number is built in this manner.

 

For each time a word is found in a particular item the count is incremented, 1001 would mean that your search term was found once in an item. 1050, would mean that you search term was found 50 times in an item.

 

If you were searching for multiple search terms the only difference in the process is how the relevance number is built. If you searched for two terms, and both terms were found one time in an item, the relevance number would be 2002. If the two search terms were found a total of 3 times (once for term 1, and twice for term 2) the relevance would be 2003. The key information here is this.

 

For the first number (X) in Relevance (X000), X is the number of search terms found in a particular item, if you searched for 3 terms, and all three terms were found, X would be 3.

 

For the other three numbers (YYY) in the relevance (XYYY), YYY is the count of the number of times any of the search terms that were found in an item.

 

You might have searched for 3 terms in a particular search, and returned a relevance number of 3099. This tells you that all three terms were found in this particular result, but does not provide you any more insight into how many times each individual term was found, the first term might have been found 97 times and the second two terms may have been found one time each, 97+1+1 = 99, or each term might have been found 33 times, with the basic nature of the core DNN search provider you don't get this information.

 

Hopefully this post has provided you a bit more insight into how the core searching functionality works. If you're interested in learning more open up the solution, because DotNetNuke is open source you can learn a lot by opening up the code and stepping through some of the functionality.

 

A note of thanks to Charles Nurse for reviewing this blog post for accuracy and providing me some feedback before I posted it! :)

Comments

Comment Form

Only registered users may post comments.

NewsArchives


Aderson Oliveira (22)
Alec Whittington (11)
Alessandra Daniels (3)
Alex Shirley (10)
Andrew Hoefling (3)
Andrew Nurse (30)
Andy Tryba (1)
Anthony Glenwright (5)
Antonio Chagoury (28)
Ash Prasad (37)
Ben Schmidt (1)
Benjamin Hermann (25)
Benoit Sarton (9)
Beth Firebaugh (12)
Bill Walker (36)
Bob Kruger (5)
Bogdan Litescu (1)
Brian Dukes (2)
Brice Snow (1)
Bruce Chapman (20)
Bryan Andrews (1)
cathal connolly (55)
Charles Nurse (163)
Chris Hammond (213)
Chris Paterra (55)
Clint Patterson (108)
Cuong Dang (21)
Daniel Bartholomew (2)
Daniel Mettler (181)
Daniel Valadas (48)
Dave Buckner (2)
David Poindexter (12)
David Rodriguez (3)
Dennis Shiao (1)
Doug Howell (11)
Erik van Ballegoij (30)
Ernst Peter Tamminga (80)
Francisco Perez Andres (17)
Geoff Barlow (12)
George Alatrash (12)
Gifford Watkins (3)
Gilles Le Pigocher (3)
Ian Robinson (7)
Israel Martinez (17)
Jan Blomquist (2)
Jan Jonas (3)
Jaspreet Bhatia (1)
Jenni Merrifield (6)
Joe Brinkman (274)
John Mitchell (1)
Jon Henning (14)
Jonathan Sheely (4)
Jordan Coopersmith (1)
Joseph Craig (2)
Kan Ma (1)
Keivan Beigi (3)
Kelly Ford (4)
Ken Grierson (10)
Kevin Schreiner (6)
Leigh Pointer (31)
Lorraine Young (60)
Malik Khan (1)
Matt Rutledge (2)
Matthias Schlomann (16)
Mauricio Márquez (5)
Michael Doxsey (7)
Michael Tobisch (3)
Michael Washington (202)
Miguel Gatmaytan (3)
Mike Horton (19)
Mitchel Sellers (40)
Nathan Rover (3)
Navin V Nagiah (14)
Néstor Sánchez (31)
Nik Kalyani (14)
Oliver Hine (1)
Patricio F. Salinas (1)
Patrick Ryan (1)
Peter Donker (54)
Philip Beadle (135)
Philipp Becker (4)
Richard Dumas (22)
Robert J Collins (5)
Roger Selwyn (8)
Ruben Lopez (1)
Ryan Martinez (1)
Sacha Trauwaen (1)
Salar Golestanian (4)
Sanjay Mehrotra (9)
Scott McCulloch (1)
Scott Schlesier (11)
Scott Wilkinson (3)
Scott Willhite (97)
Sebastian Leupold (80)
Shaun Walker (237)
Shawn Mehaffie (17)
Stefan Cullmann (12)
Stefan Kamphuis (12)
Steve Fabian (31)
Steven Fisher (1)
Tony Henrich (3)
Torsten Weggen (3)
Tycho de Waard (4)
Vicenç Masanas (27)
Vincent Nguyen (3)
Vitaly Kozadayev (6)
Will Morgenweck (40)
Will Strohl (180)
William Severance (5)
What is Liquid Content?
Find Out
What is Liquid Content?
Find Out
What is Liquid Content?
Find Out