Products

Solutions

Resources

Partners

Community

About

New Community Website

Ordinarily, you'd be at the right spot, but we've recently launched a brand new community website... For the community, by the community.

Yay... Take Me to the Community!

The Community Blog is a personal opinion of community members and by no means the official standpoint of DNN Corp or DNN Platform. This is a place to express personal thoughts about DNNPlatform, the community and its ecosystem. Do you have useful information that you would like to share with the DNN Community in a featured article or blog? If so, please contact .

The use of the Community Blog is covered by our Community Blog Guidelines - please read before commenting or posting.


Using Luke to peek into Lucene Search Database

The new Search starting DNN 7.1 uses Lucene as its indexing and querying engine. Lucene is a file-based NoSQL database.  You need a specialized Java tool "Luke" to dig into this database. Luke is mostly used to troubleshoot issues with Search, especially when you want to know how Lucene stores your content internally. 

Downloading Luke

Download the “lukeall-3.5.0.jar” file from https://code.google.com/p/luke/downloads/list

 

Running Luke

Provided you have the current version of Java installed, you should be able to launch it by just double-clicking the downloaded jar file.

Pointing to Search Folder

Ensure you point to the App_Data\Search folder in Luke. Also ensure you have checked “Open in Read-Only mode” checkbox prior to clicking OK. This will allow your DNN Site to continue to use this database while you use Luke.

Lucene Database

Here is how the typical Lucene files look like:

The write.lock file is created by Lucene to ensure that only one process can write to these files. The best way to delete this file is to first recycle the app pool (deletion is not recommended in general).

Initial View

A few things jump out from the very initial view itself. Below are some of the important fields:

Number of documents

Lucene works at a Document level. DNN converts it’s entity into one or more document. Every page in DNN becomes one document each in Lucene. Likewise, every module becomes one document each as well. Each item within each module (e.g. an Article) in turn become more documents. In short, everything becomes document in Lucene - it's a document store.

:Number of Documents" shows the total number of such documents (excluding the ones that are deleted). The screenshot shows 167, which is a very small number from a brand new Evoq Content installation. Your actual website will have this number in thousands.

Number of Fields

Every Document consists of one or more fields. Think of a document as a database table, then a fields are the columns. These fields support common "type" such as numeric, boolean, string, etc. The beauty of a typical NoSQL database is that its schema is very flexible. It allows two documents to have very different set of fields.

The "number of fields" shows the total number of such unique fields in it's documents.

Number of terms

These are keywords extracted by Lucene from the texts provided by DNN.

Deleted Documents

Deleted documents in Lucene are not removed immediately, they are marked as deleted to start with. Physical removal is done by a difference process. The number of deleted documents is shown in “Has deletions? /Optimized?” label. The number in parenthesis next to Yes is the number of deleted documents. In this example it is 5.

So the total number of documents in this database is 167 + 5 = 172.

Documents View

The second tab allows going through the various documents present in the database. The arrows allow you to go back and forth. You can type a number between 0 and the maximum number of documents (e.g. 171 in this case) here and click enter to see the content of the document directly.

As noted above, this database has 172 items. Giveb ids are indexed starting 0. The id for the first document is 0, second one is 1, and so on. In this example you can go through documents id 0 till 171 (which is 172 -1).

Seeing a document.

Typing a document id or going back and forth will list the contents of the document.

The field names are listed in the bottom pane with the value as stored in Lucene. Please note that Lucne might convert texts to its stemmed version. Even though your site had jumped, jumping, and jump as texts in one document, it might just convert all the three to ‘jump’. This process is called as ‘stemming’

Running Queries

Luke has limited support for running custom queries. Follow the steps below:

 
  1. Go into Search tab
  2. Type the keyword you want to search
  3. Ensure the “….KeywordAnalyzer” is the analyzer
  4. Select field in which you want to search within
  5. Click on Search button
  6. Results should show at the bottom pan
  7. Clicking on any of the rows in the result will take you to that specific document in the Documents tab

Alternate way to query

At times, the Search tab may not find anything, and could be a very frustrating experience. Don’t loose heart, there is an alternate way:

  1. Go to the Overview tab
  2. On the bottom-left pane, select the field you think your keyword resides, title in this example
  3. Click on “Show top terms”. Adjust the “number of terms” drop as needed.
This will take you to the Documents tab and list all the documents that contains this keyword.
  • Published:

Comments

Jim2000
Can Ash please drop us a few lines how to enable non Latin character search?
Jim2000 Saturday, July 18, 2015 3:53 AM (link)
Ash Prasad
Host > Host Settings > Advanced Settings tab > Search Settings section > Customer Analyzer Type drop down > Select the one you want.
Ash Prasad Friday, July 31, 2015 4:47 PM (link)
Joseph Craig
When I try to open c:\mydirectory\httpdocs\App_Data\Search

I get: No valid directory at the location, try another location.

There is definitely something that looks like a Lucene set of files there.

Any ideas? (I get search results in the site)
Joseph Craig Monday, June 13, 2016 10:06 PM (link)
Ash Prasad
Joseph - What do you see under Host > Host Settings > Advanced Settings > Search Settings > Search Index Path? Does it list the same path as you mentioned above?
Ash Prasad Monday, June 13, 2016 10:55 PM (link)
Joseph Craig
id10t. Thanks!
Joseph Craig Tuesday, June 14, 2016 11:34 AM (link)

Comment Form

Only registered users may post comments.

NewsArchives


Aderson Oliveira (22)
Alec Whittington (11)
Alessandra Daniels (3)
Alex Shirley (10)
Andrew Hoefling (3)
Andrew Nurse (30)
Andy Tryba (1)
Anthony Glenwright (5)
Antonio Chagoury (28)
Ash Prasad (37)
Ben Schmidt (1)
Benjamin Hermann (25)
Benoit Sarton (9)
Beth Firebaugh (12)
Bill Walker (36)
Bob Kruger (5)
Bogdan Litescu (1)
Brian Dukes (2)
Brice Snow (1)
Bruce Chapman (20)
Bryan Andrews (1)
cathal connolly (55)
Charles Nurse (163)
Chris Hammond (213)
Chris Paterra (55)
Clint Patterson (108)
Cuong Dang (21)
Daniel Bartholomew (2)
Daniel Mettler (181)
Daniel Valadas (48)
Dave Buckner (2)
David Poindexter (12)
David Rodriguez (3)
Dennis Shiao (1)
Doug Howell (11)
Erik van Ballegoij (30)
Ernst Peter Tamminga (80)
Francisco Perez Andres (17)
Geoff Barlow (12)
George Alatrash (12)
Gifford Watkins (3)
Gilles Le Pigocher (3)
Ian Robinson (7)
Israel Martinez (17)
Jan Blomquist (2)
Jan Jonas (3)
Jaspreet Bhatia (1)
Jenni Merrifield (6)
Joe Brinkman (274)
John Mitchell (1)
Jon Henning (14)
Jonathan Sheely (4)
Jordan Coopersmith (1)
Joseph Craig (2)
Kan Ma (1)
Keivan Beigi (3)
Kelly Ford (4)
Ken Grierson (10)
Kevin Schreiner (6)
Leigh Pointer (31)
Lorraine Young (60)
Malik Khan (1)
Matt Rutledge (2)
Matthias Schlomann (16)
Mauricio Márquez (5)
Michael Doxsey (7)
Michael Tobisch (3)
Michael Washington (202)
Miguel Gatmaytan (3)
Mike Horton (19)
Mitchel Sellers (40)
Nathan Rover (3)
Navin V Nagiah (14)
Néstor Sánchez (31)
Nik Kalyani (14)
Oliver Hine (1)
Patricio F. Salinas (1)
Patrick Ryan (1)
Peter Donker (54)
Philip Beadle (135)
Philipp Becker (4)
Richard Dumas (22)
Robert J Collins (5)
Roger Selwyn (8)
Ruben Lopez (1)
Ryan Martinez (1)
Sacha Trauwaen (1)
Salar Golestanian (4)
Sanjay Mehrotra (9)
Scott McCulloch (1)
Scott Schlesier (11)
Scott Wilkinson (3)
Scott Willhite (97)
Sebastian Leupold (80)
Shaun Walker (237)
Shawn Mehaffie (17)
Stefan Cullmann (12)
Stefan Kamphuis (12)
Steve Fabian (31)
Steven Fisher (1)
Tony Henrich (3)
Torsten Weggen (3)
Tycho de Waard (4)
Vicenç Masanas (27)
Vincent Nguyen (3)
Vitaly Kozadayev (6)
Will Morgenweck (40)
Will Strohl (180)
William Severance (5)
What is Liquid Content?
Find Out
What is Liquid Content?
Find Out
What is Liquid Content?
Find Out