7.1 saw the best Search ever in the DNN Platform; have a look into my previous blog to learn all about the new Search. This blog series is dedicated to provide more insight to Developers for easy Search integration.
This blog is organized as FAQ for ease of reading and answering common questions for the module developers. The key takeaway is that ISearchable has been deprecated and replaced with ModuleSearchBase with lots of awesomeness.
Let’s start with knowing a bit more about Site Crawler.
Site Crawler
Is Site Crawler new?
No. It’s the CE’s “Search Engine Scheduler”, renamed to “Search: Site Crawler”
What’s new in Site Crawler?
Pre 7.1, it called every module that implemented ISearchable to obtain Search information and index them for searching by users. It also did RSS Syndication.
Now it does more – it has Tab Indexer, Module Metadata Indexer and Module Content Indexer. Below is summary from Schedule History
Is Site Crawler Site specific?
No. The schedule job goes through all the Sites present.
What is Tab Indexer?
Tab Indexer is part of Site Crawler. Its job is to collect information about each and every page defined in all the Sites including the Host ones. It stores page name, title, description, keywords, taxonomy tags, etc.
What is Module Metadata Indexer?
Very similar to Tab Indexer, except it’s at a module level.
What is Module Content Indexer?
This is also part of Site Crawler and the most important Indexer. It is responsible for getting content from modules. It calls modules that implement ISearchable or the new ModuleSearchBase.
ModuleSearchBase
What is ModuleSearchBase?
It’s the new Interface (actually an abstract class) that module developers need to implement to better integrate with new Search.
Is ModuleSearchBase better than ISearchable?
Yes, it is more efficient as it has the concept of Deltas. ISearchable required modules to provide all their content all the time. ModuleSearchBase only asks difference in content since the last run.
Where should I implement ModuleSearchBase?
It should be implemented in the BusinessControllerClass in the module’s manifest. Below is an example of manifest from Html module. As always, you must provide “Searchable” as one of the SupportedFeature.
Do you have an example of ModuleSearchBase implementation?
Indeed, have a look into the Html module.
C# does not allow multiple inheritance of base classes, what can I do if my BusinessControllerClass is already derived from another base class?
Ideally you should keep BusinessControllerClass clean and not inherit from any other base class. In this case you’d need to remove the other base class and inherit from ModuleSearchBase instead. You can continue to inherit any number of Interfaces though.
How many methods do I need to implement in ModuleSearchBase?
Just one - GetModifiedSearchDocuments.
Backwards Compatibility
Do I have to implement ModuleSearchBase, can I not stick with old ISearchable?
Well you can continue to use ISearchable; we have made sure that the new Search is backwards compatible with ISearchable. You won’t be able to take advantage of Deltas though. You will also be missing other cool features as localization, granular security trimming, etc.
What happened to SearchItemInfo?
It’s still present to support ISearchable. We map most of the properties of SearchItemInfo into the new SearchDocument.
Which properties of SearchItemInfo are not ported over?
HitCount and ImageFileId.
GetModifiedSearchDocuments
What does this method return?
Essentially a collection of SearchDocuments. It should return SearchDocuments for new, changed and deleted content for your module.
What parameters are passed to this method?
ModuleInfo and BeginDate. The BeginDate is in UTC format. You should return new, changed and deleted content from the BeginDate and the current time.
How is this method executed?
This method is called periodically by Site Crawler, which is a scheduled job. This method is called for each and every module instance that implements either ISearchable or ModuleSearchBase.
Is it possible that this method can be called more than once during a single run of the Site Crawler?
Yes. It gets called for every instance of Html module. However the ModuleInfo passed is different.
How about packages that have more than one module definition in the manifest?
If you have a package with specified SupportedFeature as Searchable, this method is called for all module definitions defined in that manifest. In these situations (e.g Blogs module), there is usually one main module and other helper module(s). You should return empty collection of SearchDocuments for helper modules and real content for the main module or else you’d be creating duplicated data in the Search index. The old ISearchable worked this way as well. You can differentiate between main / helper module by using moduledefinitionid.
How do I troubleshoot if this doesn’t get called?
It’s likely that you did not specify SupprtedFeature in the manifest or ModuleSearchBase is not implemented in the BusinessControllerClass. Best to execute this Stored Procedure to ensure your module is listed. exec GetSearchModules 9999 –replace 9999 with your portalid
Is there anything specific I need to consider while writing this method?
Yes, since this method will be executed in the context of Scheduler, you need not use the HttpContext, e.g. PortalSettings, CurrentUser, etc.
Conclusion
Module developers are highly recommended start using the new ModuleSearchBase (with 7.1+) and not use old ISearchable.
I am planning to write many more mini-blogs on topic with lots of details and insights, stay tuned