Have you ever written a piece of code that seems to perform correctly during testing, gets deployed and then 6 months later when people really start using it, you find out that there is a fundamental flaw that makes your whole design completely worthless? That is the situation I find myself experiencing now.
Module Installation
The module installation process is fairly mature by DotNetNuke standards and is based on earlier work by Jonathan de Halleux which was contributed to the core. The installer code was written as part of the DNN 2.0 implementation and was a major new feature provided in this release. This installation process unzips the PA in memory, parses the .dnn manifest file and then moves any assemblies into the bin directory. There are lots of other bits going on, but these are the key processes for the purposes of this discussion. This has worked well for almost two years with very little change.
IUpgradeable
As part of the 3.0 development effort we decided that we would support module notification during this process so that the module could run any custom business logic to initialize the module for the current version. This feature was desperately needed since many modules need to initialize directory structures, remove legacy files or folders, update internal configuration files or any other number of custom processes which falls outside of the ability to handle within the DataProvider scripts.
This new feature was implemented as an Interface. Any module which needed this notification service would implement the IUpgradeable interface which defines the UpgradeModule method. After all of the dlls have been saved to the \bin directory and the module has completed the registration process, then the core will call the upgrade method for every script that was installed as part of the installation process. On a new install, this means every script up to and including the current version. On upgrades, this method is only installed for the scripts that were executed to bring the installation up to the current version.
This is the point where the astute reader says "Hey! I thought IIS performs an app recycle when a new dll is placed in the \bin directory. How can this work?" Sure, you say that now. Where were you when I was developing this code? This is a case where sometimes you can be too close to the code to see the big picture and notice potential problems.
Now, in my defense, I will say that it has taken just shy of one year to uncover a problem. I did my usual unit testing and everything worked perfectly. I could install a module and the method was called correctly. This made it through internal testing and beta testing without any problems noted. This has been used on production sites and things seemed to work ok. But wait! I thought we just said that when an assembly is added to the \bin directory that IIS recycles the worker process. Shouldn't this have caused problems? That is correct. At least someone is paying attention.
But there are problems and there are problems. Sometimes problems are so subtle that they can go a long time without detection. If you don't believe me just try writing a multi-threaded application and see how many problems only show up under very limited situations.
So lets examine what exactly is going on and why it took so long to uncover this issue.
Application Recycle
The key to understanding why this problem occurs and why things often seem to work is to understand what exactly happens during an app recycle and to understand why the application is recycled.
In IIS (we'll ignore the differences between IIS 5 and 6 for now) every ASP.Net application is loaded into an application domain. This application domain is the context under which your application will run. Any assemblies that are accessed by the application are loaded into the app domain as they are called. If an assembly is never called, then it will not be loaded, even it is in the \bin directory. But if IIS loaded these assemblies directly then the dlls in the \bin directory would be locked and you would not be able to install new versions without first shutting down the process that loaded the dll, thereby unlocking the dll. The .Net team implemented a feature called shadow copying which allows a dll to first be copied to a temporary location and then the copy is loaded and executed from the temp directory. ASP.Net makes use of this feature to allow assemblies to be updated without stopping the worker process.
When a new assembly is placed in the \bin directory IIS receives an event which says that an assembly was changed. Since .Net does not allow you to unload assemblies you must unload the entire app domain in which the assembly is loaded. This means shutting down your application. Then a new app domain can be created which could then load the new dlls if needed. But wouldn't that abort your current request? Boy, you are pretty sharp! But then again, so was the ASP.Net development team. That is why they allow the existing app domain to finishing servicing all of the requests that are currently in processing. A new app domain is loaded to handle any new requests from that point forward.
What this means is that when we install the new assemblies into the \bin directory, the current app domain will be marked for destruction and all new requests will be given to a new app domain once it has finished being initialized. The current request which is still processing the module installation request will be allowed to continue through to completion. So where is the problem? Your installation process gets to finish so everything should work.
Lets look at a couple of different scenarios to see why this only fails part of the time.
The first case is a new installation. In this scenario, your module assembly did not exist previously. When the module installation code gets to the step to see you implement IUpgradeable, it will not find the assembly in memory and will be forced to load the assembly from disk. You guessed it. It will load the only assembly that is available. The one that was just installed to the \bin directory. If you implemented IUpgradeable the code will see that your BusinessControllerClass (as indicated in the .dnn manifest) is a Type Of IUpgradeable and will call your UpgradeModule method. Everything works as you would expect.
The second case is an upgraded installation. In this scenario you are running a low volume site or you have recently had an app recycle. The module being upgraded has not been used since the last app recycle and therefore the assembly is not in memory. Just like in the first scenario, the installation proceeds as expected and the upgrade works correctly.
The third case is also an upgraded installation. In this scenario your module implemented IUpgradeable in the earlier version, and is adding some new business logic to the UpgradeModule method for the new version. Because you have a really popular site with thousands of users visiting you every day, your site never reaches the 20 minute idle point which might trigger an app recycle. Also, the module you are upgrading is on one of the most popular pages on the site. As you probably guessed by now (you are the astute reader after all), a shadow copy of your assembly is already loaded into memory. This is the original version from before the installation process placed a new copy in the \bin directory. When the installer checks, it finds that your module does in fact implement the IUpgradeable interface. Like a good application, the installer calls your UpgradeModule method. Unfortunately you forgot to include your new business logic in the old assembly. Pretty shortsighted of you. I try to include code for anything I think I might someday need. (Okay. You're right. That is just plain dumb, but you can't blame a guy for trying.) So when the UpgradeModule method is called, your new business logic for the new module version does not exist and nothing happens. The process functions without any "errors" but the code does not get executed as you expected. Not good.
The las case is a variation of the previous scenario. In this instance your original module did not implement IUpgradeable. But people really love the module and it is always in memory. When the installation occurs the installer tries to determine if your module implements IUpgradeable. Because it will look at the version that is in memory it will not find the interface implemented and therefore it will not call the UpgradeModule method. Again, not what you wanted.
The Solution
So all in all, not too bad. The code works half of the time. As long as you are running a site that no one visits then it will almost always work correctly. If you are running a really popular site you have a couple options. You could try handing out bogus URLs in the hopes that all of your users will go somewhere else and allow you to upgrade in peace. Some of you I am sure are pretty stubborn and will not choose to use this perfectly acceptable solution. In that case I guess we just might need to implement a different mechanism for providing this key service.
This is going to take a pretty substantial effort to try and work around the app recycle issue (are you sure you don't want to hand out some other URLs? I have a couple spares handy.). For now, I would recommend that all module developers cease relying on the IUpgradeable interface until we get it re-architected.
Once again it just proves that software developers wouldn't have any bugs if we could just get rid of those pesky customers.