When integrated circuit engineers design components, they have to pay attention not only to the logical design of the chip, but also to how it is physically implemented in silicon. With improved . NET deployment technology, programmers must do likewise. Logical design refers to the class and algorithmic structure of a program; physical design refers to how those classes are placed in assemblies. The fundamental idea is to group your classes into assemblies so as to minimize dependencies among the assemblies, with as little associated performance degradation as possible.
Assemblies are the units of deployment in .NET. If physical design is done properly, applications have more finely grained upgrades and can be gradually evolved as user requirements change and problems are fixed. The idea is to minimize the number of assemblies that have to be replaced when a change is made to an application. That lets you perform emergency fixes with fewer possible repercussions, and it makes it easier to pull back upgrades. It even makes development easier, because you can know the versions of the software that developers are working against, that QA is testing, and that the customers are using. It might even make it easier to divide work among multiple developers.
Obviously you need some dependencies among your assemblies, because without dependencies you wouldn't have an application. Minimizing the number of dependencies doesn't mean putting each class in a separate assembly. That would not only increase the runtime loading and introduce an assembly management problem, but it also actually would increase the number of dependencies.
The first requirement in physical design is that each assembly have a strong name, which allows the assembly to be versioned and have a unique identity. With unique identities you can then deterministically install and uninstall the application, have a bill of materials for each application, and use application and publisher policy to declare version compatibility.
To make this clear, let's look at a simple application and focus on the logical and physical design. You can download the source code for these examples. Each step is in a separate directory.
Our example has three classes: a Customer class, a CustomerManager class, and an application class (the main program).
The idea is that the Customer class contains the code to maintain the information associated with a customer.
The CustomerManager class maintains a list of customers and retrieves the customer instance based on a name or ID.
The code in
main.cs represents the application that has the business logic and uses the other two classes. While the
example is trivial, it can be used to illustrate the essential ideas.
In Step 1, all the classes are bound in one assembly. From both a logical and a physical point of view, this design is a mess. There is little encapsulation. The application can access both the CustomerManager class and the Customer class and can modify the identifiers assigned by the CustomerManager class. The application can also create instances of the Customer class. The entire application has to be rebuilt whenever a line of code is changed. You cannot isolate fixes or new features. Multiple installations with different features or bug fixes require source code branching of the entire source. You have longer compile, build, and test/debug cycles since the entire application is rebuilt every time.
Step 2 naively divides the program into three assemblies: one for the Customer class, one for the CustomerManager class, and one for the application class. Although the assemblies are separate, the application class is still dependent on both the CustomerManager class and the Customer class. The application class can still modify the identifiers assigned to the customer and create new instances of the customer. While this is a simple application, imagine if the Customer object accessed a database-- you could not prevent the rest of the user interface or business rules from directly accessing the data access layer.
Step 3 applies the standard principles of object-oriented programming to create a better logical design of the system. We now have the ICustomer and ICustomerManager interfaces as well as a CustomerManagerFactory class. From a logical perspective, the application is dependent only on the CustomerManagerFactory class and the ICustomer and ICustomerManager interfaces. From a logical perspective, we have removed the application's dependencies on the Customer class and the CustomerManager class. But we have put the ICustomer interface in the Customer assembly and the ICustomerManager and CustomerManagerFactory classes in the CustomerManager assembly. The main assembly is still dependent on the two other assemblies. Independent deployment is still difficult.
In Step 4, we have paid attention to physical design. By putting the interfaces into a separate assembly, we have completely isolated the main program from the Customer assembly. Of course, the assumption is that interfaces do not change often. If they do change, it is not unreasonable that a major rebuilding of the application would be necessary.
The key challenge is to partition classes into assemblies based on the probability of frequent changes. With fewer dependencies it becomes easier to do pinpoint deployment and create versions for smaller parts of the application. Various installations can deploy different features or bug fixes. In general this may not be a good idea, but applications are built with different features based on the license fee. One site needs a critical emergency fix that might break other parts of the application that other sites require. With better physical design it becomes clearer what versions of the software the developers are working against, and minimizes the need to "rebuild the world." Your compile, build, and test/debug cycles get faster.
When specifying assembly versions, you should increase the version number of a changed assembly immediately after deployment to quality assurance before any more changes are made. Now development is working on the next version, so if a bug is reported it is clear which version of the assembly has the problem.
Potential objections to this approach fall into two categories: assembly management and performance.
Putting everything into separate assemblies is not a panacea. Suppose in our CustomerManager example we put each interface into a separate assembly. We would have increased the number of assemblies for probably little gain. Imagine what would happen in a larger application if every class and interface went into a separate assembly. Encapsulation is lost because some methods and classes that could be marked internal have to be marked public. We may have increased the amount of dependencies beyond your ability to manage them. Too many finely grained assemblies means more assemblies have to be redeployed when you make only a small change to your application.
Loading an assembly imposes a performance penalty. You have to analyze the use patterns of your application. Do all the assemblies load at once during the application startup, or is the load distributed over the lifetime of the application? Does it matter to the users of the application? It obviously depends on whether you have a UI-intensive app or Web service. How long do the users spend with a UI-intensive app? Some users are willing to tolerate a little longer startup time if they are going to spend 15 minutes to an hour with the app. Other times they want the quickest startup possible.
More assemblies do increase the working set of the application. Does that matter? Again it depends on how big your working set is to begin with, and on the locality of reference in your application.
The problem with performance-related statements, of course, is that you can always find a set of circumstances to which they apply or do not apply. In applications, the biggest performance bottleneck is often the network latency, or the time the user thinks about what to do next. Any performance penalties from having more assemblies would not be noticed.
How much performance do you trade off for an ease of upgrade? It depends. You have to understand the issues associated with your application. But we suspect that for most applications, the advantages of having more assemblies and more finely grained upgrades far outweigh the performance penalties.
Better tools would help with physical design. It would be nice if within one source code control project you could associate a set of source code control files with an assembly version.
Using good physical as well as logical design techniques lets you minimize assembly dependencies and achieve pinpoint deployment of your application at multiple installations.
Physical design matters.
Michael Stiefel is the principal of Reliable Software, Inc.
George Wesolowski is a senior consultant at Knowledge Management Associates.
Return to ONDotnet.com
Copyright © 2009 O'Reilly Media, Inc.