Thursday, May 31, 2007

Abstract Factory: Part III

In this post I'm going to address a couple of issues left out in the discussion of the abstract factory pattern. If you haven't read the earlier discussions in regards to this pattern you can see them here and here.

The abstract factory pattern is a useful and highly used pattern and sometimes we automatically pick this pattern without thinking about the consequences or without asking the question what value does the pattern add? Or does it limit the design in any way?

Like most patterns Abstract Factory also has negative aspects and concerns that have to be thought about before actually trying to use it. In most situations where we need to support multiple competing technologies (i.e. the database example in previous discussions) or we want our client code to be absolutely free of hard coded creation commands (the 'new' keyword) this pattern looks like it's doing the job perfectly but we should always consider the negative aspects too:

Abstract Factory does not support adding new products. (what I mean by support is support without the need for re-coding)

As I explained in the previous discussions one of the main features of this pattern is that adding new product families is a breeze. You just develop your new product family (in the database example I would develop the classes that would communicate with my new database technology), create a product family concrete factory (the class inheriting from AbstractFactory and in charge of the actual product creations) and your set to go. You don't need to change even a single line of code in your previous implementation and with a little help from reflection we can plug in our new family of products, change the config file to start using the new family and presto! It works.

BUT what if you wanted to add a new product? Take a look at the previous example discussed in part 1 of my abstract factory discussion, the diagram below summarizes it:



What if we needed to add a SupplierAccess as a new product how could we go about doing that? Well the pattern doesn't give you any features for adding new products and the truth is in design scenarios where adding a new product is going to happen quite often you shouldn't be using this pattern. It's just too much trouble:

In this example to add a the new SupplierAccess product we would have to change the AbstractFactory class to define a new abstract method called CreateSupplier which in turn would return an ISupplierAccess interface. We would have to create the new interface and create an implementation for the new interface for every single product family that we already have (i.e. the Oracle & SQL product families) and then change the current implementation of all our concrete factories (the SQLFactory & OracleFactory classes) to implement the new create method. This is the biggest possible mess we could have gotten ourselves into. Basically every package (or worst yet every .net assembly) in the original design has to be modified to support the change.

As I mentioned before this pattern is not suitable for situations where you have a lot of new products being added to the system, in these situations you should look into the prototype pattern or other patterns (I will hopefully get around to talk about these in later posts).

Overkill

As is the case with many design patterns you don't want to use a pattern in a situation where it's overkill. Obviously if your requirements specifically state that you are going to have only one product family why would you design it using this pattern? If you have only one product type (or have identified only one so far) why would you use this pattern (what's the use of a factory with just one single create method)? But there are also situations where the patterns use might seem to be overkill but really isn't. For example:

Suppose we are developing a system that is going to support two and only two database technologies: SQL Server & Oracle. We have 7 or 8 product types and we first think of the abstract factory pattern then turn it down on the grounds that it's overkill. It's overkill due to the fact that we are not going to support any other product families in the future. There is going to be only two product families!!! Is it overkill? Well let's see what happens if we want to implement this using specific features of each technology but not use this pattern (or any other pattern that is aimed at solving this problem). In one way or another we would end up with this kind of code:


class WarehouseAccess {
public void Save(some data) {
if (IsInSQLMode) {
// SQL Server specific code
}
else {
// Oralce specific code
}
}
/* the rest of the code */
}

The above code has many problems one of which is logical cohesion (which is a very bad level of cohesion). We have put two completely different pieces of code in one place (the SQL and the Oracle implementation). We are also adding an extra if statement in the path of a save (although an 'if statement' might be a very inexpensive code addition, but it's still something extra). Also from a development/maintenance point of view we have two different code types which might need different programmers to work on them, they will require different assemblies to be referenced and namespaces to be imported, etc.

All the mentioned problems seem trivial but from a "clean code" perspective & a performance perspective implementing the abstract factory pattern here is worth while.

Microsoft Surface

Take a look at http://www.microsoft.com/surface and see the future of computers!
This is just too cool!
I've got to buy one of these!

Wednesday, May 30, 2007

Prototype

Next pattern that I want to discuss is the Prototype pattern from the Gang of Four Design Patterns. For a detailed introduction to this pattern please see here or refer to the Gang of Four Design Patterns book.


The prototype pattern is an interesting little pattern (little in the sense that it's not that complicated) which has been mostly superseded by language mechanisms available in many modern programming languages. But it still remains as a nice topic to discuss and learn since the more proficient you are at this pattern the better you can use those language specific features (which will be discussed later) and understand where they might not work out for your design and you'd have to implement the prototype pattern from scratch. As always my discussions here are going to be based on the C# language and .net features but Java programmers will also find a whole lot of features that are completely similar and will benefit from this discussion.


The prototype pattern is another pattern in the list of creational patterns. Creational patterns by definition provide a mechanism to create new objects in ways more flexible than simply using the 'new' keyword. The two patterns discussed so far Singleton & Abstract Factory are also creational patterns and they serve the same exact goal. For example when using the Abstract Factory pattern you don't use the 'new' keyword to create a product but you call the abstract factory's CreateXYZProduct() method that will create the specific product and return an interface to it. This level of indirection (calling a method that will create the object on your behalf) is how the pattern is providing the added flexibility and the ability to swap families of products without touching the client code. The prototype pattern is also considered a creational pattern because it provides the same concept. It gives you a more flexible way of creating an object.


Let's discuss the pattern by examining an example. Suppose you are building a graphic design tool in which the user gets a canvas and can pick from different drawing shapes (such as circle, rectangle, line …) and use them to design some graphics on the canvas. Now let's focus on the toolbox of shapes that we are providing for the user: Suppose one of our requirements state that alongside the x number of shapes that we are supporting built-in we are also supposed to support the ability to add new shapes in the future (without changing or re-compiling the currently deployed software). In this case the routines that are creating each shape need an added level of flexibility to be able to add any shape, even the ones that are added in the future (by someone else). This is a perfect example of where the prototype pattern can be used. But before we get into the pattern details let's address another issue with the above mentioned requirements: If a shape is created in the future how am I supposed to know how to use it: obviously to the experienced OO designer this is a trivial question since all you need to do is look at the shapes from a level of abstraction that is common between all shapes. In other words I'm going to assume that we will have a Shape class or an IShape interface that all of our shapes (even the ones built in the future) will inherit from (implement) and therefore we can talk to all of them through that common abstract interface.


OK back to our pattern discussion:

The prototype patterns general structure looks something like this:







Before getting into any discussion about the pattern let's apply the above structure to our example:




As can be seen the prototypes are our shapes (or toolbox items), they provide the functionality that we expect from each shape (i.e. to appear in the toolbox, to be drawn on the canvas, resize, etc.), but they also provide an extra piece of functionality: the ability to clone themselves and provide an exact replica of this object. Now this ability might look trivial when we know what the object's type is but when you look at this ability from an abstract level (for example the Prototype level or in this case the Shape level), it seems rather useful. What we have in the ShapeFactory class (Prototype factory) is a list of all possible shapes (prototypes). Whenever a shape is selected by the user, we calculate that shapes index based on where the user clicked on our toolbox and then ask the ShapeFactory class to provide us with that shape (the Create method). The ShapeFactory in turn finds that shape (prototype) in the list of shapes that it contains and asks it to create an exact instance of itself. Now this shape could be one of the built-in shapes or a shape added later by another programmer. The above example perfectly shows that the Clone method although trivial at the specific object type level, is extremely powerful at the abstract level and of great use to the prototype factory class.


We've had an assumption so far that must be addressed: we had assumed that the prototype factory class will somehow have a list of prototype objects. With those prototype objects it can clone them and create new ones but the question is where did it get the prototype objects in the first place. It's similar to the egg & the chicken problem: which one came first?


The context in which you use the prototype pattern greatly affects how the prototype factory will obtain the list of prototypes. A very simple example is situations where (as in the above example) throughout our entire program we only need one prototype factory that will contain a list of prototypes and reply to create requests based on that lists content. In these types of scenarios the best implementation is to make the prototype factory a singleton that when loaded will look into a config file, find a list of all the possible prototype objects (built-in ones and future ones) and the load all of them using reflection. As in this code sample:


class PrototypeFactory
{
private PrototypeFactory()
{
Configure();
}

private List<IPrototype> Prototypes = new List<IPrototype>();

private void Configure()
{
List<string> prototypeConfig = LoadConfig(); //load the prototype list from somewhere
foreach(string config in prototypeConfig)
{
string[] parts = config.Split(','); //let's assume each config has a dll name followed by a class name separated by a comma
//obviously we need extensive error checking and exception handling which has been omitted for the sake of brevity

Assembly asm = Assembly.Load(parts[0]);
IPrototype pt = (IPrototype)asm.CreateInstance(parts[1]);
Prototypes.Add(pt);
}
}

/* other methods including the singleton implementation */
}

You might ask why not use reflection to totally replace the prototype pattern? Actually that's one of the language mechanisms that I mentioned at the beginning of this post. But you have to remember that as always using reflection comes at a performance cost and in situations that speed and performance are at the utmost importance the above technique works better. As you can see we are only using reflection once and that is to load the initial list of prototypes. After that list is loaded to clone a prototype we are asking it for a clone and the prototype is using a standard 'new' keyword to replicate itself.


Another variation of the pattern which isn't supported directly by reflection is when the Clone method is used to do more than just create a new object. In some design problems we need the prototype object to replicate itself by first creating a new object and then initializing it with some data that the original prototype is carrying. Imagine a situation where creating new objects from scratch is going to be too expensive because each object once created would have to go to a DBMS or a remote network location to load some data to initialize itself with. In these scenarios the prototype pattern would be useful because the prototype factory will be holding a list of objects that have been created & initialized once and then when we ask it to create a new instance of one of the specific prototypes it will in return call that prototype's clone method and as mentioned above the clone method will create the new object and also copy the initialization information from its own copy into the new object (before returning it). This way we save on the expensive load process.


It's quite obvious that the above scenario can't be implemented using pure reflection and the prototype pattern needs to be in place. Here is a sample implementation of the above mentioned scenario:


class BigAndExpensivePrototype : IPrototype
{
//private constructor used internally
private BigAndExpensivePrototype(BigAndExpensivePrototype copyFrom)
{
//initialize this object based on the passed object’s data
}

//public constructor used by the prototype factory’s list creation method (should only happen once)
public BigAndExpensivePrototype()
{
//initialize this object by reading it’s information from the DBMS
}

public IPrototype Clone()
{
//create new prototype object by calling the private constructor and passing this object so that it will be initialized the easy way
return new BigAndExpensivePrototype(this);
}
}



As the above example shows the expensive routines used to load the prototype from the DBMS are called once when the prototype object is created and added to the prototype factory's list of prototypes. After that any attempt by the factory to clone and get a new copy of the prototype would result in the private constructor initializing the new object based on the original prototypes values.

Another language mechanism that one needs to consider when using this pattern is the ICloneable interface in .net. This interface is the equivalent of the IPrototype interface in the pattern's original structure. The ICloneable interface has been so closely modeled over this pattern that even the only method available in the interface is called Clone(). So the basic interface has been standardized in the .net platform (and similarly in Java) but the major decisions regarding the usage of the pattern still remain for you as the designer to make. To summarize here are the things you have to decide on:



  1. Is the usage of the pattern only going to be limited to creating new objects or are we going to create a new object and perform some sort of initialization based on the original prototype.
  2. If we are going to copy data then is it going to be a deep copy or a shallow copy (If you don't know what a deep copy or shallow copy is I'll be discussing it soon in the next post; see here)?
  3. If the usage of the pattern is only going to be limited to creating new objects can we afford to use reflection (is very high speed an issue: then don't use reflection)?
  4. How will the factory select the prototype (so far we have assumed an integer index)?

By answering the above questions you have basically determined the variation of the prototype pattern that you need to develop.

So far we have assumed that we are going to pass an integer which will map to an item in an array when calling the create method. But in many real life scenarios this isn't the case. The abstract structure of the pattern assumes that the Create method will receive a 'key' which it can translate to an appropriate prototype. For example the prototype key might be strings (and we will look them up using a Dictionary or a Hashtable) or other types of objects. So in deciding to use the pattern one of the major concerns is that there has to be a method by which the prototype factory can select the prototype objects.

In all of the above examples we have assumed a singleton for the prototype factory but there are also other situations where this pattern is useful and you can't implement the prototype factory as a singleton. For example suppose you need to configure a sub-system with a set of prototypes to use, basically you would create the prototype factory, configure its list of prototypes and then pass the prototype factory to the sub-system. The sub-system will use the factory it receives to create the objects that it needs. What the sub-system uses and how many prototypes it gets are in the calling system's power.

Now we have already seen a code example that uses the IPrototype interface but here is one that will use reflection. In this case we don't really need a base class or interface to inherit from since we don't need a Clone method to call all we need is the prototype factory to create the objects we need using reflection. Again for simplicities sake I'm going to assume that it's a singleton but you can extend this example to situations where the factory isn't a singleton object:


class PrototypeFactoryWithReflection
{
private PrototypeFactoryWithReflection()
{
Configure();
}
public static PrototypeFactoryWithReflection Instance
{
get
{
return _instance;
}
}
private static PrototypeFactoryWithReflection _instance = new PrototypeFactoryWithReflection();

private List Definitions = new List();
private void Configure()
{
Definitions = //load the definition list from a config file or a similar location.
//note that we are expecting these definitions to be in the form of “dll name, class name”
}

public object Create(int idx) //again for simplicity we are assuming an index key but this can also be extended
{ //error checking code omitted for brevity
string def = Definitions[idx];
string[] parts = def.Split(‘,’);
Assembly asm = Assembly.Load(parts[0].Trim());
return asm.CreateInstance(parts[1].Trim());
}
}

As mentioned before the above implementation is using reflection to simplify the whole process and it has the advantage of not having a base interface or abstract class therefore meaning you can apply it to classes that already exist and don't inherit from IPrototype (ICloneable) but as mentioned before reflection is a bit slower then directly calling the 'new' keyword and if performance is a major issue you are better off implementing this via the standard definition of the pattern.

Monday, May 28, 2007

Singleton: Ammendment 2

I've found an intersting question regarding the Singleton pattern that might be worth examining: "What if the class that is supposed to become a Singleton doesn't have parameter less constructor?"
(If you need to read explanations on the pattern read here and here)
Hmmm! No why would you have a singleton where its constructor accepts parameters? It kind of sounds useless since no one is supposed to create instances of the singleton class.
But just for the sake of argument lets assume we have a singleton class with such properties: for example we've decided to create a log helper as a singleton. It's a class that's going to be used all over our program and anyone who needs to log something will send it to this class. The log helper would take care of writing the entry to all the different log output devices. Again for argument's sake let's assume that we have created the constructor of this class so that it accepts some settings in regards to how it's supposed to behave (don't ask why!):


class LogHelper
{
/* Logging method implementations */

private LogHelper(some-configuration-value)
{
//use the some-configuration-value or store it somewhere
}

public static LogHelper Instance
{
get
{
return _instance;
}
}
private static LogHelper _instance = new LogHelper(/*how to provide the param?*/);
}


As you can see the big question is how do we provide the parameters to the constructor. Well the issue here is a little bit of a bad design: WHY would you want to pass something to the constructor of the class your designing when no one can actually call the constructor but yourself? Weren't you better off just extracting the parameters or config values or what ever it was supposed to be passed to the constructor in the constructor itself? That would be a better implementation.

Now the above example is very simple (and stupid) but there are situations where a constructor with parameters could help a Singleton design! Surprised? Well here is a situation:
Suppose we were building a class called PrintManager, this class is going to manage the queue of documents that are waiting to be printed to a specific printer and it is also going to handle the printing operations (don't really care about that part). The only thing important about this design is that the PrintManager is going to be a multi-instance singleton meaning that multiple instances will be available, each instance managing a specific printer. Now for simplicities sake we are going to assume that all the printers are the same (one class can handle them all) but each printer is on a different port so when we create the printer object we need to tell it (via the constructor) on what port to print to.
Even in this case the code responsible for creating the PrintManager object is still part of that class so if the PrintManager object has no parameters in the constructor or has 10 parameters it doesn't really affect anyone and it's an internal thing. See the code below:


class PrintManager
{
private int Port;
/* printing & queue management code */

private PrintManager(int p)
{
Port = p;
}

public static PrintManager GetInstance(int forPort)
{
lock(Instances)
{
if (Instances.ContainsKey(forPort) == false)
Instances.Add(forPort, new PrintManager(forPort));
}

return Instances[forPort];
}
private Dictionary Instances = new Dictionary();
}


As can be seen in the above code the internal code of the PrintManager class has to deal with the constructor so again having parameters or not having them doesn't really matter.

Thursday, May 24, 2007

The much debated Singleton: Part II

As I mentioned in my previous post regarding the singleton pattern I've seen a lot of talk on different blogs/forums regarding the evils of the singleton pattern. I've tried to cover the general concepts of where this pattern should be used and where it should be avoided in the previous post I made regarding this issue (read here) and in this post I'm going to quote a couple of the stuff that I've found and post my reply to them in here. So here goes: My responses in bold.

The items that I found on http://c2.com/cgi/wiki?SingletonsAreEvil

  • Someone had mentioned that when you use the singleton pattern you suddenly have two types of objects:

    "those that can be instantiated in a standard fashion and those that cannot be created at all. I would personally rather use a container which governs the number of a given object that can exist in a system and acquire the objects from the container"

    We already have a lot of different types of objects that can be instantiated (or not instantiated) in different ways this isn't a particular problem with the singleton pattern but quite the opposite, the singleton pattern is using one of these different types of language mechanisms to obtain what its aiming for (the fact that no one should be able to instantiate the object unless they go through the single point of access). As an example we have interfaces, abstract classes, sealed classes, protected & private constructors, static classes and a whole lot of different types of classes that some you can instantiate in the regular way (using new) and some you can't (and some you are not allowed to instantiate period; e.g. the abstract class). Anyway this isn't a really good argument against the use of the singleton pattern.

  • Someone else had mentioned this:

    "Singletons usually are used to provide a single point of access to a global service. I always make the singleton separate from the class itself so the class can be used any way you want. The singleton can then use the class. The singleton also doesn't have to instantiate the object"

    One of the points of the singleton pattern is to prevent accidental or purposeful creation of the singleton object, that's why it has a private constructor so one can create it except for the singleton class itself which knows when and where to create it. In regards to the second part I have seen C++ code that uses the "friend" keyword to have a separate singleton factory than the singleton object itself but I really can't see any use for it. You need a single point of access you also need a class that can be instantiated in a controlled manner why are you making your design complex and the programmer using your design confused by making him deal with (and understand) two classes (albeit one being very small and simple). I guess you might come up with very rare cases where this would be useful but in general I think sticking with a singleton class that also provides the single point of access is the best way to go.

  • A couple of people where mentioning coupling issues with singleton class names and/or polymorphism issues with the singleton:

    "Subsystem's coupling to Singleton's Class Name" or "Singletons aren't polymorphic"

    Let's make something very clear singletons are polymorphic (please read the section regarding polymorphic singletons in this post) and let's also focus on something else the whole reason we are using a singleton as opposed to a global variable (or static member) is because it allows us to deal with the single point of access issue through an object oriented fashion. It allows us to have all the OO design techniques, design patterns etc. at our disposal when we are dealing with a singleton just as we would when dealing with other classes & objects. So all general OO design techniques and features apply to a singleton.

    Regarding the first argument: your different subsystem's are going to be dependent on some single point of access (because you've made that design decision) now it's a lot better that you implement it using the polymorphic singleton pattern just to make sure that they are only dependant on an abstract class (the singleton being that abstract class) not on the actual implementation.

  • An amusing one:

    "Doesn't a C# static class, with its static constructor, provide all the functionality you would ever need from a singleton?"

    OK I think this guy has missed the whole point of the singleton. If you use a static class that means all member variables/methods are static and then you might use the static constructor to initialize some of these member variables, BUT this isn't object oriented programming this is a feature simulating the procedural way of design & implementation. Why have the designers of the C# language decided to include it there? Well one reason being the fact that you will need utility classes (methods) that are pure static classes and you don't need a real object to respond to the calls (another design decision) but a utility (or pure static class) is just a bunch of methods but a singleton is a full blown object instantiated from a class the only difference being that only one instance (or a limited number of instances) exist of it.

    If you think this argument is correct you should try to first understand the singleton pattern and then come back and re-read what the argument is saying!


These items I've found on http://www-128.ibm.com/developerworks/webservices/library/co-single.html

  • "There is one implementation anti-pattern that flourishes in an application with too many singletons: I know where you live anti-pattern. This occurs when, among collaborating classes, one class knows where to get instances of the other."

    This will be a problem when you overuse the singleton pattern, but correct usage of the pattern especially in places where there is a strong domain reason or a legitimate technical requirement backing its existence but otherwise over-use of the pattern will cause the above mentioned problem.

  • "Where's the harm? Coupling among classes is vastly increased when classes know where to get instances of their collaborators. First, any change in how the supplier class is instantiated ripples into the client class. This violates the Liskov Substitution Principle, which states that you should allow any application the freedom to tell the client class to collaborate with any subclass of the supplier. This violation is felt by unit tests, but more importantly, it makes it difficult to enhance the supplier in a backward-compatible way. First, the unit tests cannot pass the client class a mock supplier instance for the purposes of simulating the supplier's behavior, as described above. Next, no one can enhance the supplier without changing the client code, which requires access to the supplier's source. Even when you have access to the supplier's source, do you really want to change all 178 clients of the supplier? Weren't you planning on having a nice, relaxing weekend?

    Collaborating classes should be built to allow the application to decide how to wire them together. This increases the flexibility of the application and makes unit testing simpler, faster, and generally more effective. Remember that the easier it is to test a class, the more likely a developer will test it."

    Again I think this a case of misuse of the pattern. Let's straighten one thing out the fact that a singleton is a class that by default you're not suppose to instantiate doesn't mean that it can't have sub-classes and doesn't mean that replacing it with "stub" classes for the purpose of testing would be impossible. I feel I'm repeating myself too much but a polymorphic singleton especially when it's combined with other patterns such as bridge or strategy it can give you all the flexibility in the world and doesn't affect testing or any other type of pluggable behavior. Completely loosely coupled!

  • "To decide whether a class is truly a singleton, you must ask yourself some questions: (1) Will every application use this class exactly the same way? (exactly is the key word) (2) Will every application ever need only one instance of this class? (ever and one are the key words) (3) Should the clients of this class be unaware of the application they are part of?"

    I don't agree with the rules set above although they might be very helpful in situations where there is going to be only one instance of the singleton but as Gamma mentions in his book a singleton might be polymorphic (therefore question 1 is not really the case if you need a polymorphic singleton), a singleton might be designed because we need a controlled number of objects (therefore question 2 is not true) and I don't understand question 3! J

This item is from http://www.sitepoint.com/forums/showthread.php?t=386447

  • "It's hard coupling. Singletons are essentially static functions - the difference is syntactical sugar. Whenever you reference a static symbol in your code, you make it more dependent on the context. It's really the same as with a global variable - A global variable is bad because it has a global scope. A static function is also global in scope, so it has the same problems associated. The difference between the two, which accounts for the assumption that singletons are slightly better than global variables, is that functions are more abstract than variables, and this gives you some flexibility. This flexibility can then be used to excersise some damage control. Still it can't beat narrowing the scope, which is what objects provides."

    Hard coupling!! First time I heard that (I think he meant tight coupling). Anyhow this description is very good, and especially relevant to misuse of the pattern. If you use a singleton as an excuse to be lazy and not think about your design, if you use a singleton as an excuse to make everything global then yes there is no difference between a singleton and static members BUT if you use the singleton where it's supposed to be used it will be a whole lot more than "syntactical sugar." Plus we are forgetting that part of using a pattern isn't just because it's implemented with for example a static member or it isn't, but it's the whole knowledge that comes packed up with a pattern and is transferred to the next person (read this post). The above comparison is like saying using functions & procedures is "syntactical sugar" to using a goto statement because they are eventually a jump to a piece of code and a jump back! But that's not the case, functions and procedures provide a higher level of abstraction, so do classes, interfaces etc. which provide an even higher level of abstraction. Patterns are pretty much the same thing they too provide a higher level of abstraction albeit being implemented using mundane "static" members or similar constructs of a lower level of abstraction.

Items from http://home.earthlink.net/~huston2/dp/singleton.html

  • "It's relatively easy to protect Singleton against multiple instantiations because there is good language support in this area. The most complicated problem is managing a Singleton's lifetime, especially its destruction ... There are serious threading issues surrounding the pattern."

    Who says we don't want multiple instantiation of a singleton? There are cases that we need multiple instantiation (but controlled number of instances). I totally agree that in C++ and other languages that don't have garbage collection managing the lifetime of a Singleton or a lot of other patterns could be hard but not impossible. Yes threading could also be a serious problem to consider, but with singletons there are usually two scenarios that occur: you either have a singleton instance per thread then you have a multi-instanced singleton which returns an instance based on the calling active thread or you have on instance of the singleton replying to all threads which means the class must be thread-aware.

  • "The Singleton design pattern is one of the most inappropriately used patterns. Singletons are intended to be used when a class must have exactly one instance, no more, no less ... frequently use Singletons in a misguided attempt to replace global variables ... A Singleton is, for intents and purposes, a global variable. The Singleton does not do away with the global; it merely renames it."

    One instance and no more? I don't agree and if you read Gamma's book you'll see that it's not what he intended when he created the pattern. I agree that singleton is usually misused and it is not supposed to fix global variable problems (i.e. common coupling) but the advantages it provides as opposed to using a global variable are tremendous. Read my post on this issue here & here.

    "When is Singleton unnecessary? Short answer: most of the time. Long answer: when it's simpler to pass an object resource as a reference to the objects that need it, rather than letting objects access the resource globally ... The real problem with Singletons is that they give you such a good excuse not to think carefully about the appropriate visibility of an object. Finding the right balance of exposure and protection for an object is critical for maintaining flexibility."

    I agree completely! Very well said.

  • "Our group had a bad habit of using global data, so I did a study group on Singleton. The next thing I know Singletons appeared everywhere and none of the problems related to global data went away. The answer to the global data question is not, "Make it a Singleton." The answer is, "Why in the hell are you using global data?" Changing the name doesn't change the problem. In fact, it may make it worse because it gives you the opportunity to say, "Well I'm not doing that, I'm doing this" – even though this and that are the same thing."

    Very well said.

Abstract Factory: Part II

In my previous post regarding the abstract factory pattern I discussed some basic concepts regarding this pattern and we went all the way up to the combination of an abstract factory and a polymorphic singleton to achieve the best encapsulation and reuse of the abstract factory code. In order to be able to fully maximize this feature we also need to talk about how to package the different classes in the abstract factory pattern so that packaging will not hinder future growth.

My discussion here regarding packaging is going to be completely based on Microsoft .net technology but the same basic concepts can be applied to any other technology that you might be using:

At the fines grained level of packaging of the abstract factory implementation you need to place all the abstract definitions that the client code needs into one assembly (i.e. the AbstractFactory class and the product interfaces) and then each product family alongside its factory class should be packaged into its own assembly. So coming back to the data access layer example of the previous discussion we had created this design (please see the last example in my original abstract factory post for the polymorphic singleton implementation added to the AbstractFactory class):



Since in the above design we have two product families (the Oracle & the SQL Server families) then we would need three assemblies to package all these classes:

  • The CommonDAL assembly will contain AbstractFactory, IWarehouseAccess, IDeliveryAccess & IProductAccess.
  • The SQLDAL assembly will reference the CommonDAL assembly and would contain the SQLFactory, SQLWarehouseAccess, SQLDeliveryAccess & SQLProductAccess.
  • The OracleDAL assembly will reference the CommonDAL assembly and would contain the OracleFactory, OracleWarehouseAccess, OracleDeliveryAccess & OracleProductAccess.

The client code would only need to reference the CommonDAL. Using that reference the client code would call the AbstractFactory (since it's a polymorphic singleton it will be our single point of access), obtaining an appropriate interface that it needs and then call that interface to do whatever is necessary:

IWarehouseAccess w = AbstractFactory.Instance.CreateWarehouse();

w.SaveWarehouse(…)

As you can see the client code only needs to physically reference the CommonDAL, obviously the assembly containing the product family that has been configured for current use must be available (either in the current execution directory or in the GAC), since the AbstractFactory's polymorphic singleton implementation is using reflection to load the configured product assembly, but the only compile-time dependency between the client code and our assemblies will be to the CommonDAL.

This will also give us a very nice feature. Once you design your software based on the above pattern (or combination of patterns) you are able to deliver only what the client needs and no more. If you have a client that will be using Oracle you don't need to deliver the SQL product family assembly at all and there will be no dependencies on it either. Once a client using Oracle for example decides to switch over to SQL you can easily send them the SQL product family assembly (SQLDAL), ask them to copy it into the runtime directory and change the configuration of the app to point to this assembly instead of the previous one. This also brings us to another neat capability: if in the future you decide to store your information in another format (for example let's say as an XML file) all you have to do is implement the XML family of products (XMLWarehouseAccess, XMLProductAccess & XMLDeliveryAccess) & create a factory for them, package them up in one assembly and ship it to the client. As you can see the original binary code available on the client's machine doesn't need to be touched at all. All you are doing is providing a pluggable new implementation that can be copied alongside the current binary implementation on the client's machine and be used immediately.

Some notes on the Abstract Factory pattern:

  1. The abstract factory pattern creates an extra hierarchy of classes (the factory classes) that need to be maintained alongside the actual products. This extra hierarchy in some designs might be disliked; in other cases might be thought as the cause of extra complexity. So the use of the pattern should be justified with the need to change/replace/add product families in the future and the other features that the pattern provides.
  2. This pattern can be easily replaced with a full reflection based implementation but is it worth it from a performance stand point? I have seen similar implementations of the pattern where the factory hierarchy was removed and replaced with one class that will create the needed product by using reflection (similar to what we are doing in the polymorphic singleton's CreateInstance method) but the main difference would be that the code that I have provided in the CreateInstance method is only called once per program execution while a reflection based implementation of the factory class would cause the reflection routines to be called every time an object is created (this could be very expensive). As a rule of thumb if your factory class is going to be called in the order of more than 50 times a second then don't go with a reflection implementation as it can impact performance a lot.
  3. Don't ever use the abstract factory pattern in scenarios where the client code would need to create an object from product family A and then a little bit later from product family B, this pattern hasn't been designed for such a problem. I'll describe a good solution to this issue once I describe the Bridge pattern.
  4. In some cases an abbreviated version of this pattern would work a lot better. This abbreviated version is a pattern all by itself and it's called the "Factory Method" pattern which I will describe in later posts.

Wednesday, May 23, 2007

The much debated Singleton

After creating my post about the Singleton pattern I noticed a lot of hype around this pattern and its use (or misuse) over the internet. I also found some very interesting and sometimes amusing posts/comments on the matter (which I've included at the end of this post). Interestingly there was a lot of negative content about this pattern and the side effects that it has on your design. Intrigued by all of this information I decided to write my version of why & when you should use the Singleton pattern and also try to answer a lot of misconceptions that I read about during the past two weeks. So here goes:

The original thought behind the conception of the singleton pattern is to provide an object oriented approach to alleviate the pain of common coupling (global data) and provide a different strategy when dealing with global data. Also the pattern was intended as a solution when you absolutely needed to limit the number of objects of a certain type (class). Note that the singleton pattern will not totally remove common coupling from your design and to solve the problem completely, in most cases, requires the designer to completely re-think their design. Let's understand common coupling and it's evils before we delve any deeper:

Common coupling is a situation where function/component/class/module A & B are dependent on one another (coupled together) through the existence of some sort of external (and shared) data area between them in which one can change the content and the other will read from it (or both will read and change). Now the two coupled entities might be calling each other or might not be using one another directly at all. It doesn't matter; they are coupled to each other due to their shared use of the common data area. The perfect example being global data shared between multiple functions/components/classes/modules etc. (just for the sake of brevity I'm going to refer to all these different sections of code that coupling might relate to as components from now on).

Why is common coupling or global data bad? Well let's start by answering this question: why do we want loose coupling between our components (as opposed to tight coupling)?

  • When multiple components are tightly coupled we lose the chance for component reuse: If we want to reuse component A we must also take component B for the ride, because component B changes the global data and if the global data doesn't contain a certain state we will not be able to get the functionality we need from component A.
  • Due to tightly coupled components we increase our cost of maintenance and tests. When something changes instead of only needing a replacement of component A's implementation (which relates to the global area), since other components are tightly coupled to it we need to change them as well. Once we change component A and test it we must also test the other components tightly coupled with it.
  • When there is too much coupling we reduce the probability of independent design, coding & testing therefore creating a tougher environment as far as technical & project management is concerned.
  • Too many tightly coupled components mean a bigger thing to understand: more complexity.

As we can see common coupling (which is a level of tight coupling) can become a major reuse hindering factor, especially in cases where future change and evolution of the system is expected. How many times have we all designed something using global data only to find out in later versions that the data that we assumed (wrongly assumed!) to only have one running instance in the system needed to actually have multiple running instances. For example suppose we were to design a word processor similar to WordPad that ships with Windows. One of the immediate design decisions that come to mind when designing this word processor is to keep the current document information in a global variable where all the different routines of our program can access it and so we don't have to pass it around (common coupling). Making this design decision will cause many problems down the road but let's just examine two:

  • Components that we design (i.e. the spell checker) all depend on the existence of a globally available document to do their work and their reuse is hindered by this fact when we need to make use of them in a different project. As an example suppose we need a spell checker to check product information before it was saved to the database. If I had had the foresight to design my spell checker as an independent component that accepts what it's supposed to check as input (a rather abstract form of input but never the less an input not a global data area) and performed the checks on the input (as opposed to a global data area), I wouldn't have had this problem when the time came for its reuse.
  • If in the future the stakeholders decide that we are going to change our SDI (single document interface) word processor into a fully fledged MDI (multiple document interface) application where the user can open and work on multiple documents simultaneously we are going to be in trouble. Again we would need to rethink our single point of access because now we have multiple documents being loaded and it no longer makes sense to have one single document in memory where everyone will access and use.

Having said all this, one can't deny the fact that using global data in some cases is absolutely essential. But the scenarios where use of global data as a design decision is correct and invaluable are rare and will only occur once or twice in a moderately sized program. So the big problem is detecting when you are to allow a global shared piece of data and when not to. Chances are that in most cases you shouldn't allow it and we'll try to figure out when to do this and when not to, but before we get into that let's get back to the singleton pattern.

Based on the discussion so far the singleton pattern is useful when their exists enough reasons to share something as global data in your program, at that point instead of sharing it as global data (or a static member variable of a class) you should expose it as a singleton. This way you have more of a control over how that piece of global data will be accessed and used and in the future you can maintain it a lot easier (since singletons are regular objects instantiated from classes) by using OO techniques and other design patterns. You can also make sure that only one instance of that global object exists and no one (on purpose or through a mistake) can create a second instance which could have dire consequences (depending on your object's role in the system).

OK now we can go back to the definition of the singleton pattern and figure out when to use it:

The singleton design pattern is used in places where you want to ensure that only a single (or a controlled number) of instances of a specific class exist. The client code using the object shouldn't be allowed to create new instances (since it doesn't make sense to create new instances) and all of the different areas of client codes should access that single object through a well defined access point.

As can be seen nowhere in the definition of the singleton pattern (in any documentation about the pattern) is there a reference to use a singleton whenever you feel like using global data! Or use a singleton whenever you're lazy and don't feel like thinking about your design so you just expose some object as a global object and let everyone in all the different layers of your software access it through the single point of access. Or any other similar excuse for singleton's (mis)use. In order to use a Singleton correctly one cannot break other design rules (such as Common Coupling) a singleton does not justify having common coupling in your design!

So where can we use a singleton? Well there are situations where hardware or some other fact from the domain or a non-functional (technical) requirement dictates the use of the singleton pattern. A perfect example is the polymorphic singleton that I've used in the Abstract Factory example so that we have one single point of access to our abstract factory and that single point is configured with one of its subclasses. The technical requirements of that design imply that throughout the execution of the software the singleton will only contain one object that never changes. Other such examples would be when you are managing one piece of hardware (i.e. a modem, printer, etc.) and even in this case it is always wise to think about the consequences of your design decisions if we end up supporting more than one of those hardware pieces (i.e. two modems at the same time). But even for places where we have a controlled number of hardware (multiple modems) we can use the multi-instance singleton (please read my post on the singleton pattern) to fix this problem.

Another perfect use of the singleton pattern is to make sure that from a specific class we only get one object per thread. An example is the Connection Manager (which in some cases we might need more than one per thread but for the time being we'll assume one per thread will do) in this case the singleton pattern is there to make sure that nobody creates an extra copy of the ConnectionManager and cause problems in the way we communicate with the DB but the singleton itself might have multiple instances (actually one for every active thread that is running) so code in each thread will get a different ConnectionManager object hence making sure that one thread wouldn't use another thread's database connection.

There are a lot of perfectly legitimate reasons to use a singleton and I've only touched on a couple of them here but to round things up I believe the following list will help make an informed decision. As with any engineering design decision you as the designer have to consider the tradeoffs and you will ultimately decide if the singleton will help you out in a specific scenario but you should also take care not to misuse it since its negative effects on your design (especially form a reuse point of view) can be disastrous:

  1. Is the candidate object a truly single object from a problem domain point of view? Will it always be that way (considering future growth of the software)? If yes by all means implement it as a singleton.
  2. Will the candidate object becoming a singleton cause the common coupling problem (as described above)? If so be very careful you might be better off not implementing this as a singleton. If it doesn't create a common coupling problem (as is the case with many technical requirements which point us to a singleton and won't create a common coupling problem) then a singleton would be a lot better than a global variable (or a static member variable).
  3. Is the data stored in the singleton object being used from multiple locations in your software and a change from one location will be picked up in another location? If this is the case go back and think really seriously about item #2 in this list, since it looks like your causing a common coupling problem. If it's not a common coupling problem but the question still holds true then ask yourself are the multiple locations that are going to be accessing this singleton located in different layers of the software possible deployed in different processes or across machine boundaries? Will they ever be deployed like that? A nice example of this is current user information. Suppose we are building a multi-layered distributed windows application. The UI layer authenticates the user and puts the current user credentials in a singleton object hoping that one of the back end layers (for example the DAL) will be able to pick it up and use it. In this case your code will work when you're testing it and everything is running in-process but will immediately break once you deploy it to the live environment where the UI resides on client machines and the DAL resides on a server. Why? Because singletons don't automatically get marshaled across process/machine boundaries. Actually that's the case for any global (static) variable and you would have to come up with some solution to get them across. In this kind of a scenario a singleton might work but you need extra effort and code to make sure the data it's holding gets marshaled across to the other process/machine.
  4. Does the problem domain consider a class to only have one instance, for simplicities sake or is that an immutable fact in the domain? For example suppose we're developing a financial accounting system and in this system we decide to create a singleton called OwnerCorporation: The OwnerCorporation is considered a singleton since the financial accounting system will be sold to one corporation and they will use it to keep a record of their financial records. But this decision (having only one OwnerCorporation object) has been made for simplicities sake it's really not true in the real world and the moment this app has to support other different scenarios in the real world this design decision will cause it to break. For example suppose down the line we decide to sell the app to an accounting firm where it will be used to track the financial accounts of multiple corporations hence needing multiple instances of the OwnerCorporation class. Or in the future we decide to sell our financial accounting services online and the same back-end routines need to be used to support multiple owner corporations each accessing the site to manipulate their specific information, again implementing the OwnerCorporation as a singleton will cause headaches in this scenario.

In general you should pay attention to the fact that some object might seem like a singleton from a certain view point but from a different viewpoint it's not a singleton at all (as in item #4 above). This is the case with many other design decisions that we have to make and the cost/time needed to design a perfect scenario that will fit all possible viewpoints is usually so prohibiting that most designers don't really bother. The same case is true for the singleton pattern BUT the issue that should be stressed time and again is that the singleton is not a valid replacement for good (or decent) design or we shouldn't use the singleton as an excuse to create global data shared by everyone and therefore creating common coupling in our system.

I know I promised to review some posts & comments that I found on the internet here but this post is very long as it is so I'll work on it in the next post.

Monday, May 21, 2007

ASP.Net Blog

I'd like to introduce an interesting ASP.Net blog written by a good friend of mine. Enjoy.

Abstract Factory

Abstract Factory is also a highly used pattern throughout the initial design & architectural design steps of a project. Whenever you are faced with supporting multiple competing technologies in a software design and the possibility of switching between these different implementations one of the first patterns that should pop into your head is Abstract Factory. For formal definitions of this pattern please read here or refer to the Gang of Four book.

Let's start by trying to understand where this pattern would be used. A classic example is when you are developing an application that is supposed to run on different DBMS technologies and based on a setting or a configuration that the client will set at install time (for example) one of these implementations will be chosen and used throughout the code. But the problem doesn't end there, you're also supposed to support the future expansion of these implementations without the need to touch old code (or even re-compile it). Say for the time being our requirements state that we must support SQL Server 2000 as one of the DBs but we should also support Oracle 10g. And in the future we should also support other data stores for example an XML based data store or Microsoft Access, DB2, etc. etc.

What we want to do is try to design our software in a way where we are absolutely sure that in the future any new data store that comes along can be plugged into our application and our application can start reading/writing to that data store. By now some of you might have decided to use OLEDB or some other similar technology independent method to access the data store, but let's make things a bit more complex, when we are supposed to use a specific type of technology we want to write native code accessing that specific DBMS, plus some future data store might not be supported by our "universal" method of data access (i.e. OLEDB might not support it).

In situations like these using the abstract factory pattern can be really handy. Let's summarize the criteria that should apply when you decide to use the abstract factory pattern:

  • We have different ways of doing something (different implementations) that are similar to the eye of the client code looking at them at a specific level of abstraction.
  • We need to support all the different ways that we know of right now but we also need to have a mechanism to support future methods (implementations).
  • The use of one of the technologies is usually bound to an execution or a context in the execution. In other words we aren't going to be switching implementations on the fly or have some part of our code running on one implementation while the other is running on another.

Before we delve any deeper into this pattern let's also consider where it's a bad idea to use abstract factory:

  • I have a special dictionary object that is supposed to use a simple array when the items inside it are less than 10 but use a hash table when there is more than 10 items. In this example we have two different implementations (array & hashtable) and were going to switch back and forth between them based on some runtime criteria. This is definitely not Abstract Factory.
  • We are creating a payroll system and we are designing the main pay calculation algorithm. Now depending on the employee type (whether it's a fulltime employee or a contract based consultant or some other criteria) we have to calculate employee's weekly pay in different ways. Again a bad use of abstract factory because we have multiple implementations that are all used simultaneously when calculating the salary for a list of employees.
  • We are building an application that is supposed to save some information to a SQL Server DB and at the same time save it to log file and print it out. If we consider saving to the DB, saving to the log file and printing all different implementations of an abstract Save we might again be tempted to use Abstract Factory but again since we are using all these implementations at the same time Abstract Factory would be a bad design choice.

So now that we know when to and when not to use Abstract Factory let's get into some details:

First let's review some terminology:

  • Abstract Product: an abstract definition of some concept we want to implement in different ways
  • Product: a concrete implementation of that abstract product with specific technology, algorithm, etc.
  • Family of Products: a group of different products all implementing based on the same technology/algorithm.
  • Abstract Factory: an abstract definition for creating all the possible abstract products that we might need.
  • Factory: a concrete implementation of the abstract factory based on a specific family of products.

The Abstract Factory pattern:



From the diagram above it's obvious that the client code once it has been given an instance of AbstractFactory will use it to create any product that it needs. The products created will be returned as an abstract interfaces so the client code will not be dependent on a specific implementations and switching an implementation (or adding a new one) for the client code would only mean some other object (with the same interface) in the reference that the client code holds. So assuming that we have an instance of AbstractFactory initialized and globally available than the client code will look like this:

AbstractFactory af = (some initialization that has occurred somewhere; doesn't matter for the time being)

Client Code:
 af.CreateProduct1().DoSomething();
af.CreateProduct2().DoSomethingElse();


As it is obvious from the above code the client code is not dependant on any of the specific implementations (i.e. the A family or the B family) and as long as we can keep the client code independent of how the instance of the AbstractFactory is created we can provide different AbstractFactory implementations and get different product families to work.

For a better understanding of the terminology take a look at this diagram:

OK let's turn these abstract definitions to some concrete examples:

Let's go back to the DBMS example and assume that we want to be able to manipulate three different concepts in our database: a Producer, a Warehouse and a Delivery. So these will be our different products and we have to create abstract definitions for each product type which will provide us with abstract methods (or services) that we need from each product. For simplicities sake let's assume we want to be able to Save & Retrieve by an id the business objects of each Type using this pattern. So we would be looking at a design like this:




For the sake of brevity I haven't shown the names of the functions implemented in the sub-classes, but in reality the IProductAccess, IWarehouseAccess & IDeliveryAccess are all interfaces and the AbstractFactory class is an abstract class whose main methods are all abstract (the reason for the AbstractFactory being an abstract class and not an interface will be revealed shortly).

In the example above all the native code accessing the Oracle database would go into the Oracle family of classes (OracleProductAccess, OracleWarehouseAccess & OracleDeliveryAccess) and the same goes for the SQL family of classes. As you can see the implementations are completely separated out and each class focuses on its family's way of implementation for the product that it's realizing.

The implementations in the concrete factory classes are very simple too. These classes only have to create objects from their own family that match the product type required so for example the SQLFactory class's implementation would look like this:


class SQLFactory : AbstractFactory
{
public override IProductAccess CreateProduct()
{
return new SQLProductAccess();
}
public override IWarehouseAccess CreateWarehouse()
{
return new SQLWarehouseAccess();
}
public override IDeliveryAccess CreateDelivery()
{
return new SQLDeliveryAccess();
}
}

A very similar class would exist for the oracle factory class creating products from the Oracle family.

One of the big questions that are usually asked when people are learning the abstract factory pattern is how we provide access to the single instance of the abstract factory we want to expose. So far our discussion has revolved around the fact that somehow the client code will have access to an object based on the AbstractFactory class and that object will always be initialized and ready. Well for a very simple example we can assume that we have a static member (global variable) somewhere in our code that during program initialization based on some setting will be filled in with one of our concrete factory classes. So for example:


class Program
{
public static AbstractFactory AF;

private void Configure()
{
int setting = GetDBMSTech();

if (setting == 0)
AF = new SQLFactory();
else
AF = new OracleFactory();
}
}

In the above example we have assumed that a function called GetDBMSTech() exists that will read the needed settings from wherever we have stored it. Also lets assume that Configure is called during program initialization (i.e. the Main method). Now with the above code the client code can use the AF variable in any place that it needs to create the objects that it requires.

Obviously the above code doesn't conform with any design/programming good practices. One of the major problems is that the AF variable is accessible to everyone (even to change it) therefore one piece of the client code can reset the AF or change the factory object (or even set it to null). Another problem is the fact that we have hard coded the family types that we are going to support. If in the future a new family type (let's say an XMLFactory) where to be created the above code would have to be changed.

The best implementation of the Abstract Factory method so that it's elegant & compatible with new families of products introduced in the future is a combination of the Abstract Factory with the Singleton pattern (to be more specific with the polymorphic singleton pattern). Combining these two patterns provides an easy access to the single instance of the abstract factory that we need plus a configurable way of plugging in new families of products. Let's consider this example:


class AbstractFactory
{
public abstract CreateWarehouse();
public abstract CreateDelivery();
public abstract CreateProduct();

protected AbstractFactory() { }

public static AbstractFactory Instance
{
get
{
return _Instance;
}
}
private static AbstractFactory _Instance = CreateInstance();

private static AbstractFactory CreateInstance()
{
Assembly asm = Assembly.Load(ConfigurationManager.AppSettings["AFAssembly"]);
return (AbstractFactory)asm.CreateInstance(ConfigurationManager.AppSettings["AFClassName"]);
}
}

Now the above abstract factory is providing a single point of access to a single object created based on configuration stored in the app.config (or web.config in a web app). This abstract factory will provide the basis for pluggable product families without any need to change previous code.

In the next section regarding the Abstract Factory pattern I will talk about how to package your abstract factory classes to maximize their reuse and some other little issues left to discuss.

Thursday, May 17, 2007

APAC SharePoint Conference

It was a very enjoyable experience both from the technical perspective and the lunch, dinner & party. From what I gathered areas such as enterprise search, BDC and reporting based on BDC content & data and enterprise portal are cool new features of SharePoint 2007 in which Microsoft has invested heavily and I think we'll see a lot more on these features as the product matures down the road.

Tuesday, May 15, 2007

APAC SharePoint Conference: Day 1

Just got back from the conference. Awesome stuff especially on BDC, enterprise search and office integration. SharePoint 2007 is really shaping up to be a concrete infrastructure for enterprise level software development with a huge amount of features out of the box. As it is with any product it also has a long way to go to become mature in different aspects but what is currently offered in BDC (business data catalogue) & enterprise search will immensely simplify portal & enterprise intranet development even when collecting and retrieving data from legacy apps.

Will try to blog some more on the topic tomorrow once the conference is over.

Monday, May 14, 2007

APAC SharePoint Conference

Hello everyone,

I've been really busy preparing for the APAC Microsoft SharePoint Conference 2007 (the company I'm working for has a presentation at the conference), getting all the demos and technical stuff out of the door. Hopefully it will be an interesting & rich experience. To all whom are attending hope to see you there.

Anyhow the conference has kind of set me back on my regular posting to this blog but hopefuly after the conference I'll be back with some conference news and some new posts regarding design patterns. I've already got an Abstract Factory post in the making which will have some interesting & neat features to compliment your patterns knowledge and experience.

Wednesday, May 9, 2007

Singleton: Ammendment 1

An excellent issue was raised by Farzad in his comment to my previous post (Singleton) regarding the fact that when you have a multi-instanced singleton and the singleton is going to decide based on the current thread what object to return it's better to store it in the thread using SetData & GetData.
I absolutely agree with Farzad since this will save us from the hassel of clean up issues. Consider the example of the multi-instanced connection manager:

public class ConnectionManager
{
/* ConnectionManager implementation */

//Singleton logic:
private ConnectionManager() { }

private static Dictionary _instances = new Dictionary();

public static ConnectionManager Instance
{
get
{
lock (_instances)
{
if (_instances.ContainsKey(System.Threading.Thread.CurrentThread) == false)
_instances.Add(System.Threading.Thread.CurrentThread,new ConnectionManager());
}
return _instances[System.Threading.Thread.CurrentThread];
}
}
}


One problem with the above code that you would have to resolve is what happens when a thread is terminated? The singleton will still contain a reference to the ConnectionManager associated with the thread and over time this could cause a memory leak. Now in my previous post this was only supposed to be an example, but in real life you should take this stuff into account.
There are multiple ways to solve this. In some systems you wouldn't have a problem because a very limited number of threads will be created and maintained throughout the lifetime of the system. You could also use weak references that will automatically get cleaned up when the garbage collector is doing a round of clean up.
But a very elegant solution would be using the Thread.SetData & GetData methods. So the result would look like this:

public class ConnectionManager
{
/* ConnectionManager implementation */

//Singleton logic:
private ConnectionManager() { }

public static ConnectionManager Instance
{
get
{
if (Thread.GetData(Thread.GetNamedDataSlot("ConnectionManager")) == null)
Thread.SetData(Thread.GetNamedDataSlot("ConnectionManager"), new ConnectionManager());

return (ConnectionManager)System.Threading.Thread.GetData(Thread.GetNamedDataSlot("ConnectionManager"));
}
}
}

As can be seen in the above code we aren't using an explicit data structure to store our ConnectionManager objects but we are using a place that the thread will provide for storing data specific to this thread. The other interesting bit about the above code is there is no need for locking due to the fact that the SetData & GetData function are operating on the current thread's data and there is no way two threads can access one thread's data at the same time :)

Monday, May 7, 2007

Singleton

As I promised the first pattern that I'm going to write about is Singleton. Now I know that this might be the simplest and the most well known pattern ever existed but I think there are a lot of points that haven't been collected in one location so I'm going to focus on that stuff. Let me re-iterate that I'm not going to get into in depth discussion of what is Singleton but to more advanced topics and discussions, but to start off we need an introduction so here goes:

In software design there are a lot of situations where we need only one instance of an object to exist or a limited control number of objects to exist (of a specific class). This might be due to resource issues or just the simple fact that it doesn't make sense to have more than one object of a specific class. Many consider the singleton pattern an excellent solution to common coupling (global data). This pattern provides a very easy solution to solve the problems of common coupling and at the same time have the ease of use of global data.

OK so let's take a look at a very simple singleton. Let's assume that we have a PrintManager class that we can call its Print method and pass a document to it for printing. Now this class should queue all the documents that it receives and print them in order. For the sake of simplicity let's assume we have only one printer to print to and therefore only one queue. The singleton pattern says that if you want to make a class a singleton write your logic as if you were going to have multiple instances of it (as a regular class) and then perform the following steps on it:

  1. Make the constructor private so no one can create an instance.
  2. Create a private class variable (static) of your class.
  3. Create a public class method (static) to access the private variable and instantiate it on demand.

For example:


public class PrintManager
{
//The following code is regular implementation not part of
//the singleton pattern
private Queue PrintQue; //Sample member variable

public void Print(Document doc)
{
//code that would take care of queing
//and other boring stuff we don't care about
}


//The singleton pattern implementation:
private PrintManager()
{
}

private static PrintManager _instance = null;

public static PrintManager getInstance()
{
if (_instance == null)
_instance = new PrintManager();

return _instance;
}
}

Obviously the client code needing to use the single instance of the above class will access it through code similar to this:

PrintManager.getInstance().Print(myDoc);

This way no one can create a new instance or remove the one instance that exists, but it's also available for everyone's use. It is also created on first use of the PrintManager.


Items to discuss:

1) Almost all implementations of the singleton pattern should safe guard against a race condition in multi-threaded environments. So the correct implementation of the above code would be:


public class PrintManager
{
/* Ommited for simplicity */

//The singleton pattern implementation:
private PrintManager()
{
}

private static PrintManager _instance = null;
private static object LockObject = typeof(PrintManager);

public static PrintManager getInstance()
{
lock(LockObject)
{
if (_instance == null)
_instance = new PrintManager();
}

return _instance;
}
}

The above code will make sure that no two threads can accidentally enter the getInstance method at the same time and create two instances of the PrintManager object, one overwriting the other.


2) Using C# syntax and relying on a couple of .net CLR features we can rewrite the above code like this:


public class PrintManager
{
/* Ommited for simplicity */

//The singleton pattern implementation:
private PrintManager()
{
}

private static PrintManager _instance = new PrintManager();

public static PrintManager Instance
{
get
{
return _instance;
}
}
}

And the client code would look like this:

PrintManager.Instance.Print(myDoc);

This is a lot nicer and more readable than the previous code. Also notice that since we are relying on the CLR's static member initialization routines we don't need to worry about a race condition or similar multi threading issues.


3) Another situation that exists is a singleton that might have multiple controlled objects (instead of only one object). To keep with the above example we might want to implement our PrintManager so that it can manage multiple printers. Each printer would have a name and would work completely separately from other printers (it will have its own separate queue etc.). One solution that might immediately pop into mind is changing the Print method to accept two parameters a printer name and a document. But this means changing the logic of our code, a logic that might be working fine and we only needed to extend it. Obviously a bad solution and if you really want to know why go look up functional cohesion. Yes if we mix the code to separate different printers & queues with the printing logic we have basically downgraded our design from functional cohesion level to a lower less cohesive level (in this design it looks like a downgrade to logical cohesion; very badJ).
OK to solve this we need a multi-instance singleton

The Multi-Instanced Singleton:

This variation of the singleton pattern which can be implemented in many different ways is basically a singleton that has more than one instance of its class, but each instance is created and controlled by the singleton. For our current example assuming that each printer would be differentiated using a string name our multi-instance singleton would look like this:


public class PrintManager
{
/*Same logic as before (no changes needed) */

//The singleton pattern implementation:
private PrintManager()
{
}

//We need a method to keep multiple instances of the PrintManager class:
private static Dictionary _instances = new Dictionary();

public static PrintManager GetInstance(string printerName)
{
lock (_instances)
{
if (_instances.ContainsKey(printerName) == false)
_instances.Add(printerName, new PrintManager());
}

return _instances[printerName];
}
}

In all variations of the multi-instanced singleton we always face two design decisions:

  • How am I going to store the multiple objects (the data structure needed)?
  • How am I going to differentiate between the different instances?

The first question is usually easy to answer we might need a list or a dictionary or similar data structures. In rare cases we might not even need to store the different instances directly and they would get stored in some other sort of runtime available structure (discussed later).

The second question is a lot more interesting. The above example is a perfect example where the client code will decide which instance it needs to access and the singleton will check if it has that instance, if it does it's returned otherwise it's created (or loaded) and then returned.

Another widely used variety of the pattern is when the singleton class itself can decide which instance to return (the client code just uses the Instance property oblivious to which instance is actually returned. In these cases there has got to be some external way of figuring out which instance to return. As an example suppose we are writing an MDI document processing application. When the user clicks on the Tools menu and selects "Spell Check" we want to invoke the spell check routine passing it the current active document. Now suppose we have implemented Document class as a singleton. When accessing this Document singleton the client code doesn't need to tell it which document all it needs to do is say Document.Instance (whatever object is returned will be the active document).

Another perfectly useful example of this second variation is when you are developing objects that should only exist one per thread. The calling client doesn't need to tell the singleton which object it requires, all that is needed is to ask for the Instance and the active instance will be decided based on the calling thread. See below:


public class ConnectionManager
{
/* ConnectionManager implementation */

//Singleton logic:
private ConnectionManager()
{
}

private static Dictionary _instances = new Dictionary();

public static ConnectionManager Instance
{
get
{
lock (_instances)
{
if (_instances.ContainsKey(System.Threading.Thread.CurrentThread) == false)
_instances.Add(System.Threading.Thread.CurrentThread, new ConnectionManager());
}

return _instances[System.Threading.Thread.CurrentThread];
}
}
}

4) Other situations exist that we need a singleton object but this singleton object might be implemented in different ways or need to use mechanisms such as polymorphism to allow different implementation of the singleton. As an example suppose that we were implementing the above ConnectionManager in a single instance singleton but we needed to provide multiple implementations of it. For example an implementation to work with a SQL Server DB and another to work with an Oracle DB. These implementations would differ a lot in the actual logic of the class but would require the same singleton logic and the same access point for clients. In other words we want our clients to be able to say:

ConnectionManager.Instance.OpenConnection();

regardless of whether the SQL Server DB is configured for use or the Oracle version is going to be used. To achieve this we would need a mechanism to decide which version of our ConnectionManager is going to be used at instantiation but that is irrelevant in this example and for the sake of simplicity I'm going to assume that we will use reflection to create the currently configured version of our ConnectionManager and start using it as the only instance available. Please see the following piece of code:

The Polymorphic Singleton:


public abstract class ConnectionManager
{
//Abstract methods that need to be implemented by concrete ConnectionManagers:
public abstract void OpenConnection();
public abstract void CloseConnection();
// .
// .
// .

//Singleton logic:

//Note that in this version the constructor should be protected
protected ConnectionManager()
{
}

private static ConnectionManager _instance = CreateInstance();

public static ConnectionManager Instance
{
get
{
return _instance;
}
}

private static ConnectionManager CreateInstance()
{
//let's assume the assembly name and class name of the currently
//active ConnectionManager is stored in the .config file
Assembly asm = Assembly.Load(ConfigurationManager.AppSettings["Assembly"]);
return (ConnectionManager)asm.CreateInstance(ConfigurationManager.AppSettings["CMFullName"]);
}
}

public class SQLConnectionManager : ConnectionManager
{
public SQLConnectionManager() : base()
{
}

public override void OpenConnection()
{
//SQL implementation
}

public override void CloseConnection()
{
//SQL implementation
}
}

public class OracleConnectionManager : ConnectionManager
{
public OracleConnectionManager() : base()
{
}

public override void OpenConnection()
{
//Oracle implementation
}

public override void CloseConnection()
{
//Oracle implementation
}
}

As is obvious in the above implementation the constructor of our singleton class must be protected (otherwise no one can inherit from it) and the sub-classes need to have a public constructor so that the singleton CreateInstance method can create them on demand.

5) Finally you might even think of a situation where we need a multi-instanced polymorphic singleton as the final type of singleton.


Singletons in Web Apps

A question that I have been asked time and again is what happens in web based applications. Do we still need the singleton pattern there? Can't we just use the Application/Session objects?

YES you can and many people do and feel that it's a lot easier dealing with Application/Session than with a singleton, but I'd like to make these couple of points regarding web based scenarios:

  1. In many cases you might be developing a library or a reusable piece of code that will be run in multiple environments. You might need to use it in a web based scenario and a windows based scenario and other similar situations. For these kinds of reusable parts using the Application/Session will tightly couple them with the web-based environment and will prevent their reuse in non-web based environments. To get over this we can implement our singleton using the methods discussed above or implement a polymorphic singleton that depending on the environment that it's in will choose a web based or a windows based implementation.
  2. Another big question in web based scenarios is what are you using the singleton for? Are you using it to store user specific information, in other words a multi-instanced singleton that has an instance per user storing that users' information or are you designing a singleton (or multi-instanced one) that has no relation to the user. For the former you have to use the Session object otherwise in server farm scenarios you would have to implement a lot of code to replicate your singleton's data across all servers. But for the latter situation a regular singleton should suffice.
  3. Speed & performance might be another reason you would pick a regular singleton over the Session/Application data. Session data (especially in a server farm environment) could be really slow (object serialization, transfer to state server, storage there and the reverse process). Again if you don't need to replicate that data across all servers in the farm why use a Session object.
  4. As a general best practice it's better to wrap all your Session/Application needs in a class similar to a singleton implementation so you don't have to deal with strings in accessing Session/Application data and you have strongly typed variables (preventing many runtime time errors). The wrapper you would create around the session/application object can be a singleton in itself.

Other patterns:

Many patterns can be combined with the singleton object to solve more complex problems and I will try to touch on them in the next posts, but maybe one of the most famous ones is the Abstract Factory pattern. This pattern in most cases gets combined with a polymorphic singleton and we'll get into this on the next design pattern post I'm going to get into.

SharePoint Cross List Queries not upto the task

I've been using SharePoint Cross List Queries in multiple places in a recent site development project and I've noticed that no matter how you use them with indexed columns or without indexed columns or any other settings these monsters consume huge amounts of memory and are extermely slow. So unless you have a small number of lists and not a lot of items in each list you shouldn't be using them.
In a recent project I was using SharePoint queries to fetch latest posts based on a selection & filtering algorithm from SharePoint forums. Now we had a huge number of forums each located in different sub sites and we needed to run a query where we could get the latest post that matched a specific criteria. Once the system grew to a huge amount of data we realized that the Cross List queries were running very slow and at one point we started getting "Server Out of Memory" exceptions. So to cut a long story short we eventually had to create SharePoint events put them on list adding, updating & deleting events and replicate the fields needed in a SQL table and then run our query on the table.
My personal on why SharePoint Cross List Queries suck in performance: I think when Microsoft developed this feature they only had small lists or a small number of lists in mind plus they also had to make sure that this feature was compatible with other SharePoint features (like list item level security) so they couldn't just translate the CAML in cross list query to a SQL statement and run it on the DB even when you have configured all the columns in your query as indexed columns. So what happens is SharePoint loads all the data related to those lists into memory, sorts them, tries to filter them and then you get a Server out of memory exception on large amounts of data.