Monday, May 21, 2007

Abstract Factory

Abstract Factory is also a highly used pattern throughout the initial design & architectural design steps of a project. Whenever you are faced with supporting multiple competing technologies in a software design and the possibility of switching between these different implementations one of the first patterns that should pop into your head is Abstract Factory. For formal definitions of this pattern please read here or refer to the Gang of Four book.

Let's start by trying to understand where this pattern would be used. A classic example is when you are developing an application that is supposed to run on different DBMS technologies and based on a setting or a configuration that the client will set at install time (for example) one of these implementations will be chosen and used throughout the code. But the problem doesn't end there, you're also supposed to support the future expansion of these implementations without the need to touch old code (or even re-compile it). Say for the time being our requirements state that we must support SQL Server 2000 as one of the DBs but we should also support Oracle 10g. And in the future we should also support other data stores for example an XML based data store or Microsoft Access, DB2, etc. etc.

What we want to do is try to design our software in a way where we are absolutely sure that in the future any new data store that comes along can be plugged into our application and our application can start reading/writing to that data store. By now some of you might have decided to use OLEDB or some other similar technology independent method to access the data store, but let's make things a bit more complex, when we are supposed to use a specific type of technology we want to write native code accessing that specific DBMS, plus some future data store might not be supported by our "universal" method of data access (i.e. OLEDB might not support it).

In situations like these using the abstract factory pattern can be really handy. Let's summarize the criteria that should apply when you decide to use the abstract factory pattern:

  • We have different ways of doing something (different implementations) that are similar to the eye of the client code looking at them at a specific level of abstraction.
  • We need to support all the different ways that we know of right now but we also need to have a mechanism to support future methods (implementations).
  • The use of one of the technologies is usually bound to an execution or a context in the execution. In other words we aren't going to be switching implementations on the fly or have some part of our code running on one implementation while the other is running on another.

Before we delve any deeper into this pattern let's also consider where it's a bad idea to use abstract factory:

  • I have a special dictionary object that is supposed to use a simple array when the items inside it are less than 10 but use a hash table when there is more than 10 items. In this example we have two different implementations (array & hashtable) and were going to switch back and forth between them based on some runtime criteria. This is definitely not Abstract Factory.
  • We are creating a payroll system and we are designing the main pay calculation algorithm. Now depending on the employee type (whether it's a fulltime employee or a contract based consultant or some other criteria) we have to calculate employee's weekly pay in different ways. Again a bad use of abstract factory because we have multiple implementations that are all used simultaneously when calculating the salary for a list of employees.
  • We are building an application that is supposed to save some information to a SQL Server DB and at the same time save it to log file and print it out. If we consider saving to the DB, saving to the log file and printing all different implementations of an abstract Save we might again be tempted to use Abstract Factory but again since we are using all these implementations at the same time Abstract Factory would be a bad design choice.

So now that we know when to and when not to use Abstract Factory let's get into some details:

First let's review some terminology:

  • Abstract Product: an abstract definition of some concept we want to implement in different ways
  • Product: a concrete implementation of that abstract product with specific technology, algorithm, etc.
  • Family of Products: a group of different products all implementing based on the same technology/algorithm.
  • Abstract Factory: an abstract definition for creating all the possible abstract products that we might need.
  • Factory: a concrete implementation of the abstract factory based on a specific family of products.

The Abstract Factory pattern:



From the diagram above it's obvious that the client code once it has been given an instance of AbstractFactory will use it to create any product that it needs. The products created will be returned as an abstract interfaces so the client code will not be dependent on a specific implementations and switching an implementation (or adding a new one) for the client code would only mean some other object (with the same interface) in the reference that the client code holds. So assuming that we have an instance of AbstractFactory initialized and globally available than the client code will look like this:

AbstractFactory af = (some initialization that has occurred somewhere; doesn't matter for the time being)

Client Code:
 af.CreateProduct1().DoSomething();
af.CreateProduct2().DoSomethingElse();


As it is obvious from the above code the client code is not dependant on any of the specific implementations (i.e. the A family or the B family) and as long as we can keep the client code independent of how the instance of the AbstractFactory is created we can provide different AbstractFactory implementations and get different product families to work.

For a better understanding of the terminology take a look at this diagram:

OK let's turn these abstract definitions to some concrete examples:

Let's go back to the DBMS example and assume that we want to be able to manipulate three different concepts in our database: a Producer, a Warehouse and a Delivery. So these will be our different products and we have to create abstract definitions for each product type which will provide us with abstract methods (or services) that we need from each product. For simplicities sake let's assume we want to be able to Save & Retrieve by an id the business objects of each Type using this pattern. So we would be looking at a design like this:




For the sake of brevity I haven't shown the names of the functions implemented in the sub-classes, but in reality the IProductAccess, IWarehouseAccess & IDeliveryAccess are all interfaces and the AbstractFactory class is an abstract class whose main methods are all abstract (the reason for the AbstractFactory being an abstract class and not an interface will be revealed shortly).

In the example above all the native code accessing the Oracle database would go into the Oracle family of classes (OracleProductAccess, OracleWarehouseAccess & OracleDeliveryAccess) and the same goes for the SQL family of classes. As you can see the implementations are completely separated out and each class focuses on its family's way of implementation for the product that it's realizing.

The implementations in the concrete factory classes are very simple too. These classes only have to create objects from their own family that match the product type required so for example the SQLFactory class's implementation would look like this:


class SQLFactory : AbstractFactory
{
public override IProductAccess CreateProduct()
{
return new SQLProductAccess();
}
public override IWarehouseAccess CreateWarehouse()
{
return new SQLWarehouseAccess();
}
public override IDeliveryAccess CreateDelivery()
{
return new SQLDeliveryAccess();
}
}

A very similar class would exist for the oracle factory class creating products from the Oracle family.

One of the big questions that are usually asked when people are learning the abstract factory pattern is how we provide access to the single instance of the abstract factory we want to expose. So far our discussion has revolved around the fact that somehow the client code will have access to an object based on the AbstractFactory class and that object will always be initialized and ready. Well for a very simple example we can assume that we have a static member (global variable) somewhere in our code that during program initialization based on some setting will be filled in with one of our concrete factory classes. So for example:


class Program
{
public static AbstractFactory AF;

private void Configure()
{
int setting = GetDBMSTech();

if (setting == 0)
AF = new SQLFactory();
else
AF = new OracleFactory();
}
}

In the above example we have assumed that a function called GetDBMSTech() exists that will read the needed settings from wherever we have stored it. Also lets assume that Configure is called during program initialization (i.e. the Main method). Now with the above code the client code can use the AF variable in any place that it needs to create the objects that it requires.

Obviously the above code doesn't conform with any design/programming good practices. One of the major problems is that the AF variable is accessible to everyone (even to change it) therefore one piece of the client code can reset the AF or change the factory object (or even set it to null). Another problem is the fact that we have hard coded the family types that we are going to support. If in the future a new family type (let's say an XMLFactory) where to be created the above code would have to be changed.

The best implementation of the Abstract Factory method so that it's elegant & compatible with new families of products introduced in the future is a combination of the Abstract Factory with the Singleton pattern (to be more specific with the polymorphic singleton pattern). Combining these two patterns provides an easy access to the single instance of the abstract factory that we need plus a configurable way of plugging in new families of products. Let's consider this example:


class AbstractFactory
{
public abstract CreateWarehouse();
public abstract CreateDelivery();
public abstract CreateProduct();

protected AbstractFactory() { }

public static AbstractFactory Instance
{
get
{
return _Instance;
}
}
private static AbstractFactory _Instance = CreateInstance();

private static AbstractFactory CreateInstance()
{
Assembly asm = Assembly.Load(ConfigurationManager.AppSettings["AFAssembly"]);
return (AbstractFactory)asm.CreateInstance(ConfigurationManager.AppSettings["AFClassName"]);
}
}

Now the above abstract factory is providing a single point of access to a single object created based on configuration stored in the app.config (or web.config in a web app). This abstract factory will provide the basis for pluggable product families without any need to change previous code.

In the next section regarding the Abstract Factory pattern I will talk about how to package your abstract factory classes to maximize their reuse and some other little issues left to discuss.

No comments: