Home Page

The Abstract Factory Pattern

A “Creational Pattern” from the Gang of Four book

Verdict

The Abstract Factory is an awkward workaround for the lack of first-class functions and classes in less powerful programming languages. It is a poor fit for Python, where we can instead simply pass a class or a factory function when a library needs to create objects on our behalf.

The Python Standard Library’s json module is a good example of a library that needs to instantiate objects on behalf of its caller. Consider a JSON string like this one:

text = '{"total": 9.61, "items": ["Americano", "Omelet"]}'

By default, the json module’s loads() function will create unicode objects for the strings "Americano" and "Omelet", a list to hold them, a Python float for 9.61, and a dict with unicode keys for the top-level JSON object.

But some users will not be content with these defaults. For example, an accountant would probably be unhappy with the choice of a float to represent an exact amount like “9 dollars 61 cents” and would prefer an exact Decimal instead.

The need to control what kind of numeric object the json module creates is a specific instance of a general problem:

First, we’ll look at the Pythonic approach to this problem. Then we’ll start placing a series of restrictions on our Python code to more closely model legacy object oriented languages, until the Abstract Factory pattern emerges as the best solution within those limitations.

The Pythonic approach: callable factories

In Python, a “callable” — any object f that executes code when invoked using the syntax f(a, b, c) — is a first-class object. To be “first class” means that a callable can be passed as a parameter, returned as a return value, and can even be stored in a data structure like a list or dictionary.

First-class callables offer a powerful mechanism for implementing object “factories”, a fancy term for “routines that build and return new objects.”

A beginning Python programmer might expect that each time they need to supply a factory, they will be responsible for writing a function:

import json
from decimal import Decimal

def build_decimal(string):
    return Decimal(string)

print(json.loads(text, parse_float=build_decimal))
{'total': Decimal('9.61'), 'items': ['Americano', 'Omelet']}

This simple factory ran successfully! The number returned is a decimal instead of a float.

(Note my choice of a verb build_decimal() as the name of this function, instead of a noun like decimal_factory() — I find a function name easier to read when it tells me what the function does instead of telling me what kind of function it is.)

While the above function will certainly work, there is an elision we can perform thanks to the fact that Python types are themselves callables. Because Decimal is a callable taking a string argument and returning a decimal object instance, we can dispense with our own factory and pass the Decimal type directly to the JSON loader! Unless we need to edit the string first, like by removing a leading currency symbol, Decimal can completely replace our little factory:

print(json.loads(text, parse_float=Decimal))
{'total': Decimal('9.61'), 'items': ['Americano', 'Omelet']}

There is one implementation detail that deserves mention. If you study the json module you will discover that load() is simply a wrapper around the JSONDecoder class. How does the decoder instance itself support an alternative factory? Its initialization method stores the parse_float argument as an instance attribute, defaulting to Python’s built-in float type if no override was specified:

self.parse_float = parse_float or float

It can then invoke it later as self.parse_float(…).

If you are interested in variations on this pattern — where a class uses its instance attributes to remember how it’s supposed to create a specific kind of object — then try reading about the Factory Method which explores several variations on this maneuver.

But to arrive at the Abstract Factory pattern, we need to head in a different direction. Here we’ll pursue what happens to an object factory itself — whether Decimal() or our hand-written build_decimal() — if we begin restricting the set of Python features we let ourselves use.

Restriction: outlaw passing callables

What if Python didn’t let you pass callables as parameters?

That restriction would remove an entire dimension from Python’s flexibility. Instead of supporting both “nouns” and “verbs” as arguments — both class instances and callable functions — some legacy languages only support passing class instances. Under that restriction, every simple factory would need to pivot from a function to a method:

# In Python: a factory function.

def build_decimal(string):
    return Decimal(string.lstrip('$'))

# In some legacy languages: the code must
# move inside a class method instead.

class DecimalFactory(object):
    @staticmethod
    def build(string):
        return Decimal(string.lstrip('$'))

In traditional Object Oriented programming, the word “factory” is the name of this kind of class — a class that offers a method that builds an object. In naming the equivalent Python function build_decimal(), therefore, I’m not only indulging in my own preference for giving functions verb-names rather than noun-names, but being as precise as possible in naming: the “factory” is not the callable, but the class that holds it.

Instead of continuing our earlier example of JSON parsing, let’s switch to a simpler task that can fit in a couple of lines of code: parsing a comma-separated list of numbers. Here’s how the parser would invoke the builder method on our factory class.

class Loader(object):
    @staticmethod
    def load(string, factory):
        string = string.rstrip(',')  # allow trailing comma
        return [factory.build(item) for item in string.split(',')]

result = Loader.load('464.80, 993.68', DecimalFactory)
print(result)
[Decimal('464.80'), Decimal('993.68')]

Note that, thanks to the fact that Python classes offer static and class methods that can be invoked without an instance, we have not yet been reduced to needing to instantiate the factory class — we are simply passing the Python class in as a first-class object.

Restriction: outlaw passing classes

Next, let’s also pretend that a Python class cannot be passed as a value, but that only object instances can be assigned to names and passed as parameters.

This restriction is going to prevent us from passing the DecimalFactory class as an argument to the load() method. Instead, we’re going to have to uselessly instantiate DecimalFactory and pass the resulting object:

f = DecimalFactory()

result = Loader.load('464.80, 993.68', f)
print(result)
[Decimal('464.80'), Decimal('993.68')]

Note the difference between this pattern and the Factory Method. Here, we are neither asked nor required to subclass Loader itself in order to customize the objects it creates. Instead, object creation is entirely parametrized by the separate factory object we choose to pass in.

Note also the clear warning sign in the factory’s own code that build() should, in Python, not really be the method of an object. Scroll back up and read the method’s code. Where does it accept as an argument, or use in its result, the object self on which it is being invoked? It makes no use of it at all! The method never mentions self in its code. As Jack Diederich propounded in his famous talk Stop Writing Classes, a method that never uses self should not actually be a method in Python. But such are the depths to which we’ve been driven by these artificial restrictions.

Generalizing: the complete Abstract Factory

Two final moves will illustrate the full design pattern.

First, let’s expand our factory to create every kind of object that the loader needs to create — in this case, not just the numbers that are being parsed, but even the container that will hold them. Now that we have switched to instantiating the factory, we can write these as plain methods instead of static methods:

class Factory(object):
    def build_sequence(self):
        return []

    def build_number(self, string):
        return Decimal(string)

And here is an updated loader that uses this factory:

class Loader(object):
    def load(string, factory):
        sequence = factory.build_sequence()
        for substring in string.split(','):
            item = factory.build_number(substring)
            sequence.append(item)
        return sequence

f = Factory()
result = Loader.load('1.23, 4.56', f)
print(result)
[Decimal('1.23'), Decimal('4.56')]

Every choice it needs to make about object instantiation is deferred to the factory instead of taking place in the parser itself.

Second, consider the behavior of languages that force you to declare ahead of time the type of each method parameter. You would overly restrict your future choices if your code insisted that the factory parameter could only ever be an instance of this particular class Factory because then you could never pass in anything that didn’t inherit from it.

Instead, to more happily separate specification from implementation, you would create an abstract class. It’s this final step that merits the word “abstract” in the pattern’s name “Abstract Factory”. Your abstract class would merely promise that the factory argument to load() would be a class adhering to the required interface:

from abc import ABCMeta, abstractmethod

class AbstractFactory(metaclass=ABCMeta):

    @abstractmethod
    def build_sequence(self):
        pass

    @abstractmethod
    def build_number(self, string):
        pass

Once the abstract class is in place and Factory inherits from it, though, the operations that take place at runtime are exactly the same as they were before. The factory’s methods are called with various arguments, which direct them to create various kinds of object, which they construct and return without the caller needing to know the details.

It’s like something you might do in Python, but made overly complicated. So avoid the Abstract Factory and use callables as factories instead.