Software Issues: Using IdentityHashMap and Flyweight Pattern

In my opinion, one of the most important tasks we computer programmers are facing in our job is understanding code written by previous generations of programmers. Sometimes the task becomes difficult especially when they have used little known classes or/and design patterns. One of those cases happened to me when I tried to grasp the ad selection process of the old adserver of company Z that was planned at 2007. I have to admit that I even tried to do it more than once, but until I realized that it was about the use of a flyweight pattern, I had not really been able to grasp their point. This implementation is interesting because it uses the almost unknown IdentityHashMap class.

Flyweight Pattern

When we talk about the Flyweight Pattern, we refer to the definition of the GOF (Gang-Of-Four):

The Flyweight Pattern uses sharing to support large numbers of fine-grained objects efficiently. A flyweight is a shared object that can be used in multiple contexts simultaneously. The flyweight acts as an independent object in each context - it's indistinguishable from an instance of the object that's not shared. Flyweights cannot make assumptions about the context in which they operate.

Flyweight Factory

In Z's adserver there is a TargetingConditionCache class functioning as a FlyweightFactory (according to GOF's terms).

public class TargetingConditionCache {
private Map<TargetingCondition,TargetingCondition> conditions = new HashMap<TargetingCondition, TargetingCondition>();
 
     public synchronized TargetingCondition get(TargetingCondition condition) {
         TargetingCondition cachedCondition = conditions.get(condition);
         if (cachedCondition==null) {
             cachedCondition = condition;
             conditions.put(cachedCondition,cachedCondition);
         }
         return cachedCondition;
     }
 }

The line 2 creates variable conditions that will keep objects implementing the TargetingCondition interface. Line 4 starts the declaration of function get(TargetingCondition condition) which is defined as synchronized. It is synchronized because the map of conditions could be approached by number of threads simultaneously. Line 6 checks whether the condition argument is in the map, if not it is added to the map in line 8. The class is a bit different from the classic GOF's implementation where creation of a new instance of the Flyweight object is done in Flyweight Factory, contrary in Z's ad server the creation of the new Flyweight object is done through a static function of a concrete Flyweight class.

public static PublisherTargetingCondition 
construct(HashMap publishers, TargetingConditionCache cache) {
     PublisherTargetingCondition condition = new PublisherTargetingCondition(publishers);
     return (PublisherTargetingCondition) cache.get(condition);
}

The function is kind of factory method that encapsulates the creation of a new PublisherTargetingCondition instance. The line 3 calls a private constructor and line 4 puts a new condition instance to the cache or just gets a back cached condition instance. Because we don't use a key for putting or retrieving instances of flyweight from a map, all Concrete Flyweight classes have to override Object's functions hashCode and equals.

Workflow for building targeting conditions

Where is the function PublisherTargetingCondition.construct() called from? The function PublisherTargetingCondition.construct() is called from the TargetingConditionParser's class, the classes of parser is placed in the configure package and they are activated because of the following two reasons:

Launching the whole ad server application
Changes of one of the campaign files (the campaign has Ads).

Finally we are getting a list of TargetingCondition's items that is assigned to a variable targetingConditions of the object's Ad and the list of ads is assigned to a variable listAdds of the singleton class NetworkDeliveryManager which is available from all edges of ad sever's project.

Class Diagram for Targeting Condition Cache

Sequence Diagram configuration of Targeting Conditions

IdentityHashMap Usage

The use of IdentityHashMap enable us to get the full benefit from just shown above Flyweight pattern example. First of all Z's adserver uses additional cache TargetingCheckCache, which is based on IdentityHashMap class.

I want to quote a short section from the JDK that pretty much explains the concept of the class:

IdentityHashMap - This class implements the Map interface with a hash table, using reference-equality in place of object-equality when comparing keys (and values). In other words, in an IdentityHashMap, two keys k1 and k2 are considered equal if and only if (k1==k2). (In normal Map implementations (like HashMap) two keys k1 and k2 are considered equal if and only if (k1==null ? k2==null : k1.equals(k2)).)

public class TargetingCheckCache {
    private IdentityHashMap<TargetingCondition, Boolean> checkResults  = new IdentityHashMap<TargetingCondition, Boolean>();

    public boolean matches(AdRequestInfo adRequest, TargetingCondition condition,
                           LogFileWriter errorLogWriter) {
        if (adRequest==null) throw new NullPointerException("adRequest");
        if (condition==null) throw new NullPointerException("condition");
        if (errorLogWriter==null) throw new NullPointerException("errorLogWriter");

        Boolean matches = checkResults.get(condition);
        if (matches == null){
            matches = condition.matches(adRequest, errorLogWriter);
            checkResults.put(condition,matches);
        }
        return matches;
    }
}

The line 2 it initializes the checkResults variable of IdentityHashMap's type. The function matches() gets two parameters adRequest and concrete targeting-condition. It first checks whether the same targeting-condition already have been matched before. Then if it is a new targeting-condition it checks whether the condition matches or doesn't to the request. After the checking's completion the targeting-condition will be put in the cache as a key and the value will be boolean True or False.

Workflow for checking condition

AdServerServlet get comands 'getad' and request, convert url presentation of request to object adRequest. The function fetchAds() of class NetworkDeliveryManager called and:

Creates new TargetingCheckCache instance
Starts iteration over all available ads
Call on every ad method matches

Summary

Here we can sum up the wisdom of this design. There are about 2000 ads, every ad has about 10 conditions, therefore there are about 20,000 conditions in the ad server. For all of them it should be called function matches(RequestInfo, TargetingCondition), but due to the use of TargetingConditionCache in the process of loading of the ad server, we ensured that there will be only one instance for identical targeting-conditions in a heap and thereby we approximately cut a large proportion of targeting-condition instances. As well, thanks to the use of the IdentityHashMap with its reference-equality comparison which is much more effective than the heavy object's equals methods of some sub-classes of the TargetingCondition, we saved a lot of time to iterate over all ads with its targeting-conditions.

Software Issues

Saturday, August 16, 2014

Using IdentityHashMap and Flyweight Pattern