Monday, August 18, 2014

Permutation of a string

On one of my job interviews I was asked to write a function that prints out all possible combinations of n characters for the given string. So, if the method is given the string "abc" as input, then it will print out the the following: "abc","acb","bac","bca","cab","cba".
Thanks to two years of studying statistics in university I knew that number of permutations in that kind of problem is calculated according the formula n in factorial, that is for case of 3 characters it is 3! -> 1*2*3 = 6, for case of 4 characters it is 4! -> 6*4 = 24. Unfortunately the knowledge of statistics and probability wasn't enough to help to solve the issue and after wasting all my time I had to admit that I don't have any solution. My interviewer was kind to me to explain me in really brilliant way the algorithm to solution of the problem. So I was so impressed, that immediately wrote the function when I returned from the interview. Now, here is the code I've written according to his algorithm:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
public void permute(String input) {
 permute("", input);
}

private void permute(String prefix, String word) {
 //System.out.println("start permute ("+prefix+","+word+")");
 int n = word.length();
 if (n == 0) {
  System.out.println(prefix);
 }
 for (int i = 0; i < n; i++) {
  permute(prefix + word.charAt(i), word.substring(0, i) + word.substring(i + 1, n));
 }
}


Input:
abc

Output:
abc
acb
bac
bca
cab
cba

Flow of recursive function permute(prefix, word)
Flow for recursive function

Lets try to follow after what's happen here. The flow starts with the separation of input for two pieces a prefix and a word, when in beginning the prefix is an empty string and the word is the whole input.
Afterwards we start to call recursively the function permute when on every new call a new char is passed from word to prefix. The base case of this recursive function is when the word turn to be empty string then the function prints out a new permutation.
After returning from the base case the function return to for loop iteration, increases i and therefore selects a new character to be added to prefix.

The example of permute function's calls:

start permute (,abc)
start permute (a,bc)
start permute (ab,c)
start permute (abc,)  - base point and print
start permute (ac,b)
start permute (acb,) - base point


new iteration of loop when prefix is empty

start permute (b,ac) - and e.t.c.



Sunday, August 17, 2014

Writing generic class

In one of my former posts "Incorrect usage of WeakHashMap" I have shown code for the class WeakCache. The class works all right, but since release Java 1.5 that kind of classes should be written as generic. So let write generic WeakCache.

That is old version of WeakCache

public class WeakCache {

 WeakHashMap<Object, Object> conditions = new WeakHashMap<Object, Object>();

 public synchronized Object get(Object condition) {
  if (condition == null)
   throw new NullPointerException("condition");
  Object cachedCondition = conditions.get(condition);
  if (cachedCondition == null) {
   cachedCondition = condition;
   conditions.put(cachedCondition, new WeakReference<Object>(
     cachedCondition));
  } else {
   @SuppressWarnings("unchecked")
   WeakReference<Object> weakReference = (WeakReference<Object>) cachedCondition;
   cachedCondition = weakReference.get();
  }
  return cachedCondition;
 }

}

And here the new one:
public class WeakCache<K extends TargetCondition> {

 private WeakHashMap<K, WeakReference<K>> conditions = new WeakHashMap<K, WeakReference<K>>();

 public synchronized K get(K condition) {
  if (condition == null)
   throw new NullPointerException("condition");
  K cachedCondition;
  WeakReference<K> weakReference = conditions.get(condition);
  if (weakReference == null) {
   cachedCondition = condition;
   conditions.put(cachedCondition, new WeakReference<K>(cachedCondition));
  } else {
   cachedCondition = weakReference.get();
  }
  return cachedCondition;
 }


}


Saturday, August 16, 2014

Using IdentityHashMap and Flyweight Pattern

In my opinion, one of the most important tasks we computer programmers are facing in our job is understanding code written by previous generations of programmers. Sometimes the task becomes difficult especially when they have used little known classes or/and design patterns. One of those cases happened to me when I tried to grasp the ad selection process of the old adserver of company Z that was planned at 2007. I have to admit that I even tried to do it more than once, but until I realized that it was about the use of a flyweight pattern, I had not really been able to grasp their point. This implementation is interesting because it uses the almost unknown IdentityHashMap class.


Flyweight Pattern


When we talk about the Flyweight Pattern, we refer to the definition of the GOF (Gang-Of-Four):
The Flyweight Pattern uses sharing to support large numbers of fine-grained objects efficiently. A flyweight is a shared object that can be used in multiple contexts simultaneously. The flyweight acts as an independent object in each context - it's indistinguishable from an instance of the object that's not shared. Flyweights cannot make assumptions about the context in which they operate.

Flyweight Factory UML


Flyweight Factory


In Z's adserver there is a TargetingConditionCache class functioning as a FlyweightFactory (according to GOF's terms).

public class TargetingConditionCache {
private Map<TargetingCondition,TargetingCondition> conditions = new HashMap<TargetingCondition, TargetingCondition>();
 
     public synchronized TargetingCondition get(TargetingCondition condition) {
         TargetingCondition cachedCondition = conditions.get(condition);
         if (cachedCondition==null) {
             cachedCondition = condition;
             conditions.put(cachedCondition,cachedCondition);
         }
         return cachedCondition;
     }
 }

The line 2 creates variable conditions that will keep objects implementing the TargetingCondition interface. Line 4 starts the declaration of function get(TargetingCondition condition) which is defined as synchronized. It is synchronized because the map of conditions could be approached by number of threads simultaneously. Line 6 checks whether the condition argument is in the map, if not it is added to the map in line 8. The class is a bit different from the classic GOF's implementation where creation of a new instance of the Flyweight object is done in Flyweight Factory, contrary in Z's ad server the creation of the new Flyweight object is done through a static function of a concrete Flyweight class.



public static PublisherTargetingCondition 
construct(HashMap publishers, TargetingConditionCache cache) {
     PublisherTargetingCondition condition = new PublisherTargetingCondition(publishers);
     return (PublisherTargetingCondition) cache.get(condition);
}

The function is kind of factory method that encapsulates the creation of a new PublisherTargetingCondition instance. The line 3 calls a private constructor and line 4 puts a new condition instance to the cache or just gets a back cached condition instance. Because we don't use a key for putting or retrieving instances of flyweight from a map, all Concrete Flyweight classes have to override Object's functions hashCode and equals.


Workflow for building targeting conditions


Where is the function PublisherTargetingCondition.construct() called from? The function PublisherTargetingCondition.construct() is called from the TargetingConditionParser's class, the classes of parser is placed in the configure package and they are activated because of the following two reasons:
  1. Launching the whole ad server application 
  2. Changes of one of the campaign files (the campaign has Ads). 
Finally we are getting a list of TargetingCondition's items that is assigned to a variable targetingConditions of the object's Ad and the list of ads is assigned to a variable listAdds of the singleton class NetworkDeliveryManager which is available from all edges of ad sever's project.



Class Diagram for Targeting Condition Cache





Sequence Diagram configuration of Targeting Conditions
















IdentityHashMap Usage


The use of IdentityHashMap enable us to get the full benefit from just shown above Flyweight pattern example. First of all Z's adserver uses additional cache TargetingCheckCache, which is based on IdentityHashMap class.

I want to quote a short section from the JDK that pretty much explains the concept of the class:

IdentityHashMap - This class implements the Map interface with a hash table, using reference-equality in place of object-equality when comparing keys (and values). In other words, in an IdentityHashMap, two keys k1 and k2 are considered equal if and only if (k1==k2). (In normal Map implementations (like HashMap) two keys k1 and k2 are considered equal if and only if (k1==null ? k2==null : k1.equals(k2)).)


public class TargetingCheckCache {
    private IdentityHashMap<TargetingCondition, Boolean> checkResults  = new IdentityHashMap<TargetingCondition, Boolean>();

    public boolean matches(AdRequestInfo adRequest, TargetingCondition condition,
                           LogFileWriter errorLogWriter) {
        if (adRequest==null) throw new NullPointerException("adRequest");
        if (condition==null) throw new NullPointerException("condition");
        if (errorLogWriter==null) throw new NullPointerException("errorLogWriter");

        Boolean matches = checkResults.get(condition);
        if (matches == null){
            matches = condition.matches(adRequest, errorLogWriter);
            checkResults.put(condition,matches);
        }
        return matches;
    }
}

The line 2 it initializes the checkResults variable of IdentityHashMap's type. The function matches() gets two parameters adRequest and concrete targeting-condition. It first checks whether the same targeting-condition already have been matched before. Then if it is a new targeting-condition it checks whether the condition matches or doesn't to the request. After the checking's completion the targeting-condition will be put in the cache as a key and the value will be boolean True or False.


Workflow for checking condition




AdServerServlet get comands 'getad' and request, convert url presentation of request to object adRequest. The function fetchAds() of class NetworkDeliveryManager called and:
  1. Creates new TargetingCheckCache instance
  2. Starts iteration over all available ads
  3. Call on every ad method matches


Summary


Here we can sum up the wisdom of this design. There are about 2000 ads, every ad has about 10 conditions, therefore there are about 20,000 conditions in the ad server. For all of them it should be called function matches(RequestInfo, TargetingCondition), but due to the use of TargetingConditionCache in the process of loading of the ad server, we ensured that there will be only one instance for identical targeting-conditions in a heap and thereby we approximately cut a large proportion of targeting-condition instances. As well, thanks to the use of the IdentityHashMap with its reference-equality comparison which is much more effective than the heavy object's equals methods of some sub-classes of the TargetingCondition, we saved a lot of time to iterate over all ads with its targeting-conditions.

Monday, August 11, 2014

Incorrect usage of WeakHashMap

Introduction

At first I wanted to write a post about an elegant design containing a flyweight pattern together with a use of WeakHashMap class for creating a cache, that was done in one of my previous work places, but while writing this I realized that the above mentioned use of a WeakHashMap had a serious failure.

WeakHashMap

So before the running with illustration of the incorrect usage of the class, let explain what WeakHashMap is and where we can apply it. I cut some most appropriate quotes from Oracle Java documentation to resume its essence and the purpose of this Java class:
WeakHashMap - Hash table based implementation of the Map interface, with weak keys. An entry in a WeakHashMap will automatically be removed when its key is no longer in ordinary use. More precisely, the presence of a mapping for a given key will not prevent the key from being discarded by the garbage collector, that is, made finalizable, finalized, and then reclaimed. When a key has been discarded its entry is effectively removed from the map, so this class behaves somewhat differently from other Map implementations.
The behavior of the WeakHashMap class depends in part upon the actions of the garbage collector, so several familiar (though not required) Map invariants do not hold for this class. Because the garbage collector may discard keys at any time, a WeakHashMap may behave as though an unknown thread is silently removing entries.
Each key object in a WeakHashMap is stored indirectly as the referent of a weak reference. Therefore a key will automatically be removed only after the weak references to it, both inside and outside of the map, have been cleared by the garbage collector.
Or in my own words that means, if there is an object A and there is an instance of WeakHashMap B, and A after its creation was put into the B as a key and after a while A becomes to be eligible for garbage collection, then the entry with the key A is as well eligible for garbage collector. For example:


public class WeakMapTest2 {

 public static void main(String a[]) {
  Map<String, Object> weakMap = new WeakHashMap<>();
  List<String> list = new ArrayList<>(); 
  String x = new String("x");  
  String x1 = new String("x1"); 
  String x2 = new String("x2"); 
  weakMap.put(x, new Object());
  weakMap.put(x1, new Object());
  weakMap.put(x2, new Object());
  // x will have strong reference to list 
  list.add(x);
  System.out.println("Map size :" + weakMap.size());
  // force all keys be eligible
  x=x1=x2=null;
  // call garbage collector 
  System.gc();
  try {
   Thread.sleep(100);
  } catch (InterruptedException e) {
   e.printStackTrace();
  }
  
  System.out.println("Map size :" + weakMap.size());
  System.out.println("Map :" + weakMap);
  System.out.println("List :" + list);
 }

}


Output:

Map size before gc :3
Map size after gc :1
Map :{x=java.lang.Object@5437086a}
List :[x]

Resume:

All entries were discarded from the map, except the entry with key x because x had a strong reference to the list.

WeakHashMap as a cache

So as I said in the introduction WeakHashMap can be applied as a cache that keeps entries while they are useful and remove them when they are useless.


public class WeakCache {

 WeakHashMap<Object, Object> conditions = new WeakHashMap<Object, Object>();

 public synchronized Object get(Object condition) {
  if (condition == null)
   throw new NullPointerException("condition");
  Object cachedCondition = conditions.get(condition);
  if (cachedCondition == null) {
   cachedCondition = condition;
   conditions.put(cachedCondition, cachedCondition);

  }
  return cachedCondition;
 }

 public int getSize() {
  return conditions.size();
 }

}


Test:


public class WeakMapTest {

 WeakCache cache = new WeakCache();
 List<Object> list = new ArrayList<>();

 public WeakMapTest() {

 }

 private void initWeakMap() {
  String x = new String("x");  
  String x1 = new String("x1"); 
  String x2 = new String("x2"); 
  cache.get(x1);
  cache.get(x);
  cache.get(x2);  
  list.add(x);
  
 }

 private void clearGC() {
  System.gc();
  try {
   Thread.sleep(100);
  } catch (InterruptedException e) {
   e.printStackTrace();
  }
  
 }

 public static void main(String a[]) {
  WeakMapTest wmt = new WeakMapTest();
  wmt.initWeakMap();
  System.out.println("map before gc: "+wmt.cache.conditions);
  wmt.clearGC();
  System.out.println("map after gc: "+wmt.cache.conditions);
 }

}


After running the class WeakMapTest I was unpleasantly surprised to get the following output:

map before gc: {x=x, x1=x1, x2=x2}
map after gc: {x=x, x1=x1, x2=x2}

That is, the garbage collector didn't do its work, but why? After I've changed the line of WeakCache class from
conditions.put(cachedCondition, cachedCondition);
to
conditions.put(cachedCondition, new Object());
the program returned to work as expected.

So why the bug is occurring? The explanation is that only a key of WeakHashMap has weak references, but a value of WeakHashMap has strong reference, therefore all these objects x, x1, x2 remained not to be  eligible. Thereby the line conditions.put(cachedCondition, cachedCondition)
that assigns the same object, that is the same reference to the value  and to the key, as a matter of fact neutralizes the special trait of WeakHashMap and turns it to function as regular HashMap.

It is funny to say that the code has been running in production environment of that company over 5 years and only my willing to write a post about their great design caused me to figure out the bug.

Fixing the cache

So how anyway, we should to code the WeakCache class in order to benefit from weak references?
For my opinion it seems that the only solution in that case to get both discarding of useless objects and flyweight features ( I'll write about this in the next post) is to replace the problem line
conditions.put(cachedCondition, cachedCondition);
to
conditions.put(cachedCondition, new WeakReference(cachedCondition));



public class WeakCache {

 WeakHashMap<Object, Object> conditions = new WeakHashMap<Object, Object>();
   
 public synchronized Object get(Object condition) {
  if (condition == null)
   throw new NullPointerException("condition");
  Object cachedCondition = conditions.get(condition);
  if (cachedCondition == null) {
   cachedCondition = condition;   
   conditions.put(cachedCondition, new WeakReference<Object>(cachedCondition));
  }
  else{
   @SuppressWarnings("unchecked")
   WeakReference<Object> weakReference = (WeakReference<Object>)cachedCondition;
   cachedCondition = weakReference.get();
  }
  return cachedCondition;
 }

 public int getSize() {
  return conditions.size();
 }

}

Or you can see the generic version of the former code in the my other post Writing generic class.

In the following illustration, we can see that x is still alive because it has strong references with the List, while x1 was discarded because it doesn't have more  strong references.

Weak and Strong references