Beauty and the beast, recursion and iteration


, , , , , , ,

Readability is a goal of a software developer, beauty is the goal for the engineer. Reducing a complex problem to a functionally elegant solution that has you wandering around code as though you were gliding through an art gallery unraveling the mysteries behind each masterful line of code. Beautiful code leaves you with that feeling of awe and inspiration, it is clean, readable, simple, maintainable, and just so perfect. However, just as the artist can become so engrossed in their canvas they fail to notice the building burning around them, the engineer can also loose the purpose of the system in which their masterpiece lies. Performance is like the ugly duckling, the troll under the crystal bridge, the windows vista amongst the snow leopards.


Sorry, did you say immortal rabbits?

One great tool of beauty is recursion, unfortunately while being a wonderful tool it can be the rose coloured glasses of the software world. As an example we will take the fibonacci series, a sequence of numbers introduced by Leonardo of Pisa describing the mating patterns of immortal rabbits. The sequence itself is very simple and many will recognise it as it shows up everywhere in nature:

1, 1, 2, 3, 5, 8, 13, …

The sequence is defined mathematically as follows:
Fn = Fn-1 + Fn-2

The fibonnacci sequence is a function that is a great candidate for a recursive definition:

  public long fibonacci(int tnGenerations)
    return tnGenerations <= 1 ? 1 : 
           fibonacci(tnGenerations -1) + fibonacci(tnGenerations -2);

The recursive function is a simple definition, a single executable line of code that can be used to retrieve the fibonnacci number for any generation. This code can not be simplified any further, and it exactly describes the mathematical equation.  Works great, easily maintainable, and the world is good.  Right?  Well it depends on how you are going to be using the function, if you are going to call it once in your code and only on a very low generation, say anywhere between 0 and 30, then sure, it may be fine, but lets have a look at some of the performance properties of the function specifically the calling pattern.

Using fibonacci(6) as an example, look at the execution call stack:

fibonacci call stack for 6 generations

If we focus on the number of times the fibonacci function is called we get a list like the following:

  • fibonacci(6) – 1 time
  • fibonacci(5) – 1 time
  • fibonacci(4) – 2 times
  • fibonacci(3) – 3 times
  • fibonacci(2) – 5 times
  • fibonacci(1) – 8 times

See any pattern here?  The function is following the fibonacci sequence in the number of times it is called with each parameter.  The function definition looks great but calling fibonacci(50) will result in a whopping 40730022147 calls and managed to take over 99 seconds to execute on my mac pro.  That’s a lot of processing.

By the way, in the diagram the only reason fibonacci(0) is not called 13 times is because 1 and 0 are both terminators for the recursive call, luckily or we would still be waiting for fibonacci(50) to finish.


Bring out the ugly duckling

Okay so scrap the recursion, that’s not going to work out for us if we need a fibonacci calculator that is even halfway responsive. Next step, bring out the ugly duckling. Lets turn this code into an iterative function

  public long fibonacci(int tnGenerations)
    long lnResult = 1, lnFibPrevious = 1, lnTemp;
    for (int i=0; i < tnGenerations; i++)
        lnTemp = lnResult;
        lnResult = lnFibPrevious;
        lnFibPrevious = lnTemp + lnFibPrevious;
    return lnResult;

Execution of the above function delivers the same result, the difference being that execution time now for fibonnacci(50) is at ~2ms and the function is only called a single time, the loop iterates tnGenerations times. Not as pretty but much more functional.


Take the hit up front

There is another technique which may be useful to fix the recursive function, this is a technique called memoisation or caching. Basically remembering the result of a function call so that in future that pre calculated result is used rather than attempting to calculate again. This is particularly useful when a function will be called many times, kind of like in a recursive function like the fibonacci sequence, so lets give that a go.

  private static HashMapg_oFibMap = new HashMap();
  public long fibonacci(int tnGenerations)
    if (g_oFibMap.containsKey(tnGenerations))
      return g_oFibMap.get(tnGenerations);
    long lnResult = tnGenerations <= 1 ? 1 : 
         fibonacci(tnGenerations -1) + fibonacci(tnGenerations -2);
    g_oFibMap.put(tnGenerations, lnResult);
    return lnResult;

This function is now a mix of recursion and memoisation. If the result has not been stored it is calculated and cached for the next call to the function. The function still gets called much more than the iterative version loops, but much less than the fully recursive function. In the case of fibonacci(50), the memoised function will now only be called 99 times. It also takes slightly longer than the iterative version to run, 4ms, which is expected as there are more function calls. The advantage is that a second call to fibonacci(50) does not need to be calculated, instead it is retrieved, so the second call took my machine less than 1ms to execute. An additional advantage is that as fibonacci(50) and fibonacci(49) are already calculated, retrieving fibonacci(51) should not take any time at all. After a call to fibonacci(51) the function counter is incremented to 102 and the time to execute was also less than 1ms.


Just because you have a hammer doesn’t mean you should always use the hammer

So three options, which one should be used?

It depends on the problem you are solving, if you are solving one time for fibonacci(100) and you are going to leave your machine running for a two week holiday, go ahead and use the recursive function, who needs the polar ice caps anyway. In seriousness there is definitely a place for recursive functions and the fibonacci function was specifically chosen to show where recursion falls over. Don’t discount recursion everywhere, just be careful when you use it, and test against iterative versions of the function if you are performance driven. What you can do in a loop you can do with recursion and vice versa.

Iteration is great, but can get complicated. The best part of iteration is the performance gain, there are times when maintenance of an iterative function can be problematic. Just try natural language processing. There is a tradeoff between maintainability and performance when choosing between recursion or iteration.

Memoisation is still messy in comparison to pure recursion, but the performance gains and the readability gains are over the moon. There is a loss in performance over pure iteration, but it more than makes up for it with readability. There is also of course a memory tradeoff with Memoisation as the results are being stored somewhere. Memoisation is good when you can afford to take the hit up front to improve performance in heavy calls. This is a technique heavily used in image manipulations, transformations, and forensics, to name just a few. Memory is cheap, cache what you can.

Everything has its place, make sure you know what you are trading for with each decision you are making.