Splunk Completes Acquisition of Plumbr Learn more

To blog |

Managing code complexity

July 25, 2013 by Nikita Salnikov-Tarnovski Filed under: Programming

This article is again opening up the development of Plumbr. If you follow me down the road I will demonstrate how our code has evolved over the years in the terms of code complexity management.

The story will be illustrated using a Structure 101 products. The company has created a portfolio of products, using which you can define rules how different parts of your codebase can interact and depend on each other, enforce these rules in your build process and refactor when needed. I must admit we did not go the whole way and were just using the Restructure product to visualize the dependencies, but when the product grows bigger we might consider embedding the whole process into our build cycle.

Anyhow, the products use two magic numbers to illustrate the quality of your source code. The actual mechanics how the numbers are calculated are complex, but from the 10,000 feet one could define the numbers as

  • Tangle. How bad are the relationships between the code artifacts, such as the packages and classes.
  • Fat. How big and cluttered are the individual classes/packages.

Both of those are indeed magic numbers. It is not easy to prove that if your individual classes are lean (e.g. low on Fat) and have a reasonable number of close friends  (e.g. are not Tangled) then you might have a better code base. But it just feels that way – the smaller individual code snippets are easier to grasp and the quality of dependencies makes the code a lot more predictable. Both of them build the foundation for a code that is both less buggy and more maintainable.

The story starts in the beginning of 2011. The source code repository is just created and from the following screenshot we can see that the Restructure is also happy with our code. The Tangle number is really low so we could say we have laid a solid foundation for the future releases. Or just that we have not had time to write more than just a few thousand lines of source code yet.

Code complexity

But only six months in the project and we see a different picture. We are still lean, but the dependencies have become a mess. The fat dashed lines indicate bidirectional relationships, which is a good way to make your code to behave unpredictably across the board. And the whole code repository has formed one huge Tangle, indicated by the big red stack.

Source code visualized

Six more months into the development and we see that our source code is still not Fat. And that we are still trying to strangle ourselves with the Tangled codebase. But you can already see that some packages (allocation, io, lifecycle) are now isolated from the large Tangled base. I must admit this happened by accident as we started to use Structure 101 a bit later down the road.

Souce code visualized

Another six months passed and time for another snapshot from the source code repository. Seems like things are now really heading down the drain. Besides the horrific Tangle factor we have now become Fat as well. But around this time we started using Structure 101 products and started analyzing the picture. And even with just this single screenshot we were able to ask some truly meaningful questions from ourselves. Such as

  • Why does a report facility depend upon a reference package?
  • Why do we have such a strong cyclical dependency between the filesystem tools and log packages?

Of course we could have asked the very same questions without the tool at hand, but this is one of the cases where visualization of the problem really helps. I guess it has something to do with how our brains are wired, but let us leave this to the social scientist to figure out.

Code dependencies

Now equipped with this information we started to take corrective actions. The following picture is taken around six months ago and we can see that even with all the code being added we are starting to get rid of the Tangle. For example a lot of the relationships between siblings (fs, http, …) are now gone for good.

Code dependencies

And the last picture taken just a week ago. I can proudly say that even with the 25% larger code base we have managed to decrease the Tangle number from 39,000 to 16,000.  This number alone does not say much, but the code just feels cleaner and the structure more natural. I am not quite sure there is a qualitative metric I could use to express this, but I can say my internal Feng Shui level has definitely increased.

Source code complexity

We have a long way to go. Or to put the other way, we have to run even to stay where we are – with the ever-increasing code base it is darn hard to enforce the rules to the code structure. But as they say – first step towards the cure is to admit you actually have a problem.

The story is a good sample on how even a small team can over a relatively short period of time create a true mess in the code base. Now picture what a more typical project with expected lifespan of a decade and a team of tens of developers might look like.

If you enjoyed the post, subscribe to our RSS feed or follow us in Twitter.