Splunk Completes Acquisition of Plumbr Learn more

To blog |

Plumbr, blind and silent

March 7, 2012 by Nikita Salnikov-Tarnovski Filed under: Product Updates

Here’s the news for you: Java memory leaks have attitude, and you need to know your way around to successfully nail and solve them. While this blog post is written in response to people who became confused while using our product for discovering memory leaks, it should serve some food for thought for other leak hunters as well.

I got an email a couple of days ago. The author had tried Plumbr on a self-made leaking test application and, to his disappointment, instead of getting a beautiful leak report he got an OutOfMemoryError. He was not the first one sending us such comments, and from the fair amount of feedback we’ve gathered so far we’ve begun seeing some patterns in how people use Plumbr.

These cases can be divided into two major groups

  1. Your application crashed with an OutOfMemoryError without Plumbr reporting a leak.
  2. Your application runs without any sign of Plumbr finding (doing) anything.

Blind Plumbr

In fact, in my previous post about memory leaks in Java, I provided a sample piece of Java code that, when compiled and run, eventually leads to OutOfMemoryError. If you try running it with Plumbr, nothing changes. You will get the OOM nonetheless – and no report from Plumbr. Is Plumbr that bad?

We don’t think so. Let us review the problem that Plumbr solves. Plumbr pinpoints the location of the memory leak and the root cause of it in your application. Would it make sense to create a tool that shows you Mount Fuji when you stand in front of it? The main goal of Plumbr is to sort out all the clutter, mess, jumble, tangle, spaghetti and lasagna, that constitute every modern Java application built upon layers of frameworks, patterns and best-practices. And separate the wheat from the chaff.

In contrast, most synthetic test-cases and demonstrations of memory leaks in Java use a handful of classes with 90%+ of the heap filled with the leak being demonstrated. In such situation every tool that at least remotely is related to memory or application performance will scream in your face. Not much to gain here.

When the size of the application grows, the number of classes in it blows exponentially. And when the size of your business implies that the cost of one minute of your application downtime is comparable to the budget of Nicaragua, then things get much more interesting. This leads us to one of the major challenges when developing Plumbr: how to best strike a balance between the Fuji and a snail.

In a nutshell the problem is as follows: how to find a memory leak in small applications, at the same time minimizing the possibility of false alerts in large ones? We have solved it by distinguishing two sets of objects: the ones that are more or less growing and the others, more or less stagnant. This approach has one clear drawback: we need those “stagnant” objects. That means that there should be objects in your application that do not leak. A fair amount of them. As a result Plumbr does not spot extremely synthetic leaks, such as the one provided in the previous article.

Silent Plumbr

Imagine yourself trying to find a black cat in the dark room by walking back and forth alongside one wall of the room. Only that one wall. What is the probability that you to stumble upon the cat, provided that the room is wide enough? A friend of mine says: “50%. Either you stumble or not!”. If that cat sits in the other side of the room – good luck finding it.

In order for any memory leak detection tool to find your problem, you must execute the part of the application that contains the leaking code! Unfortunately this is easier said than done. And that is the hardest problem to solve for any QA engineer: how to execute buggy code. How to find that buggy place in your code. Or how to be sure enough there is no one.

We chose the simple path. And the simplest way to get your buggy piece of code executed is to let your end-user execute it. No kidding. If even they cannot find it, then problem is anyway solved as nonexistent. The trick here is to minimize or to eliminate the impact of that problem for the unfortunate user and to maximize the amount of new information you get from the experiment. In case of a memory leak, where the “bug” does not represent a blow-the-app-at-once kind of a problem, this is a viable approach. And most widely used. Let the end-users use your application and at the same time monitor it for any signs of memory problems. That is exactly what Plumbr does. And that is why for Plumbr to work somebody (or something) must execute the leak.

If you have attached Plumbr to your application, and try to evaluate whether it works, be sure to really use your application as much as possible. If you know exactly where the leak is, run that part repeatedly again and again (but also keep in mind the first part of this article, the Blind Plumbr phenomenon!).

If you do not know the whereabouts of the leak, then use the whole application extensively, over and over. Plumbr will detect the leak and report it well before your end users experience any problems.

If you are not even sure whether there is memory leak at all, and Plumbr doesn’t find anything… Well, chances are good that there really is no leak in your application, congratulations!

ADD COMMENT

Comments

Good heading!

Anonymous

And the most important advice: “Don’t Panic!” 😉

gorlok