Splunk Completes Acquisition of Plumbr Learn more

To blog |

Adaptive heap sizing

October 29, 2014 by Nikita Salnikov-Tarnovski Filed under: Memory Leaks

GC impact on adaptive memory sizingWhile enhancing our test bed to improve the Plumbr GC problem detector,  I ended up writing a small test case I thought might be interesting for the wider audience. The goal I was chasing was to test JVM’s self-adaptiveness in regard of how the heap is segmented between eden, survivor and tenured spaces.

The test itself is generating objects in batches. Batches are generated once per second and each batch is 500KB in size. Those objects are referenced for five seconds, after this the references are removed and objects from this particular batch are eligible for garbage collection.

The test was run with Oracle Hotspot 7 JVM on Mac OS X, using ParallelGC and is given 30MB heap space to work with. Knowing the platform, we can expect that the JVM will launch with the following heap configuration:

  • The JVM will start with 10MB in Young and 20MB in Tenured space, as without explicit configuration the JVM is using 1:2 ratio to distribute heap between the Young and Tenured spaces.
  • In my Mac OS X, 10MB of young space is further distributed in between Eden and two Survivor spaces, given 8MB and 2x1MB correspondingly. Again, these are the platform-specific defaults used.

Indeed, when launching the test and peeking under the hood with jstat, we see the following, confirming our back-of-the-napkin estimates:

My Precious:gc-pressure me$ jstat -gc 2533 1s
 S0C    S1C    S0U    S1U      EC       EU        OC         OU       PC     PU    YGC     YGCT    FGC    FGCT     GCT   
1024.0 1024.0  0.0    0.0    8192.0   5154.4   20480.0      0.0     21504.0 2718.9      0    0.000   0      0.000    0.000
1024.0 1024.0  0.0    0.0    8192.0   5502.1   20480.0      0.0     21504.0 2720.1      0    0.000   0      0.000    0.000
1024.0 1024.0  0.0    0.0    8192.0   6197.5   20480.0      0.0     21504.0 2721.0      0    0.000   0      0.000    0.000
1024.0 1024.0  0.0    0.0    8192.0   6545.2   20480.0      0.0     21504.0 2721.2      0    0.000   0      0.000    0.000
1024.0 1024.0  0.0    0.0    8192.0   7066.8   20480.0      0.0     21504.0 2721.6      0    0.000   0      0.000    0.000
1024.0 1024.0  0.0    0.0    8192.0   7588.3   20480.0      0.0     21504.0 2722.1      0    0.000   0      0.000    0.000

From here, we can also give the next set of predictions about what is going to happen:

  • The 8MB in Eden will be filled in around 16 seconds – remember, we are generating 500KB of objects per second
  • In every moment we have approximately 2.5MB of live objects – generating 500KB each second and keeping references for the objects for five seconds gives us just about that number
  • minor GC will trigger whenever the Eden is full – meaning we should see a minor GC in every 16 seconds or so.
  • After the minor GC, we will end up with a premature promotion – Survivor spaces are just 1MB in size and the live set of 2.5MB will not fit into any of our 1MB Survivor spaces. So the only way to clean the Eden is to propagate the 1.5MB (2.5MB-1MB) of live objects not fitting into Survivor to Tenured space.

Checking the logs gives us confidence about these predictions as well:

My Precious:gc-pressure me$ jstat -gc -t 2575 1s
Time   S0C    S1C    S0U    S1U      EC       EU        OC         OU       PC     PU    YGC     YGCT    FGC    FGCT     GCT   
  6.6 1024.0 1024.0  0.0    0.0    8192.0   4117.9   20480.0      0.0     21504.0 2718.4      0    0.000   0      0.000    0.000
  7.6 1024.0 1024.0  0.0    0.0    8192.0   4639.4   20480.0      0.0     21504.0 2718.7      0    0.000   0      0.000    0.000
	... cut for brevity ...
 14.7 1024.0 1024.0  0.0    0.0    8192.0   8192.0   20480.0      0.0     21504.0 2723.6      0    0.000   0      0.000    0.000
 15.6 1024.0 1024.0  0.0   1008.0  8192.0   963.4    20480.0     1858.7   21504.0 2726.5      1    0.003   0      0.000    0.003
 16.7 1024.0 1024.0  0.0   1008.0  8192.0   1475.6   20480.0     1858.7   21504.0 2728.4      1    0.003   0      0.000    0.003
	... cut for brevity ...
 29.7 1024.0 1024.0  0.0   1008.0  8192.0   8163.4   20480.0     1858.7   21504.0 2732.3      1    0.003   0      0.000    0.003
 30.7 1024.0 1024.0 1008.0  0.0    8192.0   343.3    20480.0     3541.3   21504.0 2733.0      2    0.005   0      0.000    0.005
 31.8 1024.0 1024.0 1008.0  0.0    8192.0   952.1    20480.0     3541.3   21504.0 2733.0      2    0.005   0      0.000    0.005
	... cut for brevity ...
 45.8 1024.0 1024.0 1008.0  0.0    8192.0   8013.5   20480.0     3541.3   21504.0 2745.5      2    0.005   0      0.000    0.005
 46.8 1024.0 1024.0  0.0   1024.0  8192.0   413.4    20480.0     5201.9   21504.0 2745.5      3    0.008   0      0.000    0.008
 47.8 1024.0 1024.0  0.0   1024.0  8192.0   961.3    20480.0     5201.9   21504.0 2745.5      3    0.008   0      0.000    0.008

Not in 16 seconds, but more like in every 15 seconds or so, the garbage collection kicks in, cleans the Eden and propagates ~1MB of live objects to one of the Survivor spaces and overflows the rest to Old space.

So far, so good. The JVM is exactly behaving the way we expect. The interesting part kicks in after the JVM has monitored the GC behaviour for a while and starts to understand what is happening. During our test case, this happens in around 90 seconds:

My Precious:gc-pressure me$ jstat -gc -t 2575 1s
Time   S0C    S1C    S0U    S1U      EC       EU        OC         OU       PC     PU    YGC     YGCT    FGC    FGCT     GCT   
 94.0 1024.0 1024.0  0.0   1024.0  8192.0   8036.8   20480.0     8497.0   21504.0 2748.8      5    0.012   0      0.000    0.012
 95.0 1024.0 3072.0 1024.0  0.0    4096.0   353.3    20480.0    10149.6   21504.0 2748.8      6    0.014   0      0.000    0.014
 96.0 1024.0 3072.0 1024.0  0.0    4096.0   836.6    20480.0    10149.6   21504.0 2748.8      6    0.014   0      0.000    0.014
 97.0 1024.0 3072.0 1024.0  0.0    4096.0   1350.0   20480.0    10149.6   21504.0 2748.8      6    0.014   0      0.000    0.014
 98.0 1024.0 3072.0 1024.0  0.0    4096.0   1883.5   20480.0    10149.6   21504.0 2748.8      6    0.014   0      0.000    0.014
 99.0 1024.0 3072.0 1024.0  0.0    4096.0   2366.8   20480.0    10149.6   21504.0 2748.8      6    0.014   0      0.000    0.014
100.0 1024.0 3072.0 1024.0  0.0    4096.0   2890.2   20480.0    10149.6   21504.0 2748.8      6    0.014   0      0.000    0.014
101.0 1024.0 3072.0 1024.0  0.0    4096.0   3383.7   20480.0    10149.6   21504.0 2748.8      6    0.014   0      0.000    0.014
102.0 1024.0 3072.0 1024.0  0.0    4096.0   3909.7   20480.0    10149.6   21504.0 2748.8      6    0.014   0      0.000    0.014
103.0 3072.0 3072.0  0.0   2720.0  4096.0   323.0    20480.0    10269.6   21504.0 2748.9      7    0.016   0      0.000    0.016

What we see here is the amazing adaptibility of the JVM. After learning about the application behaviour, the JVM has resized survivor space to be big enough to hold all live objects. New configuration for the Young space is now

  • Eden 4MB
  • Survivor spaces 3MB each

After this, the GC frequency increases – the Eden is now 50% smaller and instead of ~16 seconds it now fills in around 8 seconds or so. But the benefit is also visible as the survivor spaces are now large enough to accommodate the live objects at any given time. Coupling this with the fact that no objects live longer than a single minor GC cycle (remember, just 2.5MB of live objects at any given time), we stop promoting objects to the old space.

Did you know that 20% of Java applications have memory leaks? Don’t kill your application – instead find and fix leaks with Plumbr in minutes.

Continuing to monitor the JVM we see that the old space usage is constant after the adoption. No more objects are propagated to old, but as no major GC is triggered the ~10MB of garbage that managed to propagate before the adaption took place will live in the old space forever.

You can also turn of the “amazing adaptiveness” if you are sure about what you are doing. Specifying -XX:-UseAdaptiveSizePolicy in your JVM parameters will instruct JVM to stick to the parameters given at launch time and not trying to outsmart you. Use this option with care, modern JVMs are generally really good at predicting the suitable configuration for you.

I hope I managed to surface some interesting behavioural aspects of the JVM. If indeed so, you might consider following us in Twitter or RSS feeds, I promise to keep posting about the JVM insights regularly.

ADD COMMENT

Comments

Hi, not really gc question but,
i see that your producer and consumer do no synchronization, deque is not thread safe and you have 2 sized executor. I understand that its not relevant for this particular problem but is there something that i dont know of that makes this thread safe? or its just not?

Vach

The only way i can think of that this would be safe with thread1 writing (producer) and thread 2 reading (consumer) is when writer write more than the cacheline, then because reader is late for significant time it is guaranteed to load correct data…

Vach

Hello, Vach. You are correct, this code has no thread-safety guarantees.

Nikita

Thanks Nikita. I am hooked to this blog!

hamid