
How Plumbr will help you
As the word “native” in the error message suggests, the root cause of this error is platform-dependent. On unix, for example, it is not unusual to have a limit on how many processes a user can spawn:
$ ulimit -a | grep process max user processes (-u) 1400
On a Windows machine, another common cause for getting this error might be that there’s simply not enough free RAM to create a new thread. Unlike unix, which has an OOM killer, when some application goes haywire on windows and eats up all the memory, it is not terminated. Since each thread requires some native memory, e.g. to keep its stack, it may be impossible to create it due to some application (not necessarily the current one) having consumed all the memory available to the operating system. This is especially common on 32-bit machines (or when running a 32-bit JVM).
By exposing the number of threads, their names, and their stack traces, Plumbr Agent allows you to quickly pinpoint the root cause of the problem. Was the number of threads close to the ulimit? Were there actually very few threads, so could it be lack of memory? Were these threads even doing anything useful, or just wasting resources?
How to solve this
In this particular example, we see that the number of threads exposed by Plumbr is suspiciously close to the ulimit of 1400 set for the user running the java process. A possible fix would be to adjust the ulimits. This is OS-dependent. For instance, on Linux this would require editing the /etc/security/limits.conf file to set the hard and soft limits:
user_name soft nproc 2047 user_name hard nproc 16384
You can also use this command to set the maximum for the current session:
ulimit -u 16384
However, looking at the stack traces more closely reveals that about a thousand threads was actually either sleeping or just waiting for something. The names of the threads like "MessageProcessingSystem-scheduler-*" suggest that the threads are in fact threadpool workers. And, seeing as the threads are idle, the pools are quite overprovisioned.
Mitigating the OutOfMemory may be as simple as changing the cap on thread count for Apache Tomcat:
<Connector ... maxThreads="100" minSpareThreads="25" maxSpareThreads="50" ... />