Stack smashing detected
I bet every Java developer has been surprised at some point in the beginning of their career when they first encounter native methods in Java code.
I am also sure that the surprise has later vanished over the years with coming to understand how the JVM handles calls to native implementations via JNI.
This post is about a recent experience with native methods. Or in more detail, how using native methods can result in JVM crashing silently without any reasonable traces in log files. To walk you through the experience, I have created a small test case.
It consists of a simple Java class, calculating checksums for files. To achieve Awesome Performance (TM), I decided to implement the checksum calculation part using native implementation. The code is simple and straightforward and so is running it. You would just need to clone the repository and launch it similar to the following example:
$ ./gradlew jarWithNatives $ java -jar build/libs/checksum.jar 123.txt Exiting native method with checksum: 1804289383 Got checksum from native method: 1804289383
The code seems to work just as expected. The not-so straightforward part is exposed when you discover yourself staring at the output with a slightly different (longer) filename used for input:
$ java -jar build/libs/checksum.jar 123456789012.txt Exiting native method with checksum: 1804289383 *** stack smashing detected ***: java terminated
So the native method finished its execution just fine, but the control was not returned to Java. Instead, the JVM is crashed without so much as a crash log. You should be aware of the fact that I only tested the examples on Linux and Mac OS X, and it might behave differently on Windows.
The underlying problem is not too complex and is probably immediately visible in the C code:
char dst_filename[MAX_FILE_NAME_LENGTH]; // cut for brevity sprintf(dst_filename, "%s.digested", src_filename);
From the above it is clear that the buffer can only hold a fixed number characters. With longer inputs the remaining characters will be written past its end. This will indeed result in stack smashing and opens doors for potential hacks or just leaving the application to an unpredictable state.
For C-developers, the underlying stack protector mechanism is well-known, but for Java developers, it might need a bit more explanation. Other than using the much safer snprintf that takes buffer length and does not write past that, you can also ask the compiler to add stack protectors, or memory sanitization to the compiled code. The available safety nets will vary significantly from compiler to compiler, and even between different versions of the same compiler, but here’s an example:
gcc -fstack-protector CheckSumCalculator.c -o CheckSumCalculator.so
Having compiled the code with the stack protector in place, the implementations of either the runtime library or the OS may detect this situation in some conditions and terminate the program to prevent unexpected behavior.
When the code is compiled without the sanitization in place, as in the following example,
gcc -fno-stack-protector CheckSumCalculator.c -o CheckSumCalculator.so
the results of running such code can become completely unpredictable. In some cases the code might complete seemingly fine, but in some cases you can encounter buffer overflows. While in this example using snprintf and enabling sanitization will definitely help, the error may easily be much more subtle than that and not caught automatically.
Going back to the allegedly safe Java world, such a buffer overflow may corrupt internal JVM structures, or even enable whoever it was that supplied the string to execute arbitrary code. So the JVM adds guard values to the memory, and if these values are mangled after the native method finishes, terminates the application immediately. Why the abortion is done without a more detailed error log is a different question and outside the scope of this post.
I hope the post saves someone an all-nighter or two when facing abrupt JVM death without even a crash log. The “stack smashed” message in the standard error stream is not even present on all platforms, and it can take lots and lots of time to figure out what happened, especially if you are running a third-party native library with no source code.