To blog Previous post | Next post
Acquiring JDBC Connections. What can possibly go wrong?
Acquiring a database connection sounds like a safe operation. I mean, how bad can it be – calls to methods such as DataSource.getConnection() or DriverManager.getConnection() are straightforward and simple. The last thing you would expect is a performance hit, correct?
Apparently there are many ways how to severely hurt yourself via something as simple as JDBC connection retrieval. In this post I will walk you through the most common problems we have discovered when preparing to launch the next upgrade to Plumbr’s JDBC monitoring feature.
No Connection Pool
Connecting to any remote system is expensive. JDBC connections are no different in this regard, meaning that each time the JDBC connection is created, the application spends lots of time waiting for the connection to be established.
Pooled connections are left connected to the database and can be shared across the different components needing database access. This approach allows to share the cost of connection establishment across all transactions requiring access to the database.
However, it is amazing how frequently the connection pools are missing altogether and connections are created each time database is accessed. The problem turns from bad to worse when coupled with the (in)famous N+1 problem: instead of creating the expensive connection once per transaction, the application initializes potentially hundreds and thousands of connections during a single user interaction.
The solution for this problem is easy. Do yourself a favor and use the connection pool when connecting to a remote system. For JDBC, you have a variety of options to choose from, our recommendation is to go with HikariCP for example.
Uninitialized Connection Pool
Even when you have actually set up the pool, the connections acquired from the pool are initially missing. Depending on the implementation used, the connection pool might for example:
- initialize the pool lazily, creating the connections only when requested,
- have different limits to initial and maximum size, growing beyond the initial size lazily,
- decrease the number of connections in pool during idle times.
All these situations can result in a request for a connection being blocked for way longer than the fully initialized pool would have needed to deliver it.
The solution for the problem is often as easy as initializing the pool to the maximal size it will ever be used. In cases where elasticity needs to be built into pooling, make sure the size adapts preventively, avoiding the situation where end user transactions are blocking due to connection being established.
Testing Connections Before Utilization
Pooling, as all solutions to a problem, creates some problems itself along the way. One of the most common issues undermining connection pool benefits is the fact that pooled connections can end up being stale. This most often happens due to inactive connections being timed out by network devices between the JVM and the database. As a result, there will be stale connections in the pool.
To avoid the situation, most JDBC drivers provide a possibility to test the connection before handing it off to the worker thread. The JDBC API provides a standard java.sql.Connection.isValid() method to carry out the test. The test itself will be carried out as a simple query to the database, for example on Oracle databases it often executes “SELECT 1 FROM DUAL” to test whether or not the connection is still alive.
These queries should be as cheap as possible but if they pile up, they can become a performance bottleneck. For example when you have configured the pool to test each connection every time it is handed off from the pool then you have effectively doubled the number of requests from the JVM to the database.
An additional problem arrives when the application uses an expensive operation to test the connection. We have seen queries such as “SELECT ID FROM INVOICE” being used as the test operation. With large number of entries in the database table, this query alone can impact execution time significantly.
As a solution, you should first make sure you understand the network configuration between the JVM and the database in question. If a firewall between the JVM and database drops inactive connections after idling for 30 minutes, maybe it does not make sense to test the connections more frequently than once every 29 minutes or so.
Under-provisioned Connection Pools
Whenever the connection pool connections are all fully utilized, the pool in question will not be able to service more incoming requests for connections. Depending on the implementation, the datasource can either start queuing up or dropping the calls. Both results quickly become a burden for end users – either via intolerably slow execution times or outright failures in functionality.
Before jumping to a seemingly obvious solution of increasing the pool size, you should take a step back and check whether or not the active connections are used in the best possible way. An under provisioned pool is often just the symptom of the connections being consumed by slow JDBC statement executions. When you increase the size of the pool in such a case, the main performance bottleneck would actually not be alleviated.
Leaking Connections
Another issue quickly exhausting available connections from the pool is JDBC connection leakage. If connection is not closed, the connection pool does not know that the connection is no longer being used by the borrower thread. In such case, the connection pool will consider this connection to still be in use.
As a solution, the best way is to make sure your codebase will not leak connections. For instance, using Spring JDBC templates will avoid the problem for you. As a safety fallback, many JDBC driver vendors and connection pool implementations also offer a possibility to forcefully close connections (or notify about connection leakage) on certain conditions.
As an example, the Tomcat connection pool has logAbandoned and suspectTimeout properties to configure pool to log about possible leak suspects.
HikariCP has a similar approach, where you can tweak the property leakDetectionThreshold to warn you about leaking connections.
Pool implementations also offer truly weird “solutions” to the problem. For example it is also possible to force the Tomcat connection pool to remove the abandoned connections from the pool and replace them with fresh ones automatically. Notice that a “solution” like this will render the connection pool close to useless – if you end up recreating connections in the pool, why would you even use the pool in the first place?
Take – Away
Should you now be worried enough to drop everything and start rebuilding the way your application handles JDBC connections? Most likely not. As always with performance – it all boils down to “measure, do not guess” approach.
And being an engineer at Plumbr, I can only recommend my craft – grab our 14-day trial and find out whether any of your end users suffer from expensive JDBC connection retrieval or not.