To blog Previous post | Next post
Most popular relational databases
It is now more than five months when Plumbr started monitoring Java Virtual Machines to detect expensive JDBC operations. This means that we have had some time to gather interesting insights about the database integrations from the JVM standpoint. As always, we are happy to share the insights with you.
The data originates from 1,272 unique Java Virtual Machines monitored by Plumbr which matched all the following criteria:
- The JVM was active during June – Oct 2015;
- The JVM had the JDBC monitoring feature enabled;
- The JVM used JDBC Type 3 or Type 4 drivers to communicate with the database;
- The JVM did at least one call to a database using JDBC Connections API.
What are the most popular database vendors?
First question we sought the answer was related to the vendor popularity. Our initial guess expected the big enterprise players Oracle and IBM having a strong presence with popular OS alternatives closely following up. To verify this, we extracted the vendor information using a call to the java.sql.Connection.getMetadata().getDatabaseProductName() API on the first call to the database. This resulted in the following popularity ranking:
The top four players in the list are not actually too surprising in the afterthought – MySQL, PostgreSQL, Oracle and Microsoft SQL Server are claiming the first places in most of the rankings composed about relational databases. So much about our initial guesses then.
What is surprising, though, is the massive lead MySQL takes over other deployments – 40% of the JVMs connected to at least one MySQL instance. This is double what other podium finishers, namely PostgreSQL and Oracle managed to pull off, both being present in about 20% of the JVMs monitored. Microsoft is already further behind, having 13% of deployments connecting to at least one MS SQL Server instance.
Another anomaly on the chart is found on the 9th place where IBM is hiding with its DB2. In this case we do acknowledge a potential bias – as Plumbr started supporting IBM JVMs only two months ago, the data is most likely biased in this regard – IBM DB2 deployments tend to hang around with IBM J9 JVM installations, so considering the low number of IBM JVMs monitored during the period, the low number of IBM databases detected is potentially caused by the missing/late support for J9. All in all, we detected 20 different vendors in those 1,272 JVMs.The long tail, where we detected a vendor in three or less JVMs is consolidated under “Other” category.
Careful readers have noticed that the numbers on the chart add up to more than 100%. It is indeed the case, considering the situations where a JVM is integrated with many different databases in the background. This builds the foundation for the next question we are looking the answer for.
How many data sources per JVM?
Another interesting way to look at this dataset is to see how many different data sources a JVM is typically integrated with. In this data set, the distribution of the JDBC data sources per JVM looked like the following:
From the above it is clear that in majority of the situations, just a single data source was present in the JVM. More than 50% (or close to 700 in total) of of the JVMs integrated with just a single JDBC data source.
On the other end of the spectrum, we find 18 JVMs with more than 10 different JDBC datasource integrations. The record holder in this regard managed to integrate with 35 different JDBC data sources from a single JVM.
All in all the numbers were confirming our expectations that a lot of applications integrate many downstream data sources. I can only say I am impatient to re-run this analysis next year after Plumbr has managed to detect all the noSQL players as downstream integrations in addition to the current relational database world.
Are the database versions being updated?
One more question which we looked answers for was the pace at which the database engines get upgraded. For this we went through a more detailed analysis, but for clarity let us look into just the most popular vendor’s data about MySQL versions out there:
The best news is barely visible on the chart. There was just a single 4.1 instance we detected among all the 857 Mysql databases the JVMs connected to. Apparently databases get upgraded faster than JVMs, having just one ancient database engine from 2004 still hanging around.
Next two datasets are not that awesome, considering that 5.0 was released a decade ago and 5.1 is also more than seven years old now. We do suspect that part of the reason is the default bundling policy of Linux distros, for example Oracle Linux bundled MySQL 5.1 as a default MySQL version as recently as 2014.
Most popular versions, 5.5 with 34% and 5.6 with 44% of the instances can be considered up to date if you think about the slow and careful pace enterprises tend to upgrade their software stacks. The best news is also peeking at us just above 2% of the data sources – the brand new MySQL 5.7 released under a month ago is already picking up speed. Good to see some early adopters in the enterprise field.
Summary
I am not fully convinced that you can apply the knowledge from the post in your tomorrow’s practices. However, I am almost certain that you got an interesting peek into a data set from the field representing the status quo about what is happening in the field of relational databases. If you enjoyed the content, subscribe to our Twitter feed, we will keep posting interesting stuff from performance monitoring field.