Splunk Completes Acquisition of Plumbr Learn more

Slow Lucene Operations

Lucene indexes are the backbone of almost any full text search service implemented in Java. Solr, Elasticsearch and Liferay are just some examples where Lucene indexes are found. Querying large indexes or updating complex index entries can and will end up impacting the performance of such applications.
Get a full view of Lucene transactions by attaching Plumbr to your application.

How Plumbr will help you

The screenshot above is extracted from a real-world application, suffering from poorly performing Lucene indexes. Plumbr has extracted that the the particular search operation via MMapDirectory.search(?,?,?) has been too slow, taking 5,587 milliseconds to complete. In addition it is visible that this is not a single incident - during the period monitored, there have been a total of 228 cases where the same operation contributed towards poor user experience.

The information needed to alleviate the problem is embedded in the following two blocks. First of these exposes the specific call to the index, equipped with exact parameters set for the access:

  "query" : "+(+formLocale:6751 -status:-1 -status:0) +(+(+(dataBundleId:[2246949 TO 2246949] dataBundleId:[2296374 TO 2296374] dataBundleId:[2296375 TO 2296375] dataBundleId:[2296376 TO 2296376] dataBundleId:[2296381 TO 2296381] dataBundleId:[2296392 TO 2296392] dataBundleId:[2296433 TO 2296433] dataBundleId:[2299809 TO 2299809])))",
  "n" : "99",
  "sort" : "[{\"Type\":\"CUSTOM\"}]"

The block next to it exposes the call stack through which the index was accessed:


In addition, Plumbr exposes you the full view of the transaction where this particular index was accessed. Opening the transaction link in the header exposes you the following information:

From the above it is clear that the particular call to MMapDirectory.search(?,?,?) contributed 5,587ms to the the entire transaction duration of 7,992ms.

The Solution

The solution to the particular issue described in the example involved more strict search criteria validation, as the current search accessed 27,708 documents in the index. The business case involved did not require so large scan on the index, so the solution was moderately easy.

In general, the solutions to performance issues revolving around Lucene indexes tends to include:

  • Making sure the indexes are small enough to be cached. Too large indexes will overflow to disk, making the index scan significantly slower.
  • Using correct caching policies (LRU, MRU, etc), based on the business case at hand.
  • Verifying the presence of the search fields in the index. If some of the fields in the query are not indexed, the queries would not perform as expected.
  • Rethinking the architecture of the indexes, potentially separating the index updates from read operations and propagating changes from the single master node to read-only nodes.
  • Decoupling index updates from the transaction context by invoking the actual update in a different thread asynchronously. With some consistency issues asides, this will result in improvement of perceived latency.