To blog |

Achieve the three pillars of observability – link Plumbr distributed traces with logs

July 8, 2020 by Vladimir Šor Filed under: DevOps Java Monitoring Plumbr Tracing

The modern understanding of how to do observability right stands on the “three pillars of observability” – metrics, logs, and distributed traces. In order to really benefit from these three pillars, you need to implement a holistic solution that ties data from all three together.

Plumbr gives metrics and distributed traces out of the box – just attach our monitoring agents and enjoy metrics directly correlated with end-user experience and distributed traces for all failed and slow interactions or API calls. However, as Plumbr is not a log monitoring tool, we’ve lately started getting questions from our customers around how to link Plumbr data to the logs that the monitored applications produce.

The good news is, for Java applications this can be easily achieved with the help of our Java Agent API. The following blog post will show you how.

To link an entry in a log file with the Plumbr trace (or transaction, as we call it) we need to log the trace ID with all log entries produced while servicing the request by all downstream nodes.

The standard approach for such a task is to use a Mapped Diagnostic Context (MDC) of the logging framework. MDC is essentially a thread local map, where custom code can put entries and which can be referenced in the logger configuration to add values from that map to the logger output. In the Java world all major logging frameworks, like slf4j, log4j and logback, support MDC. So, once we get our trace ID to the MDC, the rest is just a matter of logger configuration.

In order to do that, we need to integrate three components:

  1. Plumbr Java API – Get the ongoing trace ID. 
  2. Web framework – Run our integration code for every request as early as possible.
  3. Logger – Put the trace Id to the MDC and configure the logger to output the trace Id.

Using the Plumbr Java API is simple – add the dependency and you’re good to go. Just copy and paste the suitable snippet for your build system from the Bintray () or Maven Central. To use the library in the code, the following import must be added to your source file:

import eu.plumbr.api.Plumbr;

Trace ID can be obtained like this:

Plumbr.getCurrentSpan().getTransactionId()

The call chain is null-safe – if there is no Plumbr Java agent attached to the JVM, the call will return not-monitored-with-plumbr. You can check whether there is an ongoing Plumbr transaction by calling Plumbr.getCurrentSpan().isNull().

Next, web framework integration. For servlet-based frameworks the integration choice is to use a servlet filter that will run first. Servlet filters can be registered either directly in the web.xml or using the web framework’s capabilities. Following is an example of such filter for Spring MVC/Boot:

@Component
@Order(Ordered.HIGHEST_PRECEDENCE)
public class LoggingMdcFilter implements Filter {

 @Override
 public void doFilter(ServletRequest request, ServletResponse response, FilterChain chain) {
   MDC.put("PlumbrTxId", Plumbr.getCurrentSpan().getTransactionId());
   try {
     chain.doFilter(request, response);
   } finally {
     MDC.remove("PlumbrTxId");
   }
 }
}

@Component is Spring’s annotation to register the Spring Bean.

@Order(Ordered.HIGHEST_PRECEDENCE) is forcing the filter to be the first in the filter chain, preceding other common filters like Spring Security filters.

MDC in this example is from the slf4j.

The last task is to configure the logger to output the trace ID. For the Spring Boot and logback, one can configure this via application.properties:

logging.pattern.file=%d [%X{PlumbrTxId}] %-5level [%thread] %logger{35} - %msg%n

%X{PlumbrTxId} will insert the value from the MDC to the log.

With the above changes in place, if the Plumbr Java agent is attached to the application, the log lines of the application will start looking like this:

2020-07-08 11:58:01,453 [b59b5b1f-2771-f245-1f63-fd7d2983a1d0] INFO  [http-nio-8080-exec-5] e.p.d.o.FindOwnersForm - querying database.

Voilà – the UUID in square brackets in the above log is the ID of the distributed trace as captured by Plumbr.

The last task is to correlate the logs with distributed traces. Plumbr Agents will take care of transmitting the trace id between monitored services, so all nodes participating in the distributed request will have the same trace ID. Whenever the API call is slow or failed, Plumbr will make the full trace available in the UI:

To find all logs related to this slow trace, you can copy the trace ID from the URL in the browser’s location bar and search for it in your logs.

Finally, a note from our product team. If you plan to integrate Plumbr with your logs, or have already done so – please reach out to us at support@plumbr.io. We are planning to make log integration even smoother in the future and are looking for real life customer stories. Understanding your needs will help us come up with the best solution.

ADD COMMENT