Manual

Introduction

What is Plumbr?

Plumbr is designed to monitor the user experience in web applications. The aspects Plumbr focuses are performance and availability of the application. To do so, Plumbr monitors every user interaction within the user interface.

Such interactions are called transactions in Plumbr lingo, exposing:

  • the application the user interacted with;
  • what the user did in the application;
  • the duration and outcome of the interaction;
  • the user performing the interaction;
  • the root cause for failure or slowness in case of unsuccessfully finishing transactions.

With the help of Plumbr, you can easily answer questions like the following:

  • How many unique users used my site yesterday? Who were they?
  • Which features in my application performed the worst?
  • How many users experienced failures on my site last week?
  • Which root causes have degraded the user experience the most today?

How does Plumbr work?

Plumbr deployments contain up to three different modules:

  • Browser Agent, capturing the end user experience in the device used by your customers. This is the module you should be installing first.
  • Java Agent, tracing the user interaction through the back-end services composing it into one distributed transaction. Presence of this Agent also detects root causes in the back-end layer. You should install this module if your back-end is running on Java Virtual Machines.
  • the Server responsible of receiving and processing the data collected by the Agents.  Server also exposes the UI. This is the module we recommend not to install yourself. Instead, when using the Server as our SaaS solution, the installed Agents would connect to https://app.plumbr.io and serve the user interface also via https://app.plumbr.io.

If you only use the Browser Agent, the deployment model includes injecting our Javascript Agent to the <HEAD> section of all the HTML pages in your application. After doing so, the Agent can start listening to the end user interactions in the browser:

Plumbr Browser concept

The data captured is sent to Plumbr Servers, which are responsible for assembling the transactions and allowing you to run analysis on the data.

In case you are after end-to-end transparency and deploy both the Browser and Java Agents, the transactions are monitored from the browser to all the JVM-based nodes in the backend:

Plumbr Java and Browser Agent

The Plumbr Java Agent is packaged as a standard -javaagent, and attaching it does not mean you need to make any changes to your application. The only required change involves pinpointing the location of the Agent in the file system by adding -javaagent:/path/to/plumbr.jar to JVM startup scripts.

The Agents located in the nodes processing the transaction pass along the transaction ID as the call metadata, so all the nodes servicing the transaction could be assembled together in the Plumbr Server into a single distributed transaction.

Transactions

What are Transactions?

A transaction is a single end user interaction performed in an application monitored by Plumbr. Transaction tracking behaves differently depending on the deployment model of the Plumbr Agents in your application. See the following two chapters to understand how transaction detection works for your deployment.

Transactions in Browser

In cases where the Plumbr Browser Agent is present in the end user’s browser, transaction monitoring starts in the browser. The Plumbr Browser Agent keeps track of the interactions performed by the end user. The tracked interactions include mouse and keyboard events, scrolling, tapping on touch screens and many other events through which modern web applications expose their functionality.

Only some of the user interactions start a transaction. A transaction is started whenever the user interaction results in at least one HTTP request being sent to the backend. So for example, a click on an HTML anchor scrolling the page to the location of the anchor will not start a transaction, as no server-side requests are made.

All HTTP requests (called spans in Plumbr lingo) triggered by the interaction are also monitored and linked with the transaction the spans belong to. All spans register the start and end timestamp of the request, the URL to which the request was sent and the HTTP response code of the request.

Browser transaction ends when the last span in the browser is completed. Based on the duration of the transaction and the response codes of the individual spans, the transaction is flagged either as successful, too slow, or having failed to deliver the expected outcome.

User interaction monitored by APM

Transactions also keep track of the user performing the interaction, the application in which the interaction was performed, and the functionality that the interaction consumed. This allows you to keep track of what the particular user was actually doing within the application.

Every transaction starting in a browser thus monitors and exposes the following information:

  • the ID of the transaction
  • the ID of the user performing the interaction
  • the start and end timestamps of the interaction
  • the application to which the interaction belonged
  • the functionality of the application used
  • the outcome of the interaction (successful/slow/failed)

Transactions in JVM

Transactions recorded in JVM can behave in two ways depending on whether or not the transaction originated in a browser with the Plumbr Browser Agent.

If the Browser Agent was present, the JVM processing the request joins the existing transaction. All the processing done by the server-side JVM(s) is then linked to the ongoing transaction as spans. In this case the outcome is the end-to-end transparency of the transaction throughout the infrastructure.

If no transaction is started in the end user’s browser, the transaction is started the moment it arrives to the JVM. This happens when:

  • the browser triggering the request does not have the Plumbr Browser Agent attached
  • the request is not made by a human using a browser, but by another system using the web services published by the JVM
  • the inbound operation arriving to the JVM is not an HTTP request. To give you an idea –  such transactions can be initiated by remote calls to EJB modules or Swing event listeners in Swing-based desktop applications.

The main benefit in adding the Java Agent to the monitoring stack is in more detailed root causes being exposed.  Every user interaction being slow due to server-side issues would now be linked to the exact root cause in the JVM.

Transaction Status

Each transaction gets assigned a status based on the duration and outcome of the transaction. The duration of the transaction is calculated at the time when the last span in the transaction is closed.

  • Successful transactions. This is the desired status of a transaction, meaning the transaction did complete with the expected outcome and was fast enough not to impact end user experience.
  • Slow transactions. If the duration of the transaction exceeds a predetermined threshold, the transaction is flagged as slow.
  • Stuck transactions. When the duration of the ongoing transaction exceeds the predetermined slow threshold 100x, Plumbr assumes the transaction will never complete and flags the transaction as stuck. The stuck transaction will not be monitored by Plumbr after it has been flagged as such.
  • Failed transactions. When a transaction fails to complete with the expected result, the transaction is flagged as failed. Whether or not the transaction failed to complete is decided using different means for different application types:
    • for web applications monitored by the Browser Agent, every transaction including spans with 400 or 500-series HTTP response codes will be flagged as failed
    • for web applications monitored by the Java Agent alone, all transactions returning a 500-series response will be flagged as failed
    • for EJB modules, all remote method invocations that result in an exception being thrown are flagged as failed
    • for Swing/AWT applications, all Action Listener events which result in an exception being thrown are flagged as failed

Applications

What are applications?

An application is an attribute of a transaction, used to group together transactions within the same software. The application is identified differently for different application types:

  • For web applications monitored by Plumbr Browser Agent, the application is captured from the URL displayed to the end user of the application, extracting the value of location.href attribute.
  • For web applications monitored by Plumbr Java Agent, the application is detected from the domain, protocol and port combination of the inbound traffic.
  • For non-web applications monitored by Plumbr Java Agent, the application name is equivalent to the Java Virtual Machine ID

The examples below help to understand the concept better.

When the default application detection is not suited to your needs, you can rename, merge or split the existing applications detected by Plumbr based on the instructions in the following chapters.

Renaming applications

If the default name assigned by Plumbr does not suit you, you can rename an existing application. For example, Plumbr might have detected the application as http://testbilling.example.com:8080 but your team is used to calling this software bundle the Billing Test.

You can rename an application via the Plumbr user interface accessible at https://app.plumbr.io. Click on the Settings icon next to the application name in the application detail view and change the name to one you prefer.

With this change, the existing transactions also get a new application name. So for example if you had 10 transactions within the http://testbilling.example.com:8080 application that you renamed as the “Billing Test”, then all the existing 10 transactions will refer to “Billing Test” application along with all the new transactions yet to arrive.

Merging applications

In case Plumbr application detection has detected two or more applications where in reality there is just one, you can merge the applications with the same software into a single application. There are two ways to do this.

You can merge applications via the user interface by renaming one or more applications to bear the same name. For example, applications http://billing.com and https://billing.com might be detected by Plumbr. If you wish to make sure the transactions accessing the billing.com domain are linked to the same application, regardless of whether or not it was accessed over HTTP or HTTPS, you can do just that. Proceed to rename both applications in UI to Billing Live. After this change, previous interactions with these applications will also refer to Billing Live.

In situations where you have several applications to merge, this approach might not be the best choice due to the sheer amount of rename operations involved. This can for example happen in situations where you are exposing your application dedicated to a virtual host to different clients. In such a case, instead of accessing http://billing.com, each client has its unique virtual host (http://company1.billing.comhttp://company2.billing.com, etc).

Faced with this situation, you can merge applications via a configuration in the plumbr.properties file located next to the Plumbr Java Agent. You can use this file to modify (or add, if not present) the parameter appId, referring to the identity of the application you wish to use, similarly to the following example:

appId=Billing Live.

This change will only be applied to new transactions arriving after the change. All existing transactions will still refer to previous applications.

Splitting applications

In case Plumbr has bundled together different types of software under the same application, you might wish to split it into two or more applications. This can for example happen in situations where Plumbr Agents capture data being accessed via the http://localhost:8080 interface from different web applications being deployed in different developer machines.

You can split applications by altering the contents of the plumbr.properties file located next to the Plumbr Java Agent. You can use this file to modify (or add, if not present) the parameter appId referring to the identity of the application you wish to use, similarly to the following example:

appId=Billing Test.

After this change all new transactions arriving to this JVM will be linked to the ID you chose. The existing transactions will not be affected.

JVM Root Causes

What are Root Causes?

When a transaction is flagged as slow or failed, Plumbr looks for the root cause of the underlying issue. Plumbr is able to explicitly track the problematic transactions to the actual root cause in the source code or configuration.

The root cause detection works only in server-side, meaning that only accounts with Plumbr Java Agents will benefit from the root cause detection.

To be able to detect the root causes for slow transactions, Plumbr monitors the JVM for various performance issues. Whenever a particular issue contributes more than 1,000ms to the transaction duration, Plumbr links such issues to the transaction as a root cause.

To detect root causes for failed transactions, Plumbr is capturing all the exceptions thrown while the transaction was being executed. When such an exception is been detected in context of a failed transaction it will be linked to the transaction as the root cause. An example of such a root cause being linked to a transaction is a NullPointerException being triggered during a particular transaction aborting the normal flow of operations and returning a 500-series error to the end user instead.

Transaction Snapshots

Plumbr is capable of monitoring for a large number of specific root causes explicitly. Unfortunately, the different technologies used in real world means that the number of ways a particular code can perform poorly is effectively unlimited. Thus a fallback is implemented to cover the cases where the explicit root cause can not be determined. In such situations, Plumbr Agent captures snapshot(s) from the suspicious transaction.

Snapshots are effectively thread dumps taken from the thread executing the transaction. Snapshot capturing happens at increasing intervals during the transaction lifespan and is limited to 10 snapshots. Snapshots taken will be linked to the transaction if the duration of the transaction will eventually be flagged as Slow or Stuck. When the transaction ends up being successful, such snapshots will be discarded.

To expose this information in a useful way, Plumbr aggregates those call stacks into a tree-like structure. Call stacks occurring most frequently are ranked higher in a tree. To reduce noise, non-repetitive occurrences are hidden, enabling you to focus on the most frequently captured snapshots first.

Slow JDBC Calls

The Plumbr Agent monitors every JDBC Type 3 and Type 4 driver detected in the application. This means that Plumbr supports almost every database vendor exposing the data storage via JDBC, including but not limited to the most widely used MySQL, Oracle, Postgres and IBM DB2 databases.

The Agent instruments all the JDBC calls which connect to databases via StatementPrepared Statement and Callable Statement APIs. When a call via such an API starts affecting the end user experience, the offending query is listed as a root cause exposing the JDBC operation executed along with the call stack from the thread executing the query. In such a way, you get access to the root cause of expensive JDBC operations down to a single line in the source code responsible for executing such queries.

In order to reduce noise and get a prioritized list of expensive database operations, Plumbr groups expensive operations triggered by the same root cause together, allowing you to rank the expensive operations based on the number of times they are detected.

 

Slow Lucene Operations

Plumbr monitors Lucene indexes being used via instrumenting and monitoring all implementations of org.apache.lucene.search.IndexSearcher and org.apache.lucene.index.IndexWriter interfaces. Doing so allows Plumbr to track all the operations modifying the index or reading from the index. This support is implemented and tested on Lucene 4 and 5 releases.

By monitoring the behavior of said interfaces, Plumbr is capable of exposing:

  • The impact poorly performing Lucene indexes have on your end users
  • Actual root cause, down to a single line in source code accessing the index
  • Information about the index accessed, including the index size, accessed fields, accessor methods and more.

Locked Threads

The Plumbr Agent monitors all JVM threads for lock contention events. Plumbr monitors both synchronized block/method access and java.util.concurrent locks.

For synchronized blocks/methods, Plumbr tracks the situations where a thread in the JVM executes code in a synchronized block or method and another thread tries to enter the same synchronized block/method.

For java.util.concurrent locks Plumbr will detect the situations where threads are forced to wait for events originating from the use of various java.util.concurrent classes, ranging from ReentrantLock to ArrayBlockingQueue.

When the wait times in either of the case exceeds a predetermined threshold, the root cause will be exposed, containing the following:

  • How long the thread was forced to wait before getting access to the synchronized block/method.
  • The monitor used to lock the method/code block (for synchronized usage only).
  • The name and call stack from the thread trying to enter the synchronized block/method.
  • The name and a snapshot of the call stack of the thread whose code was running in the synchronized block. The snapshot of the call stack is taken when the waiting time for the blocked thread is about to exceed the configured threshold.

Having such information allows you to zoom in to the underlying root cause with the precision of a single line in the source code, skipping the tedious and complex process of troubleshooting concurrency issues. Notice that Plumbr also binds together similar lock contention events, allowing you to rank the severity of the performance issues based on the frequency of the underlying root cause.

JDBC Multi-Queries

The Agent instruments all the JDBC calls which connect to databases via Statement, Prepared Statement and Callable Statement APIs. When a single JDBC statement executed through such APIs will impact user experience, a Slow JDBC Call is detected as the root cause. In situations where many database calls take place during a single transaction and the accumulated duration of such calls is the reason why the transaction is flagged as slow, the multi-query root cause is exposed instead.

In the details of this root cause you will find the offending queries along with the call stacks from the threads executing such queries. To minimize overhead, smart sampling is applied when exposing this data.

The Plumbr Agent monitors every JDBC Type 3 and Type 4 driver detected in the application. This means that Plumbr is able to monitor communication with almost every database vendor exposing the data storage via JDBC, including but not limited to the most widely used MySQL, Oracle, Postgres and IBM DB2 databases.

GC Pauses

The Plumbr Agent monitors all stop-the-world Garbage Collection pauses that take place in the JVM. If the duration of such a pause exceeds a configured threshold, an incident is created. In addition to the time and duration of the pause, a Plumbr incident contains insights that would help reduce either the duration or frequency of the long GC pauses, for example:

  • Plumbr captures a memory snapshots, exposing the most memory-hungry data structures in memory. This allows you to proceed with trimming the most resource-hungry data structures.
  • Allocation and promotion rates exposed by Plumbr, along with the memory consumption in different memory pools will give you clues about the poorly allocated heap structures..

Excessive Number of …

“Excessive number of …” root causes are exposed in situations where many similar operations take place during a single transaction and the accumulated duration of such operations is the reason why the transaction ends up being slow.

For example, when just a single HTTP call is impacting user experience, it will be exposed as a Slow HTTP Call. In situations where many HTTP calls take place during a single transaction and the accumulated duration of such calls is the reason why the transaction ends up being slow, the Excessive Number of HTTP Calls root cause is exposed instead.

In most cases, the solution for such problems requires a change in application code. The performance gains can often be achieved by applying either of the following guidelines:

  • Reducing the amount of operations invoked via changing the amount of data requested
  • Batching the operations together instead of launching them via a single call.

OutOfMemoryErrors

When memory leak detection does not spot any abnormal data structure growth that looks like a memory leak, the second line of defense is set to capture OutOfMemoryError events and analyze the contents of memory when such an event occurs. When the event is captured, the native code of the Plumbr Agent captures the snapshot of statistics from JVM memory and sends it to the Plumbr Server for analysis. When the analysis completes, an incident is created containing the following information:

  • What are the “fattest” data structures currently in memory (measured in MB).
  • What is currently referencing such data structures, blocking them from being GC-d.
  • Of what do these “fat” data structures consist.
  • Where were these data structures created.

Having this information allows you to quickly understand the most likely reason why the OutOfMemoryError was triggered. In the vast majority of cases, the culprit is staring right at you in one of the top three memory consumers.

The information might look somewhat similar to the dominator tree one could acquire via heap dumps, but at a closer look you will see that Plumbr exposes a lot more information than one could capture via heap dumps (such as the allocation points and the full reference chain). In addition, the relevant information is presented in a lot more user-friendly way, saving you from days spent trying to figure out why some byte arrays seem to occupy most of the heap inside your heap dumps.

File Stream Operations

The Plumbr Agent monitors file reading and writing operations performed by using FileInputStream and FileOutputStream classes. When the wait time for the read and write operations starts impacting end user experience, Plumbr recognizes this and links a File Stream Operation root cause with the slow transaction. The root cause exposed will contain the following information:

  • File(s) being read/written, along with their path in file system, size and other relevant attributes.
  • Call stack from the thread executing the operation, zooming you right to the line in source code accessing the file system.

There are several common problems that happen when reading from or writing to a file stream and leading to slow transactions:

  • Lack of buffering: each read or write operation incurs overhead, depending on the operating system, file system and hardware. Instead of reading or writing one byte at a time, a much more performant approach would be to do it in bulk. A simple approach would be to make use of a BufferedInputStream or BufferedOutputStream
  • System issues: like we said above, the performance of file operation depends on the operating system, the file system and the hardware. It is sometimes the case that one of these becomes the bottleneck, and even a single file stream operation could take tens of seconds.

Memory Leaks

Memory leak detection monitors all object creation and collection events in order to detect patterns indicating a certain data structure growth being triggered by a memory leak. When such a data structure is detected, Plumbr exposes the root cause equipping it with the following information:

  • The size of the leak (in MB) and the speed at which the leak is growing (in MB/h).
  • The objects that are leaking.
  • What is currently referencing the leaked objects; blocking them from being GC-d.
  • The line in source code where the leaking objects were created.

 

File Attribute Operations

The Plumbr Agent monitors file attribute querying performed by using methods such as File.exists(), File.isDirectory(), File.canWrite() and so on. While individual operations like that are usually handled very quickly, typically under a few microseconds, having a large number of them may result in a slow transaction. One of the most common cases is recursively walking a large directory that contains millions of files.

When the wait time for the attribute checking starts impacting end user experience, Plumbr recognizes this and links a File Attribute Operations root cause with the slow transaction. The root cause exposed will contain the following information:

  • File(s) being accessed, along with the operation performed (exists(), isDirectory(), etc)
  • Call stack from the thread executing the operation, zooming you right to the line in source code accessing the file system.

Exceptions

The Plumbr Agent monitors the creation of Exceptions during a transaction. Whenever a transaction is flagged as failed, the chronologically last Exception is linked to the transaction as a root cause. The Exception contains the full stack trace, allowing you to zoom in to the source code and quickly fix the underlying error.

Exceptions that do not affect any transactions or exceptions used to steer control flow are not exposed. Exceptions are grouped together into root causes by Exception class, so for example all ArrayIndexOutOfBoundExceptions would be grouped together as instances of a single root cause. Different call stacks are visible from the root cause details to verify whether or not the source code would need patches in multiple locations.

 

Underprovisioned Thread Pools

Plumbr Agent monitors thread pools embedded in the application to detect situations where submitted tasks/requests will end up waiting in queue for available executor. When a the wait time in such queue starts impacting the end user experience, an Under-provisioned Thread Pool root cause is registered. The root cause exposes the call stack from the thread waiting in the queue for Plumbr users.

ThreadPools Plumbr Java Agent is able to monitor:

  • ThreadPoolExecutor embedded in the Java SDK
  • org.apache.catalina.core.StandardThreadExecutor from Tomcat application service (configured through “tomcatThreadPool” Executor)

Under-provisioned thread pools are surfaced as a root cause in situations where the thread pool is not able to provide a free thread enough threads to cope with the incoming work load. This can be so either due to:

  • Work done by such threads is taking unusually long to complete. The solution for such cases is to optimize the code executed by the threads in the pool.
  • Amount of tasks/requests submitted to a pool is higher than usual. In such situations the solution is either in controlling or load balancing the load.
  • Last, but not least, situations where the thread pool configuration is not providing enough threads to match the regular load. In such situations the solution is as easy as increasing the number of threads in the pool configuration.

Slow HTTP Calls

The Plumbr Agent monitors different HTTP client libraries used for connecting to remote systems over HTTP. When the HTTP calls to such remote endpoints start affecting the end user experience, the offending HTTP query is linked to user transactions as a root cause, exposing the outgoing HTTP request along with the call stack from the thread executing the query.

Slow HTTP Calls tend to perform poorly due to the remote system not responding to the call from JVM quickly enough. To solve the problem, the system being accessed via HTTP needs to be tuned for latency. If this is not an option, caching the results can also used to reduce the number of such operations.

The supported list of HTTP client libraries includes:

Slow MongoDB Operations

To detect expensive calls to a MongoDB instance, Plumbr monitors DBCollection and MongoCollection interface methods such as find() and findAndModify(). When a call via such an API starts affecting the end user experience, the operation is listed as a root cause exposing the MongoDB operation called along with the call stack from the thread executing the operation.

In order to reduce noise and get a prioritized list of expensive MongoDB operations, Plumbr groups expensive operations triggered by the same root cause together, allowing you to rank the expensive operations based on the number of times they are detected.

Plumbr supports and monitors both the 2.x and 3.x versions of MongoDB drivers.

Slow JDBC Connection Acqusition

Plumbr detects slow JDBC Connection Acquisition when JDBC connection retrieval via DataSource.getConnection() or DriverManager.getConnection() is affecting end user experience. In such case Plumbr notices this and exposes the number of transactions affected along with the wait time the transactions were forced to wait behind the connection retrieval.

Slow connection retrieval can be caused either by

  • Missing connection pool. Creating JDBC connections is expensive, so in this case please consider using pooling the connections. 
  • Uninitialized connection pool. In cases where pooled connections are initialized lazily, the first requests to the empty pool are slow. Consider initializing the pool during application startup.
  • Under-provisioned connection pool. When the number of available connections in the pool is smaller than the demand, there will be wait time in queue for the connections. Consider increasing the pool size to match the number of concurrent requests to the data source.
  • Leaking connection pool. If connections are not closed the connection pool does not know that the connection is no longer being used by the borrower thread. To fix this, add pool-specific options to pool configuration to spot leakages in pool.
  • Testing connections. To avoid unused connections in pool for becoming stale, the pool implementations often test out the connection before handing it off to the executor thread. When the test query is expensive, this can result in poor performance. Consider simplifying or dropping the tests if possible.

Slow ResultSet Processing

Slow ResultSet processing is detected when the result set fetched from database over JDBC is processed in a way it is affecting end user experience. In such case Plumbr notices this and exposes the number of transactions affected along with the wait time the transactions were forced to wait behind the JDBC result set processing.

To monitor the time it takes to process the resultset, Plumbr Agent monitors the cumulative duration of each java.sql.ResultSet.next() iteration. When the cumulative time of the iterations starts impacting end user experience, the Slow ResultSet Processing root cause is created. This root cause will expose the query whose results were processed along with the call stack through which the results were processed.  

Slow ResultSet processing is usually detected when fetching large result sets received from database. To improve the situation, consider either switching to more fine-grained queries or use database-backed paging to limit the size of result sets.

Browser Root Causes

What are Browser Root Causes

Along with JVM Root Causes Plumbr now exposes Browser Root Causes which happened during the communication of user’s browser with backend systems showing availability or performance issues. Most common of them are: Browser Request Failures, Browser Request Waits, DNS lookups, etc.

The availability issues are exposed by Plumbr as Browser – Request Failure meaning that particular HTTP request was failed to execute.

To better understand performance issues background we need to look at an illustration of a standard HTTP call here:

Source: W3 Resource Timing specification

Based on this illustration Plumbr is bringing the following Browser Root Causes reflecting the HTTP call phases shown on the picture:

  • Browser – Redirect – there were a lot of redirects or some of the redirects were slow
  • Browser – Cache Fetch – resources requested by browser were already cached in the disk, but taking this resource from the disk was slow
  • Browser – DNS lookup – performing DNS lookup was slow
  • Browser – TCP connect – establishing connection to the backend server was slow
  • Browser – SSL handshake – establishing secure connection to the backend server was slow
  • Browser – Request Wait (Assets, XHR, Pageload) – asset (CSS, JS, etc.), XHR or page HTML request from the backend server was slow
  • Browser – Download (Assets, XHR, Pageload) – asset, XHR or page HTML download from the backend server was slow
  • Browser – HTTP Call (Assets, XHR, Pageload) – http request for asset, XHR or page HTML from the backend server was slow, but Plumbr wasn’t able to distinguish the phase where it was slow
  • Browser – Queue Wait – if there are lots of images/resources to be fetched for particular page from the same domain, then browser can place requests requesting stuff from same domain into waiting queue.

How data is collected

By the combined power of observing the document tree for changes (Mutation Observer) and instrumenting native methods for requests (such as XMLHttpRequest) we’re able to keep track of resource loading starts and ends. However as those are basic start and end metrics and can be inaccurate, we additionally use the Resource Timings API to get a detailed breakdown of the request’s life-cycle from the browser whenever available.

Browser – Request Failure

Request Failure Root Cause is exposed when at least one of the transaction spans received response code is of 4xx or 5xx series. There are also some exceptions, we are not considering spans with response code 400, 401, 403, 409, 418, 426, 451 as failed because they are used a lot in REST APIs to describe behaviour of business rules.

These can be totally different reasons why a spans fails, here’s some examples:

  • Requesting some resource (CSS, image) is failed with 404 code because it’s not found
  • Receiving code 502 may indicate that a proxy server the browser is communicating with cannot establish a connection with the backend server processing responses (for example a nginx server in front of a jvm)
  • 500 error code means that the server is failing to produce response and one needs to acquire additional information from backend server logs.

Browser – Redirect

If you have a lot of redirects on your site then it can be very painful to the performance of your site as the browser needs to request them one by one. Plumbr is taking redirects time into account and if it exceeds given threshold then exposes Redirect Root Cause with an full URL where redirect was ended.

In general the rule is to avoid redirects as much as possible, they can drastically decrease page loading especially considering mobile devices. When possible try to make only single redirect.

Browser – Cache Fetch

In case the resource queried by the browser was recently requested it’s response can already be located in local cache and can be retrieved from there. In some cases when local disk is under high load it can happen that taking resource from the cache will take more than expected and if that time exceeds threshold then Cache Fetch Root Cause is registered with static text “Cache Fetch”. It is supposed to be a very rare Root Cause, assuming that clients machines nowadays are powerful enough to overcome such issues.

Browser – DNS lookup

Another issue Plumbr now exposes is slowness caused by browser making a request to a DNS server to translate domain name to IP. DNS lookup Root Cause will be registered if this phase exceeded the threshold. Plumbr groups all requests to the same domain into one DNS lookup Root Cause referencing the particular domain.

Reasons why DNS lookup is slow usually are following:

  • network is bad between client and DNS server
  • DNS server is at heavy load and cannot process requests fast enough
  • DNS server is not configured properly to make fast lookups
  • short DNS TTL setting causing servers to frequently check if their cached value is up to date

To get rid of this problem one can try to improve network conditions or change DNS servers if possible.

Browser – TCP connect

Bad network can cause one more issue also exposed by Plumbr now – TCP connect Root Cause. This Root Cause is registered when making a connection from browser to backend systems takes too long. As with DNS lookup Root Cause this kind of requests are grouped by the domain name where connection establishing was slow.

Browser – SSL handshake

During connection establishment phase the browser also does SSL handshake in case of a secure connection. SSL handshake Root Cause can be caused by poor network conditions, slow client or server. Domain name is also used to group all requests affected by this given Root Cause.

As it is not always possible to improve network conditions then one can pay attention to server performance looking that it’s CPU is not under heavy load and there is sufficient RAM to keep previous connections alive. Also it’s good to not choose certificates with very long keys (RSA 2048 should be sufficient).

Browser – Request Wait (Assets, XHR, Pageload)

Request waiting phase indicates the time from when the browser starts sending the request to getting the first response bytes from the backend. We differentiate 3 different Root Causes for assets (CSS, images, etc.), XHR and full page load requests. Root Cause grouping is done using:

  • domain name for Assets and full page load requests
  • shortened URL for XHR requests

While for all previous Root Causes the main reason can be network, here the reason can more likely be in the backend system and one needs to performance test and debug the system to find the real reason.

Browser – Download (Assets, XHR, Pageload)

Request download phase indicates time browser is receiving response from the backend system. Here we also are making 3 different Root Causes for assets (CSS, images, etc.), XHR and full page load requests.

Root Cause grouping is done using:

  • domain name for Assets and full page load requests
  • shortened URL for XHR requests

To solve the issue in most cases one needs to optimize backend system endpoint (making it closer to the user, eg. using a CDN) or the size of content returned from the system.

Browser – HTTP Call (Assets, XHR, Pageload)

In case we cannot determine a particular phase (as resource timings weren’t available) and the request is slow, then HTTP Call Root Cause is registered. Here we also are making 3 different Root Causes for assets (CSS, images, etc.), XHR and full page load requests.

Root Cause grouping is done using:

  • domain name for Assets and full page load requests
  • shortened URL for XHR requests

Browser – Queue Wait

Less critical resources such as images can be put into a queue by the browser and processed later. Plumbr registers such requests under Queue Wait Root Cause if being in the queue phase exceeds the threshold. The reason why it happens is:

  • lack of TCP connections
  • too many requests are being processed in parallel (usually by default the browser can process 6 requests to the same domain in parallel)
  • some requests are postponed because considered by browser not critical

Grouping of Queue Wait Root Causes is done by domain name.

To solve this issue:

  • domain sharding could be done.
  • reduce the number of requests by:
    • concatinating CSS, JS files together
    • using CSS sprites for images

Services

What are Services?

A service is an attribute of a transaction exposing the operation the user was doing. By exposing this, services group together similar transactions performing the same operation (such as paying an invoice or adding an item to a shopping cart).

A service is also used to set a threshold for the transactions consuming this service. Transactions exceeding this threshold would be flagged as slow. Out-of-the-box, Plumbr sets a default 5,000 ms threshold for all transactions. This default threshold is configurable via the menu Settings – Thresholds. You can also set a threshold for a particular service overriding the default limit for transactions consuming this service.

Service detection works differently depending on what type of application Plumbr is monitoring.

Services in web applications

When the application is monitored by the Plumbr Browser Agent, service detection builds upon three components:

  • the URL at which the user was before the interaction
  • the interaction the user performed
  • the URL at which the user ended up after the interaction completed

monitoring web applications

In the example above, the user was viewing an invoice with the ID 123 and decided to pay the invoice by clicking the button Pay. The application processed the payment, after which the user remained at the same URL with the confirmation message that the invoice was paid successfully.

Plumbr identifies the service from this transaction as click “Pay” on /invoice/view/{1}. As seen from this example, all invoice payments carried out in the /invoice/view screen are grouped together under the same service.

Improving the service detection in web applications

There are known cases where out-of-the-box Plumbr configuration ends up with either services exposed via cryptic names. As a result you would see services similar to the following in Plumbr UI:

Key pressed on “div > list > filter > input#filter” at /user/search

This happens in situations where no human-readable elements were present to use as the identifier on the input field where the user performed the event. As a result of this, Plumbr used a fallback and exposed the DOM tree branch the event took place at. To replace this with a human-readable version, use “aria-label” attribute on such elements, so instead of

<input type="text"/>

you would use

<input type="text" aria-label="Name"/>

After this change, you would end up nicely formatted services with Key pressed on “Name” at /user/search for Plumbr. As a side effect, blind people also now have better access to the content of your site, as this is what the aria-label element was originally designed for.

Services in JVMs

Whenever a transaction that arrives at the JVM has not started in a browser monitored by the Plumbr Browser Agent, service detection is the responsibility of the JVM accepting the HTTP call. Service detection in JVM is done differently for different applications:

  • HTTP calls. If the transactions arrive via an HTTP protocol, Plumbr extracts the service from either of the two:
    • MVC framework metadata. If Plumbr supports a particular Java MVC framework used to process the incoming HTTP request, service detection uses the class/method name of the controller invoked by the transaction
    • For the transactions not processed by a supported MVC framework, the service is detected using the information encoded in the URL
  • EJB methods. If the transaction arriving to the JVM is a remote EJB call, Plumbr users the EJB class and method name as the service
  • Swing event listeners. If the transaction was captured in a Swing application, this transaction uses a Swing event and action listeners as the service

Detecting a service from MVC

When the JVM monitored by Plumbr exposes the services via an MVC framework supported by Plumbr, the service name is extracted from the controller processing the transaction. For example, when an HTTP request such as

http://www.example.com/payments?actionId=payInvoice&invoiceId=411121

is mapped and processed by a Struts controller, the service is extracted from the controller. An example of such a controller would then be visible in the Plumbr interface similar to:

com.example.payments.PaymentAction.execute().

The controllers from the following MVC frameworks are currently supported for service detection:

  • Spring MVC
  • Struts 1 & 2
  • GWT 2.x
  • JSF 1.1+
  • Vaadin 6+
  • ZK 7+

Detecting a service from an URL

Whenever the HTTP request is not processed by an MVC framework known to Plumbr, service detection falls back to capturing the service from the information encoded in the URL.

Let us explain this approach using a transaction arriving at the following URL as an example:

http://www.example.com/shop/cart/add/iPhone6

Service detection parses the URL to use /shop/cart/add/iPhone6 as the input. As seen, the last token identifying the product added to the shopping cart (iPhone6) is actually a parameter of the service. In order to group all interactions adding items to the shopping cart under the same service, Plumbr replaces the iPhone6 token in the URL with the placeholder {1}. As a result, the service detected from the transaction is

/shop/cart/add/{1}.

Using this approach makes it possible to group transactions accessing the same /shop/cart/add service together, regardless of the product you added to the shopping cart.

Users

What are Users?

A user is an attribute of a transaction, which associates the transaction with the end user interacting with the application. Users in Plumbr are built upon two concepts: user tracking to distinguish one user from another and user identification to expose the identity of the user.

Tracking Users

User tracking works for web applications monitored by the Plumbr Browser Agent. User tracking is based on generating and storing a random unique string in the end user’s browser. This random string is then submitted along with each request from the particular browser.

The unique string is stored in the browser’s cookies, and so subsequent visits to the same site can be associated with the same user. The cookie used for this purpose is named plumbr_user_tracker.

In a Plumbr deployment where users are tracked but not identified, a unique user is counted each time your content is accessed from a different device or browser.

For example, the following journey appears as three different users in cases where user identification is not implemented:

  1. Searching for a product on a tablet or a phone one day,
  2. Purchasing the product on a desktop the next day,
  3. Filing a complaint about the purchased product on a laptop a week later.

Even if all these interactions were performed by an authenticated user, Plumbr would track it as three different users. While you can collect data about each of these interactions and devices, you cannot determine if any relationships exist. You only see independent data points.

The very same journey would be linked to a single user if users could be identified. In such a case, the interactions on different devices would be connected with the same user identity.

Plumbr user tracking only works for web applications. So, if you are monitoring EJB modules or Swing applications, Plumbr will be incapable of tracking the users in such deployents.

 

Identifying Users

In order to expose the journey of a specific user and track the same user across multiple devices, Plumbr also embeds the possibility of identifying users. The exposed identity can be in any form that the particular application can handle. Typical examples of identity are the username or email address of the user.

User identity is automatically linked to a transaction in applications where Plumbr is capable of determining the location of the identity. In cases where Plumbr fails to detect the identity automatically, you can configure the location of the identity yourself.

By default, Plumbr supports the following frameworks for capturing identity:

  • JWT Bearer tokens. If your application passes the identity of the user in the HTTP request headers using JWT Bearer tokens, Plumbr will use the value of the subject extracted from the token as the identity of the user.
  • Spring Security. If the application monitored by Plumbr uses the authentication built into the Spring Security library, Plumbr will extract the user’s identity from security.core.userdetails.UserDetails.getUsername().
  • Java Authentication and Authorization Service (JAAS). If the application Plumbr is monitoring stores the principal instances in the HTTP Session, Plumbr will extract the identity from security.Principal.getName().

The JWT Bearer token approach can be used for applications monitored only by Plumbr Browser Agents. Spring Security and JAAS detection only works in settings where the application is monitored by the Java Agent (regardless of whether or not the Browser Agent is used).

In case Plumbr has not been able to detect the user’s identity, you can help Plumbr locate the identity yourself via configuring an Identity Detection Rule. The steps needed to achieve this are explained in the following chapter.

Identity Detection Rules

In case Plumbr has not been able to identify the users, you will need to help it find the location of the stored identities. You can do this by configuring the location of the user identity via creating a new  Identity Detection Rule.

Identity Detection Rules can look for the identity of users from two different locations.

In case your application passes along the identity of the user via HTTP Request Headers, you will need to configure a HTTP Header Rule. In such a case, all you need to specify is the name of the HTTP Header from which Plumbr can extract the tracking information. The value for the Header would then be similar to the following:

X-User-Authentication

In case your application does not use HTTP Headers to pass along the identity, you will need to configure a Session Attribute Rule instead. In such a case, you will need to configure two parameters: Attribute Name and Extraction Path.

  • Attribute Nameis the name of the attribute in the session context storing the user’s identity. For example, Spring Security adds its security context under the SPRING_SECURITY_CONTEXT attributeWhen the attribute specified is not found in a particular HTTP Session either because an incorrect attribute name was provided or the application user is not yet authenticated, the Plumbr Agent will not proceed to detect the user’s identity from the Extraction Path.
  • In the Extraction Path field, you should define the exact path where the user identity is stored. For example, if the Spring Security is used, the extraction path used will be getAuthentication().getPrincipal().getUsername().

Combining the two parameters allows Plumbr to look for the identity. Using the Spring Security as an example again and combining the two examples above will result in Plumbr looking for:

session.getAttribute(“SPRING_SECURITY_CONTEXT”).getAuthentication().getPrincipal().getUsername();

Configuration: Example

To explain how you can configure the location of the User Identity let us check the following example. In this example application the successful authentication operation results in storing the User’s identity in the HTTP Session as:

request.getSession(true).setAttribute(“USER_CONTEXT”, new UserContext(ipAddress, username));

where request is instance of javax.servlet.http.HttpServletRequest.

Let’s also assume that the UserContext class would be designed as:

public class UserContext {
    public String getIpAddress() {
        return ipAddress;
    }

    public User getUser() {
        return user;
    }

    private final String ipAddress;
    private final User user;

    public UserContext(String ipAddress, String username) {
        this.ipAddress = ipAddress;
        this.user = new User(username);
    }

    private final class User {
        private final String username;

        public String getUsername() {
            return username;
        }

        public User(String username) {
            this.username = username;
        }
    }
}

So we are adding an instance of the UserContext into session attributes under the key USER_CONTEXT. By default Plumbr Agent does not look the identity from this location. To teach Plumbr Agent how to extract user identity in this case, we would specify the configuration as following:

  • Attribute name: USER_CONTEXT
  • Extraction Path: getUser().getUsername()

Equipped with this knowledge, Plumbr Agent will now be monitoring for setAttribute() events in all HTTPSession instances. Whenever such an event arrives and the attribute set is “USER_CONTEXT”, Plumbr starts capturing the identity.

The identity itself is extracted by invoking getUser().getUsername() on the UserContext object stored under “USER_CONTEXT” key.

If Plumbr agent fails to extract identity with defined Extraction Path then in your application logs you will see the following banner:

 *********************************************************************
* Failed to extract user identity with path: getUser().username             *
* Please check your configuration here:                                                          *
* https://app.plumbr.io/settings/identity-detection                            *
**********************************************************************

Using the same example above, the error message becomes clear. The username attribute in the User class is declared private and cannot be thus accessed. To fix, just change the Extraction Path to be equal to getUser().getUsername().

Alerts

What are Alerts?

Alerts are used to communicate health and uptime degradations. Alerts are designed to be actionable, triggering only when the performance impact is high enough to require action.

Each alert contains the following information:

  • The alert rule that triggered the alert.
  • The time when the alert was triggered.
  • The JVM or service that triggered the violation.
  • The severity/impact of the alert.

Alerts are triggered when the conditions in alert rules are matched. Alerts are sent to alert channels, which you can configure separately.

Alert Rules

Alert rules are used to create actionable alerts and send an alert to a correct channel when the health or downtime of a particular JVM or service has been spanning for a certain period. Some examples of such rules are:

  • Send a message to [Hipchat room MyCompany Sysadmins] when [any service] has been unhealthy for more than [1 hour].
  • Send a message to [Hipchat room MyCompany Sysadmins] when [any JVM] has been down for more than [5 minutes].
  • Send [an email to admins@mycompany.com] when the [e-shop@example.com JVM] has been unhealthy for more than [5 minutes].
  • Send [an email to bigboss@mycompany.com] when the [e-shop@example.com JVM] has been unhealthy for more than [30 minutes].

Each such alert rule will trigger the creation of an alert. Alerts are sent out to alert channels, which you can configure separately.

To avoid alert fatigue/flapping, Plumbr sends out alerts only once during the 24 hours when the alert is triggered from the same JVM or service.

By default, Plumbr alert rules are set to trigger an alert when:

  • Any JVM has been unhealthy for more than 10 minutes.
  • Any service has been unhealthy for more than 15 minutes.
  • Any JVM has been down for more than 10 minutes.

You can either modify existing rules or add new rules based on your specific needs.

Alert Channels

Alert channels are used to communicate alert rule violations to external channels. Examples of such channels include email, SMS, issue trackers, chatrooms and other monitoring solutions. Adding, deleting and changing channels is possible via Settings – Alert Channels menu. Plumbr currently supports the following channels:

  • Email
  • HipChat
  • JIRA
  • PagerDuty
  • Slack

When you have configured a new alert channel, the newly created channel can be used to create new alert rules sending alerts to the channel.

Email

Email channel sends alerts to the email whenever any of the alert rules using the channel is triggered. When choosing email channel, all you need to do when configuring is to provide the email address where the emails are sent. After this, all the alerts triggered by Plumbr will be sent to the email address provided according to the alert rules present in the settings.

Email is also the alert channel configured by default during the sign-up. The e-mail address provided during the sign-up process is used as the endpoint to send alerts to.

HipChat

HipChat channel sends alerts to the Hipchat room whenever any of the alert rules using the channel is triggered. To set up the channel, you would need to provide an existing HipChat room name and notification token. To acquire the token, follow the steps below:

  1. Go to https://hipchat.com/ and sign in
  2. Navigate to “Rooms” section
  3. Select the room where you would like to receive Alerts from Plumbr
  4. Navigate to “Tokens” menu
  5. Fill the form under “Create new token” section
  6. Choose type “Send Notification” from the drop down menu
  7. Enter “Plumbr” to the label input
  8. Click “Create”
  9. Copy the created token to the “Notification token” input field

When the channel is created all the alerts triggered by Plumbr will be sent to the HipChat room specified according to the alert rules present in the settings.

JIRA

JIRA channels creates an issue in the JIRA whenever any of the alert rules using the channel is triggered. To set up the channel, you would need to provide the following information about the issues created:

  • JIRA Base URL. The URL your JIRA is located. You can acquire this URL from https://your-company-name.atlassian.net -> Settings -> System
  • Username. User present in your JIRA who will create the ticket. We recommend
  • Password. Password for the user specified.
  • Project Key. The key of the project the issue will be created in.
  • Issue Type. The type of the issue created in JIRA (Bug, Task, …)

When the channel is created all the alerts triggered by Plumbr will trigger creation of new issues in the JIRA instance specified according to the alert rules present in the settings.

PagerDuty

PagerDuty channels send alerts to PagerDuty whenever any of the alert rules using the channel is triggered. To set up the channel, you would need to provide the PagerDuty integration key via following steps:

  1. Go to https://your-company-name.pagerduty.com/services/new
  2. Fill “General Settings” form
  3. In “Integration Settings” section choose “Use our API directly”
  4. Choose desired “Incident Settings”
  5. Click on “Add Service” button
  6. Copy generated “Integration Key” to corresponding input field

When the channel is created all the alerts triggered by Plumbr will be sent to the PagerDuty channel specified according to the alert rules present in the settings.

Slack

Slack channel sends alerts to the Slack channel specified via a webhook. To set up the channel, you would need to provide the name of an existing Slack channel and a webhook. TO acquire the webhook URL, follow the steps below:

  1. Go to integration settings in your Slack account
  2. Choose a desired channel from “Post to Channel” dropdown
  3. Click on “Add Incoming WebHooks Integration” button
  4. Copy generated URL to the “Webhook URL” input field

When the channel is created all the alerts triggered by Plumbr will be sent to the Slack channel specified according to the alert rules present in the settings.

Configuration

Java Agent Configuration

Java Agent configuration is stored in the file plumbr.properties located next to the Plumbr Java Agent .jar file. This configuration is used to monitor a single JVM in one machine. When monitoring multiple JVMs in the same machine make sure that every JVM uses a different Plumbr installation to avoid clashing in the configuration.

If you cannot use the configuration via property files, an alternative is to configure the Agent to specify parameters in the JVM startup script using -D parameters, prefixing each parameter with the “plumbr.” namespace. So, for example, you could specify the accountId, jvmId, appId and serverUrl parameters for your JVM also via:

java -Dplumbr.accountId=a8nd2bar -Dplumbr.jvmId=PtJ -Dplumbr.serverUrl=https://app.plumbr.io

Please note that if you do not use a property file, you will need to pass the following properties via parameters: accountId, jvmId, serverUrl, logFile, logLevel. Copy their values from the property file that you have.

Basic configuration

Configuration parameters in this section are required for the Plumbr Agent to connect to the Plumbr Server, link the JVM to your account, and identify the JVM so that you could distinguish between the different JVMs monitored by Plumbr.

  • accountId – your account identifier, which binds this Agent to your account in the Plumbr Server. This identity is generated and embedded into the downloaded Agent configuration for you. During normal use, you should not change the value of the parameter.
  • jvmId – JVM identifier, binding data from this particular JVM to a correct JVM in the Plumbr Server. When this identity is not provided, the connected JVM gets assigned a temporary identifier, which will not survive over JVM restarts. In order to have the data connected to the same JVM, provide the identifier either as a value of this property or via the server-side UI.
  • appId – (optional) application name under which all transactions from this JVM will be reported. If not specified, the application name is first tried to be derived from a URL of HTTP request. If no URL is available, application name falls back to jvmId.
  • clusterId –  (optional) identifier of the JVM cluster. JVMs with same clusterId are grouped together on the server side, where appropriate (like Architecture views). Setting clusterId is useful when individual JVMs run the same code and/or run in dynamically provisioned environments. If the value of clusterId is unspecified, no cluster grouping is applied to this JVM.
  • serverUrl– the server to which the Agent connects. If you are using On Demand Plumbr, make sure the value refers to https://app.plumbr.io If the Agent is connecting to a Plumbr Server installed in your premises, make sure you have specified the correct server URL.

Proxy configuration

When your network configuration requires outgoing communication to pass a proxy server you can set up the communication between the Plumbr Agent and the Server via a proxy. Specifying the values for these parameters redirects the traffic from the Agent to the Server via the proxy server specified in proxyUrl.

  • proxyUrl– the proxy URL that you can use to connect to the Plumbr Server if a direct connection from Agent is not possible. If proxy is used, this setting is mandatory; other proxy settings are optional. An example of the parameter: proxyUrl=http://squid.mycompany.com:3128.
  • proxyAuthUser– the username for proxy authentication. Note that the Plumbr Agent only supports Basic authentication.
  • proxyAuthPassword– the password for proxy authentication. Note that the Plumbr Agent only supports Basic authentication.

Logging configuration

Parameters in logging configuration are used to tune the logging of the Plumbr Agent.

  • logConf – the location of the Logback configuration file Plumbr Agent uses for logging purposes. Detailed logging configuration is embedded in the referred XML file which you can tune to suit your needs.
  • tmpDir– the location of the temporary files generated by Plumbr during runtime. Such temporary data includes the buffered data Agent has not yet sent to the Server and temporary data structures used during Agent-side analysis. The location is relative to the location of the Plumbr Agent in the file system.
  • doCleanup– whether or not Plumbr deletes the temporary files after they are no longer needed. Switch this to false only when told so by Plumbr support.

Network Configuration

When your network configuration is blocking Plumbr Agents from connecting to the Plumbr Server, you will see a message similar to the following in your JVM standard output logs:

****************************************************************
* Plumbr Server not responding -                               *
* cannot connect to https://app.plumbr.io.                  *
* Retrying in 60 seconds.                                      *
****************************************************************

You should also notice that although the Server cannot be reached, the JVM will still start, it will just not be monitored by Plumbr.

To verify the problem is related to network configuration only, try connecting to app.plumbr.io port 443 from the machine you are installing Plumbr, to see whether the connection is allowed. This can for example be achieved via telnet, similar to following:

$ telnet app.plumbr.io 443
Trying 54.171.1.110...
Connected to app.plumbr.io.
Escape character is '^]'.

When the connection is successful, you should see a message Connected to app.plumbr.io, similar to the example above. When the connection fails, the network configuration is blocking connections to app.plumbr.io port 443.

To overcome the situation, connections via proxy servers or relaxing the firewall configuration are the first two options recommended. If you can not change the network configuration, you should turn to our On Premise offering where you can install the Server component in your network.

Proxy configuration

When your network configuration requires outgoing communication to pass a proxy server, you can set up the communication between the Plumbr Agent and Server via proxy. Specifying the values for the following parameters in the plumbr.properties file located next to the Plumbr Agent redirects traffic from the Agent to the Server via the proxy server specified in the proxyUrl.

  • proxyUrl– the proxy URL that you can use to connect to the Plumbr server if direct connection is not possible. If proxy is used, this setting is mandatory; other proxy settings are optional. An example of the parameter: proxyUrl=http://squid.mycompany.com:3128.
  • proxyAuthUser– the username for proxy authentication. Note that the Plumbr agent only supports Basic authentication.
  • proxyAuthPassword– the password for proxy authentication. Note that the Plumbr agent only supports Basic authentication.

Firewall configuration

Another option for bypassing connectivity issues is to check your firewall configuration. If the outgoing connections from the Plumbr Agent are blocked by a firewall make sure the connection to app.plumbr.io:443 is allowed in your firewall configuration.

The exact configuration steps for this are firewall-specific. See your vendor manuals for further information.

Upgrading Java Agent

Automatic Upgrade

Starting from version 17.07.11, Plumbr Java Agents are capable of automatically upgrading their version. You can enable/disable this functionality under Settings menu, in case the default enabled option is not suitable for you, either due to the nature of your deployment (throwaway containers for example) or the company policy.

Whenever new Plumbr Java Agent  version is released, existing agents connected to the Plumbr Server will download updated version. The switch to newly downloaded version happens after next JVM restart.

More detailed flow of the auto-update for those who want to take a peek under the hood:

  1. When connection is established between the Java Agent and Server, Agent checks for whether or not the auto-update is enabled. If the auto-update is disabled, the process is aborted.
  2. After connection establishment, a message is sent to the Java Agent if its build number is lower than the build number of the highest known Agent version. The message contains the version to update to, the checksum and download URLs for that specific version
    1. The very same message is broadcasted to all connected Java Agents each time you change the setting from enabled to disabled or vice versa in settings menu.
  3. When the Java Agent receives the message, it makes a request to the checksum URL to download the new Agent and performs the checksum verification.
  4. The downloaded ZIP is unzipped in a separate directory in installation dir which has a name starting with version.
  5. On next JVM restart where the particular Agent got attached to, the wrapper will automatically select the new version because it has a higher build number than the previous one.

Notice that auto-updating is only possible for our On Demand customers. On Premises users must use the manual agent update process.

Manual Upgrade

If you need to manually upgrade Plumbr Java Agent (either from pre 17.08.05 version or because of your company policies) you need to go through the following steps:

  1. Download the new Java Agent from here.
  2. Backup the current Agent installation.
  3. Unzip the newly downloaded agent .zip file to the folder you wish to install the Plumbr Java Agent to.
  4. Copy plumbr.properties from the backup to new Agent installation.
  5. Update startup parameters of the JVM you are monitoring to point to new Agent JAR file:
    -javaagent:path-to-new-agent/plumbr.jar
  6. Restart the JVM you want to monitor.

Upgrading Plumbr Server

Plumbr Server update is based on building new Docker images and mounting the data to the newly built images. No data stored in the docker images is thus preserved, so you cannot expect any manual configuration changes made to existing docker machines to be preserved.

To upgrade Plumbr Server you need to go through the following steps.

  1. Download a new version of Plumbr Server distribution from Download Center.
  2. Extract downloaded archive on top of the existing plumbr-server folder replacing all existing files.
  3. Restart Docker Compose project by running “./launch.sh” from plumbr-server This will download all updated Docker images and then recreate all affected containers.
  4. After process completes, new version of Plumbr Server is now available at same URL as previously.

As a next step, the Plumbr Agents connecting to the Server need to be upgraded. You can do this independently of the Server update, but for consistency you need to eventually also upgrade the Agents.

On-Premise OOM analysis

Running analysis when an OutOfMemoryError occurs in an application is computationally intensive and requires an amount of RAM proportional to the number of objects in the JVM. Therefore, when running your own Plumbr Server on-premise, additional steps must be taken to find root causes for an OutOfMemoryError.

Semi-automatic analysis

The OutOfMemoryError meta-information snapshot will always be automatically sent by Plumbr Agent to the Plumbr Server after such error occurs. A corresponding Root Cause screen will appear in Plumbr Server, prompting you to do the following:

  1. Select or create a machine with an amount of RAM as specified on the page
  2. Download a .jar file that would perform the analysis to that machine
  3. Run it, supplying the required amount of heap to the JVM by specifying the -Xmx argument

The program will automatically download the meta-information from Plumbr Server, run the analysis and upload a complete report back to Plumbr Server. This assumes two things:

  • You have specified the plumbr.server.url property in the server properties or set it in the web interface
  • The machine that runs the analysis has access to the machine where Plumbr Server is running

In case condition (1) is not met, you can still run the analysis by supplying a property to the jar file: -Dportal.url=https://address-of-your-plumbr-server-installation.

Running analysis from behind a firewall

In case access to the Plumbr Server is restricted by a firewall, some additional manual actions are required:

  1. Click on “Detailed information” on the Root Cause page to follow the instructions
  2. Copy the meta-information files named oom_dump_v4.tbz2 and oom_dump_info.txt to the target machine
  3. Supply the path to the copied .tbz2 file to the jar running command
  4. Supply the path ot where the report should be saved, e.g. report.bin
  5. Run the analysis, e.g. java -Xmx1g -jar analyze-oom ~/oom-analysis/1/oom_dump.tar.bz2 report.bin
  6. Upload the report.bin file to the corresponding form on the Root Cause page

Data retention
By default, meta-information files will be immediately deleted upon successful analysis. Data files that date back more than 30 days will also be deleted, even if no analysis was performed on them. At any time you may manually delete the dumps from the ${plumbr.server.home}/data/dumps folder.

Identifying the JVM

When you configure your application to run with Plumbr, you have an option to identify this JVM with a name suitable for your deployment. “Payment Live” or “Reporting QA” as examples can give you an idea what this ID can look like. Assigning the ID can be done in three ways:

  • By setting “jvmId” property in the plumbr.properties file. This file resides in the same directory with “plumbr.jar” file. Just find in that file a line starting with “jvmId=” and append the selected name, similar to the example: “jvmId=Payment
  • By providing your JVM with “plumbr.jvmId” system property. Just add “-Dplumbr.jvmId=Payment” to your application command line. E.g. “java -Dplumbr.jvmId=Payment -javaagent:/path/to/plumbr/plumbr.jar …”. This option is useful in dynamic environments, where JVMs are created and destroyed dynamically via scripts.
  • When your application is already running and is connected to the Plumbr Server, then by going to this JVM detail view and by clicking on JVM name, providing a new name and clicking “Save”.

If you don’t manually identify your JVM as described above, it gets assigned an auto-generated ID. The generated ID will be ephemeral in the sense that it will not persist between JVM restarts. When you restart your application and have not provided the ID yourself, then a new JVM with a new identifier will be created in Plumbr Server.

Agent startup checks

While starting the Agent, Plumbr goes through the following checks to verify the integrity of the installation:

  1. Verifying file system permissions: whether or not the folder Plumbr Agent resides and its subdirectories are readable and writable by the user launching the JVM Plumbr is attached to.
  2. Verifying the configuration: whether or not all the required configuration parameters are present and valid.
  3. Verifying the support for the environment: whether the OS, JVM and application server used are supported by Plumbr
  4. Verifying the connectivity: whether the Plumbr Agent can connect to Server.
  5. Verifying the Agent version: whether the Agent is still supported by the Server the Agent is connecting to
  6. Verifying the subscription: whether the account has an active subscription or has the subscription period expired
  7. Other miscellaneous checks, for example including but not limited to:
    1. If a proxy server is used to connect to Plumbr Server, then whether the proxy can be connected with the credentials provided
    2. Whether or not the jvmId parameter used to identify the JVM connecting to Server is unique
    3. Whether or not multiple JVMs are using the same Plumbr installation.
    4. Check for the jvmId does belong to the account it tries to connect to.

Some of the steps can fail due to various reasons. In such case the Agent will not be attached and the JVM starts without the Plumbr Agent monitoring the end user experience. The reason for the failure will be exposed in the server’s standard output.

So in order to find whether or not the Agent failed to initialize, search the log files for “Plumbr” phrase. In case you discover one of the error messages listed below, follow the instructions specified in the particular error message.

Verifying file system permissions

********************************************************************
* Plumbr encountered a filesystem permissions error.               *
* The user that runs your Java process has no write access to the  *
* /users/me/plumbr                                                 *
* Plumbr needs write access to the whole directory.                *
*                                                                  *
* Please ensure that the user that runs your Java process has      *
* read and write permissions for that directory,                   *
* its sub-directories and files inside it.                         *
*                                                                  *
* Check out https://plumbr.io/support/agent-configuration          *
* for more information or contact support@plumbr.io                *
********************************************************************

When you encounter this error message in log files, it indicates that the user launching the JVM Plumbr Agent is attached to does not have enough permissions to access the Plumbr Agent installation directory.

Plumbr Agent needs to be able to read and write the folder the Agent is installed (and its subdirectories). In order to proceed, you need to ensure the user running the Java process the Agent is attached to has read and write permissions for both the Agent’s installation folder and its subdirectories.

Verifying the configuration

******************************************************************************
* Plumbr is missing the following required properties: serverUrl.            *
* Either make sure the plumbr.properties file is present next to plumbr.jar  *
* or specify individual properties via -D parameters in your startup script. *
*                                                                            *
* Check out https://plumbr.io/support/agent-configuration                    *
* for more information or contact support@plumbr.io                          *
******************************************************************************

When you face such a banner in your JVM startup scripts then either the entire configuration stored in the file plumbr.properties file or some of the mandarory properties are missing. The plumbr.properties file is located next to the Plumbr Agent .jar file.

First step to overcome the problem is to make sure the Plumbr installation directory is intact and you have not extracted only the Agent’s plumbr.jar file. In case the file is present, check the content of the error message indicating the missing mandatory parameter(s) and add such parameters to the file. When in doubt, check the Agent Configuration page in our support materials.

An alternative to have the configuration present in the plumbr.properties file in the filesystem is to configure the Agent to specify parameters in the JVM startup script using -D parameters, prefixing each parameter with the “plumbr.” namespace. So, for example, you could specify the accountId, jvmId and serverUrl parameters for your JVM also via:

java -Dplumbr.accountId=a8nd2bar -Dplumbr.jvmId=BillingProduction -Dplumbr.serverUrl=https://app.plumbr.io

Verifying the support for the environment

**********************************************************************************************
* Environment you are trying to run Plumbr in is not supported.                              *
* Windows XP operation system is unsupported. Minimum supported Windows version is Windows 7.*
*                                                                                            *
* Check out the support page https://plumbr.io/support/is-my-environment-supported-by-plumbr *
* for the list of supported environments.                                                    *
**********************************************************************************************

When facing a banner similar to the one above in your log files, the environment the Plumbr Agent is running in is not supported by the Agent. The exact message will be different, depending on which unsupported operating system, JVM vendor/version or application server was detected in the environment.

To overcome the issue, consult the list of supported environments in our support documentation to find out whether or not you have a possibility to use Plumbr in an environment officially supported by Plumbr.

Verifying the connectivity

***********************************************
* Plumbr Server not responding -              *
* cannot connect to https://app.plumbr.io. *
* Retrying in 60 seconds.                     *
***********************************************
***************************************************************
* Plumbr Server not responding -                              *
* cannot connect to https://app.plumbr.io.                 *
* Retrying in 60 seconds.                                     *
*                                                             *
* In case your network configuration is blocking connections  *
* to Plumbr servers, see how to configure proxy server and/or *
* firewall https://plumbr.io/support/network-configuration.   *
*                                                             *
* In case your company policy does not allow using externally *
* hosted services, try out Hosted Plumbr which does not       *
* require external network connections                        *
* https://plumbr.io/support/hosted-plumbr                     *
***************************************************************

When you face a banner similar to the either of the above in your JVM log files, then it indicates the Agent deployed cannot connect to Plumbr Server over the network. Plumbr Agents will start without the presence of the Server, but as there is no endpoint to send the harvested data, then the Server cannot analyze the gathered data and thus you receive no value from Plumbr.

Pay attention that when the Server endpoint is just temporarily unavailable, the Agent will buffer the data locally. The Agent also periodically retries to connect to Server and when the Server (re)appears, the buffered data will be sent to Server.

To verify that the problem is related to network configuration only, try connecting to app.plumbr.io port 443 from the machine you are installing Plumbr, to see whether the connection is allowed. This can for example be achieved via telnet, similar to following:

$ telnet app.plumbr.io 443
Trying 54.171.1.110...
Connected to app.plumbr.io
Escape character is '^]'.

When the connection is successful, you should see a message Connected to app.plumbr.io, similar to the example above. When the connection fails, the network configuration is blocking connections to app.plumbr.io port 443.

To overcome the situation, connections via proxy servers or relaxing the firewall configuration are the first two options recommended. If you can not change the network configuration, you should turn to our On Premise offering where you can install the Server component in your network.

Verifying Agent version

During the startup, the Agent version will be compared to the Server version to verify whether or not the Agent is still supported by the Server the Agent is connecting to. In general we use the following policy for Agent version support

  • Servers accept connections from Agents up to one year older than Servers. Agents older than one year will be rejected by Server.
  • Servers will not be compatible with Agents released later than the Server.

Recommending to upgrade

************************************************************************************************
* You are using version 16.08.02 of Plumbr. We recommend upgrading to the latest version 12356.   *
* Download the latest version of the agent here: https://app.plumbr.io/download/agent/16.09.20 *
************************************************************************************************

When facing a banner like the one above, your currently used Agent is between 1 to 3 months behind the latest and greatest Agent available. As we add new features almost every month, you should consider upgrading, but you still have nothing to worry about.

Strongly recommending to upgrade

********************************************************************************************************
* You are using version 16.08.02 of Plumbr, which will be supported only until 2017-01-01                 *
* Download the latest version (16.12.12) of the agent here: https://app.plumbr.io/download/agent/16.12.12 *
********************************************************************************************************

When facing the banner above, your current Agent is between 3 and 6 months older than the Server the Agent connects to. The Server will still support the connecting Agent, but you should start planning for the Agent version upgrade t

Deprecated Agent version

********************************************************************************************************
* You are using deprecated version 16.08.02 of the Plumbr agent.                                         *
* Download the latest version (17.04.02) of the agent here: https://app.plumbr.io/download/agent/17.04.02 *
********************************************************************************************************

When facing the message above, then the Agent connecting to the Server is from 6 to 12 months older than the Server the Agent connects to. The version is already deprecated and will be unsupported when the 12 months limit will be hit. You should plan for Agent version upgrade as soon as possible.

Unsupported Agent version

**************************************************************************************************************
* You are using unsupported version 16.08.02 of the Plumbr Agent, which can no longer connect to Plumbr Server. *
* Please upgrade to the latest version of the agent from: https://app.plumbr.io/settings/download-center *
**************************************************************************************************************

When seeing an error banner above, the Agent can no longer connect to the Server as the Agent version is older than the oldest Agent version supported by this particular Server. You need to upgrade to a new Agent version in order to proceed benefitting from Plumbr.

Agent version newer than Server version

*************************************************************************************
* You are using unsupported version 16.08.02 of the Plumbr Agent                       *
* Plumbr Server accepts only agents released before the server                     *
* Please refer to Download Center https://app.plumbr.io/settings/download-center *
* to get supported version of Plumbr Agent                                         *
*************************************************************************************

When facing an error message above in your JVM logs, the Agent connecting to the Server is newer than the Server. As Server accepts connections only from Agents it knew existed when the Server was released, this “agent from the future” is not allowed to connect.

When facing this message you are using our On Premise offering where you have installed the Server yourself. This means you have two options:

  • Preferably you should upgrade the Server, so that new Agents with new and shiny features can apply all their new features in your deployment.
  • If this is not possible, you need to use older Agent version, so that the Agent is not released later than the Server it connects.

Verifying the subscription

Plumbr is a subscription-based software with 14-day free trial subscription available. When the subscription has expired, the data on your account is kept for 10 more days, but you can no longer monitor the applications with Plumbr. After the 10 days have passed from your subscription expiring, the data on your account will be permanently deleted.

Expiration warning

***************************************************************************************
* Your subscription will expire on 2016-01-01.                                        *
* From 2016-01-01 Plumbr will not monitor your JVM(s) any more.                       *
* To renew your subscription get in contact with Plumbr Sales at sales@plumbr.io.     *
***************************************************************************************

When encountering such a warning in your log files, your subscription will expire soon. If you wish to benefit from Plumbr after the subscription period, you should start planning for the subscription extension.

Paid account expired

**********************************************************************************************
* Your subscription expired on 2016-01-01 and Plumbr is not monitoring your JVM(s) any more. *
* Your data will be available for 10 days, after which your account will be deleted.         *
* To renew your subscription go to https://app.plumbr.io/payment                          *
**********************************************************************************************

If you encounter the message above, it means your subscription has expired. The data is still present on your account, but you can no longer monitor any JVMs. Whenever the 10 days have passed from the subscription expiration, the data will be permanently deleted.

To keep benefitting from Plumbr, extend your subscription.

Paid account deleted

******************************************************************************
* As you subscription was not renewed your Plumbr account has been deleted   *
* and you cannot monitor your JVM(s) with Plumbr any longer.                 *
* To start monitoring your JVM(s), sign up and purchase Plumbr               *
* one year subscription: https://app.plumbr.io/payment                    *
******************************************************************************

Encountering the message above indicates your subscription has expired and more than 10 days have passed from the expiration date. The data on your account has been deleted.

The way to start reusing Plumbr is to purchase a new subscription.

Trial account expired

***************************************************************************************
* Your free trial is now expired and Plumbr is not monitoring your JVM(s) any more.   *
* Your data will be available for 10 days, after which your account will be deleted.  *
* To activate your account go to https://app.plumbr.io/payment                     *

The message above indicates that the free trial you used has expired. You can no longer monitor the JVMs with Plumbr. The data gathered during the trial is still available for you until 10 days have passed from the trial expiration. After this, the data on your account will be permanently deleted.

Trial account deleted

*************************************************************************************
* As your free trial expired your Plumbr account has been deleted and you cannot    *
* monitor your JVM(s) with Plumbr any longer. To start monitoring your JVM(s),      *
* sign up and purchase Plumbr subscription: https://app.plumbr.io/payment        *
*************************************************************************************

The banner indicates that your free trial has been expired and the data on your account has been deleted.

If the trial demonstrated the value of Plumbr to you, then the way to keep using Plumbr is to switch to a paid subscription.

Miscellaneous checks.

Besides the categories above, Plumbr Agent performs a number of other checks, which can also result in error/warnings being printed into the JVM standard output.

Connecting to wrong Server

***************************************************************************************************
* The account does not exist at htts://my-plumbr-sever-installation:8080/.                                         *
* Check plumbr.properties to make sure you are connecting to the correct Plumbr Server instance.  *
* If indeed so, contact support@plumbr.io                                                         *
***************************************************************************************************

When facing the error above, the Agent is connecting to a Server using the accountId the Server is not aware of. This usually means you are connecting to an incorrect Plumbr Server. If this is the case, just make sure the serverUrl in plumbr.properties is pinpointing towards the correct Server.

When it is not the case, contact our support@plumbr.io and let us figure out the source for the problem.

Lambda support in early Java 8 releases

*******************************************************************************************************
* There is a known issue with Java versions 1.8.0 - 1.8.0_31 where using Java agents                  *
* together with code that uses dynamic invocation (such as lambdas or dynamic languages)              *
* may cause segmentation faults. If these are not used in your application, your JVM may be safe,     *
* but for production sites we do not recommend using Plumbr with Java 8 versions older than 1.8.0_40. *
* To make sure this problem will not occur, either:                                                   *
*   a) Upgrade your Java version to 1.8.0_40 or newer                                                 *
*   b) If upgrading Java version is not possible, turn off JIT compilation for java.lang.invoke       *
* package by specifying -XX:CompileCommand=exclude,java/lang/invoke/ in your JVM startup script.      *
*******************************************************************************************************

When facing the warning above, you are running on an early Java 8 build which are known to contain bugs which will affect your JVM when you are making use of lambdas or dynamic languages along with any Java Agents attached to the JVM.

The application might work fine, but in order to make sure you will not run into any issues, please consider either

  • upgrading the Java version to 1.8.0_40 or newer
  • Turning off JIT compilation, as specified in the error message.

Native agent loading failure

*****************************************************************
* Native agent could not be loaded from                         *
* /users/me/plumbr                                              *
* This may be caused by missing read or execute permissions for *
* plumbr home directory or one of its subdirectories.           *
*                                                               *
* Check out https://plumbr.io/support/agent-configuration       *
* for more information or contact support@plumbr.io             *
*****************************************************************

When facing the error above, the filesystem permissions for native agents located in lib/ folder next to the Agent’s plumbr.jar file are not readable or executable by the user launching the JVM Plumbr is attached to.

To solve the problem you would need to make sure the user launching the JVM Plumbr Agent is attached has read and execute permissions for the lib/ folder and its subdirectories.

In all honesty, this is one of the cases we do not fully understand can be created. So if you are facing this situation, we would really appreciate if you could contact support@plumbr.io so we could understand how on earth this permission issue can even happen.

Proxy credentials missing

**********************************************************************************
* The Proxy server at your-proxy-server.ip:3039 is requesting a username and a password.           *
* Please add them to plumbr.properties file. You can find the instructions here: *
* https://plumbr.io/support/manual#network-configuration                         *
**********************************************************************************

When seeing the banner above in your JVM standard output, then you are trying to connect from the Agent to Server using a proxy server. The proxy server requires authentication but the configuration you provided in plumbr.properties does not contain username and password.

To solve the issue, provide the correct username and password in the Plumbr configuration to access the proxy.

Multiple JVMs using the same jvmId

*************************************************************************************************************************
* The JVM ID “myjvmid" is already in use. This happens when multiple JVMs are connecting                                *
* to Plumbr Server using the same jvmId configuration parameter specified in plumbr.properties file.                    *
* In order to solve the problem, download new Plumbr agent from here: https://app.plumbr.io/settings/download-center *
* and make sure the agent location in the new JVM refers to a different Plumbr installation in file system.             *
*************************************************************************************************************************

When facing the error message above, the Plumbr Agent was not started. It was so due to a JVM already being connected to the Server using the same jvmId as specified for the rejected JVM.

This can happen when you have copied the Plumbr installation used by one JVM and are using it for the second JVM. The jvmId specified in the plumbr.properties file must be unique, so to solve the issue you would need to make sure all the JVMs you want to monitor with Plumbr have a unique jvmId specified in the configuration (either in plumbr.properties or passed as -D parameter).

Multiple JVMs accessing the same Plumbr installation

**************************************************************************
* Working directory is locked. This might happen when you launch several *
* applications from the same plumbrHome at the same time.                *
* In order to solve the problem, download new Plumbr agent from here:    *
* https://app.plumbr.io/settings/download-center                      *
* and make sure the agent location in the new JVM refers to a different  *
* Plumbr installation in file system                                     *
**************************************************************************

When you encounter the error above, your are trying to launch two JVMs both loading the Plumbr Agent from the same location in the filesystem. Pay attention that each JVM monitored by Plumbr must use an unique Plumbr installation.

To overcome the issue, create a separate Plumbr Agent installation for each JVM monitored and load the javaagent from different locations for each JVM monitored.

Data backup & restoration: On Premises

Plumbr Server stores all persistent data in PLUMBR_SERVER_HOME/data folder on the host server running docker containers. We have provided a sample backup script called “backup.sh” that can be used to preserve the most important data. Just run it periodically (e.g. via cron job) as follows:

cd $PLUMBR_SERVER_HOME

./backup.sh /my/backup/destination

Please note, that only final aggregated data which is presented in the Plumbr Server UI is preserved. Raw probe data sent by Plumbr Agents as well as all intermediate partially processed data is not backed up.

In order to restore Plumbr Server installation after some disaster or relocation to another server do the following:

  • Install Plumbr Server on new server as described in Plumbr Server Installation Manual
  • Run Plumbr Server and wait for 5 minutes until it creates the required structures, both internally and on the file system in data subfolder
  • Run the provided script “restore.sh”

Java Agent API

Introduction

Plumbr Agent API enables programmatic control over:

  1. Application naming (see description of an application for more details)
  2. Service naming (see description of a service for more details)
  3. Identification of users (see description of users for more deatils
  4. Transaction boundary definition (see definition of a transaction for more details)

The following guide describes how to install the api dependency and how to use it.

Installation

To start using the Plumbr Agent API, agent-api.jar must be added as a dependency to your project. When running the application without the Plumbr Agent attached, all calls to the library will be silently ignored without any performance impact. When the Plumbr Agent sees the attached Agent API library, it will perform the requested integration calls.

The Agent API is published on Bintray () and Maven Central.

To add the dependency, copy and paste the suitable snippet for your build system from the respective Bintray or maven central page.

To use the library in the code, the following import must be added to your source file:

import eu.plumbr.api.Plumbr;

Terminology

Span represents some time that the application has spent executing in one thread. Spans may be started and finished. Once a span starts, it becomes associated with the current processing thread and all root causes, which are detected within that thread are associated with the active span.

A span may contain any number of child spans. Child spans may be associated with threads either in the same JVM, or in a different JVM, which also is monitored by the Plumbr Java Agent.

A span may have metadata associated with it, which is shown in the single transaction view of an unhealthy transaction which that span belongs to.

A Transaction is a tree of spans, which consists of a root span and all of its children. The transaction has some additional properties that describe that tree of spans. These properties include:

  • a transaction ID (a UUID, generated automatically)
  • an application name (taken from the root span)
  • a service name (taken from the root span)
  • an identifier of a user (taken from the root span)

Creating new transactions

When to use: Plumbr Agent fails to automatically discover transactions in a given application.

How: In this case, the transaction should be created manually by first calling eu.plumbr.api.Plumbr.newSpan(), then configuring the service name and application of the span and calling eu.plumbr.api.Span.start() to start it and eu.plumbr.api.Span.finish() to end it.

Example:

Plumbr
  .newSpan()
  .setAppName("My application")
  .setServiceName("My Service")
  .setUserId("user@domain.com")
  .start();

try {
  // do work
} catch (Exception e) {
  Plumbr.getCurrentSpan().fail(e);
} finally {
  Plumbr.getCurrentSpan().finish();
}

Setting transaction attributes

When to use: Plumbr Agent is able to detect transactions, but fails to assign meaningful service name, application or user ID to them.

How: In this case, eu.plumbr.api.Plumbr.getCurrentSpan() should be called to get a reference to the automatically created span and then the properties of that span be set with the corresponding methods in eu.plumbr.api.Span:

setServiceName(String serviceName)
setAppName(String appName)
setUserId(String userId)

The getCurrentSpan() is null-safe and thus it never returns null. If there is no current span in the current thread, then an instance of eu.plumbr.api.null.NullSpan is returned instead. It is, in turn, a null-safe implementation of the Span. So, if an agent is not attached, then you still can call all the setters on the object returned by the getCurrentSpan() without any additional null-checks. In most cases this is sufficient.

If you really need to check whether there is a current Plumbr span within the current thread (for example if the code which you want to monitor, can be called both from within a Plumbr transaction and without such), then method Span.isNull() will return true, if there is a current span, and false if the returned span is a null-span.

Example:

Plumbr.getCurrentSpan().setUserId("my precious user");
Plumbr.getCurrentSpan().setServiceName("my precious service");
Plumbr.getCurrentSpan().setAppName("my precious application");

Create failed transaction with a custom exception

When to use: Plumbr is able to detect transactions, but is unable to automatically detect if they fail or associate the correct exception with the failure.

How: In this case, eu.plumbr.api.Plumbr.getCurrentSpan() should be called to get a reference to the automatically created span and then eu.plumbr.api.Span.fail(Throwable) be called to mark the span as failed and to optionally associate a specific exception as a root cause for the failure.

Example:

try {
 // do something that throws wrapped exception
} catch (Exception e) {
  Plumbr.getCurrentSpan().fail(e.getParent());
}

Join remote spans to existing transaction

When to use: a request made from a transaction causes a new transaction in a remote application where linking it as a child span into the first transaction is desired.

How: In this case, before calling the remote service, the caller should create a child span in the current span by calling first eu.plumbr.api.Span.createChildSpan() and then serializing it using eu.plumbr.api.SpanSerializer. This serialized span can then be included in the request to the other application (which should be also monitored by the Plumbr agent) and deserialized there with eu.plumbr.api.SpanSerializer and should then be started and finished manually using calls to eu.plumbr.api.Span.start() and eu.plumbr.api.Span.finish() respectively. After the call to the remote span finishes, the calling side must acknowledge that by calling eu.plumbr.api.Span.finishChildSpan(childSpan). See full examples below.

Listing 1: Managing a child span in the parent process:

Span childSpan = Plumbr.getCurrentSpan().createChildSpan();
String serializedChildSpan = SpanSerializer.toBase64(childSpan);

// Transfer serializedChildSpan to another machine.
// See Listing 2 about what to do there.
try {
try {
// perform remote call
} finally {
	Plumbr.getCurrentSpan().finishChildSpan(childSpan);
}
} catch (Exception e) {
	// If this failed remote call should fail the transaction:
	Plumbr.getCurrentSpan().fail(e)
}

Listing 2: Working with a child span on remote JVM:

String serializedChildSpan = … // obtain a serialized child span
Span span = SpanSerializer.fromBase64(serializedChildSpan);
span.start()
try {
	…
catch (Exception e) {
	span.fail(e);
} finally {
	span.finish()
}

Browser Agent Configuration

Browser Agent Configuration

Configuration of the browser agent is done in the data-plumbr attribute of the script tag used to load the agent:

<script 
  src="https://browser.plumbr.io/pa.js" 
  data-plumbr='{
    "accountId" : "abcde..", 
    "appName"   : "Marketing site", 
    ...
  }'>
</script>

Make sure that the quotes used to define the attribute are being escaped properly. If the settings are generated dynamically (for example to include user information), it is recommended to use the backend framework/language methods for JSON and HTML entities encoding. For example in an EJS template:

<% var plumbrSettings = {
    accountId: "abcde...",
    serviceName: "User's profile"
} %>

<script src="https://browser.plumbr.io/pa.js" data-plumbr="<%= JSON.stringify(plumbrSettings) %>"></script>

Basic Configuration

The following settings are required for the browser agent to run:

accountId Your Plumbr account identifier. This is included in the embed code shown to you in portal. During normal use you do not need to change this.
serverUrl The server to which browser agent sends data to. If you are using on demand Plumbr make sure this value refers to https://bdr.plumbr.io. If the agent should connect to on premise Plumbr server make sure it is set accordingly.

Transaction Configuration

Transaction Defaults

The following configuration options apply to all of the transactions generated via the browser agent and are equivalent to calling the same Browser Agent API methods.

Application Name – appName

Set the Application Name of all transactions generated on this page.

Example

<script src="https://browser.plumbr.io/pa.js" 
  data-plumbr='{
    "accountId": "abcde..",
    "serverUrl": "https://bdr.plumbr.io"
    "appName": "Public Store"
  }'>
</script>
User Identity – userId

Set the User Identity of all transactions generated on this page

Example

<script src="https://browser.plumbr.io/pa.js" 
  data-plumbr='{
    "accountId": "abcde..",
    "serverUrl":"https://bdr.plumbr.io"
    "userId": "John Doe"
  }'>
</script>

Page Loading Transaction

These configuration options apply to the transaction generated from the user loading the page that the browser agent is included on. For example the act of loading this page by you right now.

Response Code – responseCode

Due to technical limitations the response code of page load is inaccessible to us. In order to mark the page load span with a relevant HTTP response status it must be set in the configuration.

Example

<script src="https://browser.plumbr.io/pa.js" 
  data-plumbr='{
    "accountId": "abcde..",
    "serverUrl":"https://bdr.plumbr.io"
    "responseCode": 404
  }'>
</script>
Service Name – serviceName

Set the Service Name of the transaction that is loading this page.

Example

<script src="https://browser.plumbr.io/pa.js" 
  data-plumbr='{
    "accountId": "abcde..",
    "serverUrl":"https://bdr.plumbr.io"
    "serviceName": "Product Page"
  }'>
</script>

Compatibility Configuration

Unfortunately, there are a few times where libraries used on the page don’t match with the way browser agent works. For those cases there are options to enable compatibility fixes which take a performance hit.


Prototype.js versions < 1.7.1

Prototype.js is a library that extends native JavaScript objects by adding new methods to the prototype of the objects. However as JavaScript evolves and new methods get added to the specification of the language (and therefore latest versions of browsers), these the behaviour of these methods can be different than the one defined by the specification.

To fix these issues you can either upgrade to Prototype.js versions >= 1.7.1 or if that is not feasible configure the agent accordingly:

Version Notes Configuration
1.7.0 (inc. RC) Has non-ES5 compliant Function.prototype.bind implementation. "compat":["bind"]
1.6.1 and older In addition to bind, Prototype.js adds of toJSON to all of the objects which is used in ES5 to customise an object’s value when they are turned into a string. "compat":["bind","json"]

Example of using Prototype.js 1.6.1 with browser agent:

<script src="https://browser.plumbr.io/pa.js"
    data-plumbr='{"accountId":"abcdef...","serverUrl":"https://bdr.plumbr.io","compat":["bind","json"]}'>
</script>
<script src="https://ajax.googleapis.com/ajax/libs/prototype/1.6.1.0/prototype.js"></script>

 


In-HTML Event Handlers

Event listeners defined like <a onclick="doSomething(); return false"></a> are tightly coupled to the HTML, so we need to jump through some hoops in order to gather transactions from these interactions. As part of this we move in-html event listeners to the corresponding property on the Element and replacing attribute with instrumented_with_plumbr placeholder.

If you do not want this behaviour, you can add "inlineEvents": false to your browser agent config. (Note: Any activity happening in inline event listeners will then be linked to the previous transaction.)

Browser Agent API

Introduction

Plumbr Browser Agent API enables control over:

  1. Starting transactions (see this chapter for more details)
  2. Naming of Applications (see description of an application for more details)
  3. Naming of Services (see description of a service for more details)
  4. Naming of Users (see description of an user for more details)

The following guide describes how to install the browser api and how to use it.

Installation

Plumbr Browser API is installed along with the Plumbr Browser Agent, so no additional installation is needed. See Browser Agent Installation guide for more details.

Setting transaction attributes

Plumbr Browser API allows setting following attributes of a transaction:

  • Application
  • Service
  • User

These attributes can be set in two alternative ways: programmatically or via configuration parameters of the browser agent script. Selecting which way to use depends on the application type. Classic web applications, which generate HTML on the server side and have all the knowledge there may find it easier to generate required values for the configuration on the server side, eliminating the need for additional JavaScript code. Single page web applications, may find it more suitable to setup these attributes via direct JavaScript calls.

For information on configuration parameters please see the section on Browser Agent Configuration


Plumbr Browser API is exposed on window.PLUMBR, so it can be accessed globally as PLUMBR. It is recommended to wrap API calls in try…catch blocks to avoid situations where user-side blocking of the agent (such as privacy targeted browser addons) would crash your application.

document.getElementById('add-to-cart').addEventListener('click', function() {
    try {
        PLUMBR.setServiceName('add product to cart');
    } catch(err) {}

    // ajax call to add product to cart...
});

Available configuration options and API calls will be described below.

Application

In Configuration API Call
{ "appName": "Marketing Site" } PLUMBR.setAppName('Marketing Site')

Set the application name for the page. This is persistent across all transactions made on the page (such as soft navigations, ajax interactions), so setting it in the configuration means it doesn’t need to be called again in API.

Service

In Configuration API Call
{ "serviceName": "Product details" } PLUMBR.setServiceName('Product details')

Set the service for the current transaction. The service name set in configuration is always the service name for the transaction that represents loading the page, while API calls mean the transaction that was currently active. For example in a SPA when the API method is called after user clicks on a link, it will be used to define the service of the transaction that is made by user clicking on the link.

User

In Configuration API Call
{ "userId": "person@example.com" } PLUMBR.setUserId('person@example.com')

Set the user of the page. This is persistent across all transactions made on the page (such as soft navigations, ajax interactions), so setting it in the configuration means it doesn’t need to be called again in API.

Transaction Management

In most cases the browser agent is able to automatically detect all of the the user interactions. However there might be cases where you’d want to start transactions manually.

PLUMBR.startTransaction(serviceName)

Starts a new transaction under serviceName service.

Example usage: Starting a transaction when document receives an external-force event

document.addEventListener('external-force', function (event) {
    try {
        PLUMBR.startTransaction('External force');
    } catch(err) {}

    messageServerAboutExternalForce();
});
Note: Include the api call as close to the source as possible. That way when the browser agent starts detecting it natively, it will try to avoid creating multiple transactions.

document.getElementById('sign-up').addEventListener('click', function (event) {
    try {
        PLUMBR.startTransaction('User signs up');
        // Because browser agent has already detected the click it will be translated to
        // PLUMBR.setServiceName('User signs up')
    } catch(err) {}

    registerUser({ /* ... */ });
});