Gain Control Over Production Incidents

Got a production incident in the middle of a production incident in the middle of another incident? Engineers getting burned out by a stream of alerts every night? All while customers are calling to complain about facing errors in your web application? Spending more time in the war room than with your family?

Even if it’s half as bad, you need a strategy to put an end to the constant fire fighting. Read on to see a battle-proven one.


The Strategy

  1. Capture the baseline
  2. Define what an “incident” is
  3. Get alerted of an incident
  4. Mitigate the incident’s impact
  5. Perform post-mortem analysis
  6. Follow up on action items
  7. Goto 2
The Tactics

See how Plumbr can help you efficiently execute your incident management strategy.

Track progress against the baseline

Plumbr clearly exposes the trend of user-visible errors over time, breaking down by applications and services. You will get monthly reports right in your inbox.

Get a clear signal that users are affected by the incident

Stop waking your engineers up at night with false alarms. Plumbr tells you exactly how many users are facing errors at any given time in any given service.

Save time on root cause analysis

Plumbr will equip each failed user interaction with a detailed root cause right in the source code. Your MTTR will go down, and root cause analysis will be a breeze.

