A recent report by Everbridge, a firm specializing in unified critical communications, found that the average response time to an IT incident was 27 minutes, but with some response times stretching out to almost two and a half hours.
Given the report’s estimated cost of unplanned downtime at more than $8,600 per minute, delays in response can be expensive, as well as potentially damaging to user and customer confidence, company reputation and compliance. The cost and potential risk rises if the incident is related to mission-critical or customer-facing applications and services.
Rapid Response Essential
Clearly, rapid response to incidents is essential, and automation can help in two ways. First, it can help IT teams detect incidents quickly, sometimes before they occur. And, by streamlining communications and notifications, automation can reduce the time needed to get the right people in place to respond.
A documented incident management process is an essential first step. The process should cover the full cycle from identification to successful closure. The IT Infrastructure Library (ITIL) recommends four key steps – awareness and identification of the incident, logging the incident, categorizing the incident, and prioritization.
Monitoring Identifies Incidents Quickly
Automated monitoring tools play an important role here. They monitor infrastructure against pre-defined standards and can be programmed to provide automated alerts in the event of an incident or deteriorating performance. Automated tools are a valuable back-up to the incident reports that come from users or a helpdesk, and they can enable teams to take a proactive response before the incident impacts users or customers.
Document processes and business rules for categorizing and prioritizing incidents ensure that IT teams understand the type of response that is required. While automated monitoring can provide essential information to IT teams, the notification process can prove to be a weak link.
Automated Notification Eliminates Weak Links
The Everbridge report found that ticketing systems and manual calls remain the predominant method of alerting response teams. While escalation systems are in place to ensure that response is monitored and managed, these can cause further delay and increase downtime.
To overcome these problems, Gartner and other commentators recommend the use of automated alerting tools integrated with other monitoring and management tools. Automated notification can reduce or eliminate any delay in reaching the most suitable team member and speed up mean time to resolution.
An automated notification system contains details of IT staff with their skills and experience in different incident categories. It incorporates updated information on staff status and ability, together with a set of different contact details by phone, email, mobile or text.
Right People in the Right Place
By linking the notification database to the monitoring and alert system, notifications can be issued automatically to the most suitable staff. The notifications provide incident details and priority, together with a reply mechanism that lets the incident management team know the technician is on the case. Some systems also incorporate features that allow the technician to provide status updates.
Automated monitoring and notification tools are essential. In the Everbridge survey, 90 percent of respondents reported at least one major incident per year. Resolving those issues quickly and efficiently is a priority.