About incident management

Assign an incident a priority

Each incident must be assigned a priority. Start by assessing the damage to the company caused by the incident. Consider who will be impacted and the potential financial, security, and compliance impacts. This will allow you to understand how serious the incident is and how quickly the company can resolve it.

It is recommended that you set severity and priority levels without waiting for an incident so that incident handlers can quickly assign priority to them.

If you’re not sure what priority to assign to an incident, assign it with a “round” up. Better to be safe than to underestimate the problem.

After prioritizing, proceed to resolve all open incidents in order of priority. Most organizations have SLAs for each priority level, so customers know how quickly they will get a response and resolve a problem.

React

Incident response is a fairly broad term, so we’ll go over the specific, most likely steps to resolve an incident once it’s been detected, categorized, and prioritized.

Initial Diagnosis

Here we can draw an analogy with the distribution of patients entering the hospital, according to the severity of their condition. The helpdesk agent makes a guess about what might have happened to see if the problem can be fixed right away, or if it will be necessary to follow the established procedure and collect the resources necessary to resolve the incident. Knowledge bases and diagnostic guides are very helpful at this stage.

If the first responding agent can resolve the incident based on the initial diagnosis, existing knowledge and tools, the incident is successfully resolved. Otherwise, it needs to be escalated.

Incident escalation

The service desk team that maintains contact with the customer must be able to resolve the most common incidents without escalation. However, if the problem is serious and cannot be resolved immediately, it is necessary to collect and record information about the incident so that qualified support specialists can resolve it quickly.

Analysis and diagnostics

ITIL treats this as a separate step. In reality, this process occurs continuously throughout the life cycle of an incident.

The first person to respond to an incident is essentially analyzing it, gathering relevant information, and in some cases successfully diagnosing and even resolving the issue without escalating it. In this case, you can proceed to the next few steps: resolve, restore, and close the incident.

Otherwise, analysis and diagnostics occur at every stage when escalating or attracting external resources to advise and help resolve the incident.

Resolution and Recovery

Ultimately, you will diagnose and do what is necessary to resolve the incident (ideally within the framework of the concluded service level agreements, SLA). The determining criterion when recovering from an incident is the time that will be spent on the full recovery of all functions, since after a successful recovery it may be necessary to install and test some fixes (for example, bug fixes).

Incident closure

The incident is escalated back to the help desk (if it was escalated) for closure. Only support staff can close incidents. This allows you to maintain a high quality of service and consistency in problem solving. The owner of the incident should contact the person who reported the incident and make sure that the solution found is satisfactory and that the incident can indeed be closed.

Summary

The incident management process can seem like a no-brainer, especially if you work for a small company. However, the lifecycle of incidents is the same regardless of team structure, and escalation is often required. Don’t skip the steps of the process!

Incidents happen. A robust incident management process will help mitigate incident impacts and quickly resume services.