Incident Management Principles

PROCESSES

There are three main processes related to incident management and problem management:

  • incident management process
  • problem control process
  • error control process
  • These basic processes are present in almost all advanced organizations, although they may have other names.

Incident Management Process
This process is focused on restoring the interrupted service as soon as possible. Table 1 shows the main parameters of this process, and Figure 1 shows a diagram of its operation.

PROCESSES

There are three main processes related to incident management and problem management:

incident management process
problem control process
error control process
These basic processes are present in almost all advanced organizations, although they may have other names.

Incident Management Process

This process is focused on restoring the interrupted service as soon as possible. Table 1 shows the main parameters of this process, and Figure 1 shows a diagram of its operation.
The focus of the problem control process is on identifying the causes. The composition of the participants in the analysis of causes and the length of time required to perform such an analysis depends on the problem itself. The following statements can be considered correct:

If you have enough problems, then assign a permanent team.

Otherwise, create a team when a problem arises, in much the same way as a team is formed for a project;
The team should almost always be with interdisciplinary experience and expertise. And this of course depends on the nature of the problem;
An estimate of the time to determine the cause (develop the project plan) should be given at the time the problem occurs. According to this assessment, the progress of the team should be measured.
Once resources have been allocated and prioritized, the actual mechanics of determining the cause can take many forms. Well-established methods of finding causes such as Kepner and Trego Analysis, Ishikawa diagrams, Pareto diagrams, and so on.

Error control process

Error control provides documentation of ways to overcome problems and notification of them (methods) to support personnel. It also includes maintaining communication with other technical and development organizations, which also helps to identify errors. Moreover, bug control influences developers to implement fixes for known bugs.

INTERACTIONS

Typically, interactions in this process take one of two forms. These are either messages about the status of an incident or problem that are provided to various groups and / or individuals based on approved rules and templates, or messages about requests that require the recipient to take certain actions, usually containing, in addition to the actual request / request, also a link to the incident, number user’s phone number or another link to it.

Many companies rely on the automated messaging capabilities provided by the software.

Such messages are sent according to strict rules to maintain escalation. Status messages from software systems are usually generated from the data entered in the fields of the incident card. Therefore, such messages are often incomplete and look like encryption due to the fact that the fields used to build automatic messages may be updated irregularly with timely information or automatically filled in by monitoring software using error message jargon.

To correct these deficiencies, automatic communication capabilities are supplemented, especially in the case of high-level incidents, with manually composed messages.

ESCALATION

An escalation mechanism helps resolve an incident in a timely manner by increasing staff capacity, effort, and priority to address the incident. The best organizations have well-defined escalation paths with timelines and responsibilities clearly defined at every step. They use incident management tools to automatically transfer responsibility to an ever-increasing level of support according to timescales and complexity.

Escalation timeframes and responsibilities vary greatly by organization, industry, and problem complexity. Leading organizations are negotiating with end users to determine an appropriate time frame and escalate responsibility. The result of such negotiations is implemented in the form of service level agreements, automated tools, lists, templates.

Functional escalation

Functional escalation is the transfer of an incident to a higher level of support when knowledge or experience is insufficient or an agreed time interval has expired. Advanced organizations define a matrix of severity levels based on business impact, incident resolution time frame, and time intervals in which an incident should be escalated to a more advanced group.

In most organizations, support groups of the first and second levels are focused on the operation of the existing infrastructure, while the third level of support is usually provided by groups that are responsible for infrastructure development planning and design. Therefore, careful planning of how responsibility will be functionally transferred to the third level is critical.