The key concepts and language of incident and problem management are shown in Figure. There is a lifecycle relationship among incidents, problems and errors: incidents are often the indicators of problems; problems lead to the identification of the root cause of the underlying error; errors are then systematically eliminated.
Incident Management
Incident management (IM) refers to activities undertaken to restore normal service operation as quickly as possible while minimizing adverse impact on business operations. IM is a reactive, short-term focus on restoring service. IM activities include :
- Incident detection and recording
- Classification and initial support
- Investigation and diagnosis
- Resolution and recovery
- Closure
Problem Management
Problem management (PM) refers to activities undertaken to minimize the adverse impact on the business of problems that are caused by errors within the IT infrastructure, and to prevent recurrence of incidents related to these errors. PM gets to the root cause of problems, identifies workarounds or permanent fixes and eliminates errors. PM activities include :
- Problem control
- Error control
- Proactive problem prevention
- Major problem reviews
Problem Control
The purpose of problem control is to find the root cause of a problem by executing the following steps :
- Identifying and recording of the problem
- Classifying the problem and prioritizing response activities
- Investigating and diagnosing root causes
Error Control
Error control activities ensure that problems are fixed by executing the following steps :
- Identifying and recording known errors
- Assessing permanent fixes and prioritization
- Resolution recording of temporary workarounds into service support tools
- Closure of known errors by implementing permanent fixes
- Monitoring known errors to determine if a change in priority is warranted
Problem Review
The purpose of a problem review is to improve IM and PM processes. This is accomplished by performing a post-mortem examination of the quality of the IM and PM response activities associated with a major incident or problem.