From the course: DevOps Foundations: Monitoring and Observability

Unlock the full course today

Join today to access over 24,900 courses taught by industry experts.

Incident management

Incident management

- [Ibukun] Incident response is often where observability carries the biggest weight. The idea behind understanding both the knowns and unknowns about your system becomes even more critical when things are going wrong. Defining an incident response process is simply being prepared to use the signals and data you have optimally to reduce the impact of downtime and outages. To design an incident response process, you should take into consideration the following. The incident response team. There are several roles to be played when it comes to incident management. These roles include the incident manager who often leads the incident, liaisons who handle communications, and subject matter experts who understand the domain and can resolve the incidents. You should also consider the incident response process. These are laid down processes for your team to follow when resolving an incident. These include things like when and…

Contents