Troubleshooting Outages in Kubernetes Applications

Moving your containerised applications to Kubernetes can have many advantages: from auto-healing of errors and easy rollbacks to auto-scaling of workloads and simplified deployments. This flexibility adds more operational complexity, so having a good observability strategy becomes even more critical. Errors can now come from many different places: one of the services, the interfaces between services, or even the network.

In this workshop we will deploy an application on Kubernetes and use Datadog to troubleshoot real-world issues, find the root causes, fix the issues, and get the application back to a stable state.

