Building Automated Remediations to Improve Availability and Reduce Toil (Sponsored by Shoreline)
As infrastructure grows, incidents increase in frequency and severity. Manual remediation processes don’t scale past a certain point, causing a rapid increase in downtime and worse, operator burnout. The only remedy is to introduce automation to help diagnose and repair incidents without manual intervention. However, automating repairs needs to be simple, as there are many ways a fleet can fail.
In this workshop, you will learn how to quickly diagnose and repair incidents by creating automated remediations, all from the Datadog UI. Leveraging parallelized command execution and remediations provided by the Shoreline integration, you will interactively debug multiple hosts to resolve a triggered monitor and build automation to repair whenever the incident reoccurs.