Skip to main content
DASH NYC, June 9-10 | AI + Observability

Back to Catalog

From Alerts to Autonomy: Scaling Incident Management at PUBG with Automation and AI

About this Session

Modern game platforms operate across regions, cloud providers, and highly dynamic workloads. At Krafton (maker of PUBG: Battlegrounds), fast-moving teams building player-facing and backend systems faced a key challenge: speed without autonomy creates friction, but autonomy without guardrails creates risk.

 

Junghun Kim, Lead of the DevOps Team at Krafton, will share how his team transformed incident management from a centralized SRE function into a developer-centric platform powered by Datadog. By combining unified observability, high-signal monitors, and integrated workflows across Datadog Incident Management, On-Call, and Slack, teams can now detect, declare, and respond to incidents with greater ownership.

 

He will explore how automation and AI reduce cognitive load during incidents, from automatically creating context-rich Slack war rooms, to enforcing safeguards such as scale-in prevention, correlating signals to accelerate root cause analysis, and automating postmortem reviews.

 

Through a real-world incident walkthrough, attendees will gain a practical blueprint for building an incident response model that increases ownership, reduces alert fatigue, and enables faster detection, safer mitigation, and automated postmortems.

Related Sessions