RAG-Powered SRE Agent: Building Total Situational Awareness for a Gaming Platform
We built an autonomous SRE agent that connects to Datadog, Kubernetes, AWS, and Cloudflare simultaneously — then gave it RAG access to every runbook, post-mortem, and line of source code the company ever wrote. MTTR dropped from 45 minutes to 8. Here's the architecture.