Fixing Claude with Claude: Anthropic reports on AI site reliability engineering 19.03.2026

Alex Palcuie, an AI reliability engineer at Anthropic, shared insights at QCon London on the current limitations of AI, specifically Claude, in site reliability engineering (SRE). While Claude excels at rapidly processing logs and identifying potential issues, it struggles to differentiate correlation from causation, often mistaking increased request volume for a capacity problem when the actual issue might be a broken KV cache. Palcuie emphasized that human SREs remain crucial for incident response due to their ability to discern root causes, understand system history, and possess the "scar tissue" from past incidents. Despite AI's growing utility in the observation phase of incident response, Palcuie cautioned against over-reliance, citing the Jevons Paradox and the potential for skill atrophy among human engineers.
















