Microsoft asks AWS to help save GitHub — 1B → 14B code changes in a year
TL;DR
Microsoft is asking AWS for help keeping GitHub up. Code-change events go from 1B in 2025 to a projected 14B in 2026 — 14× — driven by AI coding agents. Eight outages already this year.
Microsoft is asking competitor AWS for help mitigating GitHub's outage crisis. Background: AI coding agents are exploding. GitHub COO Kyle Daigle confirmed: code change events go from 1B in 2025 to a projected 14B in 2026 — 14×. The traffic slope far outpaces GitHub's own infra scaling.
Eight outages already this year. Feb 2: security policy misconfig blocked backend storage, taking down GitHub Actions and Codespaces for ~6 hours. Feb 9: cache TTL shortened from 12h to 2h triggered mass cache rewrites that crushed the DB — combined with a new model release, Monday peak traffic, and client updates, five factors landed at once. March: three back-to-back — Redis write failures, Codespaces cross-region collapse, webhook latency spiking from 5s to 160s.
Why AWS, not Azure: Microsoft plans full GitHub migration to Azure by 2027, but that timeline can't bend now. Pulling AWS in temporarily is a pure engineering call — borrow a competitor's compute to bridge the demand spike until in-house scaling lands.
For context: Microsoft's SLA pays $262.50 per 4 hours of GitHub outage. Fifty developers' 4-hour productivity loss is ~$15,000. The numbers are nowhere near matched.
via NewsBytes
Eight outages already this year. Feb 2: security policy misconfig blocked backend storage, taking down GitHub Actions and Codespaces for ~6 hours. Feb 9: cache TTL shortened from 12h to 2h triggered mass cache rewrites that crushed the DB — combined with a new model release, Monday peak traffic, and client updates, five factors landed at once. March: three back-to-back — Redis write failures, Codespaces cross-region collapse, webhook latency spiking from 5s to 160s.
Why AWS, not Azure: Microsoft plans full GitHub migration to Azure by 2027, but that timeline can't bend now. Pulling AWS in temporarily is a pure engineering call — borrow a competitor's compute to bridge the demand spike until in-house scaling lands.
For context: Microsoft's SLA pays $262.50 per 4 hours of GitHub outage. Fifty developers' 4-hour productivity loss is ~$15,000. The numbers are nowhere near matched.
via NewsBytes
