Dissecting GitHub Outage - How databases are managed in production
arpit.substack.com
Managing databases in production is not easy and it does require a lot of tooling to keep it up and running. GitHub had an outage that gives us a glimpse of the toolings they use in production. So, here's what happened Incident Summary The GitHub team updated the ProxySQL and a week later a master node in a critical database cluster crashed. A replica was promoted as the new master, and it also crashed. The team tried to manually recover, but the manually promoted replica also crashed.
Dissecting GitHub Outage - How databases are managed in production
Dissecting GitHub Outage - How databases are…
Dissecting GitHub Outage - How databases are managed in production
Managing databases in production is not easy and it does require a lot of tooling to keep it up and running. GitHub had an outage that gives us a glimpse of the toolings they use in production. So, here's what happened Incident Summary The GitHub team updated the ProxySQL and a week later a master node in a critical database cluster crashed. A replica was promoted as the new master, and it also crashed. The team tried to manually recover, but the manually promoted replica also crashed.