How to handle database outages?
Why a database goes down?
An unexpected heavy load on your database can lead to a process crash or a massive slowdown.
Before jumping to the potential short-term and long-term solutions, ensure you monitor the database well. CPU, Memory, Disk, and Connections are being closely monitored.
Short term solutions
Kill the queries that have been running for a long time
Quickly scale up your database if you have been seeing a consistent heavy usage
Check if the recent deployment is the culprit; if so, revert asap
Reboot the database will calm the storm and buy you some time
Long term solutions
Ensure the right set of indexes is in place
Tune your database default parameters to gain optimal performance
Check for the notorious N+1 Queries
Upgrade the database version to get the best that DB can offer
Evaluate the need for Horizontal scaling using Replicas and Sharding
Here's the video of my explaining this in-depth 👇 do check it out
In this video, we talk about why a database goes down, what happens when the database is down, a few short-term solutions to minimize the downtime, and a few long-term solutions that you should be doing to ensure that your database does not go down again.
Outline:
00:00 Why a database goes down?
06:10 What happens when a DB is down?
09:46 Short-term solutions to get your DB up
17:33 Long-term solutions to fix the database
You can also
Subscribe to the YT Channel Asli Engineering
Listen to this on the go on Spotify
Thank you so much for reading 🖖 If you found this helpful, do spread the word about it on social media; it would mean the world to me.
You can also follow me on your favourite social media LinkedIn, and Twitter.
Yours truly,
Arpit
arpitbhayani.me
Until next time, stay awesome :)