What happened
I was asked to investigate data in the production environment. Before the investigation, I was using MySQL Workbench to delete unnecessary data on the production DB.
When investigating the data in the production DB, I mistakenly executed delete instead of select in SQL in MySQL Workbench, and deleted an entire table.
DELETE FROM posts;
ON DELETE CASCADE was set for the parent table, so four more table data were lost one after another.
If you would like to know more about ON DELETE CASCADE.
https://www.geeksforgeeks.org/mysql-on-delete-cascade-constraint/
Table structure (table names are given as examples and may differ slightly from the actual ones)
Correct settings
When a user is deleted from the users table, the records in the child tables linked to that user_id are also deleted.
comments table
CONSTRAINT `comments_ibfk_1 ` FOREIGN KEY (`user_id`) REFERENCES `users` (`id`) ON DELETE CASCADE
likes table
CONSTRAINT `likes_ibfk_1` FOREIGN KEY (`user_id`) REFERENCES `users` (`id`) ON DELETE CASCADE
points table
CONSTRAINT `points_ibfk_1` FOREIGN KEY (`user_id`) REFERENCES `users` (`id`) ON DELETE CASCADE
posts table
CONSTRAINT `posts_ibfk_1` FOREIGN KEY (`user_id`) REFERENCES `users` (`id`) ON DELETE CASCADE
Incorrect setting
When you delete an entire child table posts table of the users table, the users records linked to post_id are also deleted.
Furthermore, the child table records linked to that user_id are also deleted in a chain reaction.
users table
CONSTRAINT `users_ibfk _1` FOREIGN KEY (`comment_id`) REFERENCES `comments` (`id`) ON DELETE CASCADE
CONSTRAINT `users_ibfk_2` FOREIGN KEY (`post_id`) REFERENCES `posts` (`id`) ON DELETE CASCADE
CONSTRAINT `users_ibfk_3` FOREIGN KEY (`like_id`) REFERENCES `likes` (`id`) ON DELETE CASCADE
CONSTRAINT `users_ibfk_4` FOREIGN KEY (`point_id`) REFERENCES `points` (`id`) ON DELETE CASCADE
comments table
CONSTRAINT `comments_ibfk_1` FOREIGN KEY (`user_id`) REFERENCES `users` (`id`) ON DELETE CASCADE
likes table
CONSTRAINT `likes_ibfk_1` FOREIGN KEY (`user_id`) REFERENCES `users` (`id`) ON DELETE CASCADE
points table
CONSTRAINT `points_ibfk_1` FOREIGN KEY (`user_id`) REFERENCES `users` (`id`) ON DELETE CASCADE
posts table
CONSTRAINT `posts_ibfk_1` FOREIGN KEY (`user_id`) REFERENCES `users` (`id`) ON DELETE CASCADE
Dealing with the problem
This time, we took a backup of RDS every day, so I was able to restore it to its original data.
The problem occurred at about 10:45 that morning, and I was able to restore RDS at about 13:00. The restoration time may vary depending on the amount of data.
It happened during a meeting, so I was able to report the problem immediately.
Backups are very important.
Measures to prevent a recurrence
- Only senior engineers or managers have editing rights (delete, etc.) to the production environment DB, and other members have read-only rights.
- It was bad to have ON DELETE CASCADE set in the parent table at first, so I removed it from the parent table. I don't know how it was implemented, but it's clearly an anti-pattern, so I removed it.
Thoughts
Since I became an engineer, I had never made a big mistake until this incident, so I think I was pretty relaxed.
Also, during a meeting, we needed to investigate the data, and it was not good that I looked at the production database while talking.
When I executed DELETE, my mind went blank,
but afterwards my colleague encouraged me by saying, "I've had a lot of experiences like that. Don't worry about it," which made me happy.
I think that this is how we grow as engineers, step by step, even as we make mistakes, so I hope I can use this failure as a stepping stone to grow myself!
Top comments (3)
Haha yeah, that's pretty terrible. Good job on minimizing the downtime though and hopefully not lose too much data.
A few notes that come to my mind :
The way to deal with this type of mistakes is to have a delayed replica. In case of a mistake like this (which is always possible), you can take a dump of the deleted data from the replica and load it into the master.
We learn from our mistakes.
Backups are critical to any database, first thing you should do is build a backup system.
I doubt there is many Devs who haven't made some epic mistake, not always the same kind of mistakes but I Think we all have some terrible story to tell how something went horribly wrong for us.
You are not alone on making mistakes, it's how you come out of it that matters, and what you learn from it.