MongoDB is a popular NoSQL database and is one of my top choices when starting a project. Recently, I came across a report that explained why MongoDB with its default configurations is unreliable and how it can lose data. In this post, I will be going through what makes MongoDB lose data and how we can fix it.
The fault of the defaults ๐ ๏ธ
MongoDB cannot guarantee to save your data durably if you continue working with its default configurations, namely write concern and read concern.
Write concern defines how durably you want your data to be saved before giving you a success message. The default is { w: 1 }
, which means that MongoDB will save data to the primary and give you a success message even if the data isn't yet replicated to any of the secondaries. So, if the primary goes down for some reason before replication happens, the data will be rollbacked.
Read concern defines what level of durable data you want to read from the database. The default value is local
and it means that there's no gurantee that the data you read from the database will not be rollbacked in the future.
So, in short you can read and write data that can just vanish into thin air ๐ฑ
Is there a solution? ๐
Fortunately yes ๐. For this, we need to tell MongoDB to make sure the data we save gets replicated before giving us a success message and that we only want to read data that has been replicated successfully too.
This can be done by setting both write and read concerns to majority
in the connection string itself.
mongodb://db0.example.com,db1.example.com,db2.example.com/?w=majority&readConcernLevel=majority
Why isn't this the default then? ๐ค
As MongoDB researchers reported in VLDB:
...users would prefer, of course, to use readConcern level โmajorityโ and writeConcern w:โmajorityโ, since everyone wants safety. However, when users find stronger consistency levels to be too slow, they will switch to using weaker consistency levels. These decisions are often based on business requirements and SLAs rather than granular developer needs. As we argue throughout this paper, the decision to use weaker consistency levels often works in practice because failovers are infrequent and data loss from failovers is usually small.
Conclusion
Even though it's true that MongoDB can lose data if you use the default settings, the failovers aren't that frequent to make a significant difference. However, if your business requires safety to be the topmost priority, MongoDB provides you with configurable durability levels to suit your needs.
Top comments (0)