I am tech lead for a project that revolves around multiple terabytes of trading data for one of top ten largest banks in the world. My team has three, 3-node, 3TB per node MongoDB clusters where we keep huge amount of documents (mostly immutable 1kB to 10kB in size).
Majority write/read concern is exactly so that you don't loose data and don't observe stuff that is going to be rolled back. It is important to understand this fact when you evaluate MongoDB for your solution. That it comes with additional downsides is hardly a surprise, otherwise there would be no reason to specify anything else than majority.
You just can't test lower levels of guarantees and then complain you did not get what higher levels of guarantees were designed to provide.
It is also obvious, when you use majority concern, that some of the nodes may accept the write but then have to roll back when the majority cannot acknowledge the write. It is obvious this may cause some of the writes to fail that would succeed should the write concern be configured to not require majority acknowledgment.
The article simply misses the mark by trying to create sensation where there is none to be found.
The MongoDB documentation explains the architecture and guarantees provided by MongoDB enough so that you should be able to understand various read/write concerns and that anything below majority does not guarantee much. This is a tradeoff which you are allowed to make provided you understand the consequences.
To quote from the report: "Moreover, the snapshot read concern did not guarantee snapshot unless paired with write concern majority—even for read-only transactions."
Of course, it doesn't work when you don't pair it with majority read/write concern. You can't expect to get a snapshot of data that wasn't yet acknowledged by majority of the cluster.
As to the quote you probably are referring to:
"Jepsen evaluated MongoDB version 4.2.6, and found that even at the strongest levels of read and write concern, it failed to preserve snapshot isolation."
I did not find any proof of this in the rest of the report. It seems this is mostly complaint of what happens when you mix different read and write concerns.
I would also suggest to think a little bit on the concept of snapshot in the context of distributed system. It is not possible to have the same kind of snapshot that you would get with a single-node application with the architecture of MongoDB. MongoDB is a distributed system where you will get different results depending on which node you are asking.
The only way you could get close to having a global snapshot is if all nodes agreed on a single truth (for example single log file, block chain, etc.) which would preclude read/write with concern level less than majority.
Did you see the part about "Operations in a transaction use the transaction-level read concern. That is, any read concern set at the collection and database level is ignored inside the transaction."?
"Tansactions without an explicit read concern downgrade any requested read concern at the database or collection level to a default level of local, which offers “no guarantee that the data has been written to a majority of replicas (i.e. may be rolled back).”"
The big problem is that, even if somebody correctly sets the read and write concerns to something sensible, the moment they use a transaction these guarantees fly out the window, unless they read the docs carefully enough to realise they have to set the read and write concern for the transaction too. The defaults are very un-intuitive; I can't imagine that the case of somebody needing snapshot isolation in general but being fine with arbitrary data less in transactions is a common case, compared to wanting to avoid data loss both generally and in transactions.
Not saying you're wrong. As an anecdotal data point - we've read the docs (carefully) and spoke to MongoDB quite a bit when implementing transactions including their highest paid levels of support and still ran into this issue:
> transactions running with the strongest isolation levels can exhibit G1c: cyclic information flow.
As well as the Node.js API issue (I just checked randomly and their Python API has the same bug lol) listed above.
It is not different. For a product like mongodb, both the durability guarantees and the documentation explaining them are an integral part of the user experience. If I'm starting a project, I'm making decisions for a junior developer whom I'll hire in 2 years. I care what code that junior developer will be most nudged to write.
If the Stripe API had documentation was needlessly unclear in a way which led people to lose a significant amount of money, that would be a bug.
Chief, it does not have to be this hard. 3.4 clearly states:
This anomaly occurred even with read concern snapshot and write concern majority
3.5: In this case, a test running with read concern snapshot and write concern majority executed a trio of transactions with the following dependency graph
3.6: Worse yet, transactions running with the strongest isolation levels can exhibit G1c: cyclic information flow.
3.7: It’s even possible for a single transaction to observe its own future effects. In this test run, four transactions, all executed at read concern snapshot and write concern majority, append 1, 2, 3, and 4 to key 586—but the transaction which wrote 1 observed [1 2 3 4] before it appended 1.
Like... if you had read any of these sections--or even their very first sentences--you wouldn't be in this position. They're also summarized both in the abstract and discussion sections, in case you skipped the results.
4.0: Finally, even with the strongest levels of read and write concern for both single-document and transactional operations, we observed cases of G-single (read skew), G1c (cyclic information flow), duplicated writes, and a sort of retrocausal internal consistency anomaly: within a single transaction, reads could observe that transaction’s own writes from the future. MongoDB appears to allow transactions to both observe and not observe prior transactions, and to observe one another’s writes. A single write could be applied multiple times, suggesting an error in MongoDB’s automatic retry mechanism. All of these behaviors are incompatible with MongoDB’s claims of snapshot isolation.
May I suggest alternative perspective on the matter?
Compared to a product like Oracle, transactions on MongoDB are very new, very niche functionality. Even MongoDB consultants do openly suggest not to use it.
MongoDB is really meant to store and retrieve documents. That's where the majority read/write concern guarantees come from.
As long as you are storing and retrieving documents you are pretty safe functionality.
Your article presents the situation as if MongoDB did not work correctly at all. That is simply not true, the most you can say is that a single (niche) feature doesn't work.
Have you ever tried distributed transactions with relational databases? Everybody knows these exist but nobody with sound mind would ever architect their application to rely on it.
Any person with a bit of experience will understand that things don't come free and some things are just too good to be true. MongoDB marketing may be a bit trigger happy with their advertisements but it does not mean the product is unusable, they just probably promised bit too much.
The world does not revolve around HN votes. If your first urge is whether the post gets downvoted or not you might want to rethink your life a little bit.
I'm not "worried" nor experiencing an "urge." Please skip the concern trolling.
What I do have an interest in is HN's accepted decorum, which I admittedly stepped outside of when I implored you to stop digging yourself such a hole.
HN is far from perfect but there is a culture of respectful discourse here, which is part of the reason for its value IMO.
May I suggest the tiniest bit of consideration (such as reading the report) before jumping to conclusions and low-key offending the author? You should be embarrassed.
This comment looks a bit comical when compared with the one you started this whole thread with. You're an engineer, why are you siding with marketing over measured technical facts? Do you think denial will make your infrastructure any safer? Don't make excuses for MongoDB, just acknowledge the article as an appropriately well weighted response to their marketing claims and move on.
> May I suggest alternative perspective on the matter?
Can't reply to that since it's too nested so I'll reply here. I warmly recommend getting off tree you climbed on and actually reading the article because if you do - you will see you are not disagreeing on that part.
The article is a mostly technical analysis of the transaction isolation levels and where they hold. The main criticism is how MongoDB advertises itself. If they didn't claim the database is "fully ACID" then the article would have just been a technical analysis :]
> The article simply misses the mark by trying to create sensation where there is none to be found.
As someone who is a tech lead for a large database install, I'd urge you to read the rest of the Jepsen reports. They aren't intended to be hit pieces on technology - they're deep dives into the claims and guarantees of each database. IIRC MDB has explicitly reached out to OP in the past (I doubt they'll continue to do so after this).
Why that matters to the rest of us: once I learn all those dials and knobs I'm left wondering why I would choose Mongo over another technology, and how much the design of the default behavior and complexity of said dials/knobs are influenced by their core business.
I would also wonder about the surrounding ecosystem of tooling & libraries.
Imagine there was a programming language which had rather inconsistent naming, poor automated testing support, and a history of guiding its users toward security vulnerabilities. A culture would grow up around that language and the most successful members would be those who could best tolerate those properties. People generally self-select into language communities. So unless some powerful influence pushed random programmers to use the language or made it easier to add new tooling, the culture would continue to undervalue what the language originally lacked.
I suspect the same social dynamic would apply to a database.
I agree. MongoDB has large numbers of peculiarities that you better know before you buy in. It is definitely not so rosy as advertised. In particular it seems the product is not mature (especially if you come from Oracle world) and the features seem slapped on as they go and not thought through.
The documentation states that very clearly and the attributes are part of every call to the database (as long as you are using native driver).
In any case any person that has some experience with distributed systems will understand what it roughly means to get an acknowledgment from just a single node vs. waiting for the majority.
Oracle also does not use serializable as its default isolation level, yet it advertises it.
This is all part of the product functionality. Whenever you evaluate product for your project you have to understand various options, functionalities and their tradeoffs.
Defaults don't mean shit. In a complex clustered product you need to understand all important knobs to decide the correct settings and configurable guarantees are most important knobs there are.
Since you just leaned all the way in, while repeatedly proving you either will not, or cannot read the posted article at all. Will you let us know what bank you support so at least I can make sure I never use that bank?
Thanks,
Those of us who care about our banking and investing data.
Majority write/read concern is exactly so that you don't loose data and don't observe stuff that is going to be rolled back. It is important to understand this fact when you evaluate MongoDB for your solution. That it comes with additional downsides is hardly a surprise, otherwise there would be no reason to specify anything else than majority.
You just can't test lower levels of guarantees and then complain you did not get what higher levels of guarantees were designed to provide.
It is also obvious, when you use majority concern, that some of the nodes may accept the write but then have to roll back when the majority cannot acknowledge the write. It is obvious this may cause some of the writes to fail that would succeed should the write concern be configured to not require majority acknowledgment.
The article simply misses the mark by trying to create sensation where there is none to be found.
The MongoDB documentation explains the architecture and guarantees provided by MongoDB enough so that you should be able to understand various read/write concerns and that anything below majority does not guarantee much. This is a tradeoff which you are allowed to make provided you understand the consequences.