Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The first bug is a great reminder that even strict serializability doesn't imply idempotency. If you're doing non-idempotent operations like unconditional writes, you've got to think very carefully before you add any retries to a system. Even with conditional writes, you need to think carefully about ABA bugs.

Both of these bugs are a great reminder that distributed system behavior includes clients. From the application's perspective bugs like this being introduced by the client isn't any practically different from them being introduced by the server - the same badness happens. A database needs to consider it's properties end-to-end from the application API.

It's also a great reminder that APIs that make it hard for clients to do the right thing will likely lead to bugs like this. Failures happen, and a good API needs to be designed in a way that allows the client to do something sensible following a failure. A great API makes it easy for a client to do something sensible, and hard for a client to do the wrong thing. Perhaps my favorite non-distributed example of this is AES-GCM, the ubiquitous AEAD crypto primitive: one tiny bug (reusing an IV) completely blows up the whole scheme.

And, as always, this is great stuff from Kyle. His Jepsen work has been moving the industry forward for years, and it's great to see him continue it (and continue to put the effort into writing up his findings so clearly).



> strict serializability doesn't imply idempotency

I think we're probably getting at the same thing, but I do want to clarify a bit. A Strict Serializable history, like a Serializable one, requires equivalence to a total order of transactions. That's clearly not true for etcd+jetcd: no possible order of transactions can allow (e.g.) a transaction to read from its own future. It's totally fine to submit non-idempotent transactions against a Serializable system: systems which actually provide Serializable will execute known-committed transactions exactly once.

Plenty of other databases pass this test; etcd+jetcd does not. This system is simply not Serializable.


Maybe what I should have said is "you can't just retry transactions against a strict serializable database and expect to still get strict serializability (from the applications's perspective)". This is true of distributed system APIs more generally, too.


Yeah, that's a good way of phrasing it! :-)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: