I get that syncing clocks across systems is hard and when it goes awry, unintended consequences are incurred.
"Riak is designed to accomodate (sp) server and network failures without ever losing committed writes, so this led to a quick response from Basho’s engineers."
As such losing a write to me when I read documentation is losing either a create, update or a delete. Any side affecting operation essentially. Anything that needs to write to disk to record a change...
I was concerned that might be interpreted as spin, but I hoped the rest of the article would reinforce the point that there is no way to guarantee an update is preserved in a distributed system without an approach more sophisticated than blindly trusting clocks.
Writes to a new object are inherently less problematic; while it's possible to temporarily receive a negative response about the presence of an object, the data will always be there, barring catastrophic multiple server failure.
Updates can be entirely lost, and that's something that developers and operations people need to be aware of.
> If your distributed database relies on clocks to pick a winner, you’d better have rock-solid time synchronization, and even then, it’s unlikely your business needs are served well by blindly selecting the last write that happens to arrive.
More broadly, as someone who helps write our documentation, it's very difficult to figure out how to present enough detail about the proper ways to use Riak without forcing everyone to become an expert on distributed systems. Unfortunately there are incredibly subtle tradeoffs inherently involved in running a distributed database.
That should be on a cross-stitch sampler on the wall of every NOC.
"The key enabler of these properties is a new TrueTime API and its implementation. The API directly exposes clock uncertainty, and the guarantees on Spanner’s timestamps depend on the bounds that the implementation provides. If the uncertainty is large, Spanner slows down to wait out that uncertainty. Google’s cluster-management software provides an implementation of the TrueTime API. This implementation keeps uncertainty small (generally less than 10ms) by using multiple modern clock references (GPS and atomic clocks)."
[1] Spanner: Google's globally-distributed database https://www.usenix.org/system/files/conference/osdi12/osdi12...
Out of the well known open-source AP systems, Riak is probably the leader here since they implement well understood techniques from the literature such as CRDTs and vclocks.
EDIT: removed my statement about Cassandra since it was a bit misleading and jbellis answered above in greater detail.
I am with op in that I consider an update a write.
"create/update" are both writes
"write/update" ... eh?
Other structures such as CRDTs/lattices might be more appropriate for your use case.
http://www.datastax.com/dev/blog/why-cassandra-doesnt-need-v...
http://www.datastax.com/dev/blog/cql3_collections
http://www.datastax.com/dev/blog/lightweight-transactions-in...
If we could have traded correctness, we could have optimized everything and gone home by now :)
There are people who have to. They run their own atomic clocks, and worry about things such as precision delivery of nuclear ordanance.
Then there's you. You should use vector clocks, with a builtin conflict resolution mechanism based on domain knowledge.
That's the point of the article.
Also, s/side affecting/side effecting/.
1) set last_write_wins=true (so all updates, always apply, as described in the article)
2) avoid the "partition/rejoin may cause old values to stomp on new" issue by having "rejoin detection" which refuses to rejoin if clocks are "too out of sync"
As for how, it's a long story. At bottom we rely on Paxos for consistency across failures, but we only actually do Paxos when there are failures. (We use less costly synchronous techniques for replication in "happy times".)