It should be the opposite: with more messages you want to scale with independent consumers, and a monotonic counter is a disaster for that.
You also don’t need to worry about dropping old messages if you implement your processing to respect the commutative property.
for example, you can conceive of a software vendor that does the end-to-end of a real estate transaction: escrow, banking, signature, etc. The IT required to support the model of such a thing would be staggering. Does it make sense to do that kind of product development? That is inventing all of SAP, on top of solving your actual problem. Or making the mistake of adopting temporal, trigger, etc., who think they have a smaller problem than making all of SAP and spend considerable resources convincing you that they do.
The status quo is that everyone focuses on their little part to do it as quickly as possible. The need for durable workflows is BAD. You should look at that problem as, make buying and selling homes much faster and simpler, or even change the order of things so that less durability is required; not re-enact the status quo as an IT driven workflow.
Is there any method for uniqueness testing that works after fan-out?
> You also don’t need to worry about dropping old messages if you implement your processing to respect the commutative property.
Commutative property protects if messages are received out of order. Duplicates require idempotency.
"How we've been doing things is wrong and I am going to redesign it in a way that no one else knows about so I don't have to implement the thing that's asked of me"
Why are real-estate transactions complex and full of paperwork? Because there are history books filled with fraud. There are other types of large transactions that also involve a lot of paperwork too, for the same reason.
Why does a company have extensive internal tracing of the progress of their business processes, and those of their customers? Same reason, usually. People want accountability and they want to discourage embezzlement and such things.
Businesses that require enterprise sales are probably the worst performing category of seed investing. They encompass all of Ed tech and health tech, which are the two worst industry verticals for VC; and Y Combinator has to focus on an index of B2B services for other programmers because without that constraint, nearly every “do what you are asked for” would fail. Most of the IT projects business do internally fail!
In fact I think the idea you are selling is even harder, it is much harder to do B2B enterprise sales than knowing if the thing you are making makes sense and is good.
This has a number of nice properties:
1. You don’t need to store keys in any special way. Just make them a unique column of your db and the db will detect duplicates for you (and you can provide logic to handle as required, eg ignoring if other input fields are the same, raising an error if a message has the same idempotent key but different fields).
2. You can reliably generate new downstream keys from an incoming key without the need for coordination between consumers, getting an identical output key for a given input key regardless of consumer.
3. In the event of a replayed message it’s fine to republish downstream events because the system is now deterministic for a given input, so you’ll get identical output (including generated messages) for identical input, and generating duplicate outputs is not an issue because this will be detected and ignored by downstream consumers.
4. This parallelises well because consumers are deterministic and don’t require any coordination except by db transaction.
idempotency means something else to me.
Edit: Just looked it up... looks like this is basically what a uuid5 is, just a hash(salt+string)
> Critically, these two things must happen atomically, typically by wrapping them in a database transaction. Either the message gets processed and its idempotency key gets persisted. Or, the transaction gets rolled back and no changes are applied at all.
How do you do that when the processing isn’t persisted to the same database? IE. what if the side effect is outside the transaction?
You can’t atomically rollback the transaction and external side effects.
If you could use a distributed database transaction already, then you don’t need idempotent keys at all. The transaction itself is the guarantee
This illustrates that the webdevs who write articles on "distributed system" don't really understand what is already out there. These are all solved problems.
There is also 2 phase commit, which is not without downsides either.
All in all, I think the author made a wrong point that exact-once-processing is somehow easier to solve than exact-once-delivery, while in fact it’s exactly same problem just shaped differently. IDs here are secondary.
If you try to disambiguate those messages using, say, a timestamp or a unique transaction ID, you're back where you started: how do you avoid collisions of those fields? Better if you used a random UUIDv4 in the first place.
Which obviously has it's own set of tradeoffs.
Customer A can buy N units of product X as many times as they want.
Each unique purchase you process will have its own globally unique id.
Each duplicated source event you process (due to “at least once” guarantees) will generate the same unique id across the other duplicates - without needing to coordinate between consumers.
In distributed systems, there’s a common understanding that
it is not possible to guarantee exactly-once delivery of
messages.
This is not only a common understanding, it is a provably correct axiom. For a detailed discussion regarding the concepts involved, see the "two general's problem"[0].To guarantee exactly once processing requires a Single Point of Truth (SPoT) enforcing uniqueness shared by all consumers, such as a transactional persistent store. Any independently derived or generated "idempotency keys" cannot provide the same guarantee.
The author goes on to discuss using the PostgreSQL transaction log to create "idempotency keys", which is a specialization of the aforementioned SPoT approach. A more performant variation of this approach is the "hi/low" algorithm[1], which can reduce SPoT allocation of a unique "hi value" to 1 in 2,147,483,648 times when both are 32-bit signed integers having only positive values.
Still and all, none of the above establishes logical message uniqueness. This is a trait of the problem domain, in that whether two or more messages having the same content are considered distinct (thus mandating different "idempotentcy keys") or duplicates (thus mandating identical "idempotency keys").
Pedantically, axioms by definition are assumed/defined without proof and not provable; if it is provable from axioms/definitions, it is a theorem, not an axiom.
You'd think that a transaction means money is going from a source to a destination, but according to some banking APIs sometimes it just magically disappears into the aether.
The article doesn't propose anything especially different from Lamport clocks. What this article suggests is a way to deal with non-idempotent message handlers.
Also check out Lamport vector clocks. It solves this problem if your producers are a small fixed number.
I am discussing this approach, just not under that name:
> Gaps in the sequence are fine, hence it is possible to increment the persistent state of the sequence or counter in larger steps, and dispense the actual values from an in-memory copy.
In that model, a database sequence (e.g. fetched in 100 increments) represents the hi value, and local increments to the fetched sequence value are the low value.
However, unlike the log-based approach, this does not ensure monotonicity across multiple concurrent requests.
[1] https://developer.mozilla.org/en-US/docs/Web/HTTP/Reference/...
More seriously, you are confident and very incorrect on your understanding of distributed systems. The easiest lift, you can fix being very incorrect (or at least appearing that way) by simply changing your statements to questions.
Personally, I recommend studying. Start with the two generals problem. Read Designing Data Intensive Applications; it is a great intro into real problems and real solutions. Very smart and very experienced people think there is something to distributed systems. They might be on to something.
The thing that webdevs want to solve is related but different, and whether the forest is missed for the trees is sometimes hard to tell.
What webdevs want to solve is data replication in a distributed system of transactions where availability is guaranteed, performance is evaluated horizontally, change is frequent and easy, barrier to entry is low, tooling is widely available, tech is heterogeneous, and the domain is complex relational objects.
Those requirements give you a different set of tradeoffs vs financial exchanges, which despite having their own enormous challenges, certainly have different goals to the above.
So does that mean this article is a good solution to the problem? I'm not sure, its hard to tell sometimes whether all the distributed aircastles invented for web-dev really pay out vs just having a tightly integrated low-level solution, but regardless of the hypothetical optimum, its hard to argue that the proposed solution is probably a good fit for the web dev culture vs UDP, which unfortunately is something very important to take into account if you want to get stuff done.
Its true that idempotency can sometimes be achieved without explicitly having message identity, but in cases it cannot, a key is usually provided to solve this problem. This key indeed encodes the identity of the message, but is usually called an "idempotency key" to signal its use.
The system then becomes idempotent not by having repeated executions result in identical state on some deeper level, but by detecting and avoiding repeated executions on the surface of the system.
Isn't that the situation inside a CPU across its multiple cores? Data is replicated (into caches) in a distributed system of transactions, because each core uses its own L2 cache with which it interacts, and has to be sent back to main memory for consistence. Works amazing.
Another even more complex system: a multi CPU motherboard supporting NUMA access: 2 CPUs coordinate their multiple cores to send over RAM from the other CPU. I have one of these "distributed systems" at home, works amazing.
For your specific question here: NUMA & cpu cores don't suffer from the P in CAP: network partitions. If one of your CPU cores randomly stops responding, your system crashes, and that's fine because it never happens. If one of your web servers stops responding, which may happen for very common reasons and so is something you should absolutely design for, your system should keep working because otherwise you cannot build a reliable system out of many disconnected components (and I do mean many).
Also note that there is no way to really check if systems are available, only that you cannot reach them, which is significantly different.
Then we've not even reached the point that the CPU die makes communication extremely fast, whereas in a datacenter you're talking milliseconds, and if you are syncing with a different system accross data centers or even with clients, that story becomes wildely different.
Are you sure about that? I actually have no idea but I'm surprised by your claim.