> Oh but what about ORDERED queues? The only way to get ordered application of writes is to perform them one after the other.
This is another WTF. Talking about ordered queues is like talking about databases, because it's data that's structured. If you can feed data from concurrent sources of unordered data to a system where access can be ordered, you have access to a sorted data. You deal with out-of-order data either in the insertions or a window in the processing or in the consumers. "Write in order" is not a requirement, but an option. Talking about technical subjects on twitter always results in some mind-numbingly idiotic statements for the sake of 144 characters.
It would seem to me that naively, S3 charges $5 per million POST requests, so it's 10x worse than SQS's $0.40 per million.
Interesting. Can you expand on this? How do you ensure that only one worker takes a message from s3? Or do you only use this setup when you have only one worker?
Just because you can, doesn’t mean you should.
The Perl implementation was the original AFAIK.
https://metacpan.org/release/Directory-Queue/source/lib/Dire...
So how does one lock a message in s3? Does s3 have a "createIfDoesNotExistOrError"? I'm still having difficulty understanding how the proposed system avoids race conditions.
Say the outage results in a few million messages that need to be retried. Some subset of those few million will never succeed (aka they are “poisoned pills”). At the same time, new messages are arriving.
In your system, how do you maintain QoS for incoming messages as well as allow for the resolution of the few million retries while also preventing the poisoned pills from blocking the queue? How do you implement exponential backoff, which is the standard approach for this?
SQS gives you some simple yet powerful primitives such as the visibility timeout setting to address this scenario in a straightforward manner.
I guess if you're at the point where your engineering time to implement this + all of the features on top of it that you might need from SQS and future maintenance of this custom solution is cheaper than the cost of using SQS, and you have no other outstanding work that your engineering team should be doing instead, this is a viable cost optimization strategy.
But that's a whole lot of ifs, and with customers I've mostly worked with, they're far better served just using SQS.
I can't vouch for the queueing code but I believe it's quite robust too.
Any middle tiering of the data before it reaches the consumer, is still "the queue". You don't need to know the internals of SQS, anymore than a consumer need know the black box elements of how you collate the messages within your ad-hoc queue.