zlacker

One downside of SQS is that it doesn't support fan-out, for eg. S3->SQS->multiple consumers. The recommendation instead seems to be to first push to SNS, and then hookup SQS/other consumers to it. Kinesis/Kafka would appear to be better suited for this (since they support fan-out like SNS and are pull-based like SQS), but aren't as well supported as SNS/SQS (you can't push S3 events directly to Kinesis for eg.) Can someone from AWS comment on why that is? Also, related: when can we expect GA for Kafka (MSK)?

replies(4): >>actuat+49 >>static+5c >>blr246+Cn >>Smirki+PO

>>hexene+(OP)
Generally they do GA in the next ReInvent from when as service is announced, so probably by end of year. But I won't be sure on MSK. It is extremely limited right now, last time I checked their API had no way for even changing the number of nodes.

>>hexene+(OP)
I do S3 -> SNS -> SQS. I don't see why I would use Kinesis instead. The SNS bit is totally invisible to the consumers (you can even tell SNS not to wrap the inner message with the SNS boilerplate), downstream consumers just know they have to listen to a queue.

I don't see a downside to this approach. Perhaps some increased latency?

replies(1): >>hexene+nh

>>static+5c
If you wanted multiple pull-based consumers for the stream, wouldn't you need a separate SQS queue per consumer, with each queue hooked up to SNS? Perhaps I'm mistaken, but that seems brittle to me. With Kinesis/Kafka, you only need to register a new appName/consumer group on the single queue for fan-out. Plus, both are FIFO by default, at least within a partition.

replies(1): >>static+sk

>>hexene+nh
That's exactly how you do it. To me, it's the opposite of brittle - every consumer owns a queue, and is isolated from all other consumers. Clients are totally unaware of other systems, and there's no shared resource under contention.

replies(2): >>skybri+7q >>stella+ft

>>hexene+(OP)
Kinesis is not necessarily well-suited fan-out. It is very well suited for fan-in (single consumer, multiple producers).

Each shard allows at most 5 GetRecords operations per second. If you want to fan out to many consumers, you will reach those limits quickly and have to implement a significant latency/throughput tradeoff to make it work.

For API limits, see: https://docs.aws.amazon.com/kinesis/latest/APIReference/API_...

>>static+sk
Hmm. It seems a bit awkward if you have a variable number of consumers?

replies(1): >>static+Os

>>skybri+7q
I haven't run into that myself, when would you want a variable number of consumers? Usually the way I have it is that a service, which is itself a cluster of processes, owns one queue. For example, an AWS Lambda triggered by that queue.

Then any new lambdas or other services that want to subscribe to messages will have another queue, and another, etc.

I haven't had a case where I had service groups coming up and down, I'm struggling to think of a use case.

>>static+sk
I feel like the create/delete queue semantics hint that a queue should be a long-lived thing that consumers are configured to connect to. When I saw suggestions to have one queue per consumer and have that consumer create/delete the queue during its execution lifecycle, the idea of one-queue-per-consumer started making more sense to me.

replies(1): >>static+Zu

>>stella+ft
I think the word "Consumer" here is "Consumer Group".

For example, an AWS Lambda triggered from SQS will lead to thousands of executions, each lambda pulling a new message from SQS.

But another consumer group, maybe a group of load balanced EC2 instances, will have a separate queue.

In general, I don't know of cases where you want a single message duplicated across a variable number of consumer groups - services are not ephemeral things, even if their underlying processes are. You don't build a service, deploy it, and then tear it down the next day and throw away the code.

>>hexene+(OP)
Yea, I find this setup really convoluted and unnecessarily complex. Now I have to learn the particulars of two aws services to do a job which ought to be handled by one.

Google Cloud really outshines AWS here with its serverless PubSub - its trivial to fan out, its low latency, and has similar delivery semantics (I think), and IMHO better, easier api's. Its a really impressive service, IMHO.

replies(1): >>jjeaff+EZ1

>>Smirki+PO
I have been working with Google pubsub and was excited about their Push service that can post messages to subscribed endpoints/webhooks.

But their only method of throttling is to scale up and down base on failures. And it has been very unpredictable for me.

Even though my webhook started failing and timing out on requests, pubsub just kept hammering my servers until it brought it completely to it's knees. Logs on Google's end showed 1,500 failed attempts per second and 0.2 successes per second. It hammered at this rate for half an hour.

Seems like their Push option really needs some work.