zlacker

[parent] [thread] 0 comments
1. atombe+(OP)[view] [source] 2017-09-21 02:16:12
As much as I like queues, RabbitMQ has some downsides compared to a database.

First, you get zero visibility into what's in the queue. There's literally no way to peek inside a queue without taking messages from it. Let's say one of the fields of your messages is customer_id. There's no way to get a count of how many messages are waiting that are related to customer 123.

This leads to the next problem: If the customer_key is something you want to partition by, you could create one queue per customer and then use a routing key to route the messages. But Rabbit queues are very rigid, as opposed to fluid. It's pretty inconvenient to move stuff between queues. So if you have one queue, and you want to split it into N queues, the only way is to drain the queue and republish each message back to the exchange. Rabbit provides no command line or management tools to do this, and neither does anyone else that I know.

Lastly, Rabbit deletes acked messages. To get any visibility into the history of your processing -- or indeed play back old messages -- you have to build that into your topology/apps, e.g. by having an exchange that dupes all messages into a queue and then run a consumer that drains it into a database table or log file.

I much like the "log" approach to queueing, as popularized by Apache Kafka. However, Kafka has its issues, and sometimes a database table is better.

The pattern I rather like to use is to use Rabbit purely for queue orchestration. Make a task table, use NOTIFY to signal that a row has been added (with ID as payload), have a worker use LISTEN and stuff each task's ID into Rabbit. Then have consumers get the Rabbit message, read (and lock!) the corresponding task, perform the task, then mark the task as done. If you need to replay or retry failed tasks, just use SQL to emit NOTIFYs again.

[go to top]