0. https://www.cloudamqp.com/blog/why-is-a-database-not-the-rig...
EDIT>> I am not suggesting people build their own rabbitmq infrastructure. Use a cloud service. The article is informational only.
A message queue is one of those things that is easy enough and worth the effort to do "right" early on, because it is not something you want to rip out and rewrite when you hit your scaling bottlenecks, given how critical it is and how many things it will end up touching.
Edit: also keep in mind most queues do not like "slow consumers" i.e. if your workload is bursty with long processing times, a database might be a better fit (i.e. rabbitmq does not like it)
Edit2: we implemented a queue with postgres since we need acid and having 10k inserts per second is highly unlikely since a customer upload takes longer than a second (we deal with files) we mostly have burst workloads short period of high volume followed by long pauses (i.e. nobody uploads stuff at night)
IMO, the downsides of hosting a queue inside your primary relational DB are very much outweighed by the downsides of 1) having to run a new piece of infra like rabbit and 2) having to coordinate consistency between your message queue and your relational DB(s)
For high throughput (we had ad tech servers with 1E7 hits/s) we used a home-built low-latency queue that supported real time and persisted data. But for low throughput stuff, the DBaaQ worked fine.
And ultimately, maybe it was a lack of imagination on our part since Segment was successful with a mid-throughput DBaaQ https://segment.com/blog/introducing-centrifuge/
One of the biggest advantages comes when you start thinking about them in terms of transactions. Transactional guarantees are really useful here: guarantee that a message will be written to the queue if the transaction commits successfully, and guarantee that a message will NOT be written to the queue otherwise.
https://brandur.org/job-drain describes a great pattern for achieving that using PostgreSQL transactions.
At least on GCP PubSub, a subscription is a separate concept from a topic/queue. If you want different priorities, you create multiple topics. You create multiple subscriptions when you want to fan out a single message to multiple workers. As far as I know, multiple subscriptions have nothing to do with priorities. Can you explain?
>you're big enough to experience this failure fairly often IMO
Please explain how? You would either have to suffer from frequent network connectivity issues that affects only your db and not your queue, or your process must be mysteriously dying in the microseconds between those 2 operations. Either of those cases are not something I would consider things that happen "fairly often," even if you were processing trillions of messages per day.
In my experience, the vast majority of message processing failures happen at the worker level.
You’re guaranteed to break the invariant sooner or later so you end up with all the usual complexity of keeping stuff in sync.
Edit>> I see you edited your post after I responded. None of those scenarios qualify as "fairly often."
How does it solve the additional code complexity problem?
Go back and read OPs link. They create new SQL types, tables, triggers, and functions, with non-trivial and very unforgiving atomic logic. And every system that needs to read or write from this "db queue" needs to leverage specific queries. That's the complexity.
>vs writing code to support some other new system
You mean using a stable well maintained library with a clean and sensible interface to a queueing system? Yes, that is far more simple.
- No need to poll the database table
- No table-level locks and manual handling: row locks used for handling the work in progress
- "Manual cleanup" -- uhhh
Etc.
If this is insufficient for more complicated migrations, there's tooling to support it. e.g. Flyway.