zlacker

there's an important dimension of scalability that I think gets overlooked in a lot of these discussions about database-as-a-queue vs queue-system-as-a-queue:

are you queuing jobs, or are you queuing messages?

that's a fuzzy distinction, so somewhat equivalently, what's the expected time it takes for a worker to process a given queue item?

at one end, an item on the queue may take several seconds to a minute or longer to process. at the other end, an item might take only a few milliseconds to process. in that latter case, it's often useful to do micro-batching, where a single worker pulls 100 or 1000 items off the queue at once, and processes them as a batch (such as by writing them to a separate datastore)

the "larger" the items are (in terms of wall-clock processing time, not necessarily in terms of size in bytes of the serialized item payload) the more effective the database-as-a-queue solution is, in my experience.

as queue items get smaller / shorter to process, and start to feel more like "messages" rather than discrete "jobs", that's when I tend to reach for a queue system over a database.

for example, there's a RabbitMQ blog post [0] on cluster sizing where their recommendations start at 1000 messages/second. that same message volume on a database-as-a-queue would require, generally speaking, 3000 write transactions per second (if we assume one transaction to enqueue the message, one for a worker to claim it, and one for a worker to mark it as complete / delete it).

can Postgres and other relational databases be scaled & tuned to handle that write volume? yes, absolutely. however, how much write volume are you expecting from your queue workload, compared to the write volume from its "normal database" workload? [1]

I think that ends up being a useful heuristic when deciding whether or not to use a database-as-a-queue - will you have a relational database with a "side gig" of acting like a queue, or will you have a relational database that in terms of data volume is primarily acting like a queue, with "normal database" work relegated to "side gig" status?

0: https://blog.rabbitmq.com/posts/2020/06/cluster-sizing-and-o...

1: there's also a Postgres-specific consideration here where a lot of very short-lived "queue item" database rows can put excessive pressure on the autovacuum system.

replies(1): >>doctor+i2