https://www.pgcasts.com/episodes/the-skip-locked-feature-in-...
It’s not “web scale” but it easily extends to several thousand background jobs in my experience
I've used this for tasks at big organizations without issue. No need for any special deployments or new infra. Just spin up a few worker threads in your app. Perhaps a thread to reset abandoned tasks. But in three years this never actually happened, as everything was contained in try/catch that would add it back to the queue, and our java app was damn stable.
Just curious. We maintained a custom background processing system for years but recently replaced it with off the shelf stuff, so I'm really interested in how others are doing similar stuff.
Our tasks were quick enough so that all fetched tasks would always be able to be completed before a scale down / new deploy etc, but we stopped fetching new ones when the signal came so it just finished what it had. I updated above, we did have logic to monitor if a task got taken but never got a finished status, but I can't remember it ever actually reporting on anything.
>Applied to job records, this feature enables simple queue processing queries, e.g. SELECT * FROM jobs ORDER BY created_at FOR UPDATE SKIP LOCKED LIMIT 1.
delete from task
where task_id in
( select task_id
from task
order by random() -- use tablesample for better performance
for update
skip locked
limit 1
)
returning task_id, task_type, params::jsonb as params
[1] https://taylor.town/pg-taskMy approach was:
- Accept the inbound call
- Generate a 20 character random string (used as a signature)
- Execute a sql query that selects the oldest job without a signature and write the signature, return the primary key of the job that was updated.
- If it errors for any reason, loop back and attempt again, but only 10 times, as some underlying issue exists (10 collisions is statistically improbable for my use case)
- Read the primary key returned by that sql query and read it, comparing it's signature to my random one.
- If a hit, return the job to the caller
- If a miss, loop back and start again, incrementing attempts by 1.
The caller has to handle the possibility that a call to this web service won't return anything, either due to no jobs existing, or the collision/error threshold being reached.
In either case, the caller backs for it's configured time, then calls again.
Callers are usually in 'while true' loops, only existing if they get an external signal to close or an uncontrolled crash.
If you take this approach, you will have a function or a web service that converts the SQL table into a job queue service. When you do that, you can build metrics on the amount of collisions you get while trying to pull and assign jobs to workers.
I had inbuilt processes that would sweep through jobs that were assigned (had a job signature) and weren't marked as complete, it actioned those to handle the condition of a crashed worker.
There are many many other services the proper job queues offer, but that usually means more dependencies, and code libraries / containers, so just build in the functionality you need.
If it is accurate, fast enough, and stable, you've got the best solution for you.
/edited for formatting
You can use locks to effectively break the queue into sub queues so that each sub queue is only being processed by 1 worker. Then you can order that sub queue.
> The task row will not be deleted if sendEmail fails. The PG transaction will be rolled back. The row and sendEmail will be retried.
For example, I run the above query to grab a queued email, send it using mailgun, then COMMIT. Nothing is changed in the DB unless the email is sent.
In the beginning you can do a naive UPDATE ... SET, which locks way too much. While you can make your locking more efficient, doing UPDATE with SELECT subqueries for dequeues and SELECT FOR UPDATE SKIP LOCKED, eventually your dequeue queries will throttle each other's locks and your queue will grind to a halt. You can try to disable enqueues at that point to give your DB more breathing room but you'll have data loss on lost enqueues and it'll mostly be your dequeues locking each other out.
You can try very quickly to shard out your task tables to avoid locking and that may work but it's brittle to roll out across multiple workers and can result in data loss. You can of course drop a random subset of tasks but this will cause data loss. Any of these options is not only highly stressful in a production scenario but also very hard to recover from without a ground-up rearchitecture.
Is this kind of a nightmare production scenario really worth choosing Boring Technology? Maybe if you have a handful of customers and are confident you'll be working at tens of tasks per second forever. Having been in the hot seat for one of these I will always choose a real queue technology over a database when possible.
Edit: To clarify, I mean `SELECT id WHERE used = 0` followed by `UPDATE ... SET used = 1 WHERE id = ... AND used = 0`
It's more like a few thousand per second, and enqueues win, not dequeues like you say... on very small hardware without tuning. If you're at tens of tasks per second, you have a whole lot of breathing room: don't build for 100x current requirements.
https://chbussler.medium.com/implementing-queues-in-postgres...
> eventually your dequeue queries will throttle each other's locks a
This doesn't really make sense to me. To me, the main problem seems to be that you end up with having a lot of snapshots around.
This link is simply raw enqueue/dequeue performance. Factor in workers that perform work or execute remote calls and the numbers change. Also, I find when your jobs have high variance in times, performance degrades significantly.
> This doesn't really make sense to me. To me, the main problem seems to be that you end up with having a lot of snapshots around.
The dequeuer needs to know which tasks to "claim", so this requires some form of locking. Eventually this becomes a bottleneck.
> don't build for 100x current requirements
What happens if you get 100x traffic? Popularity spikes can do it, so can attacks. Is the answer to just accept data loss in those situations? Queue systems are super simple to use. I'm counting "NOTIFY/LISTEN" on Postgres as a queue, because it is a queue from the bottom up.
Then you probably have to write complicated queries or use partitions in some sort.
Or Just stick to one thread polling the messages.
These don't occur on the database server, though... This merely affects the number of rows currently claimed.
> The dequeuer needs to know which tasks to "claim", so this requires some form of locking. Eventually this becomes a bottleneck.
These are just try locks, though-- the row locks are not contended. The big thing you run into is having lots of snapshots around and having to skip a lot of claimed rows for each dequeue.
> What happens if you get 100x traffic? Popularity spikes can do it, so can attacks.
If you get 100x the queueing activity for batch jobs, you're going to have stuff break well before the queue. It's probably not too easy to get 100x the drain rate, even if your queue system can handle it.
This scales well beyond 100M batch tasks per day, which gets you to 1M users with 100 tasks/day each.
When grabbing a new message it selects "Available or (Locked with Updated timestamp older than configured timeout)". If successful it immediately tries to set the Locked status, Updated timestamp and bumps the Version counter, where the previous values of Status and Version has to match. If the update fails it retries getting a new message.
If the Version counter is too high, it moves the message to the associated dead-letter table, and retries getting a new message.
This isn't for high performance. I tested it and got 1000 messages/sec throughput with handful of producers and consumers against test db instance (limited hardware), which would be plenty for us.
I wrote it to be simple and so we could easily move to something AMPQ'ish like RabbitMQ or Azure Service Bus when needed. Overall quite easy to implement and has served us well so far.
Why would both transactions see `used = 0`? The DB server tries to isolate transactions and actively hides effects of other transactions that have not committed yet.
_Ideally_ the queuing technology is abstracted from the job-submitters/job-runners anyway. It's a bit more work if multiple services are just writing to the queue table directly.
I agree that the _moment_ the system comes to a screeching halt is definitely not fun.
Also, postgres partial indexes can be quite helpful in situations where you want to persist and query intermediate job lifecycle state and don't want multiple rows or tables to track one type of job queue
https://www.postgresql.org/docs/current/transaction-iso.html...
There is actually another possibility: there must be a way to check whether the receiving system has received the message. But this only works if there are no "rogue" senders.
If they are being technically precise, queue isn’t the correct term, but language changes with context and time. Either way the implementation isn’t wrong if strict start order has been considered and isn’t important.
Careful monitoring and tuning of parameters mentioned by the sibling comment to you can help mitigate this, though.
Ultimately at scale, no, RDBMS shouldn’t be a queue. But most have a long way to go before they hit that point.
Throttle the inputs. Rate-limiting doesn’t belong to the data layer.
While throttling due to organic popularity isn’t great, I’d argue the tradeoffs might be worthwhile. If it looks like the spike will last, stand up Redis during the throttling, double-write, and throttle down the Postgres queue until it’s empty. If you really need to, take a 15 minute outage to just copy data over.
This line of reasoning is desirable for FAANGS, but can bankrupt startups that need to move fast and get shit done.
That's exactly what a queue means, not just in every day life, but specifically in computer science.
How does the system behave when the traffic rate is higher for which it was designed for or can currently handle? Because that number will always be there, even in a "scalable" system. One won't be able to add capacity at the same rate that work will increase.
I’m not confusing anything. I’ve seen random selection “job queues” implemented many times. As long as you truly don’t care about start order, it’s fine to trade it for increased throughout.
Queue: a list of data items, commands, etc., stored so as to be retrievable in a definite order, usually the order of insertion.
note the term "Usually", not "always".
Does that mean it doesn't have any order or that whoever writes the query doesn't care about order?
Also we are arguing over whether pg suffices as a queue implementation, and you use itself as an example?
always has an oder, which is usually of insertion.
I’m not using pg itself as an example. I’m using a specific implementation of a “job queue” built with pg.
I’ve seen and you can search for and find many implementations of “job queues” using relational databases where job start order guarantees are traded away for throughput.