https://www.pgcasts.com/episodes/the-skip-locked-feature-in-...
It’s not “web scale” but it easily extends to several thousand background jobs in my experience
I've used this for tasks at big organizations without issue. No need for any special deployments or new infra. Just spin up a few worker threads in your app. Perhaps a thread to reset abandoned tasks. But in three years this never actually happened, as everything was contained in try/catch that would add it back to the queue, and our java app was damn stable.
I thought it was an interesting article, and I'd love to hear more from people using PG for queues in production (my intuition would say you'd get a lot of table bloat and/or vacuum latency, but I haven't tested it myself), but when it comes to the conclusion - "choosing boring technology should be one’s default choice" - I can't think of anything more boring (in a good sense, mostly) than Sidekiq + Redis for a Rails app.
Python has Celery, but maybe the author is looking for more choice between brokers. https://docs.celeryq.dev/en/stable/index.html
Just curious. We maintained a custom background processing system for years but recently replaced it with off the shelf stuff, so I'm really interested in how others are doing similar stuff.
Our tasks were quick enough so that all fetched tasks would always be able to be completed before a scale down / new deploy etc, but we stopped fetching new ones when the signal came so it just finished what it had. I updated above, we did have logic to monitor if a task got taken but never got a finished status, but I can't remember it ever actually reporting on anything.
At some point you may want to refactor things to reduce tech debt, but it really is a "and" rather than "or" decision.
Use almost anything else to manage job scheduling….
https://github.com/bensheldon/good_job
Had it in production for about a quarter and it’s worked well.
Do you also have any idea on the concurrency? How many workers you have pulling from Postgres.
I’ve used this approach before (ages ago) when Redis wasn’t even a thing, though not at high throughout requirements.
Do not use SKIP LOCKED unless it is a toy/low throughout.
Row locks require transactions and disk writes.
Advisory locks require neither. (However, you do have to stay inside the configurable memory budget.)
That is, say you have a relatively complex backend mutation that needs to schedule some async work (eg sending an email after signup). With a Postgres queue, if you insert the job to send the email and then in a later part of the transaction, something fails and the transaction rollbacks, the email is never queued to be sent.
1. The main downside to using PostgreSQL as a pub/sub bus with LISTEN/NOTIFY is that LISTEN is a session feature, making it incompatible with statement level connection pooling.
2. If you are going to do this use advisory locks [0]. Other forms of explicit locking put more pressure on the database while advisory locks are deliberately very lightweight.
My favorite example implementation is que [1] which is ported to several languages.
[0] https://www.postgresql.org/docs/current/explicit-locking.htm...
I built a complete implementation in Python designed to work the same as SQS but be more simple:
https://github.com/starqueue/starqueue
Alternatively if you just want to quickly hack something into your application, here is a complete solution in one Python function with retries (ask ChatGPT to tell you what the table structure is):
import psycopg2
import psycopg2.extras
import random
db_params = {
'database': 'jobs',
'user': 'jobsuser',
'password': 'superSecret',
'host': '127.0.0.1',
'port': '5432',
}
conn = psycopg2.connect(**db_params)
cur = conn.cursor(cursor_factory=psycopg2.extras.DictCursor)
def do_some_work(job_data):
if random.choice([True, False]):
print('do_some_work FAILED')
raise Exception
else:
print('do_some_work SUCCESS')
def process_job():
sql = """DELETE FROM message_queue
WHERE id = (
SELECT id
FROM message_queue
WHERE status = 'new'
ORDER BY created ASC
FOR UPDATE SKIP LOCKED
LIMIT 1
)
RETURNING *;
"""
cur.execute(sql)
queue_item = cur.fetchone()
print('message_queue says to process job id: ', queue_item['target_id'])
sql = """SELECT * FROM jobs WHERE id =%s AND status='new_waiting' AND attempts <= 3 FOR UPDATE;"""
cur.execute(sql, (queue_item['target_id'],))
job_data = cur.fetchone()
if job_data:
try:
do_some_work(job_data)
sql = """UPDATE jobs SET status = 'complete' WHERE id =%s;"""
cur.execute(sql, (queue_item['target_id'],))
except Exception as e:
sql = """UPDATE jobs SET status = 'failed', attempts = attempts + 1 WHERE id =%s;"""
# if we want the job to run again, insert a new item to the message queue with this job id
cur.execute(sql, (queue_item['target_id'],))
else:
print('no job found, did not get job id: ', queue_item['target_id'])
conn.commit()
process_job()
cur.close()
conn.close()Having transactional semantics around background jobs is incredibly convenient for things like scheduling email only if the transaction is successful, and so on.
You do need to do a little bit of autovacuum tuning, but once sorted it’s been great for us.
are you queuing jobs, or are you queuing messages?
that's a fuzzy distinction, so somewhat equivalently, what's the expected time it takes for a worker to process a given queue item?
at one end, an item on the queue may take several seconds to a minute or longer to process. at the other end, an item might take only a few milliseconds to process. in that latter case, it's often useful to do micro-batching, where a single worker pulls 100 or 1000 items off the queue at once, and processes them as a batch (such as by writing them to a separate datastore)
the "larger" the items are (in terms of wall-clock processing time, not necessarily in terms of size in bytes of the serialized item payload) the more effective the database-as-a-queue solution is, in my experience.
as queue items get smaller / shorter to process, and start to feel more like "messages" rather than discrete "jobs", that's when I tend to reach for a queue system over a database.
for example, there's a RabbitMQ blog post [0] on cluster sizing where their recommendations start at 1000 messages/second. that same message volume on a database-as-a-queue would require, generally speaking, 3000 write transactions per second (if we assume one transaction to enqueue the message, one for a worker to claim it, and one for a worker to mark it as complete / delete it).
can Postgres and other relational databases be scaled & tuned to handle that write volume? yes, absolutely. however, how much write volume are you expecting from your queue workload, compared to the write volume from its "normal database" workload? [1]
I think that ends up being a useful heuristic when deciding whether or not to use a database-as-a-queue - will you have a relational database with a "side gig" of acting like a queue, or will you have a relational database that in terms of data volume is primarily acting like a queue, with "normal database" work relegated to "side gig" status?
0: https://blog.rabbitmq.com/posts/2020/06/cluster-sizing-and-o...
1: there's also a Postgres-specific consideration here where a lot of very short-lived "queue item" database rows can put excessive pressure on the autovacuum system.
QC (and equivs) use the same db, and same connection, so same transaction. Saves quite a bit of cruft.
My guess is that many people are implementing queuing mechanisms just for sending email.
You can see how this works in Arnie SMTP buffer server, a super simple queue just for emails, no database at all, just the file system.
I used it for a web automation system for an accounting client (automatically read files from a network share, lookup the clients on a database, submit the documents to government websites, using headless browsers, and put the resulting files in the directory). It allows for completely effortless deterministic programs that call workers that run the non deterministic code, with built in configurable retries (react to certain exception type, exponential back off) so you can write code that works almost like there were no issues with api connections, filesystem, etc.
This code has been running for 5 or more years, with barely any maintenance, with 0 issues so far. It keeps everything in postgres, so even full reboots and crashes have no impact, it will just move the work back to the queue and it will run when there's an available worker.
Advisory locks don’t work in this setup (and will start behaving in strange ways if you do try to use them.) Something to consider if you go this route.
I’d recommend that, if you have a Postgres database already, definitely use that instead. Your queues will be transactional and they will get backed up when the rest of your database does.
I’ve seen very high transaction rates from this arrangement, more than 20k messages/second.
>Applied to job records, this feature enables simple queue processing queries, e.g. SELECT * FROM jobs ORDER BY created_at FOR UPDATE SKIP LOCKED LIMIT 1.
The big advantage is for code paths which async modify your DB; these can be done fully transactionally with exactly-once semantics since the Job consumption and DB update are in the same transaction.
delete from task
where task_id in
( select task_id
from task
order by random() -- use tablesample for better performance
for update
skip locked
limit 1
)
returning task_id, task_type, params::jsonb as params
[1] https://taylor.town/pg-taskTransactionally Staged Job Drains in Postgres - https://brandur.org/job-drain
It's about the challenge of matching up transactions with queues - where you want a queue to be populated reliably if a transaction completes, and also reliably NOT be populated if it doesn't.
Brandur's pattern is to have an outgoing queue in a database table that gets updated as part of that transaction, and can then be separately drained to whatever queue system you like.
This is a really important point. I often end up using a combination of Postgres and SQS since SQS makes it easy to autoscale the job processing cluster.
In Postgres I have a transaction log table that includes columns for triggered events and the pg_current_xact_id() for the transaction. (You can also use the built in xmin of the row but then you have to worry about transaction wrap around.) Inserting into this row triggers a NOTIFY.
A background process runs in a loop. Selects all rows in the transaction table with a transaction id between the last run's xmin and the current pg_snapshot_xmin(pg_current_snapshot()). Maps those events to jobs and submits them to SQS. Records the xmin. LISTEN's to await the next NOTIFY.
Oban is able to run over 1m jobs a minute, and the ultimate bottleneck is throttling in application code to prevent thrashing the database: https://getoban.pro/articles/one-million-jobs-a-minute-with-...
My approach was:
- Accept the inbound call
- Generate a 20 character random string (used as a signature)
- Execute a sql query that selects the oldest job without a signature and write the signature, return the primary key of the job that was updated.
- If it errors for any reason, loop back and attempt again, but only 10 times, as some underlying issue exists (10 collisions is statistically improbable for my use case)
- Read the primary key returned by that sql query and read it, comparing it's signature to my random one.
- If a hit, return the job to the caller
- If a miss, loop back and start again, incrementing attempts by 1.
The caller has to handle the possibility that a call to this web service won't return anything, either due to no jobs existing, or the collision/error threshold being reached.
In either case, the caller backs for it's configured time, then calls again.
Callers are usually in 'while true' loops, only existing if they get an external signal to close or an uncontrolled crash.
If you take this approach, you will have a function or a web service that converts the SQL table into a job queue service. When you do that, you can build metrics on the amount of collisions you get while trying to pull and assign jobs to workers.
I had inbuilt processes that would sweep through jobs that were assigned (had a job signature) and weren't marked as complete, it actioned those to handle the condition of a crashed worker.
There are many many other services the proper job queues offer, but that usually means more dependencies, and code libraries / containers, so just build in the functionality you need.
If it is accurate, fast enough, and stable, you've got the best solution for you.
/edited for formatting
It seems foolish. I am a big fan of “use the dumbest tool”, but sometimes engineers take it too far and you’re left with the dumbest tool with caveats that don’t seem worth it given the mainstream alternative is relatively cheap and simple.
Basically I’m saying that there are just a number of potential footguns when using files as queues - I speak from experience! - which are trivially taken care of by a database, especially if you have one already.
I’m not saying that it’s not possible, just that for non trivial applications, it’s certainly more complex than just an atomic file move.
Storing the job state in the DB means you can query state nicely. It's not going to exactly show the state of things but it's helpful for working through a live incident (especially when most job queues just delete records as work is processed).
And if you make all the background tasks idempotent anyways then you're almost always safe with running a thing like "send a job to the task queue to handle this job".
If you rely _just_ on message queues, there are a lot of times where you can have performance issues, yet have a lot of trouble knowing what's going on (for example, rabbitMQ might tell you the size of your queues, but offer little-to-no inspection of the data inside them).
Unfortunately, Kafka carries with it enough complexity (due to the distributed nature) that it ends up not being worth it for most use-cases.
Personally I'd love something similar that's easier to operate. You'd probably be able to handle hundreds (if not thousands) of events per second on a single node, and without distributed complexity it'd be really nice.
And yes, in theory you could still use postgres for this (and just never delete rows). And maybe that's the answer.
Use the right tool for the job.
Having transactions is quite handy.
https://wiki.postgresql.org/wiki/SkyTools
I did a few talks on this at Sydpy as I used it at work quite a bit. It's handy when you already have postgresql running well and supported.
This said, I'd use a dedicated queue these days. Anything but RabbitMQ.
So please, go on and back up your bold statements with some specifics. Why specifically is it not OK to use a database as a message queue?
You can use locks to effectively break the queue into sub queues so that each sub queue is only being processed by 1 worker. Then you can order that sub queue.
I wish the industry was even half as concerned with efficiency as it was with scale. Bitcoin? Electron? 5MB web pages? 5/10/25GB downloads to run 20yr old CD-sized games on modern software?
> The task row will not be deleted if sendEmail fails. The PG transaction will be rolled back. The row and sendEmail will be retried.
For example, I run the above query to grab a queued email, send it using mailgun, then COMMIT. Nothing is changed in the DB unless the email is sent.
Would you mind elaborating on this? I'd be happy for others to chime in with their experiences/opinions, too.
Tangible example:
We have a video transcoder queue. The state of the video model in our db can change as the video is being finalized in various ways. The transcoder generates thumbnails and assets from the video and also updates its state in the db. So we store job information in the message about what thumbnails we want to generate and the video ID but nothing else. This allows us to look up the video row, see if the same media was already transcoded from the video (and cancel the job), and, if not, run the job and update the video row.
Also (and I know you’re not saying this), but I’ve never understood the argument that keeping queues in Postgres leads to higher data integrity via transaction guarantees. The job is still running on another process outside of the db. The only time this could be true is if the job itself mostly updates state in the db, in which case it’s the small minority of queued workloads (with the majority needing to do non-db compute work).
Bad experiences? If so, was it the tech itself or surrounding stuff? (aka maybe docs, maybe community, etc)
Recently I had to stop using it because after a while all NOTIFY/LISTENS would stop working, and only a database restart would fix the issue https://dba.stackexchange.com/questions/325104/error-could-n...
Taking on problems you don't (and will never) have because some vanishingly small minority has experienced them is nuts. Over-engineering is as incorrect as under-engineering. The correctly sized, correctly complicated answer is what we're after.
It works great until it doesn't, and the way it breaks puts you in a state that's very difficult to recover from. And if your excuse for using a database as a queue was that you were already running a database, that cuts both ways: congratulations, your queue mess has now brought down your primary datastore too.
Just don't put your queue tables/logic in the same DB instance as your datastore. There are still a lot of benefits to using the same tech even if you have segregated instances.
It's worth noting that these issues might have been due to my improper configuration. Nevertheless, if the configuration process is fraught with pitfalls, that's problematic in itself. I've had these experiences more than once.
Additionally, I found a critical race condition in the Python library, rendering it practically unusable for me. I submitted a bug report with a minimal example demonstrating the issue. I considered fixing it myself, but since using RabbitMQ wasn't crucial for my project, I switched to ZeroMQ, which didn't require a broker. The issue was acknowledged and fixed about a year later. At the time, I had to assume that nobody else was using the Python bindings.
Three years ago, I worked on a project that used the software for a Celery queue. Messages would occasionally go missing, although this could have been a configuration issue on our part. Ultimately, we replaced it with a Redis queue (not the best practice, I admit) and didn't look back. This was for a lower-availability use case where a single instance of Redis sufficed.
I agree, primary reason being that if you're in the cloud (thus this applies to a lot of people but obviously not everyone), all the cloud providers have extremely easy to use, and cheap, hosted queueing tech. Even if you're worried about vendor lockin, queueing primitives are so small (basically push and pop), that it's relatively easy to write things in a way so it would be easy to migrate if necessary.
This is similar to saying, 'if I mess up all the tables in one database I wreck the rest'. Just my opinion, but this is not actually a thing in databases.
Maybe compromised the performance of one database due to another loading things up? I think database are developed with this as an important consideration. I can't say I have seen this, but admit it's a possibility.
Also, if you run one postgres, you won't have of an issue running another if you have the experience in production.
If have you large teams, for sure, pick a proper queue that someone in the team knows well.
Well, maybe. Postgres is large and has lots of features. NOTIFY/LISTEN is a distinct thing with its own quirks, and just because you've been running a postgres database up until now doesn't mean you're going to know about the admin side of these other features.
If you mess up the tables in one database it doesn't affect others, but if you lock up the server where it can't respond to queries, that affects every database running on that server.
> Also, if you run one postgres, you won't have of an issue running another if you have the experience in production.
We're talking about using a different feature that you presumably haven't used before, so you won't necessarily know about the admin side of that.
I know what you mean. The catch is figuring out 'the correctly sized' answer. Uncertainty is a huge factor here.
I think a lot of projects that are called 'over-engineered' have the benefit of hindsight. At least some of such projects were rationally designed based on what people knew at the time. There is often significant uncertainty. When engineering resources are hard to acquire, the powers-that-be often want to 'get it right' the first time. Translated: over-engineer it, because we don't know when we'll get resources for the next iteration.
I suppose my point is this: the next time somebody finds some 'over-engineered' code, it can be insightful to learn about the project history. (Example: some projects have overly optimistic prediction of usage. This perhaps suggests that pessimism is one antidote to over-engineering.)
One endpoint accepts work to a named queue, writes it to a file in an XFS directory. Another locks a mutex, moves the file to an in progress directory and unlocks the mutex before passing the content to the reader. A third and final endpoint deletes the in progress job file. There is a configurable timeout, after which they end up at a dead letter box. I am simplifying only a little bit. It's a couple hundred lines of Go.
The way this is set up means a message will only ever be handed to one worker. That simplifies things a lot. The workers ask for work when they want it, rather than being constantly listening.
It took a little tuning but we process a couple billion events a day this way and it's been basically zero maintenance for almost 10 years. The wizards in devops even figured out a way to autoscale it.
I offer this correction to the author: it is "disabuse" not "disimbue".
The industry standard for real-time was anything under 30s end-to-end. We hit many scaling limits as our platform grew. We worked on improving the system instead of looking at postgresql as a problem. We measured and measured and postgresql was never the problem when it came to scaling. Most of the problems that existed was mostly due to design decisions in the system or some technical debt that needed to be solved. When we solved those problems the system flew.
There was one incident where our PostgreSQL instance CPU usage went up to 100% during peak hours and would cause problems in the system. We wondered why, there was a lot of debate around the topic, PostgreSQL doesn't scale, we need a larger instance, or move to a dedicated queue system etc...
I didn't believe the hypothesis that PostgreSQL didn't scale. So I conducted a performance audit on the system myself I discovered one of the engineers forgot to add an index on one of the columns on a major table with > 60m records. There was a lot of back and forth between engineers that discussed whether we should add the index. I simply conducted a small experiment and PG analyze clearly showed a missing index in one of the key tables. After we added a single line of code and deployed the index, during peak hours DB CPU usage would not even exceed 20% we had to scale down our DB to save some $$.
PostgreSQL backed queue system does work.
I've been building distributed systems in large and small scales for over a decade. Before building systems with Elixir / Phoenix I also used to work with Rails where the default is as mentioned in this article is you just adopt Redis. While Redis does have additional overhead when it comes to management of the infrastructure, it also worked well.
Having been through both I'd say a simple PostgreSQL backed solution is the clear winner when it came to not having to manage a 3rd dependency when you already have a complex system. There was no 3rd or 4th piece of the puzzle to theorize hypothesize or experiment with. You just had 2 pieces it is either your application or your database that's the problem. That simplicity saved us a lot of time for debate and doing experiments also squashed any dogmatic values and opinions that just didn't hold any water.
We always wanted to open source it, but we got bought out by a big and very IP protective company before we got the chance.
The key retry problem is "What happens when a worker crashes?".
RabbitMQ solves this problem by tying "unacknowledged messages" to a tcp connection. If the connection dies, the in-flight messages are made available to other connections. This is a decent approach, but we hit a lot of issues with bugs in our code that would fail to acknowledge a message and the message would get stuck until that handler cycled. They've improved this over the past year or so with consumer timeouts, but we've already moved on.
The second problem we hit with RabbitMQ was that it uses one-erlang-process-per-queue and we found that big bursts of traffic could saturate a single CPU. There are ways to use sharded queues or re-architect to use dynamically created queues but the complexity led us towards SQS.
Sidekiq solves "What happens when a worker crashes?" by just not solving it. In the free version, those jobs are just lost. In Sidekiq Pro there are features that provide some guarantees that the jobs will not be lost, but no guarantees about when they will be processed (nor where they will be processed). Simply put, some worker sometime will see the orphaned job and decide to give it another shot. It's not super common, but it is worse in containerized environments where memory limits can trigger the OOM killer and cause a worker to die immediately.
The other issue with Sidekiq has been a general lack of hard constraints around resources. A single event thread in redis means that when things go sideways it breaks everything. We've had errant jobs enqueued with 100MB of json and seen it jam things up badly when Sidekiq tries to parse that with a lua script (on the event thread). While it's obvious that 100MB is too big to shove into a queue, mistakes happen and tools that limit the blast radius add a lot of value.
We've been leaning heavily on SQS the past few years and it is indeed Simple. It blocks us from doing even marginally dumb things (max message size of 256gb). The visibility timeout approach for handling crashing workers is easy to reason about. DLQ tooling has finally improved so you can redrive through standard aws tools. There are some gaps we struggle with (e.g. firing callbacks when a set of messages are fully processed) but sometimes simple tools force you to simplify things on your end and that ends up being a good thing.
Can you elaborate more on this? How do the workers know when they have to process a new job?
Also, am I right in assuming this is typically a single node setup only, as all the files are mounted on a non "share-able" XFS disk?
If there's no work for them there's a small timeout and they ask for more. Simple loop. It's all part of a library we built for building workers. For better or worse, it's all done over http.
You are right, though, it is one XFS volume per queue instance.
We just run multiple instances (EC2) on a load balancer. Each instance of the queue gets it's own set of workers though so the workers know the right server to report done to.
We want a way to have a single pool of workers, rather than a pool per queue instance, and have them talk to the load balancer rather than directly, but we haven't come up with a reasonable way to do that.
Anyone know if there are Celery or Celery-like tools that support Postgres as a broker?
As a side-note, if you want a simple no-frills task scheduler ap-scheduler is a dead simple option. It’s even more limited than the solution described in OP (you can only run one worker so it’s not distributed at all) but often it is all you need especially for toy projects.
This also tells how important competence and knowledge of the system is. People that came in new and didn’t know the system like you do probably lacked the confidence/skills to just “get in” like that.
I've been using Sidekiq for 11+ years in production and I've never seen this happen. Sidekiq (free version) has a very robust retry workflow. What are you talking about here?
Been meaning to build an Orleans stream provider for Postgres.. I believe that's the main missing component that would allow everything to "JustWork" with Postgres until you outgrow it.
On the other hand, I have done a few experiments with postgres LISTEN/NOTIFY, while the feature seems nice at first glance, I concluded that it wasn't worth it for our use cases, maybe it is different in other languages but in the JVM, you have to allocate 1-thread for polling these results, which also keeps a connection busy.
What I ended up doing is leveraging akka-stream to stream the queue data directly from the db, which makes it simple to define throttling rules, this is super simple and effective.
For one such project, the message 'priority' feature was a life saver and a feature that is not super common in competing solutions.
How is it different from: putting multiple queues on same redis, when one queue is locked up, others queue are affected?
If that's a real risk, you can always put them into different instances. The solution is exactly the same for redis or postgresql
So if you update your offset to 4 before the transaction for 3 commits, you'll lose 3 forever (unless you control for this, eg by periodically detecting & requeueing orphaned jobs, or by using a strictly serial/gap free integer rather than the built in auto incrementing type).
Because different software has different requirements. Not having an external service requirement other than Postgres might be a feature of an on-prem/b2b appliance.
Because some software may be projected never outgrow the capabilities of Postgres, and if it does moving to another service can be made very easy.
Because you want a transitional job system and the simplicity of doing it in Postgres.
Do you have any experience with NATS, and how would you compare it to RMQ/SQS?
The authors claim it guarantees exactly-once delivery with its JetStream component, and it looks very alluring from the documentation, but looks can be deceiving.
1. It doesn't scale (there it is again)
2. Queuing with Postgres is super fiddly to get right
3. You're hacking a queue on top of something that isn't a queue
4. Running redis or rabbit isn't all that complicated
#1 as TFA argues, premature concern about scaling is the root of so much needless complexity. You should make scaling decisions like this: 1) assume boring tech like PG will satisfy your needs; 2) if it demonstrably does not, then find something that does.
#2 is obviously true; just look at this thread. There are battle-tested queuing libraries in most popular languages, but you do have to dig into the details of how they interact with things like pgbouncer.
#3 I guess so? But if the queue abstraction works and isn't leaky, what does it matter?
#4 can be debated. For several years I've been running a moderately complicated setup with 2 databases, redis, and kafka for several years. There's no way I'm going to add another piece of tech unless there's no other choice. The cognitive cost is too high.
The main debate is in the tradeoff between #2 and #4. Personally, if I can use an existing piece of tech to solve a problem to avoid having to ops another piece of tech, then I'm going to do that every time.
It is all configurable, but I've scaled up to hundreds of workers at a time to blast through tasks and it wasn't expensive at all.
Workers being an HTTP endpoint makes them super easy to implement and even better... write tests for.
Admittedly I probably shouldn't have used it the way I did. I dumped many millions of tasks into it, then fanned out processes pulling from that queue that took a variable amount of time to run. Some ran in seconds, some hours.
I had picked RabbitMQ because I wanted that queue to be durable and resist workers dying, or being restarted. However long lived tasks like this is not really what it was designed for (in my opinion). I kept running into issues where it would take a long time to restart, and stop answering connections and need a restart to continue. I ended up having to write monitoring code to check for this and handle it to have it be slightly reliable.
Im sure it works well for smaller short lived messages, but considering the issues I bumped into I would be hesitant to try it. Id probably reach to redis first with wrappers allowing me to swap out to any other queue as required first.
Jeez. What was the idea behind not adding? Disk space I presume?
In the beginning you can do a naive UPDATE ... SET, which locks way too much. While you can make your locking more efficient, doing UPDATE with SELECT subqueries for dequeues and SELECT FOR UPDATE SKIP LOCKED, eventually your dequeue queries will throttle each other's locks and your queue will grind to a halt. You can try to disable enqueues at that point to give your DB more breathing room but you'll have data loss on lost enqueues and it'll mostly be your dequeues locking each other out.
You can try very quickly to shard out your task tables to avoid locking and that may work but it's brittle to roll out across multiple workers and can result in data loss. You can of course drop a random subset of tasks but this will cause data loss. Any of these options is not only highly stressful in a production scenario but also very hard to recover from without a ground-up rearchitecture.
Is this kind of a nightmare production scenario really worth choosing Boring Technology? Maybe if you have a handful of customers and are confident you'll be working at tens of tasks per second forever. Having been in the hot seat for one of these I will always choose a real queue technology over a database when possible.
Are there any open source implementations of Task Queues? It feels like something that has been missing for years.
The code that does this is maybe 100 lines at most. It’s very effective especially if you deploy your app in kubernetes where you can expect instances to be ephemeral. It’s one of the components of my apps that has never needed any updates since I first wrote it circa 2017.
Edit: To clarify, I mean `SELECT id WHERE used = 0` followed by `UPDATE ... SET used = 1 WHERE id = ... AND used = 0`
It's more like a few thousand per second, and enqueues win, not dequeues like you say... on very small hardware without tuning. If you're at tens of tasks per second, you have a whole lot of breathing room: don't build for 100x current requirements.
https://chbussler.medium.com/implementing-queues-in-postgres...
> eventually your dequeue queries will throttle each other's locks a
This doesn't really make sense to me. To me, the main problem seems to be that you end up with having a lot of snapshots around.
Based on this sentence, I interpreted that part as representing that the engineers did not believe the missing index was causing the problem (until the experiment was run).
What technology stack are you working in which doesn't support postgres?
> but in a microservices world it doesn't work where jobs can span multiple systems
The point of the queue system is to be able to span said microservices. You can have an OCaml service picking up from one queue, processing, then writing into another queue. That queue could then be processed by a TypeScript service.
This is an output for our oldest instance (legacy system running Ubuntu 12)
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
stats
OK 952
---
current-jobs-urgent: 0
current-jobs-ready: 765
current-jobs-reserved: 2
current-jobs-delayed: 946
current-jobs-buried: 0
cmd-put: 1188640739
cmd-peek: 2718986
cmd-peek-ready: 5052
cmd-peek-delayed: 797
cmd-peek-buried: 797
cmd-reserve: 0
cmd-reserve-with-timeout: 3245006799
cmd-delete: 1188639093
cmd-release: 43276760
cmd-use: 1137988211
cmd-watch: 166122
cmd-ignore: 166077
cmd-bury: 0
cmd-kick: 1
cmd-touch: 0
cmd-stats: 1260
cmd-stats-job: 43134828
cmd-stats-tube: 53942209
cmd-list-tubes: 3251625
cmd-list-tube-used: 0
cmd-list-tubes-watched: 0
cmd-pause-tube: 988
job-timeouts: 14084
total-jobs: 1188640739
max-job-size: 1048576
current-tubes: 44
current-connections: 63
current-producers: 20
current-workers: 47
current-waiting: 41
total-connections: 14996583
pid: 3959
version: 1.4.6
rusage-utime: 170303.331293
rusage-stime: 399435.543161
uptime: 321658179
binlog-oldest-index: 90539
binlog-current-index: 90983
binlog-max-size: 10485760This link is simply raw enqueue/dequeue performance. Factor in workers that perform work or execute remote calls and the numbers change. Also, I find when your jobs have high variance in times, performance degrades significantly.
> This doesn't really make sense to me. To me, the main problem seems to be that you end up with having a lot of snapshots around.
The dequeuer needs to know which tasks to "claim", so this requires some form of locking. Eventually this becomes a bottleneck.
> don't build for 100x current requirements
What happens if you get 100x traffic? Popularity spikes can do it, so can attacks. Is the answer to just accept data loss in those situations? Queue systems are super simple to use. I'm counting "NOTIFY/LISTEN" on Postgres as a queue, because it is a queue from the bottom up.
Then you probably have to write complicated queries or use partitions in some sort.
Or Just stick to one thread polling the messages.
An option could be use a second connection and a separate transaction to insert data in the queue table.
300~ LOC or so and fairly easy to test. Wouldn't take that approach every time, but definitely worth it when you're aiming for a simple architecture.
These don't occur on the database server, though... This merely affects the number of rows currently claimed.
> The dequeuer needs to know which tasks to "claim", so this requires some form of locking. Eventually this becomes a bottleneck.
These are just try locks, though-- the row locks are not contended. The big thing you run into is having lots of snapshots around and having to skip a lot of claimed rows for each dequeue.
> What happens if you get 100x traffic? Popularity spikes can do it, so can attacks.
If you get 100x the queueing activity for batch jobs, you're going to have stuff break well before the queue. It's probably not too easy to get 100x the drain rate, even if your queue system can handle it.
This scales well beyond 100M batch tasks per day, which gets you to 1M users with 100 tasks/day each.
> With a Postgres queue, if you insert the job to send the email and then in a later part of the transaction, something fails and the transaction rollbacks, the email is never queued to be sent.
This is true - definitely worth isolating what should be totally separate database code into different transactions. On the other hand, if your user is not created in the DB, you might not want your signup email. Just depends on the situation.
Oban's been great, especially if you pay for Web UI and Pro for the extra features [3]
The main issue we've noticed though is that due to its simple fetching mechanism using locks, jobs aren't distributed evenly across your workers due to the greedy `SELECT...LIMIT X` [2]
If you have long running and/or resource intensive jobs, this can be problematic. Lets say you have 3 workers with a local limit of 10 per node. If there are only 10 jobs in the queue, the first node to fetch available jobs will grab and lock all 10, with the other 2 nodes sitting idle.
[1] https://github.com/sorentwo/oban [2] https://github.com/sorentwo/oban/blob/main/lib/oban/engines/... [3] https://getoban.pro/#feature-comparison
But I think a lot of it is also about knowledge and documentation. If I want to copy FAANG or another startup, and set up an infinitely scalable queue-based architecture, I can find dozens of high quality guides, tutorials, white papers etc, showing me exactly how to do it. Yes maintenance is higher, but I can get set up with redis, SQS, any of the 'scalable' solutions within a few hours of copy-pasting commands and code and configuration from a reputable source.
If I want to use NOTIFY in postgres? I googled "SQLALchemy notify listen postgres" and I find a few unanswered stackoverflow questions and a github gist that has some code but no context.
I would honestly love to use this approach for a side project, but I don't have 2-3 days to figure it out on my own. The direct choice for me might seem to be
* simple, but not scalable (ie just use postgres)
* complex, but scalable (ie redis, sqs, whatever)
and then it's a tradeoff, and the argument goes that I am blinded by cool tech and FAANG and I'm choosing complex but scalable, even though I don't need scalable.
But taking into account guides and other resources, the choice for me is actually
* complex and not scalable (this, because I don't know how to implement it and I can't predict what pitfalls I might face if I try)
* simple and scalable (what everyone actually does)
and that makes the engineer's choice to follow faang look a lot more reasonable.
When grabbing a new message it selects "Available or (Locked with Updated timestamp older than configured timeout)". If successful it immediately tries to set the Locked status, Updated timestamp and bumps the Version counter, where the previous values of Status and Version has to match. If the update fails it retries getting a new message.
If the Version counter is too high, it moves the message to the associated dead-letter table, and retries getting a new message.
This isn't for high performance. I tested it and got 1000 messages/sec throughput with handful of producers and consumers against test db instance (limited hardware), which would be plenty for us.
I wrote it to be simple and so we could easily move to something AMPQ'ish like RabbitMQ or Azure Service Bus when needed. Overall quite easy to implement and has served us well so far.
Also: you may think that you may one day want to be hired by a FAANG.
Bottom line, there is no silver bullet.
Why would both transactions see `used = 0`? The DB server tries to isolate transactions and actively hides effects of other transactions that have not committed yet.
_Ideally_ the queuing technology is abstracted from the job-submitters/job-runners anyway. It's a bit more work if multiple services are just writing to the queue table directly.
I agree that the _moment_ the system comes to a screeching halt is definitely not fun.
The nice thing about "boring" tech like Postgres is that it has great documentation. So just peruse https://www.postgresql.org/docs/current/sql-notify.html . No need for google-fu.
SQLite does not support SKIP LOCKED.
If you want a job there, very relevant.
> I tend to reject FAANG recruiters because Leetcode
I understand the pain of leetcode interviews. They’re terrible. But optimizing your career based on the interview process seems… backwards?
FAANG companies (for example) are very relevant if you want to make a lot of money and live in Silicon Valley without being a successful founder/VC. Apple farmers… not so much. If you live in Tokyo, then FAANG companies might be less relevant.
Either way, doesn’t seem like the interview is where you should draw the line.
Also, postgres partial indexes can be quite helpful in situations where you want to persist and query intermediate job lifecycle state and don't want multiple rows or tables to track one type of job queue
AWS white papers and engineering blogs tend to give me everything I need in one place, and I don't think there are any for apps built with NOTIFY.
UPDATE ... SET status = 'locked' ... RETURNING message_id
Or you can just use an IMMEDIATE transaction, SELECT the next message ID to retrieve, and UPDATE the row.On top of that, if you want to be extra safe, you can do:
UPDATE Queue SET status = 'locked' WHERE status = 'ready' AND message_id = '....'
To make sure you that the message you are trying to retrieve hasn't been locked already by another worker.[0]: https://github.com/litements/litequeue/
[1]: https://github.com/litements/litequeue/blob/3fece7aa9e9a31e4...
This is what kills you if you're a small startup. Of course it gives you a lot too. But if you're belly up then it doesn't matter.
Of course go for whatever solution gives you the most benefits while not distracting you too much from your main goal.
I've seen a startup where devs spent around 80% of the time fighting their tools and infrastructure. They had a 3 month runway and today there's a massive hole at the end of that runway. I still shudder form just the thought of it.
https://www.postgresql.org/docs/current/transaction-iso.html...
Usually what I do is pull the records that were updated least recently (as in they should be ahead of the queue). So if a previous worker locked the oldest X records, the second worker will pull the next batch bc the condition will exclude the previously updated (locked) records. There's a lot of flexibility you can add with just these controls: schedule, frequency, batch size, number of workers.
Granted now you need 3 services instead of 1. I personally don't find the maintenance cost particularly high for this architecture, but it does depend on what your team is comfortable with.
For instance, if you use postgres with a low load, it is almost trivial to migrate schemas, add new constraints, do analytics etc.
If you use SQS, Cassandra, whatever, then you now get scalability/availability but it becomes much more time-consuming to change things if you figure out that your original design doesn't work. Say the business comes and says "please add constraint X. All users of type foo must never combined value bar at the same time."
It is possible to implemented that without postgres, but it is not easy or simple, especially if you need to make changes.
Therefore, my take is that you either use postgres to stay flexible or you use both postgres and something else on top of it when you know that you won't have to change things. Of course this means additional infrastructure/maintenance overhead.
In the end it's always a trade-off, you just need to know when to trade which thing off against what.
There's a (sort of) objective trade-off to be made, but another dimension is how familiar you are with the solution and/or how quick can you implement it using documentation and examples.
If you happen to know exactly how to create a horizontally scalable microservice based hairball with nodejs, then maybe you are quicker with that than with some traditional django monolith using a nicely normalized sql database (or whatever).
In a startup, you are often always squeezed for time, so making the objectively right tradeoff for your context is usually secondary to the simple question of 'when can you ship?' If the scalable-yet-inflexible is what stack overflow abundantly recommends and documents, maybe this is quicker to get done now, whatever the consequences are on the longer run.
There is actually another possibility: there must be a way to check whether the receiving system has received the message. But this only works if there are no "rogue" senders.
This is a valid comment. I’ve chosen Postgres in the past for the features, not the performance. For example guaranteed at most once delivery (via row locks) and filtering of jobs based on attributes (it’s a database after all).
Especially given the emphasis on YAGNI, you don’t need a UUID primary key, and all of its problems they bring for B+trees (that thing RDBMS is built on), nor do you need the collision resistance of SHA256 - the odds of you creating a dupe job hash with MD5 are vanishingly small.
As to the actual topic, it’s fine IFF you carefully monitor for accumulating dead tuples, and adjust auto-vacuum for that table as necessary. While not something you’d run into at the start, at a modest scale you may start to see issues. May. You may also opt to switch to Redis or something else before that point anyway.
EDIT: if you choose ULID, UUIDv7, or some other k-sortable key, the problem isn’t nearly as bad, but you still don’t need it in this situation. Save yourself 8 bytes per key.
With your mentioned list, three of them are Python, so that significantly reduces the breadth.
It's even worse than you say though. As someone who has used neither Postgres or Redis for queueing, how am I supposed to know what is the "simple" solution here and if it really solves my problem?
Almost everyone uses solution X. A few people are saying "no, just use solution Y, it's obviously enough and far simpler". Even if it is far simpler, how am I supposed to know whether there are some hidden gotchas here?
Much safer to bet on technology that is proven to work, given that large amounts of people are using it in production for this purpose.
I’ve experimented with making this easier via libraries that provide high-level APIs for using Postgres as a queue and manage the schemas, listen/notify, etc for you: https://github.com/adriangb/pgjobq
I don't think that there's anything wrong with using a database as a queue, however, I think that there probably could have been better ways to get across the idea, rather than just dismissing an honest opinion as BS. I don't necessarily agree with all of what was said there, but at the same time I can see why those arguments would be reasonable: >>20022572
For example:
> Because it is hacky from the perspective of a distributed system architecture. It's coupling 2 components that probably ought not be coupled because it's perceived as "convenient" to do so. The idea that your system's control and data planes are tightly coupled is a dangerous one if your system grows quickly.
To me, this makes perfect sense, if you're using the same database instance for the typical RDBMS use case AND also for the queue. Then again, that could be avoided by having separate database instances/clusters and treating those as separate services: prod-app-database and prod-queue-database.
That said, using something like RabbitMQ or another specialized queue solution might also have the additional benefit of bunches of tutorials and libraries, as well as other resources available, which is pretty much the case whenever you have a well known and a more niche technology, even when the latter might be in some ways better! After all, there is a reason why many would use Sidekiq, resque, rq, Hangfire, asynq and other libraries that were mentioned and already have lots of content around them.
Though whether the inherent complexity of the system or the complexity of your code that's needed to integrate with it is more important, is probably highly situational.
If someone says "I choose X over Y because I used X before (or because X is better documented" then fair enough - but I rarely hear that as an argument when choosing "FAANG-technology".
1. Many queueing solutions struggle with at most once delivery. I've used a queue I built on top of RDBMS to solve this before. Ironically, when I was a junior we used a key/value store to lock certain messages for at most once delivery, which doubly compounds my next point.
2. A queue is more infrastructure to manage, and hard to troubleshoot infrastructure at that.
3. Many queueing softwares struggle with deserialization into native types which makes the code on either end unnecessarily obtuse. Using RDBMS queues makes your serialization as good as your drivers.
If they are being technically precise, queue isn’t the correct term, but language changes with context and time. Either way the implementation isn’t wrong if strict start order has been considered and isn’t important.
Careful monitoring and tuning of parameters mentioned by the sibling comment to you can help mitigate this, though.
Ultimately at scale, no, RDBMS shouldn’t be a queue. But most have a long way to go before they hit that point.
If you need to handle 70 requests/second, design for 700. If you need to handle 20 servers running batch jobs, design for 200 servers. If you're in a startup that grows 100% per year, you'll be at 8x scale in 3 years. So you'll have time to rewrite as you grow!
Out of three job systems I built, the first one tried to avoid SQL for "scalability." Then we hit a bunch of edge cases that needed transactional integrity, and we were in a world of pain.
My two more recent distributed job systems use PostgreSQL as a coordinator. They're literally built around SELECT FOR UPDATE SKIP LOCKED. One of them routinely controls 350 workers and the other does elaborate prioritization for thousands of jobs. Both of them will continue to run just fine until they're earning millions of dollars a year—for example, the one controlling 350 workers should scale to about 2,000 CPUs with a little work.
Hyperscale technologies are fairly cheap right up until you discover you need transactions. At that point, faking transactional semantics on top of an eventually consistent data store becomes an engineering nightmare.
So sit down, and do the math. If your company was earning $100 million/year, how big would your distributed system need to be? Can you easily get a PostgreSQL instance big enough to handle that load? (Or could you just shard per client?) If so, strongly consider using PostgreSQL. It makes a hundred things trivial.
Throttle the inputs. Rate-limiting doesn’t belong to the data layer.
While throttling due to organic popularity isn’t great, I’d argue the tradeoffs might be worthwhile. If it looks like the spike will last, stand up Redis during the throttling, double-write, and throttle down the Postgres queue until it’s empty. If you really need to, take a 15 minute outage to just copy data over.
This line of reasoning is desirable for FAANGS, but can bankrupt startups that need to move fast and get shit done.
Could it be because microservices and so-called "server-less" have been sold as cost-saving measures that increase the business' flexibility and decrease capital investment?
When... in reality a single deployed Docker container is way more manageable than a distributed system constructed with "lambdas" and requires fewer engineers in the long run?
What I'm trying to say is that FAANG cargo-culting is only part of why developers choose to build solutions that scale larger than they need. Another large part is the cloud development ecosystem writ-large and the consulting culture that has built up around it.
It's the same exact thing: most of the people do what is considered a "best practice" just to avoid any type of annoyance when something will break.
Advisory locks also exist, if you want to implement logic in the application to inform you of various row conditions without having the DB care about it.
But for the example given, you could do many things:
* Add some boolean columns for ack and complete. Performance due to cardinality of these will eventually start to suck if they’re being indexed, but by that point (millions of rows) you can have thought of another solution.
* Add ctime and atime columns, each of which can be handled by Postgres natively to update when created / written, respectively. This has the advantage of lending itself nicely to partitioning by date range, if that becomes necessary.
* Have three tables - available, in_progress, and completed. Use triggers or application logic to move entries in an atomic manner.
None of this is necessarily normalized, but if you’re receiving JSON payloads for the job, 1NF went out the window anyway.
Good on you for actually empirically determining reality.
That's exactly what a queue means, not just in every day life, but specifically in computer science.
Everything is a tradeoff, optimize for the things that need optimizing, and determining what those are is the hallmark of a good engineer.
How does the system behave when the traffic rate is higher for which it was designed for or can currently handle? Because that number will always be there, even in a "scalable" system. One won't be able to add capacity at the same rate that work will increase.
Good thing I didn't listen to your advice... my DIY background task queue saved my website when Celery couldn't scale. Why are you against rolling your own task queue besides it seeming complicated?
https://wakatime.com/blog/56-building-a-background-task-queu...
Author here. I would say that my post is less targeted at someone like you (application developer, presumably) and more targeted at library developers.
I don't think it's ideal for everyone to be implementing bespoke, Postgres-backend (or any other queue for that matter) background job workers in their applications. There's a lot of nuance and implementation details to get wrong with background jobs, and for that reason I think background work should generally be done by more comprehensive, dedicated libraries or frameworks.
If every Rails application didn't have Sidekiq/Active Jobs and instead had bespoke background worker implementations, Rails applications would likely have a much less rosy reputation on account of their unreliability.
https://github.com/wakatime/wakaq
We currently process ~20 million tasks per day, and I don't have to worry about running VACUUM on my queue ;)
* Metrics, monitoring, alarming on depth, message age
* Autoscaling on your custom metrics
* Managing retries, dead lettering, backoff
* Managing the DB workload: it's update-heavy and may easily be more intensive than the rest of your app. You may need to repeatedly scale your tiny, startup-scale DB, causing customer disruptions, because of your homemade queue.
The arguments for it are either avoidance of lock-in, or the hand-wavy one in this article/many comments: "we're a small startup, we need to minimize complexity, and make things as simple as possible, maintenance will kill us!".
Lock-in makes sense (though other queue alternatives should still be considered), but the latter argument gets used to justify all kinds of harebrained, superficially-simple but actually-complicated engineering schemes. I generally put the PG skip locked approach in that bucket, particularly when the alternative on hand is Pubsub or SQS. If it's between a Postgres table and ActiveMQ I might feel more conflicted.
With the paid features to keep it from getting dropped things still can be painful. We have a lot of different workers, all with different concurrency settings and resource limits. A memory heavy worker might need a few GB of memory and be capped at concurrency of 2 while a lightweight worker might only need 512MB and have concurrency of 20. If the big memory worker crashes, its jobs might get picked up by the lightweight worker (and possibly hours later), which will then OOM and all its 19 other in flight jobs all end up in the orphanage. And now your alerts are going off saying are saying the lightweight worker is OOMing and your team is scratching their heads because that doesn't make any sense. It just gets messy.
Sidekiq probably works great outside of containerized environment. Many swear to me they've never encountered any of these problems. And maybe we should be questioning the containerization rather than sidekiq, but ultimately our operations have been much simpler as we've moved off of sidekiq.
A lot of frameworks already have queue/job libraries with adapters (so you're not really locked in) and cloud providers are highly scalable and fault tolerant.
It seems silly to try to build into Postgres something that is already cheap and readily available unless you find yourself in a situation where standing up additional infra is hard (embedded, certain on premise)
It's a good compromise but not suitable for every use case.
The thing I really don't like is that you need to be connected to the cloud even for local development and test.
but I can get set up with redis, SQS, any of the
'scalable' solutions within a few hours of copy-pasting
commands and code and configuration from a reputable
source
[...] and that makes the engineer's choice to follow faang
look a lot more reasonable.
I also agree with the linked article's overall point, but I think the specific "job queue" example from the article is actually a bad example because:- "rolling your own" job queue is not rocket science but is nontrivial and easy to get wrong w.r.t. locking etc.
- the argument against taking additional dependencies is that now you have one more tool to master, understand, and manage. but my experience is that job queues like Sidekiq are not a significant overhead in terms of developer burden.
And that’s terrible advice, of course. You very likely will have time to scale things up (customer count almost never increase dramatically from one day to the next), and even if you don’t you’ll most likely never deliver a useable product if all components need to be “scalable” from the beginning.
I love the article's point, and I tend to feel that the "chasing the cargo cult of 'scale'" is maybe the biggest problem I see in development teams today. It is certainly the biggest problem that I rarely hear anybody talking about.
Author here. I would say that my post is less
targeted at someone like you (application developer,
presumably) and more targeted at library developers.
I think the article might benefit from clarification on this point.Reading the HN comments, I see that I'm not the only person who came away with a misunderstanding there.
Again, I 100% love the overall point.
A small nitpick. FOR UPDATE SKIP LOCKED was mainstream (in a sense) way before it was copied from Oracle to Postgres. It's used under the hood in most non-trivial Oracle deployments. Like for replication or refreshing materialized views in parallel.
But, to make it work, you've got to know the database well enough to know that things like SELECT FOR UPDATE SKIP LOCKED exist in the first place. Which is a kind of knowledge that's getting quite rare these days, because more and more engineers grow up never having known a world where they aren't walled off from their DBMS's true capabilities by a heavyweight ORM.
More generally, is there something like a "on prem cloud" which just replicates say Cloud Tasks (but also other Cloud Apis) using local compute as well as say a local db. For testing / development this would be very cool.
TTL typically deletes expired items within a few days. Depending on the size and activity level of a table, the actual delete operation of an expired item can vary. Because TTL is meant to be a background process, the nature of the capacity used to expire and delete items via TTL is variable (but free of charge). [0]
Because of that limitation, I would not use that approach. Instead I would do Scheduled Lambdas to check for items every 15 minutes in a Serverless Aurora and then add them to SQS with delays.I've had my eye on this problem for a few years and keep thinking that a simple SaaS that does one-shot scheduled actions would probably be a worthy side project. Not enough to build a company around, but maintenance would be low and there's probably some pricing that would attract enough customers to be sustainable.
[0] https://docs.aws.amazon.com/amazondynamodb/latest/developerg...
> If I want to copy FAANG or another startup, and set up an infinitely scalable queue-based architecture, I can find dozens of high quality guides, tutorials, white papers etc, showing me exactly how to do it.
I'm not sure about this either, though from reading typical developer blogs and listening to the hivemind, you do get the feeling that you must be scalable. Devs often don't really know when (usually not) that becomes important and how far the vast majority of apps can go with monoliths in big boxes (quite far).
I find this definition has morphed from one meaningful to developers into one queue implementations like to claim. I've learned this generally means "multiple inserts will be deduped into only one message in the queue".
The only guarantee this `exactly-once` delivery provides is that I won't have two workers given the exact same job. Which is a nice guarantee, but I still have to decide on my processing behavior and am faced with the classic "at most once or at least once" dilemma around partially failed jobs. If I'm building my system to be idempotent so I can safely retry partially failed messages it doesn't do much for me.
So yeah, I find a lot of the more complicated solutions to be simple, but mostly because it's well supported and not by just me.
But my response would then be that this is a stupid example in the context of this whole submission because that submission talks about postgres and trying to get postgres to scale "infinitely" let alone fulfill other properties like extremely high uptime etc. that is just... insane. No one in their right mind tries to do that with postgres. It is one thing to do queueing with it but "infinitely scalable" is a totally different one.
Therefore I can only say: yeah, to set up "an infinitely scalable queue-based architecture" you should not use postgres and the author in the submission says the same thing.
> Devs often don't really know when (usually not) that becomes important and how far the vast majority of apps can go with monoliths in big boxes (quite far).
Right, they make the wrong trade-offs. That is exactly what I wanted to express with my response.
> There’s a good chance that you’re already using a relational database, and if that relational database is Postgres, you should consider it for queues before any other software
The point is, if you are already using postgres, then the question is not: should I use postgres for queueing and the rest or should I use postgres for the rest and a FAANG solution for queueing on top of it.
Now the thing is that the FAANG solutions are great in certain ways and allow you to scale a lot and have extremely high availability. But it comes at the cost, for examply those solutions don't support transactions like postgres does. So if you need those (and often you don't know in advance how the business of a startup develops) then now you have to build some technical solution on top of the FAANG solution which is much much slower and more complicated compared to doing it in postgres.
Even if you say that it's more difficult to setup and understand the queueing in postgres (and I agree), I would argue that in the end it is still faster because you don't need to setup and maintain all the infrastructure (yeah, even if it runs in the cloud) unless this is a prototyp and you don't care about security, documentation and all of that and throw it away in the end anyways.
Your argument is that going with FAANG level designs saves time?
And the crux of your argument is that you're able to find a guide online?
I strongly suspect you don't have a healthy respect for complexity.
so pg_try_advisory_lock/pg_advisory_unlock can lock over transactions while for update skip locked can't, thus you would either need to keep a transaction open or use status+job_timeout (and in postgres you should not use long transactions)
basically we use c#, but we looked into https://github.com/que-rb/que which uses advisory_locks, since our jobs take like 1 min to 2 hours it was a no-brainer to use advisory_locks. it's just not the best thing if you have thousands of fast jobs per second, but for a more moderate queue where you have like 10000 jobs per minute/10 minutes/30 minutes and they take like 1 min to 2 hours its fine.
we also do not delete jobs, we do not care about storage since the job table basically does not take a lot. and we have a lot of time to catchup at night since we are only in europe
Did you find it in a credible source? Which one(s)?
I've found nothing credible in Merriam Webster, Etymology Online, nor _any_ other I've searched. There is at least one low-quality ad-serving site that credits ChatGPT with a definition.
I'm happy to be enlightened.
The usual trick I use is to have a `jobs.state` field containing "pending", "running", "done", or "error" (or whatever that job system needs). I only hold SELECT FOR UPDATE SKIPPED LOCKED long enough to:
1. Transition from "pending" to "running". Or a second time, to transition from "running" to either "done" or "error".
2. Store the current worker ID (often the Kubernetes pod name).
Then, I can build a watcher that wakes up every 5 minutes, and looks for "running" jobs with no corresponding Kubernetes pod, and mark them as "error". I try to never hold a lock for more than a second or two, and to never lock more than one job at once. This gets me 80% of the way there.
The reason I don't hold a transaction open for the entire job is because every transaction requires a PostgreSQL connection, and connections are surprisingly expensive. In fact, you may want to run connections through pgbouncer or a stateless REST API to avoid holding open hundreds or thousands of connections. Everything except PostgreSQL itself should ideally be stateless and restartable.
You might also have a retry system, or jobs that recursively queue up child jobs, or jobs that depend on other jobs, or more elaborate state machines. You might have timeouts. Most of these things are solveable with some mix of transations, some SQL, CREATE INDEX or CREATE VIEW. A database gives you so many things for free, if you're just a little careful about it.
And since Grafana supports SQL, you can easily build really nice dashboards for support and ops by just querying the tables used by the job system.
There are other ways to do it! But I'm fond of this general strategy for coarse-granularity jobs.
IDK maybe <1000 messages per minute
Not saying SKIP LOCKED can't work with that many. But you'll probably want to do something with lower overhead.
FWIW, Que uses advisory locks [1]
If you're architecture is on AWS, you're most likely using either one of those solutions.
Queues by nature tend to be for tasks that you can tolerate delaying a bit. If your queues can impact your "live"/online processing, that's worse than just impacting other queues.
Also something like redis tends to be a lot simpler and less prone to locking up than the monster that is postgresql.
Look up the features of Core if interested. No ORM needed, as it says in the docs.
Some languages/frameworks have this built in (a lot of ORMs do this out of the box). A lot of frameworks also have facilities for conditionally wiring in different implementations based on runtime config.
https://archive.org/download/completedictiona00falluoft
There's many uses in British literature of the 1800's, and a whole lot of uses in academic literature of the 70's to 80's. https://i.imgur.com/BhMv2nF.png "Disabuse" would fit into many of these slots, but not all.
Only common use now is RPG jargon; imbuing something with an attribute is something role playing nerds talk about, and it really needs an antonym.
For a starter "the most scalable component is always the most difficult to integrate and use" isn't true, and "whatever your team knows or don't know, the challenges tied to integrating then exploiting a given component are always the same". There are many parameters. In some contexts taking into account the team's subjective preferences is crucial.
There is no universal rule, à la "always go for the most scalable, neglecting any other consideration" or "the minimal immediate effort is always the best option".
What about availability, though? The distributed solution is also useful to avoid downtime in case of single node failure.
Is there an off-the-shelf solution that lets me do that with Postgres? I know the newest version (16) just added active-active replication, but I wouldn’t know how to use that to achieve resilience.
And unless your jobs are trivial then it's highly likely that they interact with your app in some way so it doesn't really matter if your workers are distributed and up, they're not able to complete their work because your app is down because of a single-node Postgres.
With 99.99% you can have 4 minutes of downtime a month. If failover to a hot standby takes a minute then that shouldn't be a problem to achieve a 99.99% uptime SLO.
I’m not confusing anything. I’ve seen random selection “job queues” implemented many times. As long as you truly don’t care about start order, it’s fine to trade it for increased throughout.
Queue: a list of data items, commands, etc., stored so as to be retrievable in a definite order, usually the order of insertion.
note the term "Usually", not "always".
It's not a small thing and it's not something you should be dismissing out of hand.
I'm certain you can work around those issues, but why if you can use a proper queue?
PREACH
If this is news to you, you just justified a week's worth of browsing HN on the clock reading this statement. Scribble it on a scrap of paper and keep it taped to your ATM card.
If background jobs need to be available while some other core application is down, that needs to be designed for, and that design can be achieved with any queue technology. Simply separate the queue system stack from the core application system stack.
> But if you're using Postgres as your queuing system because you're already using it as your core database technology
Note your own use of "database technology" and not "database server". It's common to have separate application and queue database servers when such an architectural constraint is present. Of course, this sacrifices the benefit of transactional guarantees when the application and background jobs run on the same server.
Like I said in the post, technology (and architectural) choices are tradeoffs all the way down :)
Does that mean it doesn't have any order or that whoever writes the query doesn't care about order?
Also we are arguing over whether pg suffices as a queue implementation, and you use itself as an example?
always has an oder, which is usually of insertion.
I’m not using pg itself as an example. I’m using a specific implementation of a “job queue” built with pg.
I’ve seen and you can search for and find many implementations of “job queues” using relational databases where job start order guarantees are traded away for throughput.
And, frankly, even for the less-critical stuff that was only running on a single node, I still dealt with fewer availability problems back in the day than I do now that everything's gone distributed. I think that a thing that's been forgotten over the years is that a lot of this stuff that distributed systems do to be reliable was more about digging oneself out of the hole that was created by running on lots of cheap hardware instead of using a single server with redundancy built-in. I acknowledge that, past a certain scale, that's the only option that makes sense. But if you're not operating at that scale then there's a good chance it's all just verschlimmbessern.