Overall though I'd agree it can be good as a first step or default until lower cost or higher performance is needed.
But frankly, if 10k/s inserts is the scale you are talking about even worrying about a pub/sub solution seems odd.
Introducing something like Kafka for anything less than an order of magnitude more than that seems like an architectural blunder. By the time you are there Postgres will have obviously disqualified itself.
We implemented a similar design to Que for a specific use case in our application that has a known low volume of jobs and for a variety of reasons benefits from this design over other solutions.
[1]: https://github.com/que-rb/que [2]: https://brandur.org/postgres-queues
0. https://www.cloudamqp.com/blog/why-is-a-database-not-the-rig...
EDIT>> I am not suggesting people build their own rabbitmq infrastructure. Use a cloud service. The article is informational only.
The below article goes over some downsides to using postgres for a job queue: namely that the data model isn't optimized to find and send new jobs to large numbers of subscribers. Actual messaging systems like Kafka/NATS are better at this.
https://www.2ndquadrant.com/en/blog/what-is-select-skip-lock...
Also, there are things that the dedicated messaging systems give you that postgres won't, such as transparent error handling and retry logic. If your worker grabs a job from postgres and fails (in some kind of transient way such that the task should be retried), you'll have to implement the retry logic yourself. A streaming platform like Kafka or NATS will notice that you haven't acknowledged the message and deliver it to someone else.
This is something you could pretty easily implement yourself with something like a process that just scans over the "processing" jobs and looks for timeouts, resetting them as it goes. But there are probably a few of these little papercuts that Kafka-like (…Kafkaesque?) systems would handle for you.
So, I guess if you already have postgres set up in a reliable way, and there's no one around whose job it is to set up new production systems like Kafka, and you don't already have something like Kafka, and you only need basic job queue requirements or you're okay with implementing whatever you need on top of postgres yourself, and your incoming job rate is fewer than maybe a thousand per second, and you have fewer than maybe tens of consumers… postgres is probably a decent job queue.
The above paragraph seems snide but it probably does describe many people's environments.
Many times even when I do use something like Redis, I run into limitations.
For example, I have been using it for caching calculated data that users need, but now we need to be able to query the cached data by date which puts us back to needing to store the cached data in the relational database.
But once you get even just up to say, 40 messages per second with 100 worker processes, you're now up to 4000 updates per second just to see which worker got to claim which job, and up from there becomes untenable.
And Kafka isn't the only solution for it. There are many lightweight pub/sub and queuing systems which also don't involve needlessly adding abstraction layers and application code into an RDBMS.
A message queue is one of those things that is easy enough and worth the effort to do "right" early on, because it is not something you want to rip out and rewrite when you hit your scaling bottlenecks, given how critical it is and how many things it will end up touching.
Edit: also keep in mind most queues do not like "slow consumers" i.e. if your workload is bursty with long processing times, a database might be a better fit (i.e. rabbitmq does not like it)
Edit2: we implemented a queue with postgres since we need acid and having 10k inserts per second is highly unlikely since a customer upload takes longer than a second (we deal with files) we mostly have burst workloads short period of high volume followed by long pauses (i.e. nobody uploads stuff at night)
And I've seen a looot of bad designs and misconfigurations.
All that said, I'm a massive fan of Kafka, I'm the first to admit it's a complex tool, but it needs to be for the problem space it targets.
IMO, the downsides of hosting a queue inside your primary relational DB are very much outweighed by the downsides of 1) having to run a new piece of infra like rabbit and 2) having to coordinate consistency between your message queue and your relational DB(s)
For high throughput (we had ad tech servers with 1E7 hits/s) we used a home-built low-latency queue that supported real time and persisted data. But for low throughput stuff, the DBaaQ worked fine.
And ultimately, maybe it was a lack of imagination on our part since Segment was successful with a mid-throughput DBaaQ https://segment.com/blog/introducing-centrifuge/
- The service mentioned (now called https://webapp.io ) eventually made it into YC (S20) and still uses postgres as its pub/sub implementation, doing hundreds of thousands of messages per day. The postgres instance now runs on 32 cores and 128gb of memory and has scaled well.
- We bolstered Postgres's PUBLISH with Redis pub/sub for high traffic code paths, but it's been nice having ACID guarantees as the default for less popular paths (e.g., webhook handling)
- This pattern only ever caused one operational incident, where a transaction held a lock which caused the notification queue to start growing, and eventually (silently) stop sending messages, starting postgres with statement_timeout=(a few days) was enough to solve this
Previous discussion: https://news.ycombinator.com/item?id=21484215
Happy to answer any questions!
'easy to learn and use' is a downright lie.
edit: link to this same topic being discussed a few weeks ago: https://news.ycombinator.com/item?id=28903614#28904103
> statement_timeout=(a few days)
wouldnt you want this to be a few seconds or minutes? Maybe I miss the point of setting this to days...
> The postgres instance now runs on 32 cores and 128gb of memory and has scaled well.
Am I the only one?
Take a startup with a few users. The senior engineer decides they need pub/sub to ship a new feature. With Kafka, the team goes to learn about Kafka best practices, choose client libraries, and learn the Kafka quirks. They also need to spin up Kafka instances. They ship it in a month.
With postgres, they’ve got an MVP in a day, and shipped within a week.
One could also use DNS TXT as an RDBMS with some interesting fault tolerance and distribution schemes. That doesn't mean it's a good idea or the best way to solve a problem.
If you haven't seen them already, the Jepsen analyses are really worth a read: https://aphyr.com/posts/293-jepsen-kafka https://aphyr.com/posts/282-jepsen-postgres https://aphyr.com/tags/jepsen
One of the biggest advantages comes when you start thinking about them in terms of transactions. Transactional guarantees are really useful here: guarantee that a message will be written to the queue if the transaction commits successfully, and guarantee that a message will NOT be written to the queue otherwise.
https://brandur.org/job-drain describes a great pattern for achieving that using PostgreSQL transactions.
Not the case with a queue.
For an example, it handles stripe webhooks when users change their pricing tier - if you drop that message, users would be paying for something they wouldn't receive.
I actually wrote StarQueue for MySQL, Microsoft SQL server and also Postgres - they all work just fine as a message queue back end, because they all support SKIP LOCKED.
I started out including Oracle as a supported database but found that I loathed Oracle's complexity.
What kind of TPS are people seeing on queues based on psql?
edit: https://gist.github.com/chanks/7585810
10k/s here, but that was on a postgres from years ago - there have been like 4 or 5 major versions since then I think.
That's a good amount and I'm betting you can push it forward. Further, a queue is trivially sharded since messages are entirely isolated.
That said, Kafka can do hundreds of thousands if not millions of messages per second, so clearly there's room for optimizations.
Of course, if scalability is not a concern (or the connection limitations are eventually fixed in postgres - this has improved in 14), this would be a very viable approach.
At least on GCP PubSub, a subscription is a separate concept from a topic/queue. If you want different priorities, you create multiple topics. You create multiple subscriptions when you want to fan out a single message to multiple workers. As far as I know, multiple subscriptions have nothing to do with priorities. Can you explain?
Anyway, for a server that only does pub/sub with ACID guarantees, those specs are so large that there is certainly a bottleneck before they matter. So it wouldn't be strange if somebody gets one that can't even handle that, it just would mean that there is some issue somewhere we don't see.
>you're big enough to experience this failure fairly often IMO
Please explain how? You would either have to suffer from frequent network connectivity issues that affects only your db and not your queue, or your process must be mysteriously dying in the microseconds between those 2 operations. Either of those cases are not something I would consider things that happen "fairly often," even if you were processing trillions of messages per day.
In my experience, the vast majority of message processing failures happen at the worker level.
And the next week they realize they want reader processes to block until there is work to do. Oops that's not supported. Now you have to code that feature yourself... and soon you're reinventing Kafka.
The benefit of having a distributed cron that is version controlled and type checked in my repo has been amazing. I am also confident that my pg_boss implementation has higher uptime than GitHub actions.
Of course, the database client and server together form a distributed system. The client might continue processing a job under the mistaken impression that it still holds the lock. Jobs still need to be idempotent, as with the streaming platforms.
You’re guaranteed to break the invariant sooner or later so you end up with all the usual complexity of keeping stuff in sync.
Edit>> I see you edited your post after I responded. None of those scenarios qualify as "fairly often."
What do you say to those who don't want Google to know their usage info?
And if your needs are simpler like in this case then there are dozens of smaller pub/sub/queue systems that you could compare this to.
Is this some magical database where you don't need to worry about access patterns, best practices or how it is deployed.
You can now efficiently partition your job queue, just like you would do with Kafka to get higher scalability. You then "prepare" your "dequeue" query in Postgres and the planner will only look at the relevant partition, pruning all of the others. It's like having one logical queue to insert into and hundreds of actual physical queues, transparently partitioned, to pull from. You then assign your workers to specific partitions and plow through your jobs.
In PG 14, you can reasonably have a thousand partitions on a single job queue, each virtually independent performance-wise. As a bonus, you can have two-level partitions gated on the task's timestamp. Older partitions/tasks can then be detached and dropped without using DELETE, which makes it a fast operation in Postgres. Any index or table bloat from the older partitions disappears immediately. Pretty sweet.
Obviously, this takes more work to set up and there's ongoing operational stuff (cron job), but you retain all of the transactional guarantees of Postgres in the process and the performance is quite good. I like the overall operational simplicity of having everything in a single, inexpensive and rock-solid RDBMS. I like getting transaction-safe jobs "for free."
It's so cheap, too. People on this thread talk about using SQS. I spend ~$1600/month on SQS to process 20M messages per day, going on three years now. I can do far more than that on our Postgres instance, a $5K machine bought two years ago, sitting in our co-lo in downtown LA with a cheap 10Gb Cogent connection that also runs the rest of the business ($2300/month for the entire rack of machines + Internet).
But I'm forced to pay for that damn SQS queue because that's how a business partner gets us data. Such a waste of money for something so cheap and easy to do reliably with Postgres. I've now spent over $50K on something I can do more or less for free on a 2012 Dell server. Such is business.
How does it solve the additional code complexity problem?
That’s what our workload is like for our SaaS code analysis platform. We create a few tasks (~10 max) for every customer submission (usually triggered by a code push). We replaced Kafka with a PostgreSQL table a couple of years ago.
We made the schema, functions, and Grafana dashboard open source [0]. I think it’s slightly out-of-date but mostly the same as what we have now in production, and has been running perfectly.
Go back and read OPs link. They create new SQL types, tables, triggers, and functions, with non-trivial and very unforgiving atomic logic. And every system that needs to read or write from this "db queue" needs to leverage specific queries. That's the complexity.
>vs writing code to support some other new system
You mean using a stable well maintained library with a clean and sensible interface to a queueing system? Yes, that is far more simple.
of _course_ the system architecture had to have a job queue and it had to be highly available (implemented with a rabbitmq cluster)
what we learned after a few months in production was the only time the rabbitmq cluster had outages was when it got confused* and thought (incorrectly) there was a network partition, and flipped into partition recovery mode, causing a partial outage until production support could manually recover the cluster
the funny thing about this is that our job throughput was incredibly low, and we would have had better availability if we had avoided adding the temperamental rabbitmq cluster and instead implemented the job queue in our non-HA postgres instance that was already in the design --- if our postgres server went down then the whole app was stuffed anyway!
* this was a rabbitmq defect when TLS encryption was enabled and very large messages were jammed through the queue -- rabbitmq would be so busy encrypting / decrypting large messages that it'd forget to heartbeat, then it'd timeout and panic that it hadn't got any heartbeats, then assume that no heartbeats implied a network partition, and cause an outage, needing manual recovery. i think rabbitmq fixed that a few years back
For fire and forget type jobs you can use lists instead of pub/sub: save a job to a list by a producer, pop it on the other end by a consumer and execute it. It's also very easy to scale, just start more producers and consumers.
We're currently using this technique, to process ~2M jobs per day, and we're just getting started. Redis needs very little memory for this, just a few mb.
Redis also supports acid style transactions.
I use it as a sql database and full text search for little personal project I work on off and on and it works great. I haven't touched it except to check every few weeks for security updates for months since I got a promotion and it, the golang app server and python scripts have had no issue just churning along keeping a 30 day archive of links found via reddit and twitter. Postgres is great.
Kafka does more for streaming data, but doesn't do squat for relational data. You always need a database, but you sometimes can get by without a queuing system.
At least with Postgres you can scale up trivially. Postgres will efficiently take advantage of as many cores as you give it. For scale out you will need to move to a purpose built queuing solution.
I think Kafka is great, but it is absolutely not “easy to learn and use”.
Wait, what?? I can't keep doing things with my connection after I issue a LISTEN? That doesn't seem right! I would assume it would isomorphic to how unix-y bg programs will occasionally write to the console (although I see how this might be hard to deal with at the driver level). Now I will have to go check.
> Listen to changes in a PostgreSQL Database and broadcasts them over WebSockets
[1]: https://github.com/supabase/realtime
Disclosure: I'm a Supabase employee.
> With the libpq library, the application issues LISTEN as an ordinary SQL command, and then must periodically call the function PQnotifies to find out whether any notification events have been received.
[1]: https://www.postgresql.org/docs/current/sql-listen.html
- No need to poll the database table
- No table-level locks and manual handling: row locks used for handling the work in progress
- "Manual cleanup" -- uhhh
Etc.
> graphile-worker is horizontally scalable. Each instance has a customisable worker pool, this pool defaults to size 1 (only one job at a time on this worker) but depending on the nature of your tasks (i.e. assuming they're not compute-heavy) you will likely want to set this higher to benefit from Node.js' concurrency.
I've personally written real-time back-of-house order-tracking with rails and postgres pubsub (no redis!), and wrote a record synchronization queuing system with a table and some clever lock semantics that has been running in production for several years now -- which marketing relies upon as it oversees 10+ figures of yearly topline revenue.
Neither of those projects were FAANG scale, but they work fine for what is needed and scale relatively cleanly with postgres itself.
Besides, in a lot of environments corporate will only approve the use of certain tools. And if you already have one approved that does the job, then why not?
That said, it's been running for upwards of 4 years and accumulated an insane number of temperature readings inside and above heating vents (heat source is heat pump)
SELECT count() as count FROM temperatures : msg : Object { _msgid: "421b3777.908118", topic: "SELECT count() as count FROM …", payload: 23278637 }
Ok, I need therapy for my data hoarding - 23 million temp samples is not a good sign :-)
There are many options for scaling:
- vertically scale by adding more memory
- start redis instance on another port (takes 1mb) if decided to add more cores on the same vm
- separate data into another vm
- sharding comes out of the box, but that would be my last resort
Most senior+ engineers that I know would hear that and recoil. Getting "clever" with concurrency handling in your home-rolled queuing system is not something that coworkers, especially more senior coworkers, will appreciate inheriting, adapting, and maintaining. Believe me.
I get that you're trying to flex some cool thing that you built, but it doesn't really have any bearing on the concept of "most cases" because it's an anecdote. Queuing systems are a thing for a reason, and in most cases, using them makes more sense than writing your own.
We use transactional outbox for that - we insert a record about the event into the event table in the same transaction as the rest of the operation (providing atomicity), and then a special goroutine reads this table and pushes new events to the message broker on a different server. In our design, there are multiple services which might want to subscribe to the event, and the rule is that they shouldn't share their DB's (for proper scaling) so we can't handle event dispatch in some single central app instance. Of course we could implement our own pub/sub server in Go over a DB like Postgres if we wanted, but what's the point of reinventing the wheel if there's already existing battle-tested tools for that, considering you have to reimplement: queues, exchanges, topics, delivery guarantee, proper error handling, monitoring etc.
I am both a "senior+ engineer" that has inherited such systems and an author of such systems. I think you're overreacting.
Concurrency Control (i.e., "lock semantics") exists for a reason: correctness. Using it for its designed purpose is not horror. Yes, like any tool, you need to use it correctly. But you don't just throw away correctness because you don't want to learn how to use the right tool properly.
I have inherited poorly designed concurrency systems (in the database); yes, I recoiled in horror and did not appreciate it. So you know what I did? I fixed the design, and documented it to show others how to do it correctly.
I have also inherited OOB "Queuing Systems" that could not possibly be correct because they weren't integrated into the DB's built-in and already-used correctness system: Transactions and Concurrency Control. Those were always more horrific than poorly-implemeneted in-DB solutions. Integrating two disparate stores is always more trouble than just fixing one single source.
----
> I get that you're trying to flex some cool thing that you built, but it doesn't really have any bearing on the concept of "most cases" because it's an anecdote. Queuing systems are a thing for a reason, and in most cases, using them makes more sense than writing your own.
I get that you're trying to flex that you use turnkey Queueing Systems, but it doesn't really have any bearing on the concept of "most cases", because all you've presented are assertions without backing. Queuing systems are good, for a specific kind of job, but when you need relational logic you better use one that supports it. And despite what MongoDB and the NoSQL crowd has been screaming hoarsely for the past decade, in most cases, you have relational logic.
But seriously though, postgres's relational logic implementation makes for a very good queueing system for most cases. It's not a hack that's bolted on top. I know that's how quite a few "DBs" are designed and implemented, and maybe you've been burned by too many of them, but Postgres is solid. I've seen it inside and out.
My point is that postgres is a swiss army knife and you and anyone else would be remiss to not fully understand what it is capable of and what you can do with it. Entire classes of software baggage can be eliminated for "most" use cases. One could even argue that reaching for all these extra fancy specialized tools is a premature optimization. Plus, who could possibly argue against having fewer moving parts?
Sometimes I agree with best tool for the job; if the constraints make something a very clear winner; if the difference is marginal for the particular case at hand, I pick what I/we know (I would actually argue that IS the best tool for the job; but in absolute 'what could happen in the future' terms it probably is not).
It's really simple. I used this snippet for a real-time application which can react to any kind of change in the database instantly.
Regarding the horizontal scalability; that relates to if you have heavy tasks (tasks that take a second or more to execute) - you can use more instances to get higher throughput.
Hope this helps!
Transactions in MVCC are relatively cheap. The main resource of contention is a global txid that can disastrously wrap around if autovacuum is disabled. That process is responsible for a few other important tasks, like updating statistics for the query planner and maintaining BRIN indexes.
This is not much load at all, an iPhone running RabbitMQ could process many millions of messages per day. Even 1M messages per day is only 11 messages per second average. i.e. not taxing at all.
If you're already using Postgres, you can avoid increasing operational complexity by introducing another database. Less operational complexity means better availability.
You can atomically modify jobs and the rest of your database. For example, you can atomically create a row and create a job to do processing on it.
And Elixir is especially well suited for this type of workload. I actually extracted out much of the Supabase Realtime library so that I could work with the data directly in Elixir[2]
[1]: https://github.com/supabase/realtime#why-not-just-use-postgr...
But the queue grows precisely because some notifications aren’t getting delivered, right?
Scalability is always limited, no matter which solution you choose. This article argues that the scalability limit for this particular solution is acceptable for most people to begin with:
> It's rarely a mistake to start with Postgres and then switch out the most performance critical parts of your system when the time comes.
Only if you begin a transaction when your job starts that isn’t committed until you job finishes. As I understand it, this is not a good idea for long-running jobs, since you’ll have a long-running Postgres transaction. Am I missing something? The linked article doesn’t seem to use this approach.
Does the “FOR UPDATE” decrease the lock to just a single row, thus making it unproblematic?
[1]: https://dramatiq.io/
- Closed/lock-in vs. Open/lock-free
- Rigid data access pattern vs. Very flexible SQL access
-Managed by AWS vs. Managed by you/your team (although you could use one of those managed Postgres services to reduce ops burden)
- Integrates well with other AWS services (e.g. Lambda, SNS, DynamoDB, etc) vs. No integrations with AWS ecossystem out of the box
I find it amusing that we happily play these AAA gaming experiences that are totally fantastical in their ability to deal with millions of things per frame and then turn around and pretend like hundreds of thousands of things per day is some kind of virtue.
If I'm building one shed then maybe I only need 5 tools, while a shed factory might use 100. Context matters.
That means you can't have docker and different versions of Java, node and .Net all running in parallel.
You run a single process and Sqlite is a library that allows SQL operations and database to be inbuilt. You 'budget' is like 100 mb of Ram, becauae other stuff has to run too.
All the time-series databases I know are a large, memory hungry hippo, built for distributed compute/kubernetes. Just very different usecase. If one was built with minimalism in mind, then it could be used.
Granted I didn't even read the main article because it seems like such a casual headline.
Edit post-read: yeah, using it as a CI jobs database. He lists the alternatives, but seriously, Kafka? Kafka is for linear scaling pub/sub. This guy has a couple CI jobs infrequently run.
Sure this works if the entire thing is throwaway for a non critical pub/sub system.
"It's possible to scale Postgres to storing a billion 1KB rows entirely in memory - This means you could quickly run queries against the full name of everyone on the planet on commodity hardware and with little fine-tuning."
Yeah just because it can does not mean it is suited for this purpose.
Don't do this for any integration at even medium scale.
In fact, the article makes a very good point how just doing it in postgres is great, it doesn't really scale (because of ACID), and adapting it for scale after you need it will lead to a better design than what you would do if you started optimizing without any information.
The benchmark shows that the time to send notifications grows
quadratically with the number of notifications per transaction without
the patch while it grows linearly with the patch applied and
notification collapsing disabled.Personally I do see the niceness of having a good pattern implemented using existing technology. Less deployment nonsense, less devops, less complexity, a few tables at most. I've done similar things in the past, it is nice.
For anyone who'd criticize, having complex deployments can be just about as much dev time, AND if implemented well, they can theoretically covert this whole thing to rabbitmq with minimal effort just by swapping the queueing system.
In any case, happy to see people mentioning how using existing simple tech can lead to fairly simple to manage systems, and still solve the problems you're trying to solve.
I should clarify. I meant holding a transaction for the duration of the message.
That's head of line blocking.
My experience with SQLite is that it can take you a long ways before needing to look elsewhere.
1. Postgres is easier to setup and run (than Kafka) 2. Most shops already have Postgres running (TFA is targeted to these shops) 3. Postgres is easier to adapt to changing access patterns (than Kafka).
----
> Is this some magical ...
Why must your adversary (Postgres) meet some mythical standard when your fighter (Kafka) doesn't meet even basic standards.
If this is insufficient for more complicated migrations, there's tooling to support it. e.g. Flyway.
The times I have done this, I end up with a workflow where the "select for update ... skipped lock" is used only to initially mark the job as "taken" so that other processes do not start working on it. That update is committed in one transaction and then the "work" is done in second transaction.