What about availability, though? The distributed solution is also useful to avoid downtime in case of single node failure.
Is there an off-the-shelf solution that lets me do that with Postgres? I know the newest version (16) just added active-active replication, but I wouldn’t know how to use that to achieve resilience.
And unless your jobs are trivial then it's highly likely that they interact with your app in some way so it doesn't really matter if your workers are distributed and up, they're not able to complete their work because your app is down because of a single-node Postgres.
With 99.99% you can have 4 minutes of downtime a month. If failover to a hot standby takes a minute then that shouldn't be a problem to achieve a 99.99% uptime SLO.
If background jobs need to be available while some other core application is down, that needs to be designed for, and that design can be achieved with any queue technology. Simply separate the queue system stack from the core application system stack.
> But if you're using Postgres as your queuing system because you're already using it as your core database technology
Note your own use of "database technology" and not "database server". It's common to have separate application and queue database servers when such an architectural constraint is present. Of course, this sacrifices the benefit of transactional guarantees when the application and background jobs run on the same server.
Like I said in the post, technology (and architectural) choices are tradeoffs all the way down :)
And, frankly, even for the less-critical stuff that was only running on a single node, I still dealt with fewer availability problems back in the day than I do now that everything's gone distributed. I think that a thing that's been forgotten over the years is that a lot of this stuff that distributed systems do to be reliable was more about digging oneself out of the hole that was created by running on lots of cheap hardware instead of using a single server with redundancy built-in. I acknowledge that, past a certain scale, that's the only option that makes sense. But if you're not operating at that scale then there's a good chance it's all just verschlimmbessern.