For example Python has Celery and Ruby has Sidekiq. As far as I know there's no libraries in either language that has something as battle hardened with comparable features for background tasks using Postgres as a backend.
There's a big difference between getting something to work in a demo (achievable by skimming PG's docs and rolling your own job queue) vs using something that has tens of thousands of hours of dev time and tons of real world usage.
I'm all for using PG for things like full text search when I can because it drastically reduces operation complexity if you can avoid needing to run Elasticsearch, but Redis on the other hand is a swiss army knife of awesome. It's often used for caching or as a session back-end so you probably have it as part of your stack already. It's also really easy to run, uses almost no resources and is in the same tier as nginx in terms of how crazy efficient it is and how reliable it is. I don't see not using Redis for a job queue as that big of a win.
A good job queue will have all or most of these features:
Prioritize jobs (queues, weighs to certain jobs, etc.)
Scheduled jobs (running them once but X time in the future)
Periodic jobs (running them every 2nd Tuesday at 3:33am)
Static and dynamic configuration (CRUD'ing jobs at runtime, like adding new scheduled tasks)
Re-try jobs (customizable strategy, such as exponential back off)
Rate limit jobs
Expire jobs
Cancel jobs
Unique jobs
Batch executing
Handling graceful shutdown (integration with your app server)
Get metrics (status, health, progress, etc.)
Browse job history
Web UI (nice to have)
And I'm sure I'm missing things too. These are only off the top of my head based on features I tend to use in most applications. In other words, this isn't a laundry list of "nice to haves in theory", most of these are core or essential features IMO. I use them in nearly every web app.Rolling all of these things on your own would be a massive undertaking. Tools like Celery and Sidekiq have been actively developed for ~10 years now and have likely processed hundreds of billions of jobs through them to iron out the kinks.
Even if you managed to do all of that and created it as a Postgres extension (which I think is doable on paper), that's only half the story. Now you'd have to write language specific clients to interface with that so you can create jobs in your application using a nice API. This would be a very welcome project but I think we're talking a year+ of full time development time to release something useful that supports a few languages, assuming you're already an expert with writing pg extensions, have extension knowledge about job queues and know a few popular programming languages to release the initial clients.
It's also probably one of the easiest services to deploy and manage; often a one-liner.
Plus like you said, swiss army knife. It has so many uses. It's inevitable my stack will include a redis at some point, and my reaction is almost always "I should have just started with redis in the first place."
Is redis prone to golden hammer syndrome? Of course. But as long as you aren't too ridiculous, I've found you can stretch it pretty far.
- You have a separate schema for your procs
- You define your procs in files
- You write tests for your procs
- You put your files into git
- Then, in a transaction, you drop your old schema and deploy the new one
How do you write tests? With what data are you testing?
How do you handle backwards compatibility? (For the code? For the data?)
Do you do this upgrade during deployment? Can you do blue/green deployments? What about canary deployments?
Is this transaction where you are dropping the old and creating the new part of the code or part of the database? (the actual code that does this)
How do you profile the performance of the said code?
In most other languages, you’re sending everything to the queue. With Elixir you only need a small subset of background work to go to a queue, which is usually work that would stress the database.
You can use pgtap to write automated unit tests. For profiling, there are explain, autoexplain, and plprofiler.
Blue/Green and Canary deployments are not currently possible with Postgres. On the other hand, Postres has transactional DDL, which means there is no downtime during code deployment and automatic rollback if something goes wrong.
The database is only for data and for stored procedures or functions, which are stored in separate schemas. Your deployment scripts, migration scripts, test scripts and everything else is not stored in the database, but in your source control system, e.g. Git.
For everything else, just use conventional software engineering practices. There is no reason to treat SQL code differently than Ruby or Java code.
seriously, you have the actual data, you have the store procedures and you have the code. Each has their own version. I have never seen this work in a production environment. It's possible that things evolved and there is better tooling since I last tried performing this stunt.
If it works for you that's great and maybe you should write some sort of blog post (do people still blog?) describing the setup and allowing others to either 1) replicate and use it 2) poke holes in it.
How do you handle this in other languages?
How does this nonsense add anything to the discussion?