zlacker

I think one of the biggest advantages of using Redis for job queing vs Postgres comes down to library support.

For example Python has Celery and Ruby has Sidekiq. As far as I know there's no libraries in either language that has something as battle hardened with comparable features for background tasks using Postgres as a backend.

There's a big difference between getting something to work in a demo (achievable by skimming PG's docs and rolling your own job queue) vs using something that has tens of thousands of hours of dev time and tons of real world usage.

I'm all for using PG for things like full text search when I can because it drastically reduces operation complexity if you can avoid needing to run Elasticsearch, but Redis on the other hand is a swiss army knife of awesome. It's often used for caching or as a session back-end so you probably have it as part of your stack already. It's also really easy to run, uses almost no resources and is in the same tier as nginx in terms of how crazy efficient it is and how reliable it is. I don't see not using Redis for a job queue as that big of a win.

replies(5): >>RedShi+w >>kortex+eD >>dereth+kD >>trulyr+sH >>simpli+9l1

>>nickjj+(OP)
But in Postgres you could write functions in your schema to handle job queueing/dequeueing with the additional benefit of being able to use it in any language that can connect to Postgres and being able to reuse the same SQL/interface across all languages.

replies(3): >>nickjj+N1 >>c-cube+Ec >>rantwa+rW

>>RedShi+w
I'm totally with you in this regard and I'd like to see that too but the reality of the situation is a job queue is more than slapping together a few SQL queries.

A good job queue will have all or most of these features:

    Prioritize jobs (queues, weighs to certain jobs, etc.)
    Scheduled jobs (running them once but X time in the future)
    Periodic jobs (running them every 2nd Tuesday at 3:33am)
    Static and dynamic configuration (CRUD'ing jobs at runtime, like adding new scheduled tasks)
    Re-try jobs (customizable strategy, such as exponential back off)
    Rate limit jobs
    Expire jobs
    Cancel jobs
    Unique jobs
    Batch executing
    Handling graceful shutdown (integration with your app server)
    Get metrics (status, health, progress, etc.)
    Browse job history
    Web UI (nice to have)

And I'm sure I'm missing things too. These are only off the top of my head based on features I tend to use in most applications. In other words, this isn't a laundry list of "nice to haves in theory", most of these are core or essential features IMO. I use them in nearly every web app.

Rolling all of these things on your own would be a massive undertaking. Tools like Celery and Sidekiq have been actively developed for ~10 years now and have likely processed hundreds of billions of jobs through them to iron out the kinks.

Even if you managed to do all of that and created it as a Postgres extension (which I think is doable on paper), that's only half the story. Now you'd have to write language specific clients to interface with that so you can create jobs in your application using a nice API. This would be a very welcome project but I think we're talking a year+ of full time development time to release something useful that supports a few languages, assuming you're already an expert with writing pg extensions, have extension knowledge about job queues and know a few popular programming languages to release the initial clients.

>>RedShi+w
You can extend redis with lua to perform atomic operations, too. And it has bindings in a lot of languages by virtue of its protocol being so simple.

>>nickjj+(OP)
Totally agree. The number of job queues which use or can use Redis as the backing store is legion. Celery, rq, arq, Golang has asynq (inspired by sidekiq iirc), and that's off the top of my head. IMHO, it's just a better interface for implementing a job queue than an RDB.

It's also probably one of the easiest services to deploy and manage; often a one-liner.

Plus like you said, swiss army knife. It has so many uses. It's inevitable my stack will include a redis at some point, and my reaction is almost always "I should have just started with redis in the first place."

Is redis prone to golden hammer syndrome? Of course. But as long as you aren't too ridiculous, I've found you can stretch it pretty far.

replies(1): >>advent+le1

>>nickjj+(OP)
The elixir world has Oban[0] which implements quite a lot of advanced job features on top of PG. Admittedly it doesn’t quite have the usage of Celery and Sidekiq but most queueing libraries don’t.

[0] https://github.com/sorentwo/oban

replies(1): >>bright+7z1

>>nickjj+(OP)
In the case of Python, Celery does support SQLAlchemy as a broker if I remember correctly. So in theory, you could still use PostgreSQL and also have a solid queue library.

replies(1): >>nickjj+Ca1

>>RedShi+w
please don’t. source control/versioning/deployment become a nightmare

replies(1): >>taffer+671

>>rantwa+rW
I never understood this point:

- You have a separate schema for your procs

- You define your procs in files

- You write tests for your procs

- You put your files into git

- Then, in a transaction, you drop your old schema and deploy the new one

replies(1): >>rantwa+Wo1

>>trulyr+sH
It is supported but it's classified as experimental and not officially maintained by the core devs based on their documentation.

>>kortex+eD
Redis and I have a golden hammer agreement. I keep finding new ways to use it and it just keeps working.

>>nickjj+(OP)
Nowadays Rails has good_job[0], which lets you stick with Postgres for background jobs until you need more than a million-ish a day.

[0] https://github.com/bensheldon/good_job

>>taffer+671
Do you have access to your production database?

How do you write tests? With what data are you testing?

How do you handle backwards compatibility? (For the code? For the data?)

Do you do this upgrade during deployment? Can you do blue/green deployments? What about canary deployments?

Is this transaction where you are dropping the old and creating the new part of the code or part of the database? (the actual code that does this)

How do you profile the performance of the said code?

replies(2): >>e_prox+162 >>taffer+s92

>>dereth+kD
IMO one of the reasons that works for Elixir is that Elixir itself is built for all sorts of concurrent workloads.

In most other languages, you’re sending everything to the queue. With Elixir you only need a small subset of background work to go to a queue, which is usually work that would stress the database.

>>rantwa+Wo1
You have to solve these things with a job queue system written in any other language or using any other databases as backends too though.

replies(1): >>rantwa+8x3

>>rantwa+Wo1
Have an automated process for deployment and a small number of people who have the keys to production deployment. Don't let developers manually change things in the production system or any system at all.

You can use pgtap to write automated unit tests. For profiling, there are explain, autoexplain, and plprofiler.

Blue/Green and Canary deployments are not currently possible with Postgres. On the other hand, Postres has transactional DDL, which means there is no downtime during code deployment and automatic rollback if something goes wrong.

The database is only for data and for stored procedures or functions, which are stored in separate schemas. Your deployment scripts, migration scripts, test scripts and everything else is not stored in the database, but in your source control system, e.g. Git.

For everything else, just use conventional software engineering practices. There is no reason to treat SQL code differently than Ruby or Java code.

replies(1): >>rantwa+4x3

>>taffer+s92
this reminds me about a show I saw at some point where someone had put a jet engine on a motorcycle. super dangerous but super fast. asked why they did this, the answer was: "because we can!!!"

seriously, you have the actual data, you have the store procedures and you have the code. Each has their own version. I have never seen this work in a production environment. It's possible that things evolved and there is better tooling since I last tried performing this stunt.

If it works for you that's great and maybe you should write some sort of blog post (do people still blog?) describing the setup and allowing others to either 1) replicate and use it 2) poke holes in it.

replies(2): >>taffer+ar4 >>taffer+FA4

>>e_prox+162
no you don't. the queue system gives you a bunch of primitives and you build on top of them.

>>rantwa+4x3
> seriously, you have the actual data, you have the store procedures and you have the code. Each has their own version. I have never seen this work in a production environment.

How do you handle this in other languages?

replies(1): >>rantwa+OK5

>>rantwa+4x3
> this reminds me about a show I saw at some point where someone had put a jet engine on a motorcycle. super dangerous but super fast. asked why they did this, the answer was: "because we can!!!"

How does this nonsense add anything to the discussion?

>>taffer+ar4
yo don't have stored procedures. everything is done in the code. so you only have 2 things to worry about (the db schema and the code). you still have the problem of the schema versioning but you only have to worry about 2 things

replies(1): >>taffer+Ax6

>>rantwa+OK5
Now you have your answer: Just instead of deploying to an app server you deploy to your db server. Yet for some reason people forget everything they have learned once the term SQL comes up.