zlacker

If I was going to do my own Job Queue, I'd implement it more like the GCP Tasks [0].

It is such a better model for the majority of queues. All you're doing is storing a message, hitting an HTTP endpoint and deleting the message on success. This makes it so much easier to scale, reason, and test task execution.

Update: since multiple people seem confused. I'm talking about the implementation of a job queue system, not suggesting that they use the GCP tasks product. That said, I would have just used GCP tasks too (assuming the usecase dictated it, fantastic and rock solid product.)

[0] https://cloud.google.com/tasks

replies(3): >>brandu+s3 >>jbvers+x3 >>fierro+9L1

>>latchk+(OP)
There's a lot to be said about the correctness benefits of a transactional model.

The trouble with hitting an HTTP API to queue a task is: what if it fails, or what if you're not sure about whether it failed? You can continue to retry in-band (although there's a definite latency disadvantage to doing so), but say you eventually give up, you can't be sure that no jobs were queued which you didn't get a proper ack for. In practice, this leads to a lot of uncertainty around the edges, and operators having to reconcile things manually.

There's definite scaling benefits to throwing tasks into Google's limitless compute power, but there's a lot of cases where a smaller, more correct queue is plenty of power, especially where Postgres is already the database of choice.

replies(2): >>latchk+M3 >>andrew+am

>>latchk+(OP)
>> Timeouts: for all HTTP Target task handlers the default timeout is 10 minutes, with a maximum of 30 minutes.

Good luck with a long running batch.

replies(1): >>latchk+n4

>>brandu+s3
> what if it fails, or what if you're not sure about whether it failed?

This is covered in the GCP Tasks documentation.

> There's definite scaling benefits to throwing tasks into Google's limitless compute power, but there's a lot of cases where a smaller, more correct queue is plenty of power, especially where Postgres is already the database of choice.

My post was talking about what I would implement if I was doing my own queue, as the authors were. Not about using GCP Tasks.

replies(1): >>politi+M5

>>jbvers+x3
If you're going to implement your own queue, you can make it run for however long you want.

Again, I'm getting downvoted. The whole point of my comment isn't about using GCP Tasks, it is about what I would do if I was going to implement my own queue system like the author did.

By the way, that 30 minute limitation can be worked around with checkpoints or breaking up the task into smaller chunks. Something that isn't a bad idea to do anyway. I've seen long running tasks cause all sorts of downstream problems when they fail and then take forever to run again.

replies(1): >>jbvers+iw

>>latchk+M3
Do you know that brandur's been writing about Postgres job queues since at least 2017? Cut him some slack.

https://brandur.org/job-drain

>>15294722

replies(2): >>latchk+V7 >>bgentr+Xa

>>politi+M5
"I'm into effective altruism and created the largest crypto exchange in the world. Cut me some slack."

No, we don't operate like that. Call me out when I'm wrong technically, but don't tell me that because someone is some sort of celebrity that I should cut them some slack.

Everything he pointed out is literally covered in the GCP Tasks documentation.

https://cloud.google.com/tasks/docs/dual-overview

https://cloud.google.com/tasks/docs/common-pitfalls

replies(1): >>robert+x9

>>latchk+V7
> No, we don't operate like that. Call me out when I'm wrong technically

You're being "called out" (ugh) incredibly politely mostly because you were being a bit rude; "tell me X without telling me" is just a bit unpleasant, and totally counterproductive.

> because someone is some sort of celebrity that I should cut them some slack.

No one mentioned a celebrity. You're not railing against the power of celebrity here; just a call for politeness.

> Everything he pointed out is literally covered in the GCP Tasks documentation.

Yes, e.g. as pitfalls.

replies(1): >>latchk+8b

>>politi+M5
2015, even :) https://brandur.org/postgres-queues

>>robert+x9
Sure, updated my comment to be less rude.

replies(1): >>politi+hC1

>>brandu+s3
HTTP APIs are ideal for message queues with Postgres.

The request to get a message returns a token that identifies this receive.

You use that token to delete the message when you are done.

Jobs that don’t succeed after N retries get marked as dead and go into the dead letter list.

This the way AWS SQS works, it’s tried and true.

>>latchk+n4
Well you can't really.. If you're gonna use HTTP and expect a response, you're gonna be in for a fun ride. You'll have to go deal with timeout settings for:

  - http libraries
  - webservers
  - application servers
  - load balancers
  - reverse proxy servers
  - the cloud platform you're running on
  - waf

It might be alright for smaller "tasks", but not for "jobs".

replies(1): >>latchk+9y

>>jbvers+iw
Have you ever used Cloud Tasks?

>>latchk+8b
Appreciated

>>latchk+(OP)
that's pretty fundamentally different, no? One requires you to build a distributed system with >1 components leveraging GCP tasks APIs. The second is just a library do some book keeping inside your main datastore.