The nice thing about this is that you can use a runtime like cloud run or lambda and allow that runtime to scale based on http requests and also scale to zero.
Setting up autoscaling for workers can be a little bit more finicky, e.g. in kubernetes you might set up KEDA autoscaling based on some queue depth metrics but these might need to be exported from rabbit.
I suppose you could have a setup where your daemon worker is making http requests and in that sense "push" to the place where jobs are actually running but this adds another level of complexity.
Is there any plan to support a push model where you can push jobs into http and some daemons that are holding the http connections opened?
The daemon feels fragile to me, why not just shut down the worker client-side after some period of inactivity?
You can configure cloud run to always allocate CPU but it's a lot more expensive. I don't think it would be a good autoscaling story since autoscaling is based on http requests being processed. (maybe can be done via CPU but that's may not be what you want, it may not even be cpu bound)
Having http targets means you get things like rate limiting, middleware, and observability that your regular application uses, and you aren’t tied to whatever backend the task system supports.
Set up a separate scaling group and away you go.
That just means that there's a lightweight worker that does the HTTP POST to your "subscriber". With retries etc, just like it's done here.
You simply define a task using our API and we take care of pushing it to any HTTP endpoint, holding the connection open and using the HTTP status code to determine success/failure, whether or not we should retry, etc.
Happy to answer any questions here or over email james@mergent.co