zlacker

I like that idea, basically the first HTTP request ensures the worker gets spun up on a lambda, and the task gets picked up on the next poll when the worker is running. We already have the underlying push model for our streaming feature: https://docs.hatchet.run/home/features/streaming. Can configure this to post to an HTTP endpoint pretty easily.

The daemon feels fragile to me, why not just shut down the worker client-side after some period of inactivity?

replies(1): >>jerryg+V5

>>abelan+(OP)
I think it depends on the http runtime. One of the things with cloud run is that if the server is not handling requests, it doesn't get CPU time. So even if the first request is "wake up", it wouldn't get any CPU to poll outside of the request-response cycle.

You can configure cloud run to always allocate CPU but it's a lot more expensive. I don't think it would be a good autoscaling story since autoscaling is based on http requests being processed. (maybe can be done via CPU but that's may not be what you want, it may not even be cpu bound)