Could not find any specifics on generative AI in your docs. Thanks
> How do you distribute inference across workers?
In Hatchet, "run inference" would be a task. By default, tasks get randomly assigned to workers in a FIFO fashion. But we give you a few options for controlling how tasks get ordered and sent. For example, let's say you'd like to limit users to 1 inference task at a time per session. You could do this by setting a concurrency key "<session-id>" and `maxRuns=1` [1]. This means that for each session key, you only run 1 inference task. The purpose of this would be fairness.
> Can one use just any protocol
We handle the communication between the worker and the queue through a gRPC connection. We assume that you're passing JSON-serializable objects through the queue.
[1] https://docs.hatchet.run/home/features/concurrency/round-rob...