zlacker

[parent] [thread] 2 comments
1. barrel+(OP)[view] [source] 2024-06-28 05:12:38
I have a bunch of different queues that used SAQ (~50 LoC for the whole setup) and deployed it to production. A lot of them use LLMs, and when one of them failed it was near impossible to debug. Every workflow has over a dozen connected tasks, and every task can run on over a dozen separate rows before completing... I was spending hours in log files (often unsuccessfully)

The dashboard in Hatchet has a great GUI where you can navigate between all the tasks, see how they all connect, see the data passed in to each one, see the return results from each task, and each one has a log box you can print information to. You can rerun tasks, override variables, trigger identical workflows, filter tasks by metadata

It's dramatically reduced the amount of time it takes me to spot, identify, and fix bugs. I miss the simplicity of SAQ but that's the reason I switched and it's paid off already

replies(1): >>yuppie+F4
2. yuppie+F4[view] [source] 2024-06-28 06:19:12
>>barrel+(OP)
Is that a problem with the underlying infrastructure though? Im not seeing how using postgres queues would solve your issue... Instead it seems like an issue with your client lib, SAQ not providing the appropriate tooling to debug.

FWIW, Ive used both dramatiq/celery with redis in heavy prod environments and never had an issue with debugging. And Im having a tough time understanding how switching the underlying queue infrastructure would have made my life easier.

replies(1): >>barrel+H6
◧◩
3. barrel+H6[view] [source] [discussion] 2024-06-28 06:52:28
>>yuppie+F4
No it's not a problem with the underlying infrastructure. I believe the OP was asking why use this product, not why is this specific infrastructure necessary. The infrastructure before was working fine (with SAQ at least, Celery was an absolute mess of SIGFAULTs), so that was not really part of my decision. I actually really liked SAQ and probably preferred it from an infra perspective.

It's nice to be running on Postgres (i.e. not really having to worry about payload size, I heard some people were passing images from task to task) but for me that is just a nicety and wasn't a reason to switch.

If you're happy with your current infra, happy with the visibility, and there's nothing lacking in the development perspective, then yeah probably not much point in switching your infra to begin with [1]. But if you're building complicated workflows, and just want your code to run with an extreme level of visibility, it's worth checking out Hatchet.

[1] I'm sure the founders would have more to say here, but as a consumer I'm not really deep in the architecture of the product. Best I could do could be to give you 100 reasons I will never use Celery again XD

[go to top]