One endpoint accepts work to a named queue, writes it to a file in an XFS directory. Another locks a mutex, moves the file to an in progress directory and unlocks the mutex before passing the content to the reader. A third and final endpoint deletes the in progress job file. There is a configurable timeout, after which they end up at a dead letter box. I am simplifying only a little bit. It's a couple hundred lines of Go.
The way this is set up means a message will only ever be handed to one worker. That simplifies things a lot. The workers ask for work when they want it, rather than being constantly listening.
It took a little tuning but we process a couple billion events a day this way and it's been basically zero maintenance for almost 10 years. The wizards in devops even figured out a way to autoscale it.
We always wanted to open source it, but we got bought out by a big and very IP protective company before we got the chance.