Sadly, tesser is not advertised as it should; I find it much more flexible than transducers. E.g. you could parallelize tesser code over Spark/Hadoop cluster.
https://github.com/johnmn3/injest
I can squeeze more performance out of tesser, but injest gives me a surprising boost with very little ceremony most of the time.