zlacker

Accuracy is a little wonky even with real speech to text toolkits like Kaldi (which I’ll mention is a pain to even get started with it).

I’ve had some decent results with the following:

https://cmusphinx.github.io/

I have to research how to hand tag my own samples to see if that offers significant accuracy improvements (let’s say I want to accurately transcribe one voice consistently).

Google and Watson APIs are not too free, and I believe Watson has a audio length limit (possibly limited by free tier, or possibly limited in general for all tiers).

Cool to see some real world attempts using this stuff.