It’s hit and miss, in my opinion. It’ll give you a good enough base to refine the transcript from, but I’ve yet to come across a transcript that doesn’t need editing. (Which is annoying, since Zoom doesn’t give you that option.) I’d say it’s more valuable having the tool than not, but don’t expect miracles.
(I’m not affiliated with Otter or Zoom in any way.)
My goal is to train my own models on the corrected transcriptions (I work in the speech recognition space) so I can transcribe many live feeds inexpensively.
I will respond with a link here (hopefully very soon today) once I've fixed a couple of remaining UX bugs.
Maybe there's something unique about how these low-quality radio transmissions sound that make these ineffective?
Are you doing any kind of speaker identification?
This is definitely a very hard problem to solve.
https://www.broadcastify.com/listen/feed/32890
(edit: also, thank you for keeping this service up and running for so long, have been a regular user since the early RR days. Would love to have a comment/live chat option if your backlog is getting bare :))
Repo is here if you need to report (or just fix :D) bugs in the webapp: https://github.com/lunixbochs/feeds
[1] https://feeds.talonvoice.com
Repo is here if you need to report (or fix) bugs in the webapp: https://github.com/lunixbochs/feeds
If you want to help with development, reach out and I can onboard + give some test data.
I was wondering if you could estimate what it would cost to have always on recording of all these radio conversations, cost of running this speech2text ML and cost of labeling this data.
I think having these rough estimates will make donations easier for people.
I think once my models are viable enough to do this at scale, the cost will be basically the cost of running a dedicated server per N streams. So $100-300/mo per N streams? Where N could roughly be at least 100 concurrent streams per server. I will know this better in "stage 2" where I'm attempting to scale this up. It's also a fairly distributed problem so I can look into doing it folding@home style, or even have the stream's originator running transcription in some cases to keep costs down.
(trunk-recorder + rdio scanner).
The UI is:
https://cvgscan.iwdo.xyz for the live stuff, but, let me know if you're interested in the data -- my email is in my profile