I'm the owner of Broadcastify.com, where presumably these streams are being transcribed from. We've dabbled in this space and looked at real-world approaches to taking something like this to market, but transcribing 7000+ streams to text seems like an expensive (computational) and ($$) effort that needs a lot of investigation.
Note to mention that the individual lexicons between streams are drastically different.
I wonder how the developer has done the integration to our streams... because I never heard from them :)
https://rogueamoeba.com/loopback/
Someone clever enough could create containers to run the software locally and have many loops running off many streams to many instances of the audio to text feature.
I’ve had some decent results with the following:
I have to research how to hand tag my own samples to see if that offers significant accuracy improvements (let’s say I want to accurately transcribe one voice consistently).
Google and Watson APIs are not too free, and I believe Watson has a audio length limit (possibly limited by free tier, or possibly limited in general for all tiers).
Cool to see some real world attempts using this stuff.