Thousands are monitoring police scanners during the George Floyd protests

>>eloran+(OP)
Hi there! I'm the owner and operator of Broadcastify, which is the platform that powers all the apps that provide police scanners and public safety communications online. I'm an active HN reader and would be glad to answer any questions folks have.

It's an interesting business to be in these days...

>>blanto+h7
Is there a text transcript feature for users who may want to search through the communications? I'm curious how well those speech-to-text tools work for the audio feeds.

>>autojo+ww1
It's a hard problem. I'm prototyping this here [1]. Any user can tweak or vote on transcriptions, so my goal is to use the user annotations to help train models and make it better.

[1] https://feeds.talonvoice.com

Repo is here if you need to report (or fix) bugs in the webapp: https://github.com/lunixbochs/feeds

If you want to help with development, reach out and I can onboard + give some test data.

>>lunixb+DP1
Great to see you working on this!

I was wondering if you could estimate what it would cost to have always on recording of all these radio conversations, cost of running this speech2text ML and cost of labeling this data.

I think having these rough estimates will make donations easier for people.

>>dspoka+q32
Great question! Unfortunately the long term costs aren't clear yet, right now I'm using google speech as a bootstrapping technique, but that is prohibitively expensive to run long term.

I think once my models are viable enough to do this at scale, the cost will be basically the cost of running a dedicated server per N streams. So $100-300/mo per N streams? Where N could roughly be at least 100 concurrent streams per server. I will know this better in "stage 2" where I'm attempting to scale this up. It's also a fairly distributed problem so I can look into doing it folding@home style, or even have the stream's originator running transcription in some cases to keep costs down.

zlacker