Transcribed police scanners in real-time

>>illumi+(OP)
Hello Hacker News!!!

I am the developer of murph.live - I just want to thank all of you for taking the time to check it out and give us excellent feedback. I stumbled upon this post and now have goosebumps.

This started by listening to police scanners throughout the night during recent protests in Seattle, WA. I wanted to help and I immediately put my credit card down for Google's Speech to Text API.

As for the inbound streams, @blantonl is spot on - we use the streams from a premium account on broadcastify.com (thank you for not sending a cease and desist yet!).

A few dockerized ffmpeg processes segment the streamed audio into 30 second wav files. Subsequently, sox removes silence from the audio files as police scanners have quite a bit of downtime between transmissions. The performance is very scalable using docker containers to record and trim.

Currently, we pipe these trimmed wav files to the Google Speech API - as others have mentioned this is $$$. We are receiving donations, but this dependency on Google will eventually need to be eliminated.

I have started looking into possible solutions using NLP and other acoustic models to bring the costs down. Honestly, speech processing is not my forte so I'm kind of shooting in the dark here. I am currently testing pre-trained models for wav2letter++, kaldi, vosk, and maybe deepspeech.

We can all agree the quality of the transcripts is something to be desired and improved upon. Potentially dangerous if transcribed incorrectly, but nonetheless we wanted to launch to give citizens a platform to provide transparency into our government. The idea is what counts right now.

Thanks again and I will be responding to a bunch of comments on here! You all rock!

zlacker