zlacker

[return to "Transcribed police scanners in real-time"]
1. blanto+G2[view] [source] 2020-06-08 23:08:00
>>illumi+(OP)
This is very impressive.

I'm the owner of Broadcastify.com, where presumably these streams are being transcribed from. We've dabbled in this space and looked at real-world approaches to taking something like this to market, but transcribing 7000+ streams to text seems like an expensive (computational) and ($$) effort that needs a lot of investigation.

Note to mention that the individual lexicons between streams are drastically different.

I wonder how the developer has done the integration to our streams... because I never heard from them :)

◧◩
2. lunixb+V4[view] [source] 2020-06-08 23:26:48
>>blanto+G2
I prototyped this concept too, at https://feeds.talonvoice.com with prohibitively expensive Google speech recognition, but also have a feature for users to listen and fix transcriptions. If murph was anything like me they probably paid for broadcastify and tailed a couple of the static mp3 feeds.

My plan was to collect user transcription corrections on my site then train my own inexpensive models on them. The open-source speech tech I work on can do passable transcription at close to 100x faster than realtime on a quad core desktop CPU (or 200 simultaneous streams per 4-core box at 50% activity). With higher quality transcription it's closer to 10-20x faster than realtime.

For your case you could also try to push some of the computation down to the uploading machine. These models can run on a raspberry pi.

I think the biggest work for a new effort here is going to be building local language models and collecting transcribed audio to train on. However, there have been a couple of incredible advances in the last year for semi-supervised speech recognition learning, where we can probably leverage your 1 year backlog as "unsupervised training data" while only having a small portion of it properly transcribed.

The current state-of-the-art paper uses around 100 hours of transcribed audio and 60,000 hours of unlabeled audio, and I bet you could push the 100h requirement down with a good language model and mixing in existing training data from non-radio sources.

◧◩◪
3. runawa+fy[view] [source] 2020-06-09 04:45:23
>>lunixb+V4
I’d love to read a write up on this if you ever feel the urge.
[go to top]