Your prototype is amazing! The quality of transcription is definitely better than ours via Google.
After we did some legal research we wanted to avoid storing the recordings and rather solely transcription text. Giving access to a platform for humans to verify the transcriptions and in turn train the model is a great idea.
I have started working on getting some pre-trained models set up. I am trying to implement them with wav2letter, deepspeech, kaldi, vosk, etc. - I just need to be pointed in the right direction.
Raspberry Pi's were something I was considering as well - small energy footprint and powerful enough to run these models.
Do you have any advice on ML or acoustic models to avoid? I am working with the 100 hour dataset now.
Thanks!
I have 30ish streams and keep 6 days worth, I could keep longer if you'd like to work together on this. I reached out to some of the people above, the Broadcastify guy for example, and they are, as mentioned, ready doing their own thing so didn't really care about what I wanted to share.