zlacker

[return to "Voxtral Transcribe 2"]
1. XCSme+4z[view] [source] 2026-02-04 17:43:21
>>meetpa+(OP)
Is it me or error rate of 3% is really high?

If you transcribe a minute of conversation, you'll have like 5 words transcribed wrongly. In an hour podcast, that is 300 wrongly transcribed words.

◧◩
2. cootsn+Vz[view] [source] 2026-02-04 17:46:47
>>XCSme+4z
The error rate for human transcription can be as high as 5%.
◧◩◪
3. qingch+vL1[view] [source] 2026-02-04 23:38:29
>>cootsn+Vz
I did transcription for a while in 2021. It is absurdly hard. Especially as these days humans only get the difficult jobs that AI has already taken a stab at.

The hardest one I did was for a sports network where it was a motorcross motorbike event where most of what you could hear was the roar of the bikes. There were two commentators I had to transcribe over the top of that mess and they were using the slang insider nicknames for all the riders, not their published names, so I had to sit and Google forums to find the names of the riders while I was listening. I'm not even sure how these local models would even be able to handle that insanity at all because they almost certainly lack enough domain knowledge.

[go to top]