zlacker

[parent] [thread] 5 comments
1. coder5+(OP)[view] [source] 2026-02-04 16:16:09
The diarization is on Voxtral Mini Transcribe V2, not Voxtral Mini 4B.
replies(2): >>observ+I3 >>sbroth+f9
2. observ+I3[view] [source] 2026-02-04 16:30:57
>>coder5+(OP)
Ahh, yeah, and it's explicitly not working for realtime streams. Good catch!
3. sbroth+f9[view] [source] 2026-02-04 16:54:40
>>coder5+(OP)
Do you have experience with that model for diarization? Does it feel accurate, and what's its realtime factor on a typical GPU? Diarization has been the biggest thorn in my side for a long time..
replies(2): >>coder5+cf >>ashenk+TD
◧◩
4. coder5+cf[view] [source] [discussion] 2026-02-04 17:22:27
>>sbroth+f9
> Do you have experience with that model

No, I just heard about it this morning.

◧◩
5. ashenk+TD[view] [source] [discussion] 2026-02-04 19:03:41
>>sbroth+f9
You can test it yourself for free on https://console.mistral.ai/build/audio/speech-to-text I tried it on an english-speaking podcast episode, and apart from identying one host as two different speakers (but only once for a few sentences at the start), the rest was flawless from what I could see
replies(1): >>sbroth+Zh1
◧◩◪
6. sbroth+Zh1[view] [source] [discussion] 2026-02-04 22:16:03
>>ashenk+TD
Amazing. Thank you.
[go to top]