zlacker

[return to "Voxtral Transcribe 2"]
1. satvik+Hk[view] [source] 2026-02-04 16:39:08
>>meetpa+(OP)
Looks like this model doesn't do realtime diarization, what model should I use if I want that? So far I've only seen paid models do diarization well. I heard about Nvidia NeMo but haven't tried that or even where to try it out.
◧◩
2. breisa+yU[view] [source] 2026-02-04 19:10:54
>>satvik+Hk
Not sure if its "realtime" but the recently released VibeVoice-ASR from Microsoft does do diarization. https://huggingface.co/microsoft/VibeVoice-ASR
[go to top]