I think it's nice to have specialized models for specific tasks that don't try to be generalists. Voxtral Transcript 2 is already extremely impressive, so imagine how much better it could be if it specialized in specific languages rather than cramming 14 languages into one model.
That said, generalist models definitely have their uses. I do want multilingual transcribing models to exist, I just also think that monolingual models could potentially achieve even better results for that specific language.