zlacker

Gpt4o mini transcribe is better and actually realtime. Whisper is trained to encode the entire audio (or at least 30s chunks) and then decode it.

replies(2): >>emmett+s >>mdrzn+C

>>GaggiX+(OP)
The linked article claims the average word error rate for Voxtral mini v2 is lower than GPT-4o mini transcribe

replies(1): >>GaggiX+N

>>GaggiX+(OP)
So "gpt4o mini transcribe" is not just whisper v3 under the hood? Btw it's $0.006 / minute

For Whisper API online (with v3 large) I've found "$0.00125 per compute second" which is the cheapest absolute I've ever found.

replies(2): >>GaggiX+d1 >>breisa+8G

>>emmett+s
Gpt4o mini transcribe is better than whisper, the context is the parent comment.

>>mdrzn+C
>So it's not just whisper v3 under the hood?

>>mdrzn+C
Deepinfra offers Whisper V3 at 0.00045$ / minute of transcribed audio.