zlacker

[parent] [thread] 0 comments
1. IanCal+(OP)[view] [source] 2026-02-04 20:41:44
If you use something like youtube-dlp you can download the audio from the meetings, and you could try things out in mistrals ai studio.

You could use their api (they have this snippet):

```curl -X POST "https://api.mistral.ai/v1/audio/transcriptions" \ -H "Authorization: Bearer $MISTRAL_API_KEY" \ -F model="voxtral-mini-latest" \ -F file=@"your-file.m4a" \ -F diarize=true \ -F timestamp_granularities="segment"```

In the api it took 18s to do a 20m audio file I had lying around where someone is reviewing a product.

There will, I'm sure, be ways of running this locally up and available soon (if they aren't in huggingface right now) but the API is $0.003/min. If it's something like 120 meetings (10 years of monthly ones) then it's roughly $20 if the meetings are 1hr each. Depending on whether they're 1 or 10 hours (or if they're weekly or monthly but 10 parallel sessions or something) then this might be a price you're willing to pay if you get the results back in an afternoon.

edit - their realtime model can be run with vllm, the batch model is not open

[go to top]