zlacker

[parent] [thread] 9 comments
1. dgello+(OP)[view] [source] 2024-02-14 09:36:12
I interact with ChatGPT by voice pretty often, they have the best speech recognition I’ve ever seen. I can switch between languages (English, French, German) mid-sentence, think aloud, stop mid sentence, the correct what I just said, use highly technical terms (even describe code), I don’t even double check anymore because it’s almost always transcribed correctly. They can ~easily evolve the product to a more generalized conversation UX instead of just a text based chat.
replies(4): >>Lehere+V5 >>clbrmb+17 >>vwkd+Ca >>danpal+Qa
2. Lehere+V5[view] [source] 2024-02-14 10:54:13
>>dgello+(OP)
If only something like that was available on Android. I cannot dictate messages as my phone is in English, but most of my messages are in German or French. Or it's almost impossible to search for a non-English song when driving.

Multi languages would be so useful for me.

3. clbrmb+17[view] [source] 2024-02-14 11:11:47
>>dgello+(OP)
This. Whisper is phenomenal. Have you tried the conversational mode? I would love to be able to use that in a more customized agent. I know you can use the conversation mode with a custom GPT but I’d prefer to write dynamic prompts programmatically. Would be great for a generalized personal assistant that can take notes, send/read email, texts, etc. could be a good filter on social notifications?

Though the TTS side has some trouble switching languages if only single words are embedded. A single German word inside an English sentence can really get butchered. More training needed on multilingual texts (and perhaps preserving italics). But anyways this is really only an issue for early language learning applications in my experience.

replies(1): >>dgello+gG5
4. vwkd+Ca[view] [source] 2024-02-14 11:49:04
>>dgello+(OP)
Do you use the voice chat in the ChatGPT app?

In my experience, stopping to talk even for a moment already makes it submit. This makes a real conversation with pauses for thought difficult, because of the need to hurry before it cuts off.

replies(2): >>killth+Ad1 >>dgello+NE5
5. danpal+Qa[view] [source] 2024-02-14 11:50:48
>>dgello+(OP)
For me, voice is just a different UX for the same underlying model of chat. I'm sure it's good, but I'm not going to sit at my computer talking to it, and in fact I think talking may be a worse signal to noise ratio than typing, as I can easily use shortcuts with written text.
replies(1): >>dgello+hF5
◧◩
6. killth+Ad1[view] [source] [discussion] 2024-02-14 17:53:09
>>vwkd+Ca
FWIW if you hold down the big white button it won't submit until you release it. I had no idea this was a thing until seeing someone tweet about it.
replies(1): >>dgello+ZE5
◧◩
7. dgello+NE5[view] [source] [discussion] 2024-02-15 22:14:08
>>vwkd+Ca
I think you’re describing the conversation mode (started via the headphones icon), I also have issues using it. But you can also dictate a message, on iOS it’s the little gray wave icon on the right of the text input. With this mode there is no auto submission.
◧◩◪
8. dgello+ZE5[view] [source] [discussion] 2024-02-15 22:14:56
>>killth+Ad1
Thanks, I had no idea!
◧◩
9. dgello+hF5[view] [source] [discussion] 2024-02-15 22:16:08
>>danpal+Qa
Not when you’re on your computer, but you can do it on your phone when you’re walking in the street or commuting.

You can easily talk while you’re doing something else.

◧◩
10. dgello+gG5[view] [source] [discussion] 2024-02-15 22:20:56
>>clbrmb+17
The conversational mode is fascinating. But it’s frustrating to use for the same reasons ChatGPT can be annoying: it doesn’t remember that well previous messages, you end up in weird Alzheimer-ish discussions where the interlocutor speaks perfectly but has the memory of a clownfish
[go to top]