zlacker

[parent] [thread] 3 comments
1. leumon+(OP)[view] [source] 2024-05-21 00:31:35
I think it is more then a simple tts engine. At least from the demo, they showed: It can control the speed and it can sing when requested. Maybe its still a seperate speech engine, but more closely connected to the llm.
replies(3): >>sooheo+V3 >>kromem+Lg >>nabaki+cj
2. sooheo+V3[view] [source] 2024-05-21 00:54:35
>>leumon+(OP)
tts with separate channels for style would do it, no?
3. kromem+Lg[view] [source] 2024-05-21 02:56:42
>>leumon+(OP)
Most impressive was the incredulity to the 'okay' during the counting demo after the nth interruption.

Was quickly apparent that text only is a poor medium for the variety and scope of signals that could be communicated by these multimodal networks.

4. nabaki+cj[view] [source] 2024-05-21 03:19:24
>>leumon+(OP)
Azure Speech tts is capable of doing this with SSML. I wouldn't be surprised if it's what OpenAI is using on the backend.
[go to top]