zlacker

[parent] [thread] 3 comments

I think it is more then a simple tts engine. At least from the demo, they showed: It can control the speed and it can sing when requested. Maybe its still a seperate speech engine, but more closely connected to the llm.

replies(3): >>sooheo+V3 >>kromem+Lg >>nabaki+cj

>>leumon+(OP)
tts with separate channels for style would do it, no?

>>leumon+(OP)
Most impressive was the incredulity to the 'okay' during the counting demo after the nth interruption.

Was quickly apparent that text only is a poor medium for the variety and scope of signals that could be communicated by these multimodal networks.

>>leumon+(OP)
Azure Speech tts is capable of doing this with SSML. I wouldn't be surprised if it's what OpenAI is using on the backend.

[go to top]