zlacker

[parent] [thread] 2 comments
1. astran+(OP)[view] [source] 2022-05-23 22:38:54
I believe we’re lacking someone training up a large music model here, but GPT-style transformers can produce music.

gwern can maybe comment here.

An actually scary thing is that AIs are getting okay at reproducing people’s voices.

replies(1): >>gwern+Di
2. gwern+Di[view] [source] 2022-05-24 01:12:14
>>astran+(OP)
Voice synthesis has been going steady. Lots of commercial and hobbyist interest: you can use 15.ai for crackerjack SaaS voice synthesis in a slick free UI; and if you want to run the models yourselves, Tortoise just released a FLOSS stack of remarkable quality.

Music, I'm afraid, appears stuck in the doldrums of small one-offs doing stuff like MIDI. Nothing like the breadth & quality of Jukebox has come out since it, even though it's super-obvious that there is a big overhang there and applying diffusion & other new methods would give you something like much like DALL-E 2 / Imagen for general music.

replies(1): >>thorum+cx
◧◩
3. thorum+cx[view] [source] [discussion] 2022-05-24 03:51:48
>>gwern+Di
The developer behind Tortoise is experimenting with using diffusion for music generation:

https://nonint.com/2022/05/04/friends-dont-let-friends-train...

[go to top]