I wouldn’t go as far as your last statement. While shocking, it’s not inconceivable that there’s native token I/O for audio. In fact tokenizing audio directly actually seems more efficient since the tokenization could be local.
Nevertheless. This is still incredibly embarrassing for OpenAI. And totally hurts the company’s aspiration to be good for humanity.