I don't. I assume it would need to be constantly running to know when it wants to speak and there will be multiple actors on the screen all the time. Do we have actual estimates for how much a response costs in ChatGPT? All I know is it takes a lot of video cards to power that system.
> If you have some optimized LLMs running on the client
Do these currently exist? I was under the impression that tech to date is compute intensive if you're looking for near real time interaction.