I operate a SaaS product that utilizes generative AI. Many users have been expressing concerns about the response speed. Interestingly, it seems that the speed is slower during the day and faster at night. Once again, this is fantastic information!
For api / switching based on speed: do you mean specifically between the identical or nearly models like gpt-4 and gpt-4-0613, or do you mean across non identical models too?