Why do you think cloud providers can undercut OpenAI? From what I know, Llama 70b is more expensive to run than GPT-3.5, unless you can get 70+% utilization rate for your GPUs, which is hard to do.
So far we don't have any open source models that are close to GPT4, so we don't know what it takes to run them for similar speeds.