Phones are never going to run the largest models locally because they just don't have the size, but we're seeing improvements in capability at small sizes over time that mean that you can run a model on your phone now that would have required hundreds of billions of parameters less than 6 years ago.
I supose the future will look exacrly like now. Some mixture of local and non local.
I guess my argument is that market dominated by local doesn't seem right and I think the balance will look similar to what it is right now