I think the adage about "a solution needs to be 10x other solutions to make someone switch" applies here.
Saying something performs slightly better than the industry standard offerings (OpenAI) means that OpenAI is going to laugh all the way to the bank. Everyone will just use their APIs over anything else.
I'm excited about the LLM space and I can barely keep up with the model names, much less all the techniques for fine tuning. A customer is going to have an even worse time.
No one will ever get fired for buying OpenAI (now that IBM is dead, and probably sad Watson never made a dent).
I do use Mistral for all my personal projects but I'm not sure that is going to have the same effect on the industry as open source software did in the past.
My pet theory is that OpenAI are cooking high quality user data by empowering GPT with all these toys + human-in-the-loop. The purpose is to use this data as a sort of continual evaluation sifting for weak points and enhancing their fine-tuning datasets.
Every human response can carry positive or negative connotation. The model can use that as a reward signal. They claimed to have 100M users, times let's say 10K tokens per month makes 1T synthetic tokens. In a whole year they generate about as much text as the original dataset, 13T. And we know that LLMs can benefit a lot from synthetic data when it is filtered/engineered for quality.
So I think OpenAI's moat is the data they generate.