It doesn't matter if OpenAI or AWS or GCP encoded the entire works of Shakespeare in their LLM, they can all write/complete a valid limerick about "There once was a man from Nantucket".
I seriously doubt AWS is going to license OpenAI's technology when they can just copy the functionality, royalty free, and charge users for it. Maybe they will? But I doubt it. To the end user it's just another locally hosted API. Like DNS.
More likely, they're a loss-leader and generating publicity by making it as cheap as possible.
_Everything_ we've seen come out of silicon valley does this, so why would they suddenly be charging the right price?
I thought the was a somewhat clear agreement that openAI is currently running inference at a loss?
Quake and Counter-Strike in the 1990s ran like garbage in software-rendering mode. I remember having to run Counter-Strike on my Pentium 90 at the lowest resolution, and then disable upscaling to get 15fps, and even then smoke grenades and other effects would drop the framerate into the single digits. Almost two years after Quake's release did dedicated 3d video cards (voodoo 1 and 2 were accelerators, depended on a seperate 2d VGA graphics card to feed it) begin to hit the market.
Nowadays you can run those games (and their sequels) in the thousands (tens of thousands?) of frames per second on a top end modern card. I would imagine similar events with hardware will transpire with LLM. OpenAI is already prototyping their own hardware to train and run LLMs. I would imagine NVidia hasn't been sitting on their hands either.
You mean like they already do on Amazon Bedrock?
So far we don't have any open source models that are close to GPT4, so we don't know what it takes to run them for similar speeds.