If you're using these models to generate code daily, the costs add up.
Sure, I'll give a really tough problem to o3 (and probably over ChatGPT, not the API), but on general code tasks, there really isn't meaningful enough difference to justify 4x the cost.