You'll never get actual economics out of switching to open models without running your own hardware. That's the whole point. There's orders of magnitude difference in price, where a single V100/3090 instance can run llama2-70b inference for ~0.50$/hr.
Core speed and memory bandwidth matter a lot. This is on a Ryzen 7950 with DDR5.