Because honestly I don't care about 0.2 tps for my use cases although I've spoken with many who are fine with numbers like that.
At least the people I've talked to they talk about how if they have a very high confidence score that the model will succeed they don't mind the wait.
Essentially a task failure is 1 in 10, I want to monitor and retry.
If it's 1 in 1000, then I can walk away.
The reality is most people don't have a bearing on what this order of magnitude actually is for a given task. So unless you have high confidence in your confidence score, slow is useless
But sometimes you do...