FWIW, Twitter did what you're describing, we had 4 or 5 thousands hosts running the Ruby stack at its peak. Unicorns and Rainbows, oh my. Then it started shrinking until it shrank to nothing. That period was actual the relatively stable period. The crazy period, the one that I wasn't there for, was probably impossible to architect your way out of because it was simply a crazy amount of growth in a really short amount of time, and it had a number of ways in which unpredictable events could bring it to its knees. You needed the existing architecture to stay functional for more than a week at a time for a solid 6 months to be able to start taking load off the system and putting it onto more scalable software.
Any startup would be making a mistake to architect for Twitter scale. Some startups have "embarrassingly parallel" problems -- Salesforce had one of these, although they had growing pains that customers mostly didn't notice in 2004 timeframe. Dropbox is another one. If you're lucky enough to be able to horizontally scale forever, then great, throw money at the problem. Twitter, at certain points in its evolution (remember AWS was not a thing) was literally out of room/power. That happened twice with two different providers.