I'm seeing an increasing trend of pushback against this norm. An early example was David Crawshaw's one-process programming notes [1]. Running the database in the same process as the application server, using SQLite, is getting more popular with the rise of Litestream [2]. Earlier this year, I found the post "One machine can go pretty far if you build things properly" [3] quite refreshing.
Most of us can ignore FAANG-scale problems and keep right on using POSIX on a handful of machines.
But his architecture does seem to be consistent with a "minutes of downtime" model. He's using AWS, and has his database on a separate EBS volume with a sane backup strategy. So he's not manually fixing servers, and has reasonable migration routes for most disaster scenarios.
Except for PBKAC, which is what really kills most servers. And HA servers are more vulnerable to that, since they're more complicated.
I posit that this kind of "deal killer" is most often a wish list item and not a true need. I think most teams without a working product think these kinds of theoretical reliability issues are "deal killers" as a form of premature optimization.
I worked at a FANG doing a product where we thought availability issues caused by sessions being "owned" by a single server design was a deal killer. I.e. that one machine could crash at any time and people would notice, we thought. We spent a lot of time designing a fancy fully distributed system where sessions could migrate seamlessly, etc. Spent the good part of a year designing and implementing it.
Then, before we finished, a PM orchestrated purchase of a startup that had a launched product with similar functionality. Its design held per-user session state on a single server and was thus much simpler. It was almost laughably simple compared to what we were attempting. The kind of design you'd write on a napkin over a burrito lunch as minimally viable, and quickly code up -- just what you'd do in a startup.
After the acquisition we had big arguments between our team and those at the startup about which core technology the FANG should go forward with. We'd point at math and theory about availability and failure rates. They'd point at happy users and a working product. It ended with a VP pointing at the startup's launched product saying "we're going with what is working now." Within months the product was working within the FANG's production infrastructure, and it has run almost unchanged architecturally for over a decade. Is the system theoretically less reliable than our fancier would-be system? Yes. Does anybody actually notice or care? No.
That is not at all to say that it is a deal breaker for everyone, but it certainly will be for some companies.