This is in practice not true at all. Vertical scaling is typically a sublinear cost increase (up to a point, but that point is a ridiculous beast of a machine), since you're (typically) upgrading just the CPU and/or just the RAM or just the storage; not all of them at once.
There are instances where you can get nearly 10x the machine for 2x the cost.
(and fwiw, wikipedia agrees with this definition: https://en.wikipedia.org/wiki/Scalability#Horizontal_(scale_... )
> Vertical scaling — a bigger, exponentially more expensive server
> Horizontal scaling — distribute the load over more servers
The fact that AMD chooses to package the "nodes" together on one die vs multiple doesn't change that.
> typically involving the addition of CPUs, memory or storage to a single computer.
Distinction between horizontal and vertical scaling becomes nonsense if we accept your definitions, because literally nobody does that sort of vertical scaling.
* Replace the CPU with a faster one, but with the same number of cores. Or simply run the same one at a higher clock rate.
* Add memory, or use faster memory.
* Add storage, or use faster storage.
These are all forms of vertical scaling because they reduce the time it takes to process a single transaction, either by reducing waits or by increasing computation speed.
> It's also contradicted by both the article we're discussing and the wikipedia article
The article agrees with this definition. Transaction latency decreases iff vertical scale increases. Transaction throughput increases with either form of scaling. Without this interpretation, the analogy to conveyor belts makes no sense.
Also, it's worth pointing out that increasing the number of processing units often _does_ decrease latency. In Factorio you need 3 advanced circuits for the chemical science pack. If your science lab can produce 1 science pack every 24 seconds but your pipeline takes 16 seconds to produce one advanced circuit, your whole pipeline is going to have a latency of 48 seconds from start to finish due to being bottlenecked by the advanced circuit pipeline. Doubling the amount of processing units in each step of the circuit pipeline will double your throughput and bring your latency down to 24 seconds, as it should be. And if you have room for those extra processing units, you can do that without adding more belts.
The idea that serial speed is equivalent to latency breaks down when you consider what your computer's hardware is really doing under the scenes, too. Your cpu is constantly doing all manner of things in parallel: prefetching data from memory, reordering instructions and running them in parallel, speculatively executing branches, ...et cetera. None of these things decrease the fundamental latency of reading a single byte from memory with a cold cache, but it doesn't really matter because at the end of the day we're measuring some application-specific metric like transaction latency.