I think it's true to say that vertical scaling normally is done by increasing the RAM and CPU of a single machine with a single address space and switch/bus. While horizontal scaling is normally adding more machines (additional addresses spaces and switch/bus). Historically this is because RAM to CPU performance (throughput and latency) in a single address space and bus greatly exceeds the performance of any NIC connecting machines with distinct address spaces and busses. And it mostly ignores effects like the performance costs of swapping/paging when you don't have enough RAM.
I haven't really seen many systems where horizontal scaling is truly linear, unless the problem is embarassingly parallel, like serving static content.