zlacker

[return to "We replaced Firecracker with QEMU"]
1. london+f4[view] [source] 2023-07-10 14:33:11
>>hugodu+(OP)
I really want VM's to integrate 'smarter' with the host.

For example, if I'm running 5 VM's, there is a good chance that many of the pages are identical. Not only do I want those pages to be deduplicated, but I want them to be zero-copy (ie. not deduplicated after-the-fact by some daemon).

To do that, the guest block cache needs to be integrated with the host block-cache, so that whenever some guest application tries to map data from disk, the host notices that another virtual machine has already caused this data to be loaded, so we can just map the same page of already loaded data into the VM that is asking.

◧◩
2. hamand+zh[view] [source] 2023-07-10 15:30:16
>>london+f4
If you already know so much about your application(s), are you sure you need virtualization?
◧◩◪
3. drbawb+lq1[view] [source] 2023-07-10 20:11:05
>>hamand+zh
The second I read "shared block cache" my brain went to containers.

If you want data colocated on the same filesystem, then put it on the same filesystem. VMs suck, nobody spins up a whole fabricated IBM-compatible PC and gaslights their executable because they want to.[1] They do it because their OS (a) doesn't have containers, (b) doesn't provide strong enough isolation between containers, or (c) the host kernel can't run their workload. (Different ISA, different syscalls, different executable format, etc.)

Anyone who has ever tried to run heavyweight VMs atop a snapshotting volume already knows the idea of "shared blocks" is a fantasy; as soon as you do one large update inside the guest the delta between your volume clones and the base snapshot grows immensely. That's why Docker et al. has a concept of layers and you describe your desired state as a series of idempotent instructions applied to those layers. That's possible because Docker operates semantically on a filesystem; much harder to do at the level of a block device.

Is the a block containing b"hello, world" part of a program's text section, or part of a user's document? You don't know, because the guest is asking you for an LBA, not a path, not modes, not an ACL, etc. - If you don't know that, the host kernel has no idea how the page should be mapped into memory. Furthermore storing the information to dedup common blocks is non-trivial: go look at the manpage for ZFS' deduplication and it is littered w/ warnings about the performance, memory, and storage implications of dealing with the dedup table.

[1]: https://www.youtube.com/watch?v=coFIEH3vXPw

[go to top]