For example, if I'm running 5 VM's, there is a good chance that many of the pages are identical. Not only do I want those pages to be deduplicated, but I want them to be zero-copy (ie. not deduplicated after-the-fact by some daemon).
To do that, the guest block cache needs to be integrated with the host block-cache, so that whenever some guest application tries to map data from disk, the host notices that another virtual machine has already caused this data to be loaded, so we can just map the same page of already loaded data into the VM that is asking.
This just works on Hyper-V Linux guests btw. For all the crap MS gets they do some things very right.
Okay.
An OS isn't large. Your spotify/slack/browser instance is of comparable size. Says more about browser based apps but still.
Although I’ll note that the line between a VMM and hypervisor are not always clear. E.g., KVM includes some things that other hypervisors delegate to the VMM (such as instruction completion). And macOS’s hypervisor.framework is almost a pass through to the CPU’s raw capabilities.
This is why paying for dedicated memory is often more expensive than its counter part, because that dedicated memory is not considered as part of pooling.
Is it good to think of libvirt as a virtual machine mointor, or is that more "virtual machine management"?
Is there any article that tells the difference and relationship between KVM, QEMU, libvirt, virt-manager, Xen, Proxmox etc. with their typical use cases?
...firecracker does fine what it was designed to - short running fast start workloads.
(oh, and the article starts by slightly misusing a bunch of technical terms, firecracker's not technically a hypervisor per se)
Use cases: proxmox web interface exposed on your local network on a KVM Linux box that uses QEMU to manage VM’s. Proxmox will allow you to do that from the web. QEMU is great for single or small fleet of machines but should be automated for any heavy lifting. Proxmox will do that.
Firecracker has a balloon device you can inflate (ie: acquire as much memory inside the VM as possible) and then deflate... returning the memory to the host. You can do this while the VM is running.
https://github.com/firecracker-microvm/firecracker/blob/main...
By 'zero copy', I mean that when a guest tries to read a page, if another guest has that page in RAM, then no copy operation is done to get it into the memory space of the 2nd guest.
The host block cache will end up deduplicating it automatically because all the 'copies' lead back to the same block on disk.
I understand that they need to sell their product but jeez. don't leave us hanging like that
Better to not make copies in the first place.
And remember that as well as RAM savings, you also get 'instant loading' because there is no need to do slow SSD accesses to load hundreds of megabytes of a chromium binary to get slack running...
EDIT: Okay, https://www.geekwire.com/2018/firecracker-amazon-web-service... says my "pretty sure" memory is in fact correct.
In really simple terms, so simple that I'm not 100% sure they are correct:
* KVM is a hypervisor, or rather it lets you turn linux into a hypervisor [1], which will let you run VMs on your machine. I've heard KVM is rather hard to work with (steep learning curve). (Xen is also a hypervisor.)
* QEMU is a wrapper-of-a-sorts (a "machine emulator and virtualizer" [2]) which can be used on top of KVM (or Xen). "When used as a virtualizer, QEMU achieves near native performance by executing the guest code directly on the host CPU. QEMU supports virtualization when executing under the Xen hypervisor or using the KVM kernel module in Linux." [2]
* libvirt "is a toolkit to manage virtualization platforms" [3] and is used, e.g., by VDSM to communicate with QEMU.
* virt-manager is "a desktop user interface for managing virtual machines through libvirt" [4]. The screenshots on the project page should give an idea of what its typical use-case is - think VirtualBox and similar solutions.
* Proxmox is the above toolstack (-ish) but as one product.
---
[1] https://www.redhat.com/en/topics/virtualization/what-is-KVM
Raises a few questions to me:
Can you use KVM/do KVM stuff without QEMU?
Can you do libvirt stuff without QEMU?
Hoping the answers to both aren't useless/"technically, but why would you want to?"
Qemu is a user space system emulator. It can emulate in software different architectures like ARM, x86, etc. It can also emulate drivers, networking, disks, etc. Is called via the command line.
The reason you'll see Qemu/KVM a lot is because Qemu is the emulator, the things actually running the VM. And it utilizes KVM (on linux, OSX has HVF, for example) to accelerate the VM when the host architecture matches the VM's.
Libvirt is an XML based API on top of Qemu (and others). It allows you to define networks, VMs (it calls them domains), and much more with a unified XML schema through libvirtd.
Virsh is a CLI tool to manage libvirtd. Virt-manager is a GUI to do the same.
Proxmox is Debian under the hood with Qemu/KVM running VMs. It provides a robust web UI and easy clustering capabilities. Along with nice to haves like easy management of disks, ceph, etc. You can also manage Ceph through an API with Terraform.
Xen is an alternative hypervisor (like esxi). Instead of running on top of Linux, Xen has it's own microkernel. This means less flexibility (there's no Linux body running things), but also simpler to manage and less attack surface. I haven't played much with xen though, KVM is kind of the defacto, but IIRC AWS used to use a modified Xen before KVM came along and ate Xen's lunch.
On the first path you are likely going to be just fine with VirtualBox, VMWare Workstation or Hyper-V (Windows only) / Parallels (Mac intended). Which one you should pick depends on your desired use of the machines.
On the second path you would go with a solution that deals with the nitty-gritty details, such as Proxmox, oVirt, Hyper-V, ESXi, or any of the other many available options - granted you are not going full cloud-based, which opens up a whole lot of different options too.
You would generally never need to worry about which components are needed where and why. I've had to worry about it once or twice before, because I've had to debug why an oVirt solution was not behaving like I wanted it to behave. Knowing the inner workings helps in that case.
I didn't know it existed until they posted, but QEMU has a Firecracker-inspired target:
> microvm is a machine type inspired by Firecracker and constructed after its machine model.
> It’s a minimalist machine type without PCI nor ACPI support, designed for short-lived guests. microvm also establishes a baseline for benchmarking and optimizing both QEMU and guest operating systems, since it is optimized for both boot time and footprint.
QEMU is a low level process that represents the virtual machine. It has no equivalent in Xen. Using QEMU directly is not a good idea unless your needs for VM configurations change all the time and you hardly reuse VMs.
Libvirt is at a higher level than QEMU. It manages the QEMU processes and gives them access to system resources (image files, network interfaces, pass-through PCI devices). It also makes it easy to manage the configuration of your virtual machines and the resources they use.
Higher still is virt-manager, which is a GUI interface for libvirt. Proxmox sits at roughly the same level as virt-manager.
Yes there's a few things out there like Firecracker that use KVM without using QEMU. I'm not completely aware of all of them but they do exist
> Can you do libvirt stuff without QEMU?
Yes it can also manager LXC containers and a few other types like Xen and Bhyve and Virtuozzo, like QEMU without KVM. The without KVM part is important to letting you run VMs that are emulating other architectures than the native one.
For a good bit of this, it is "why would you want to" but there are definitely real cases where you'd want to be able to do this. Like the LXC or Virtuozzo support means that you can run lighter weight containers (same underlying tech as Docker essentially) through the same orchestration/management that you use for virtual machines. And the Bhyve support lets you do the same thing for running things on top of FreeBSD (though I've never used it this way) so that a heterogeneous mix of hosts is managed through the same interfaces.
It makes sure that if your VM and/or QEMU are broken out of, there are extra layers to prevent getting access to the whole physical machine. For example it runs QEMU as a very limited user and, if you're using SELinux, the QEMU process can hardly read any file other than the vm image file.
By contrast the method in the arch wiki runs QEMU as root. QEMU is exposed to all sort of untrusted input, so you really don't want it to run as root.
Libvirt also handles cross machine operations such as live migration, and makes it easier to query a bunch of things from QEMU.
For more info see https://www.redhat.com/en/blog/all-you-need-know-about-kvm-u...
See (OpenVZ) "Containers share dynamic libraries, which greatly saves memory." It's just 1 Linux kernel when you are running OpenVZ containers.
https://docs.openvz.org/openvz_users_guide.webhelp/_openvz_c...
See (KVM/KSM): "KSM enables the kernel to examine two or more already running programs and compare their memory. If any memory regions or pages are identical, KSM reduces multiple identical memory pages to a single page. This page is then marked copy on write."
https://access.redhat.com/documentation/en-us/red_hat_enterp...
In KVM's defense, it supports a much wider range of OSes; OpenVZ only really does different versions of Linux, while KVM can run OpenBSD/FreeBSD/NetBSD/Windows and even OS/2 in addition to Linux.
Other than making sure we release unused memory to the host, we didn't customize QEMU that much. Although we do have a cool layered storage solution - basically a faster alternative to QCOW2 that's also VMM independent. It's called overlaybd, and was created and implemented in Alibaba. That will probably be another blog post. https://github.com/containerd/overlaybd
For those unfamiliar, the informal distinction between type-1 and type-2 is that type-1 hypervisors are in direct control of the allocation of all resources of the physical computer, while type-2 hypervisors operate as some combination of being “part of” / “running on” a host operating system, which owns and allocates the resources. KVM (for example) gives privileged directions to the Linux kernel and its virtualization kernel module for how to manage VMs, and the kernel then schedules and allocates the appropriate system resources. Yes, the type-2 hypervisor needs kernel-mode primitives for managing VMs, and the kernel runs right on the hardware, but those primitives aren’t making management decisions for the division of hardware resources and time between VMs. The type-2 hypervisor is making those decisions, and the hypervisor is scheduled by the OS like any other user-mode process.
It’s like with “isomorphic” code. That just sounds much cooler than “js that runs on the client and the server”.
> is KVM a hypervisor? is it type 1 or type 2? is QEMU a hypervisor, is it type 1 or type 2? if QEMU is using KVM, is QEMU then not a hypervisor in that use case?
Yes, KVM (Kernel-Based Virtual Machine) is indeed a hypervisor. It's a type 1 hypervisor, also known as a "bare metal" hypervisor. This is because KVM directly runs on the host's hardware to control the hardware and to manage guest operating systems. The fact that it's a Linux kernel module that allows the Linux kernel to function as a hypervisor makes it very efficient.
QEMU (Quick Emulator) is a bit more complex. By itself, it is technically a type 2 or "hosted" hypervisor, meaning it runs within a conventional operating system environment. QEMU is a generic, open-source machine emulator and virtualizer that can emulate a variety of hardware types and host a range of guest operating systems.
However, when QEMU is used with KVM, the picture changes somewhat. In this case, KVM provides the hardware virtualization where it allows the host machine to call CPU instructions of full virtualization. Then, QEMU emulates the hardware resources and provides the user interface for the VM, thus allowing for better performance and usability. It's this combination of KVM's hardware acceleration capabilities and QEMU's emulation capabilities that makes them often used together.
In this case, QEMU is not acting purely as a hypervisor; it's providing hardware emulation and user interface for the VMs, while KVM is the part providing the hypervisor functionality. However, we often refer to the combination of "QEMU/KVM" as a unit when talking about this mode of operation.
And virt-manager indeed manages Libvirt machines so it's not at the level of QEMU as you wrote in the parent comment:
> Proxmox is a virtual machine manager (like QEMU, virt-manager)
If you actually played with Xen you'd know it's not actually easier to manage. And increased security claims are dubious at best, as same thing that would be attacked (dom0 managing the whole thing and running linux) have direct unfettered access to xen microkernel. There is reason many sites migrated away from Xen to KVM. Also many Xen drivers de facto run part Linux dom0 instance so you don't even get that isolation.
We ran Xen for few years, as KVM at first was still not as refined and Xen was first to mature market, and it was just million little annoying things.
KVM offers far simple and straightforward management. A VM is just a process. You can look at its CPU usage via normal tools. No magic. No driver problems.
"API to virtualization system" would probably be closest approximation but it also does some more advanced stuff like coordinating cross-host VM migration
e.g. your guest kernel is loading an application into memory, by reading some parts of an ELF file from disk. Presumably each VM has its own unique disk, so the hypervisor can't know that this is "the same" page of data as another VM has without actually reading it into memory first and calculating a hash or something.
If the VMs share a disk image (e.g. the image is copy-on-write), then I could see it being feasible - e.g. with KVM, even if your VMs are instantiated by distinct userspace processes, they would probably share the pages as they mmap the same disk image. You would still need your virtualised disk device to support copy-on-write, which may or may not be possible depending on your use case.
But your copy-on-write disk images will probably quickly diverge in a way that makes most pages not shareable, unless you use some sort of filesystem optimised for that.
Lastly, since you mentioned Chromium or Slack in another comment - I'm sure you'll find nearly all of the loading time there is not spent loading the executable from disk, but actually executing it (and all its startup/initialisation code). So this probably won't be the speedup you're imagining. It would just save memory.
There are projects doing that althought qemu is the de facto standard and best bet if you don't need to boot your machines in 20ms
> Can you do libvirt stuff without QEMU?
Libvirt have many backends so yes. IIRC it can even manage virtualbox, althought I'm, not sure why anyone would want to
> Hoping the answers to both aren't useless/"technically, but why would you want to?"
...why? Is there a problem kvm+qemu+libvirt doesn't solve for you?
zero-copy is harder as one system upgrade on one of them will trash it, but KSM is overall pretty effective at saving some memory on similar VMs
HN is here for the technical details ;)
That depends on the workload and the maximum memory allocated to the guest OS.
A lot of workloads rely on the OS cache/buffers to manage IO so unless RAM is quite restricted you can call in to release that pretty easily prior to having the balloon driver do its thing. In fact I'd not be surprised to be told the balloon process does this automatically itself.
If the workload does its own IO management and memory allocation (something like SQL Server which will eat what RAM it can and does its own IO cacheing) or the VM's memory allocation is too small for OS caching to be a significant use after the rest of the workload (you might pair memory down to the bare minimum like this for a “fairly static content” server that doesn't see much variation in memory needs and can be allowed to swap a little if things grow temporarily), then I'd believe is it more difficult. That is hardly the use case for firecracker though so if that is the sort of workload being run perhaps reassessing the tool used for the job was the right call.
Having said that my use of VMs is generally such that I can give them a good static amount of RAM for their needs and don't need to worry about dynamic allocation, so I'm far from a subject expert here.
And, isn't firecraker more geared towards short-lived VMs, quick to spin up, do a job, spin down immediately (or after only a short idle timeout if the VM might answer another request if one comes in immediately or is already queued), so you are better off cycling VMs, which is probably happening anyway, than messing around with memory balloons? Again, I'm not talking from a position of personal experience here so corrections/details welcome!
[1]: https://www.redhat.com/en/topics/virtualization/what-is-KVM
So startup time could be better than halved. Seems worth it.
btrfs on the host would have support for deduplication of identical pages in the disk images. It's true that a CPU-costly scan would be needed to identify new shared pages, if for example, two VM's are both updated to the latest distro release.
Just because you can doesn't mean you should.
What actually can change is the amount of work that the kernel-mode hypervisor leaves to a less privileged (user space) component.
For more detail see https://www.spinics.net/lists/kvm/msg150882.html
> The distinction between these two types is not always clear. For instance, KVM and bhyve are kernel modules[6] that effectively convert the host operating system to a type-1 hypervisor.[7] At the same time, since Linux distributions and FreeBSD are still general-purpose operating systems, with applications competing with each other for VM resources, KVM and bhyve can also be categorized as type-2 hypervisors.[8]
[1]: https://www.usenix.org/system/files/nsdi20-paper-agache.pdf
AWS switched to KVM, and even a lot of AWS systems that report themselves as Xen are running as KVM with a compat shim to say it's Xen.
We reclaim memory with a memory balloon device, for the disk trimming we discard (& compress) the disk, and for i/o speed we use io_uring (which we only use for scratch disks, the project disks are network disks).
It's a tradeoff. It's more work and does require custom implementations. For us that made sense, because in return we get a lightweight VMM that we can more easily extend with functionality like memory snapshotting and live VM cloning [1][2].
[1]: https://codesandbox.io/blog/how-we-clone-a-running-vm-in-2-s...
[2]: https://codesandbox.io/blog/cloning-microvms-using-userfault...
It was never popularly used in a way accurate to the origin of the classification - in the original paper by Popek and Goldberg talked about formal proofs for the two types and they really have very little to do with how the terms began being used in the 90s and 00s. Things have changed a lot with computers since the 70s when the paper was written and the terminology was coined.
So, language evolves, and Type-1 and Type-2 came to mean something else in common usage. And this might have made sense to differentiate something like esx from vmware workstation in their capabilities, but it's lost that utility in trying to differentiate Xen from KVM for the overwhelming majority of use cases.
Why would I say it's useless in trying to differentiate, say, Xen and KVM? Couple of reasons:
1) There's no performance benefit to type-1 - a lot of performance sits on the device emulation side, and both are going to default to qemu there. Other parts are based heavily on CPU extensions, and Xen and KVM have equal access there. Both can pass through hardware, support sr-iov, etc., as well.
2) There's no overhead benefit in Xen - you still need a dom0 VM, which is going to arguably be even more overhead than a stripped down KVM setup. There's been work on dom0less Xen, but it's frankly in a rough state and the related drawbacks make it challenging to use in a production environment.
Neither term provides any real advantage or benefit in reasoning between modern hypervisors.
Fly.io themselves admitted they’re oversubscribed and AWS is doing the same for years now
You're going to need dom0 (a "control domain") on any Xen host. Gotta have something running xl and the rest of the toolstack for managing it. dom0less technically exists but the drawbacks mean it's not really usable by most people in a production situation.
Light is cool but for many tasks that level of Spartan is overkill
If I’m investing time in light it might as well be wasm tech
Maybe it's because of the time I grew up in, but in my mind the prototypical Type-I hypervisor is VMWare ESX Server; and the prototypical Type-II hypervisor is VMWare Workstation.
It should be noted that VMWare Workstation always required a kernel module (either on Windows or Linux) to run; so the core "hypervisor-y" bit runs in kernel mode either way. So what's the difference?
The key difference between those two, to me is: Is the thing at the bottom designed exclusively to run VMs, such that every other factor gives way? Or does the thing at the bottom have to "play nice" with random other processes?
The scheduler for ESX Server is written explicitly to schedule VMs. The scheduler for Workstation is the Windows scheduler. Under ESX, your VMs are the star of the show; under Workstation, your VMs are competing with the random updater from the printer driver.
Xen is like ESX Sever: VMs are the star of the show. KVM is like Workstation: VMs are "just" processes, and are competing with whatever random bash script was created at startup.
KVM gets loads of benefits from being in Linux; like, it had hypervisor swap from day one, and as soon as anyone implements something new (like say, NUMA balancing) for Linux, KVM gets it "for free". But it's not really for free, because the cost is that KVM has to make accommodations to all the other use cases out there.
> There's no performance benefit to type-1 - a lot of performance sits on the device emulation side, and both are going to default to qemu there.
Er, both KVM and Xen try to switch to paravirtualized interfaces as fast as possible, to minimize the emulation that QEMU has to do.
And then they learn all hotels are doing exactly same thing. One hotel doing is risk, all hotels doing is industry standard.
Airlines, hotels, restaurants, doctors and so on oversubscribe all the time. Whoever complains are free to move and add to their further disappointments.
My point is that these are largely appropriated terms - neither would fit the definitions of type 1 or type 2 from the early days when Popek and Goldberg were writing about them.
> Or does the thing at the bottom have to "play nice" with random other processes?
From this perspective, Xen doesn't count. You can have all sorts of issues from the dom0 side and competing with resources - you mention PV drivers later, and you can 100% run into issues with VMs because of how dom0 schedules blkback and netback when competing with other processes.
ESXi can also run plenty of unmodified linux binaries - go back in time 15 years and it's basically a fully featured OS. There's a lot running on it, too. Meanwhile, you can build a linux kernel with plenty of things switched off and a root filesystem with just the bare essentials for managing kvm and qemu that is even less useful for general purpose computing than esxi.
>Er, both KVM and Xen try to switch to paravirtualized interfaces as fast as possible, to minimize the emulation that QEMU has to do.
There are more things being emulated than there are PV drivers for, but this is a bit outside of my point.
For KVM, the vast majority of implementations are using qemu for managing their VirtIO devices as well - https://developer.ibm.com/articles/l-virtio/ - you'll notice that IBM even discusses these paravirtual drivers directly in context of "emulating" the device. Perhaps a better way to get the intent across here would be saying qemu handles the device model.
From a performance perspective, ideally you'd want to avoid PV here too and go with sr-iov devices or passthrough.
As a Linux user, why would you want to use VirtualBox or VMWare Workstation? They are not so well integrated with the system, and, frankly, VirtualBox is more of a toy VM player... just go for virt-manager. It gives a conceptually similar interface to VirtualBox, but better integration with the rest of the system. Especially, when it comes to stuff like sending different key combinations.
I honestly cannot think of a single benefit to using VirtualBox (and I'm less familiar with VMWare player) compared to virt-manager. My guess is that it's more often used because it's also a common choice on MS Windows, so, you get more hits if you are going to search the Web for questions associated to VMs / you'd get tutorials for how to set up a VM that use VirtualBox. But, if you apply yourself to learning how either one of these works, you'd see no reason to choose it.
Chances are you are using systems that do this and you haven't even noticed.
Malloc will happily “return” the 15 TiB you asked for.
If 10000 people called 911 at the same time, only a tiny fraction would get through (and even fewer would get help).
Evacuating a large city by road would result in giant traffic jams.
There are 5-8x as many parking spots as there are cars (and we still can’t find a goddamn spot).
And of course… the great toilet paper shortage of 2020.
If you want data colocated on the same filesystem, then put it on the same filesystem. VMs suck, nobody spins up a whole fabricated IBM-compatible PC and gaslights their executable because they want to.[1] They do it because their OS (a) doesn't have containers, (b) doesn't provide strong enough isolation between containers, or (c) the host kernel can't run their workload. (Different ISA, different syscalls, different executable format, etc.)
Anyone who has ever tried to run heavyweight VMs atop a snapshotting volume already knows the idea of "shared blocks" is a fantasy; as soon as you do one large update inside the guest the delta between your volume clones and the base snapshot grows immensely. That's why Docker et al. has a concept of layers and you describe your desired state as a series of idempotent instructions applied to those layers. That's possible because Docker operates semantically on a filesystem; much harder to do at the level of a block device.
Is the a block containing b"hello, world" part of a program's text section, or part of a user's document? You don't know, because the guest is asking you for an LBA, not a path, not modes, not an ACL, etc. - If you don't know that, the host kernel has no idea how the page should be mapped into memory. Furthermore storing the information to dedup common blocks is non-trivial: go look at the manpage for ZFS' deduplication and it is littered w/ warnings about the performance, memory, and storage implications of dealing with the dedup table.
Straight from their site. QEMU is the user space interface, KVM the kernel space driver. It’s enough to run whatever OS. That’s the point.
For libvirt: https://libvirt.org/drivers.html
They support a bunch as well.
A fairly recent Windows 11 Pro image is ~26GB unpacked and 141k dirents. After finishing OOBE it's already running like >100 processes, >1000 threads, and >100k handles. My Chrome install is ~600MB and 115 dirents. (Not including UserData.) It runs ~1 process per tab. Comparable in scope and complexity? That's debatable, but I tend to agree that modern browsers are pretty similar in scope to what an OS should be. (The other day my "web browser" flashed the firmware on the microcontroller for my keyboard.)
They're not even close to "being comparable in size," although I guess that says more about Windows.
I think the complaints are perfectly valid.
for many mostly "general purpose" use cases it's quite viable, or else ~fly.io~ AWS Fargate wouldn't be able to use it
this doesn't mean it's easy to implement the necessary automatized tooling etc.
so it's depending on your dev resources and priorities it might be a bad choice
still I feel the article was had quite a bit a being subtil judgemental while moving some quite relevant parts for the content of the article into a footnote and also omitting that this "supposedly unusable tool" is used successfully by various other companies...
like as it it was written by and engineer being overly defensive about their decision due having to defend it to the 100th time because shareholders, customers, higher level management just wouldn't shut up about "but that uses Firecracker"
so while Firecracker was designed for thing running just a few seconds there are many places running it with jobs running way longer then that
the problem is if you want to make it work with long running general purpose images you don't control you have to put a ton of work into making it work nicely on all levels of you infrastructure and code ... which is costly ... which a startup competing on a online dev environment compared to e.g. a vm hosting service probably shouldn't wast time on
So AFIK the decision in the article make sense the reasons but listed for the decision are oversimplified to a point you could say they aren't quite right. Idk. why, could be anything from the engineer believing that to them avoiding issues with some shareholder/project lead which is obsessed with "we need to do Firecracker because competition does so too".
I will never understand the whole virtual machine and cloud craze. Your operating system is better than any hypervisor at sharing resources efficiently.
mainly it's optimized to run code only shortly (init time max 10s, max usage is 15min, and default max request time 130s AFIK)
also it's focused on thin server less functions, like e.g. deserialize some request, run some thin simple business logic and then delegate to other lambdas based on it. This kind of functions often have similar memory usage per-call and if a call is an outlier it can just discard the VM instance soon after (i.e. at most after starting up a new instance, i.e. at most 10s later)
Automatic scaling is great. Cloud parallelization (a.k.a fork) is absolutely wild once you get it rolling. Code deployments are incredibly simple. Never having to worry about physical machines or variable traffic loads is worth the small overhead they charge me for the wrapper. The generic system wide permissions model is an absolute joy once you get over the learning curve.
I would be scared to let unknown persons use QEMU that bind mounts volumes as that is a huge security risk. Firecracker, I think, was designed from the start to run un-sanitized workloads, hence, no bind mounting.
Most dangerous 12-words sentence.
> i/o speed we use io_uring
custom io_uring based driver for the VM block devices? or what do you mean here?
Maybe they are async footnotes and there is a race condition. /s
It's absolutely usable in practice, it just makes oversubscription more challenging.
E.g. using the firecracker jailer: https://github.com/firecracker-microvm/firecracker/blob/main...
And if youre running untrusted code, then using a virtualized environment is the easiest (id even say best) way to go about it.
https://learn.microsoft.com/en-us/windows/security/applicati...
The fact of the matter is that it's just inefficient, slow and expensive.
Bare metal is simple, fast, and keeps you in control.
Interesting. I guess we are reading a different website.
> custom io_uring based driver for the VM block devices? or what do you mean here?
We're using the async io backend that's shipped with Firecracker for our scratch disks.
cloud vms have low capex and high opex
which one is more expensive is a function of many variables
Otherwise it's three times more expensive.
Here's a post of someone using KVM from Python (raw, without needing a kvm library or anything): https://www.devever.net/~hl/kvm
The entire vibe of this thread is
1) everyone is doing it
2) efficiency drives cost down (to the vendor) but those savings are not passed to the consumer
3) "hardware you pay for when you need it", all predicated on the at IF and it doesn't happen.
Oversubscription should always be opt-in, otherwise it is an underhanded scam.