Microsandbox: Virtual Machines that feel and perform like containers

I'd like to see a formal container security grade that works like:

  1) Curate a list of all known (container) exploits
  2) Run each exploit in environments of increasing security like permissions-based, jail, Docker and emulator
  3) The percentage of prevented exploits would be the score from 0-100%

Under this scheme, I'd expect naive attempts at containerization with permissions and jails to score around 0%, while Docker might be above 50% and Microsandbox could potentially reach 100%.

This might satisfy some of our intuition around questions like "why not just use a jail?". Also the containers could run on a site on the open web as honeypots with cash or crypto prizes for pwning them to "prove" which containers achieve 100%.

We might also need to redefine what "secure" means, since exploits like Rowhammer and Spectre may make nearly all conventional and cloud computing insecure. Or maybe it's a moving target, like how 64 bit encryption might have once been considered secure but now we need 128 bit or higher.

Edit: the motivation behind this would be to find a container that's 100% secure without emulation, for performance and cost-savings benefits, as well as gaining insights into how to secure operating systems by containerizing their various services.

>>zackmo+EL
You cannot build a secure container runtime (against malicious containers) because underlying it is the Linux kernel.

The only way to make Linux containers a meaningful sandbox is to drastically restrict the syscall API surface available to the sandboxee, which quickly reduces its value. It's no longer a "generic platform that you can throw any workload onto" but instead a bespoke thing that needs to be tuned and reconfigured for every usecase.

This is why you need virtualization. Until we have a properly hardened and memory safe OS, it's the only way. And if we do build such an OS it's unclear to me whether it will be faster than running MicroVMs on a Linux host.

>>bjackm+BR
> ... drastically restrict the syscall API surface available to the sandboxee, which quickly reduces its value ...

Depends I guess as Android has had quite a bit of success with seccomp-bpf & Android-specific flavour of SELinux [0]

> Until we have a properly hardened and memory safe OS ... faster than running MicroVMs on a Linux host.

Andy Tanenbaum might say, Micro Kernels would do just as well.

[0] https://youtu.be/WxbOq8IGEiE

>>ignora+NV
> Android

Exactly. Android pulls this off by being extremely constrained. It's dramatically less flexible than an OCI runtime. If you wanna run a random unenlightened workload on it you're probably gonna have a hard time.

> Micro Kernels would do just as well.

Yea this goes in the right direction. In the end a lot of kernel work I look at is basically about trying to retrofit benefits of microkernels onto Linux.

Saying "we should just use an actual microkernel" is a bit like "Russia and Ukraine should just make peace" IMO though.

zlacker