Code and Let Live

>>usrme+(OP)
I'm really excited about https://sprites.dev/ - it hits two of my favourite problems at once:

1. Developer environment sandboxes. This is a cheap and convenient way to run Claude Code / Codex CLI / etc in YOLO mode in a persistent sandboxed VM with a restricted blast radius if something goes wrong.

2. Sandbox API. Fly now have a product that lets me make a simple JSON API call to run untrusted code in a new sandbox. There's even snapshotting support so I can roll back to a known state after running that code.

I wrote more a bunch more about this here: https://simonwillison.net/2026/Jan/9/sprites-dev/

>>memset+yi1
Yeah that's about right.

It's a fast starting and fast pausing persistent VM, with a ton of built in developer tools (including a preconfigured Claude Code) and an extra JSON API for executing commands within it so you can treat it as a sandbox.

You may find my writeup here useful: https://simonwillison.net/2026/Jan/9/sprites-dev/

>>dtkav+jo1
If you read Ben Toews work on the tokenizer you have a good sense of where I want Sprites to go with key leaks and prompt injection:

https://fly.io/blog/tokenized-tokens/

>>HumanO+zm1
Yes that’s certainly a great feature and they don’t have it currently. For what it’s worth, they do have a teaser about “Persistent disks with some really interesting work coming soon.”

https://blog.exe.dev/meet-exe.dev

>>simonw+QR
I have found container-use to be super useful for this.

https://container-use.com/quickstart

BTW Simon, I was super happy when I heard on Theo's podcast that he will be encouraging you to monetise your work more. I'm super appreciative of your work and I'm pretty convinced that the more you profit from it, the better the universe will be!!!

>>usrme+(OP)
Playing around with this for a small amount of time, it is very neat but also there are a bunch of things that are unclear / undocumented (I assume the documentation is coming so I'm not faulting them for it not being there yet).

Some things that are unclear:

- How should I auth to github? sprite console doesn't use ssh (afaik) so I guess not agent forwarding?

- What on machine api's are available? Can I use the fly oidc provider[1]? There's a /.sprite/api.sock but curl'ing /v1/tokens/oidc gets a 404.

- How much is it going to cost me? I know there is pricing but its hard to figure out what actual usage would be like. Also I don't see any usage info in the webui right now.

[1]: https://fly.io/blog/oidc-cloud-roles/

>>simonw+QR
I know you know this, as you posted it, but readers might want to look at this related thread:

Fly's Sprites.dev addresses dev environment sandboxes and API sandboxes together - >>46561089 - Jan 2026 (10 comments)

>>usrme+(OP)
I've been having so much fun working on sprites (and working with sprites) the last the several months. There's some neat parts of the Elixir side of this we're going to open source soon.

Also check out the 5 min demo we put out where I walk thru some sprite basics: https://www.youtube.com/watch?v=7BfTLlwO4hw

>>usrme+(OP)
I really want to love this, but my experience in the first 20 seconds is unfortunately like some of my other experiences coding against Fly APIs, they're broken.

https://sprites.dev/api has this command:

$ curl -X POST "https://api.sprites.dev/v1/sprites" \ -H "Authorization: Bearer $SPRITES_TOKEN" \ -d '{"name": "my-sprite"}'

which responds with

{"error":"name is required"}

if you use the request body in the full "Create Sprite" documentation at https://sprites.dev/api/sprites#create then it does work.

can I live with some rough edges for some personal workflows that only impact me when things break? sure. however, I was thinking about playing with some CI/CD stuff using sprites that would impact our whole team if things broke and I'm really on the fence because of this experience in the first 20 seconds.

Fly team - please put some black box probes or just better testing on the example you give in the quick start. if you document it, test it.

>>yoavsh+F94
Check out LXC and the wider Incus set of projects: https://linuxcontainers.org/incus/.

Running IncusOS on some local hardware with ZFS underneath is a phenomenally powerful sandbox.

>>losved+AA4
You can deploy from XCode to your iPhone, and it seems to behave like any other app when you do that. I do have a paid Apple developer account, and I think I read that if you don't then you have to re-sign the app every 7 days. If you wanted a small number of users then I don't think this would work. I think you could use TestFlight, which is Apple's method for distributing an unreleased version of an app, but I'm not sure what the review process would look like for that. Android would be much easier as long as you can still sideload APKs, you could just build the APK and send it to everyone to install. I read that there were some changes to sideloading APKs but I don't know the details.

In terms of actually making the app, I don't know Swift or iOS at all so it's all generated. Usual caveats, and I'm only running them on my own phone. I ask Claude (not code) to help me with the spec, I give it some bullet points and it asks a bunch of clarifying questions then gives me a spec. I put that in a new directory, fire up Claude and use the ralph-loop plugin (https://github.com/anthropics/claude-code/tree/main/plugins/...):

> /ralph-loop:ralph-loop "Implement the iOS app described in app-spec.md. You have access to xcode CLI tools. You should write tests and use them to verify your work. The task will be complete when the app is fully implemented, with all tests passing. Output <promise>COMPLETE</promise> when finished." --max-iterations 50 --completion-promise "COMPLETE"

Once it's done you can open the app in XCode, test it in a simulator, play with it and iterate a bit and then send it to your phone!

Editing to add because I can't edit the original post: I think the limiting factor here might be the concurrent sprites limit. It seems like if you're on pay-as-you-go then you can only have 3 running concurrently, and have to subscribe to get 10.

>>dangoo+KH4
Alright nerd-snipe snooping research post happning now!

Seems like they are using JuiceFS under the hood, with an overlay root for your CoW semantics. JuiceFS gives them instant clone (because they're not cloning the whole rootfs), while the chnages to the overlay are done as an overlayfs and probably synced back to S3 via a custom block device they have mounted into firecracker.

You can also see they are using juicefs it for the "policy" directly (which I'm assuming is the network policy functionality). iirc juicefs has support for block devices too, so maybe they are using that to back the rootfs overlay.

One concerning thing is the `/var/lib/docker` mount - i ran this in an ubuntu container, did they... attach it? Maybe that's a coincidence, but docker is not installed on the sprite by default. (the terminal is also super busted when used through an ubuntu container)

https://pastebin.com/raw/kt6q9fuA (edit: moved terminal output to pastebin because it was so ugly here)

I played with a similar stack recently, my guess is they are: 1. making some base vm, snapshotting it 2. when you create a vm, they just restore a copy and push metadata to it (probably via one of the mounts) 3. any changes that you make to the rootfs are stored on the juicefs block device (the overlay), which is relatively minimal compared to the base os. JucieFS also supports snapshotting, so that's probably how they support memory + filesystem snapshot and restore so quick

interestingly, seems they provision maybe a max disk size of 100GB for total checkpoints?

```

NAME TYPE SIZE FSTYPE MOUNTPOINTS

loop0 loop 100G /.sprite/checkpoints/active

```

fuse is definitely being used within the VMM, i can see a fuse mount and id being assigned. They're probably using juicefs directly for the policy mount because that doesn't need to be local nvme-cached, just consistent. The local-nvme -> s3 write-through runs on the hypervisor through a custom block device they attach to the firecracker vmm. This might just be the --cache-dir + --writeback cache option in juicefs. Wild guess is just 1 file per block.

guessing the "s3" here is tigris, since fly.io seems to have a relatoinship with them, and that probably keeps latency down for the filesystem

>>usrme+(OP)
Does anyone know of similar solutions that can be self-hosted? (without a 12 service stack like Daytona [1])

[1] https://www.daytona.io/docs/en/oss-deployment/

>>therea+AX3
Believe it or not, that's the only example that's not autogenerated from tests (yet).

https://github.com/superfly/sprites-js/tree/main/examples https://github.com/superfly/sprites-go/tree/main/examples https://github.com/superfly/sprites-py/tree/main/examples https://github.com/superfly/sprites-ex/tree/main/examples

>>varyhe+VV4
Have you tried https://orbstack.dev/?

>>dmux+9N
Not sure if I've read such an article, but it would be a reasonable next step from the globally addressable processes of the BEAM VM.

As I understand it Unison tries to do something like that but that might be wrong.

https://www.unison-lang.org/

>>yoavsh+F94
If you are on mac, you can use Coderunner[1]. It will run locally on your and execute any AI generated code in an apple container.

1. Coderunner - https://github.com/instavm/coderunner

>>usrme+(OP)
On the documentation, you install the NPM package via Anthropic: https://sprites.dev/api (on this page, if you select Node)

npm install @anthropic-ai/sprites

Is there some relationship between Anthropic and Fly.io that I didn't hear about?

>>mwcamp+k96
I like Partners In Health, myself. https://www.pih.org/

>>skrebb+2d4
Sorry, yes it is Theo Brown:

https://www.youtube.com/@t3dotgg/videos

It was in one of his videos from last week.

>>djmash+c86
I noticed the same thing, I suspect it's hallucinated. It seems like the correct package is @fly/sprites --https://www.npmjs.com/package/@fly/sprites

>>usrme+(OP)
checkout also https://shellbox.dev/ which has pure ssh access

>>CGames+ZC3
sorry for the delayed response. I ended posting on this [0] thread where they (Formal) are doing something similar.

Here's the repo [1]. I modified it a bit to post publicly and remove the details of my setup within my tailnet/flycast network.

[0] >>46605155

[1] https://github.com/dtkav/agent-creds

zlacker

Code and Let Live