← Back to Writing

Sandboxed CLI Agents and Missing Permissions

AI3 min read

LLM coding CLIs are supposed to feel like a teammate in your terminal, but the default sandboxing can turn them into very polite spectators instead.

I've started experimenting with the Codex CLI. The default environment has no network access. That sounds fine on paper—it's safer, it keeps the tool from exfiltrating secrets—but it also means it can't search for documentation, hit package registries, or run anything that needs to pull artifacts from the internet. I ask it to run npm install, or npx next build for a project that pulls fonts at build time, and it just fails because the network isn't there.

The same thing happens with Docker. I want to let the agent build and run containers, but it doesn't have permissions outside its own workspace. It can edit Dockerfiles all day, but when it comes time to run docker build or docker compose up, the sandbox boundary shows up again: no socket, no daemon, no containers. I'm still not entirely sure how I got past this in practice—I ended up with Docker working without using the danger-full-access sandbox profile, which makes it even more important that the tool explain why something failed.

That's the part that grates: Codex CLI should know whether a command was blocked by sandbox policy, by missing permissions, or by something mundane like a typo. It should be able to say so up front instead of leaving you to infer whether you're fighting config, security, or your own mistakes.

None of this is surprising once you understand the threat model, but it's jarring in practice when you're just trying to get stuff done. The marketing promises an autonomous coding agent that can run commands, search docs, and wire everything together. The reality is closer to a very smart intern who's locked in a glass room with a whiteboard and no Wi-Fi. Until you unlock some doors.

I still think the constraints are the right default—you don't want a model with root on your machine by accident. But until the tooling grows a smoother way to grant scoped, explicit permissions (like "yes, you may run Docker for this repo" or "yes, you may hit npm for this project"), a lot of the promise of CLI agents is going to be lost in the gap between the demo videos and the sandbox.

Codex also seems to like undoing changes I've made to code, but that's a fight for another day.