Letting Agents Run, Safely

Part three of a series on agent harnesses. Part one introduced the model-plus-harness idea, and part two split work across subagents. This piece is about the layer where agents stop just talking and start doing.

Up to now our agents have reasoned, called tools, and delegated to specialists. But a lot of useful work is messier than a clean tool call. Reading a file, editing it, running a command, checking the output, trying again. For that, an agent needs somewhere to actually work: a folder it can write to and a shell it can run things in. That somewhere is a sandbox.

The catch is obvious the moment you say it out loud. Handing a model a shell is powerful and a little frightening. You want it to be able to run real commands, but not on anything it shouldn't touch. A sandbox is how you give an agent room to work while drawing a hard line around how far that work can reach. Flue gives you three levels of that line, and picking the right one is most of the job.

Why agents need a workspace

A tool is a single, defined action: look up an order, create a ticket. That covers a lot, but not everything. Some tasks are open-ended in a way no fixed tool can capture:

File work. Reviewing a document, editing code, generating a report. The agent needs to read inputs and write outputs, not just return text.
Command work. Running a test suite, installing a package, checking that a script actually runs. These are decided in the moment, not predefined.
Iterate-and-check loops. Run something, read the error, fix it, run it again. This is how real coding agents operate, and it needs a live environment.

A sandbox is that environment. The question is never whether to give the agent one, but how much it should be allowed to reach.

Three levels of reach

Flue offers three sandbox types, and they form a ladder from most contained to most powerful. The right choice depends entirely on how much you trust the work.

Start at the top and only move down when the task genuinely needs it. Each step down hands the agent more reach into real systems.

The virtual sandbox

By default, every agent works in a virtual sandbox: a lightweight, in-memory workspace. You do not configure anything to get it. A common pattern is a workflow that stages a file, lets the agent work on it, then collects the result:

import { createAgent, type FlueContext } from '@flue/runtime';

const reviewer = createAgent(() => ({
  model: 'anthropic/claude-sonnet-4-6',
  cwd: '/workspace',
}));

export async function run({ init, payload }: FlueContext<{ document: string }>) {
  const harness = await init(reviewer);
  await harness.fs.writeFile('document.md', payload.document);

  const session = await harness.session();
  await session.prompt('Review document.md and write your findings to review.md.');

  return { review: await harness.fs.readFile('review.md') };
}

Notice there is no sandbox field. Leaving it out selects the virtual sandbox. Your application writes the input with harness.fs.writeFile, the agent does its work inside the workspace, and you read the output back with harness.fs.readFile. The cwd setting just tells those relative paths where to resolve.

Two things to keep in mind. The virtual sandbox starts empty, with none of your host files, and its contents vanish once the work is done. It is also not a network boundary, so do not treat it as a wall against outbound access. It is the right starting point when your application can hand the agent exactly the files it needs.

The local sandbox

When a trusted agent should work directly on the real host, on an actual checkout or a CI runner, use local():

import { createAgent } from '@flue/runtime';
import { local } from '@flue/runtime/node';

export default createAgent(() => ({
  model: 'anthropic/claude-sonnet-4-6',
  sandbox: local(),
  cwd: '/srv/checkouts/catalog-service',
  instructions: 'Inspect the requested change and run only relevant validation.',
}));

Now the agent can reach real files and installed commands through its workspace. This is genuinely useful for development tools and disposable runners, but be clear-eyed about what it is: local() gives the model access to the host, with no isolation between its work and the machine. Use it only where both the host and the input are already trusted.

Flue helps a little here by limiting host environment variables by default. If a command needs a specific value, you expose just that one through local({ env: { ... } }) rather than opening everything. The guiding instinct: when a narrow tool could do the job, prefer the tool over handing the model a broad shell.

Remote sandboxes

When the work should not run on your application host at all, untrusted requests, tenant-specific tasks, or anything needing a full Linux toolchain, reach for a remote sandbox. These run in an isolated environment with their own lifecycle, supplied through an integration such as Daytona or, on Cloudflare deployments, a container-backed Cloudflare Sandbox.

With a remote sandbox your application takes on the responsibility decisions: which workspace belongs to which agent, what credentials and network access it gets, whether it can be reused, and when it is torn down. That is more setup, but it is the correct trade when the work cannot be trusted to run anywhere near your own systems. One note worth checking: a remote integration may expose different workspace capabilities than the virtual and local sandboxes, so read its docs before assuming the same file and command tools are there.

Optimizing how you use sandboxes

A few habits keep sandboxes both safe and efficient.

Choose the narrowest sandbox that works. This is the single most important rule. Every step down the ladder expands what model-directed work can read, change, run, and reach. Start virtual, and only escalate when the task truly requires it.
Stage only what the agent needs. With the virtual sandbox you control exactly which files exist in the workspace. Hand over the input for this task and nothing more, rather than mounting a whole project the agent does not need.
Keep credentials out of the shell. A broad set of environment variables in a model-directed shell is a standing risk. Expose single values explicitly when required, and let bounded tools handle anything sensitive.
Match the sandbox to the agent, including subagents. As covered in part two, a subagent shares its parent's sandbox boundary. Scope that boundary with the delegated work in mind so a specialist cannot reach further than its job requires.

Best practices

A short checklist for working with sandboxes:

Separate persistence from security. A sandbox controls workspace and command access. It does not decide whether a conversation is remembered. A persisted session does not make a virtual sandbox durable, and a durable remote workspace does not preserve a conversation. Decide each one on its own.
Do not use local() as an isolation boundary. It is for trusted host work, not for untrusted or multi-tenant requests. When you need isolation, use a remote sandbox.
Treat the virtual sandbox as in-memory and temporary. If you need files to survive, plan for it through the workspace lifecycle you choose, not by assuming the sandbox keeps them.
Remember the virtual sandbox is not a network wall. If outbound access matters to your threat model, control it at the sandbox environment level, not by assuming the default contains it.
Let the workspace shape the instructions. An agent told to "run only relevant validation" in a scoped checkout behaves more predictably than one set loose in a broad environment. Narrow the space and the guidance together.

Where to go next

The best way to get a feel for this is to run the virtual sandbox example above, then change one thing: give the agent a slightly bigger task and watch how staging the right input keeps it focused. When a task finally needs the real host or a heavier toolchain, you will know, and stepping down the ladder will feel like a deliberate choice rather than a default. The Flue sandboxes guide has the full set of options and integrations when you are ready. With a model, structure, and a safe place to work now in place, the next article will look at how to see what your agents are actually doing once they are running.

If you ever need help or just want to chat, DM me on Twitter / X or LinkedIn.

Kartik Mehta

X / LinkedIn

Letting Agents Run, Safely

Why agents need a workspace

Three levels of reach

The virtual sandbox

The local sandbox

Remote sandboxes

Optimizing how you use sandboxes

Best practices

Where to go next

Comments

The Harness

Hands, Memory, and a Job to Do

More from this blog

Passport Chips, Explained

agent-first is the de facto now

Watching the Black Box

Hands, Memory, and a Job to Do

Command Palette

Why agents need a workspace

Three levels of reach

The virtual sandbox

The local sandbox

Remote sandboxes

Optimizing how you use sandboxes

Best practices

Where to go next

Comments

The Harness

Hands, Memory, and a Job to Do

More from this blog