Coding Agent Harness
Dispatch multi-turn coding tasks to external coding agents from within your Agentfield workflows
Coding Agent Harness
The bridge between your Agentfield workflows and external coding agents
Under Active Development
The harness is being built as part of Epic #208. APIs and behavior described here reflect the design spec and may change before the stable release.
app.ai() is excellent at what it does: a single-turn call to a language model that returns structured output. You give it a prompt, it gives you back text or a typed Pydantic object. Fast, cheap, predictable.
But some tasks can't be solved in one turn.
Imagine you need an agent to find a bug in a codebase, trace it through three files, write a fix, run the test suite, and confirm the tests pass. That's not a single LLM call. That's a multi-turn session where the agent browses files, edits code, executes commands, reads output, and iterates. No amount of prompt engineering turns app.ai() into that.
That's what .harness() is for.
What .harness() Does
app.harness() dispatches a task to an external coding agent — Claude Code, Codex, Gemini CLI, or OpenCode — and waits for the result. The coding agent runs its full agentic loop: reading files, editing code, running tests, checking output, trying again if something fails. When it's done, .harness() returns a HarnessResult with the agent's output, metrics, and optionally a validated schema instance.
It's the bridge between your Agentfield workflow and the coding agents you already know.
from agentfield import Agent, HarnessConfig
app = Agent(
node_id="my-agent",
harness_config=HarnessConfig(
provider="claude-code",
model="sonnet",
),
)
result = await app.harness("Fix the auth bug in src/auth.py")
print(result.text)The coding agent handles everything in between: browsing the file, understanding the bug, writing the fix, running tests. Your code just dispatches the task and receives the result.
ai() vs harness()
These two methods solve different problems. Knowing when to use each is the key to building effective workflows.
app.ai() | app.harness() | |
|---|---|---|
| Turns | Single turn | Multi-turn session |
| What it does | Calls an LLM, returns output | Runs a full coding agent loop |
| File access | No | Yes — reads, edits, creates files |
| Command execution | No | Yes — runs tests, builds, scripts |
| Iteration | No | Yes — agent retries until done |
| Latency | Seconds | Minutes |
| Cost | Low | Higher (many LLM calls internally) |
| Best for | Classification, extraction, summarization, routing | Bug fixes, refactors, code generation, test writing |
The mental model: app.ai() is a smart function call. app.harness() is hiring a contractor to do a job.
How It Works
You call app.harness() with a prompt
Your prompt describes the task in plain language. You can optionally pass a working directory, a Pydantic or Zod schema for structured output, tool permissions, and a cost cap.
result = await app.harness(
"Refactor the database layer to use async/await throughout",
cwd="/my/project",
max_turns=100,
max_budget_usd=3.0,
)Agentfield selects the provider and builds the execution context
The HarnessRunner resolves configuration (constructor defaults merged with per-call overrides), validates the provider, and prepares the prompt. If you passed a schema, it appends output requirements to the prompt instructing the agent to write a JSON file.
The coding agent executes
The external coding agent runs its full loop. It browses your codebase, edits files, runs commands, reads output, and iterates. This can take seconds or minutes depending on the task. The agent has access to the tools you've permitted: Read, Write, Edit, Bash, Glob, Grep.
Results come back as HarnessResult
When the agent finishes, .harness() returns a HarnessResult containing the agent's text output, execution metrics (cost, turns, duration), and — if you passed a schema — a validated typed instance.
print(result.text) # Agent's summary
print(result.cost_usd) # What it cost
print(result.num_turns) # How many iterations
print(result.parsed) # Typed schema instance (if schema was passed)The Provider Model
Agentfield supports four coding agent providers. You pick one when configuring your agent.
| Provider | Integration | Notes |
|---|---|---|
claude-code | Native Python/TypeScript SDK | In-process, no binary dependency |
codex | CLI subprocess (Python) / Native SDK (TypeScript) | OpenAI's coding agent |
gemini | CLI subprocess | Gemini CLI with JSON output |
opencode | CLI subprocess | Open-source coding agent |
Claude Code uses the claude_agent_sdk in Python and @anthropic-ai/claude-agent-sdk in TypeScript — running the agent in-process with no subprocess overhead. Codex, Gemini, and OpenCode run as CLI subprocesses, parsing their JSONL event streams.
Each provider requires its own authentication setup. See Provider Requirements for installation and auth details.
Schema-Constrained Output
When you need structured data back from a coding task, pass a Pydantic model (Python) or Zod schema (TypeScript). The harness instructs the coding agent to write its output as JSON to a file in the working directory, then reads and validates it after the session completes.
from pydantic import BaseModel
class RefactorResult(BaseModel):
files_changed: list[str]
summary: str
tests_added: bool
breaking_changes: bool
result = await app.harness(
"Refactor the auth module to use the new token format",
schema=RefactorResult,
cwd="/my/project",
)
print(result.parsed.files_changed) # ["src/auth.py", "tests/test_auth.py"]
print(result.parsed.tests_added) # TrueThis approach works identically across all four providers — the agent writes a file, the harness reads it. No provider-specific schema flags, no token limit issues with large schemas.
If the output doesn't validate on the first read, the harness applies a four-layer recovery strategy: cosmetic repair (stripping markdown fences, fixing trailing commas), a follow-up prompt in the same session, and finally a full retry. In practice, the agent almost always gets it right the first time.
When to Use harness() vs ai()
Use app.ai() when:
- You need to classify, extract, or summarize content
- The task fits in a single prompt-response exchange
- You need fast, cheap, predictable output
- You're routing decisions or generating structured data for downstream logic
Use app.harness() when:
- The task requires reading multiple files to understand context
- You need the agent to edit code and verify the result
- The work involves running tests or build commands
- The task is open-ended enough that iteration is expected
- You're automating something a developer would do manually
A common pattern is using app.ai() to analyze and plan, then app.harness() to execute:
@app.reasoner()
async def fix_github_issue(issue: dict) -> dict:
# ai() to understand and plan (fast, cheap)
plan = await app.ai(
system="You are a senior engineer. Analyze this issue and describe the fix needed.",
user=f"Issue: {issue['title']}\n\n{issue['body']}",
schema=FixPlan,
)
# harness() to execute the plan (multi-turn, agentic)
result = await app.harness(
f"Implement this fix:\n\n{plan.description}",
schema=FixResult,
cwd=issue["repo_path"],
max_turns=150,
)
return result.model_dump()Quick Examples
from agentfield import Agent, HarnessConfig
from pydantic import BaseModel
app = Agent(
node_id="code-agent",
harness_config=HarnessConfig(
provider="claude-code",
model="sonnet",
),
)
class BugFix(BaseModel):
files_changed: list[str]
summary: str
tests_added: bool
@app.reasoner()
async def fix_issue(issue: dict) -> dict:
fix = await app.harness(
f"Fix: {issue['title']}\n\n{issue['description']}",
schema=BugFix,
cwd=issue["repo_path"],
max_turns=100,
tools=["Read", "Write", "Edit", "Bash", "Glob", "Grep"],
)
return fix.model_dump()import { Agent } from "@agentfield/sdk";
import { z } from "zod";
const agent = new Agent({
nodeId: "code-agent",
harnessConfig: {
provider: "claude-code",
model: "sonnet",
},
});
const BugFix = z.object({
filesChanged: z.array(z.string()),
summary: z.string(),
testsAdded: z.boolean(),
});
agent.reasoner("fixIssue", async (issue: Record<string, string>) => {
const fix = await agent.harness(
`Fix: ${issue.title}\n\n${issue.description}`,
{
schema: BugFix,
cwd: issue.repoPath,
maxTurns: 100,
}
);
return fix.parsed;
});Related
- Python SDK: app.harness() — Full parameter reference and return types
- TypeScript SDK: agent.harness() — TypeScript API reference
- Provider Requirements — Installation, authentication, and per-provider configuration