Forked Debug Pattern

When an agent run fails, you don't need to start over. Fork from the last good step, adjust the approach, and resume. The successful early steps are reused — no wasted tokens, no wasted time.

The Problem

An agent completes 7 successful steps — exploring files, reading code, understanding context — then writes buggy code at step 8. Without forking, you'd throw away all 7 good steps and start from scratch.

Step 1: Create a Run That Fails

forked_debug.py
1from opentine import Agent
2from opentine.core import PythonPolicy
3from opentine.models.anthropic import Anthropic
4from opentine.tools.fs import read, write, ls
5from opentine.tools.python import execute as _execute
6
7def execute_python(code: str) -> str:
8    """Execute Python code in an isolated subprocess."""
9    return _execute(code, policy=PythonPolicy(enabled=True))
10
11agent = Agent(
12    model=Anthropic("claude-sonnet-4-20250514"),
13    tools=[read, write, ls, execute_python],
14    system="You are a coding assistant. Implement the requested feature.",
15    max_steps=30,
16)
17
18# This run fails at step 8 — the agent wrote buggy code
19run = agent.run_sync("Add CSV export to the reports module")
20run.save("csv_export.tine")
21
22# Check the run status
23print(f"Status: {run.status.value}") # Status: failed
24print(f"Steps: {len(run.steps)}")    # Steps: 8
25print(f"Cost: {run.total_cost:.4f}")     # Cost: 0.0287

Step 2: Inspect the Failure

Use the CLI to see exactly where things went wrong:

Terminal
tine show csv_export.tine

# Output:
# Run: csv_export
# Status: failed | Steps: 8 | Cost: $0.0287 | Duration: 12.4s
#
# [think]  Planning CSV export feature...
# [tool]   ls("src/")
# [tool]   read("src/reports/generator.py")
# [model]  Understanding the report structure...
# [tool]   write("src/reports/csv_export.py")      ← step 5
# [tool]   read("src/reports/csv_export.py")
# [tool]   execute_python("import subprocess, sys; subprocess.run([sys.executable, '-m', 'pytest', 'tests/'], check=True)")
# [error]  Test failed: csv_export missing header row

The agent explored the codebase (steps 1-4), wrote the implementation (step 5), then the tests failed (step 7). The exploration steps are perfectly good — only the implementation needs to change.

Step 3: Fork and Fix

Fork from step 3 (after understanding the codebase) and resume with a hint about what went wrong:

forked_debug.py
1from opentine import Run, Agent
2from opentine.core import PythonPolicy
3from opentine.models.anthropic import Anthropic
4from opentine.tools.fs import read, write, ls
5from opentine.tools.python import execute as _execute
6
7def execute_python(code: str) -> str:
8    """Execute Python code in an isolated subprocess."""
9    return _execute(code, policy=PythonPolicy(enabled=True))
10
11# Load the failed run
12run = Run.load("csv_export.tine")
13
14# Fork from step 3 — right after the agent understood the codebase,
15# but before it wrote the buggy implementation
16fork_point = run.steps[2].id
17forked = run.fork(from_step_id=fork_point)
18
19# Create a new agent with a hint about the bug
20agent = Agent(
21    model=Anthropic("claude-sonnet-4-20250514"),
22    tools=[read, write, ls, execute_python],
23    system="""You are a coding assistant. Implement the requested feature.
24Important: CSV files must include a header row as the first line.""",
25)
26
27# Resume from the fork point — steps 1-3 are reused
28fixed = agent.resume_sync(
29    forked,
30    prompt="Continue the CSV export with the header-row requirement.",
31)
32fixed.save("csv_export_v2.tine")
33
34print(f"Status: {fixed.status.value}")  # Status: completed
35print(f"Cost: {fixed.total_cost:.4f}")  # Only charged for new steps

The forked run reuses steps 1 through 3 from the original. Only the new implementation and testing steps incur cost.

Step 4: Compare Runs

Use tine diff to see exactly how the two runs diverge:

Terminal
# Compare the original and forked runs
tine diff csv_export.tine csv_export_v2.tine

# Output shows which steps diverged and what changed

Automated Debug Loop

You can automate this pattern: fork from the last good step, feed the error message into the system prompt, and retry. This is especially useful for coding tasks where test failures provide clear feedback.

debug_loop.py
1from opentine import Run, Agent
2from opentine.core import PythonPolicy
3from opentine.models.anthropic import Anthropic
4from opentine.tools.fs import read, write, ls
5from opentine.tools.python import execute as _execute
6
7def execute_python(code: str) -> str:
8    """Execute Python code in an isolated subprocess."""
9    return _execute(code, policy=PythonPolicy(enabled=True))
10
11def debug_loop(tine_file: str, max_attempts: int = 3):
12    """Fork and retry until the run succeeds or we hit the attempt limit."""
13    run = Run.load(tine_file)
14
15    for attempt in range(max_attempts):
16        if run.status.value == "completed":
17            print(f"Succeeded on attempt {attempt + 1}")
18            return run
19
20        # Find the last successful step
21        steps = run.steps
22        last_good = None
23        for step in steps:
24            if step.kind.value == "error":
25                break
26            last_good = step
27
28        if last_good is None:
29            print("No successful steps to fork from")
30            return run
31
32        # Fork from the last good step
33        forked = run.fork(from_step_id=last_good.id)
34
35        agent = Agent(
36            model=Anthropic("claude-sonnet-4-20250514"),
37            tools=[read, write, ls, execute_python],
38            system=f"Previous attempt failed. Error: {steps[-1].error or steps[-1].outputs}. Try a different approach.",
39        )
40
41        run = agent.resume_sync(
42            forked,
43            prompt="Continue from the fork point with a different implementation.",
44        )
45        run.save(f"debug_attempt_{attempt + 1}.tine")
46
47    return run
48
49result = debug_loop("csv_export.tine")

Each iteration forks from the last successful step and includes the previous error in the system prompt. The agent gets a fresh chance to fix the issue without losing the work from earlier steps.

When to Use This Pattern

  • Test failures:The agent wrote code that doesn't pass tests. Fork from before the write step and try again with the error context.
  • Wrong approach: The agent went down the wrong path at step 5 of 15. Fork from step 4 instead of re-running all 15 steps.
  • Model comparison: Fork the same run at the decision point and resume with different models to compare their approaches.
  • Prompt iteration: Fork from the same point with different system prompts to find the best instruction for a tricky step.

Next Steps