The missing layer beneath every agent framework

0xcircuitbreaker·May 6, 2026·8 min read

Look at any agent framework and find the place where it records what the agent did. LangChain has callbacks and tracers. LlamaIndex has an observability module. CrewAI has its own task memory. AutoGen has a message history. Pydantic-AI has tool-call logs. Every framework has reinvented this layer. None of them interoperate.

This is the missing substrate. It exists in every codebase, it's implemented poorly in all of them, and it's the same problem every time.

The pattern is always the same: the framework starts by just executing agents. Then users ask "how do I see what happened?" So the framework adds a callback system. Then users ask "how do I save this for later?" So the framework adds serialization. Then users ask "how do I replay this?" So the framework adds a half-baked replay mode that doesn't quite work because the serialization wasn't designed for it. Then users ask "how do I share this with my teammate who uses a different framework?" And the answer is: you can't.

I watched this play out in three different frameworks in 2024 and 2025. Each one was competently built. Each one had thoughtful maintainers. Each one ended up with a run-log module that was bolted on, inconsistent with the rest of the design, and incompatible with every other framework's version of the same thing.

The reason is structural, not a failure of any individual project. Frameworks are optimizing for "make it easy to build an agent." The run log is in service of that goal — an aid, not the product. When your job is to ship a framework, the right amount of effort to spend on the run log is "the minimum that keeps users from complaining." And that minimum is far below what a properly-designed run log would be.

So every framework ships a run log that is 30% correct and 0% compatible.

opentine is what you get when you design the run log as the primary thing and leave the agent-execution concerns to the existing frameworks.

Concretely this means:

A run is a first-class data structure, specified as a graph rather than as a framework callback side effect. The current Python package proves the shape with native agents, `.tine` load/save/fork/diff/replay, and scoped external CLI harnesses.

The storage format (.tine) is versioned, content-addressed, and framework-agnostic. It is a file on disk, not rows in a service's database.

Adding opentine to an existing workflow today means either using the native `Agent` API or wrapping a supported external CLI harness with `tine run --harness ...`. Broader framework tracers can come later, but the artifact contract has to be correct first.

The primitive layer has to stay focused, because it is trying to be boring infrastructure — the layer you stop thinking about. A sprawling run-log library cannot be the universal substrate because it carries too much opinion.

The analogy I keep coming back to is POSIX. POSIX didn't win because it was the most elegant system interface. It won because it was small enough to be portable, specified enough to be reimplementable, and neutral enough that every OS vendor could adopt it without conceding territory. That is the role I want .tine to play for agent runs.

The frameworks above us are doing hard, valuable work. Routing messages between agents, managing state machines, building graph execution — that stuff is complicated and I don't want to rebuild it. But the run log underneath all of them should converge on shared artifacts: the same file format, the same hash scheme, and explicit replay/resume semantics that say what can be safely reused and what needs a native runtime or harness.

This is the bet. Not "opentine is a better framework." We're not a framework. Opentine is the layer that exists once, underneath all of them, so the tenth agent framework doesn't have to rebuild the run log for the tenth time.