Why I made run graphs content-addressed
Content addressing is the idea that the identity of a piece of data is derived from its content, not from where it lives or when it was created. Git does this with commits. IPFS does it with files. opentine does it with agent steps.
In opentine, every Step has an ID that's the hash of its immutable payload: parent links, kind, inputs, outputs, model/tool metadata, and errors. Timestamps, duration, and cost are recorded, but they are not part of the step ID. This seemingly simple decision has profound implications.
First: reusable prefixes. When you fork a run from step 7, steps 1-7 aren't re-executed. The forked run preserves the same content-addressed graph prefix and records the fork point explicitly.
Second: caching. Model and tool calls also get semantic cache keys, stored separately from immutable step IDs. That cache provenance is what lets replay reuse recorded work without pretending every rerun is safe.
Third: integrity. Saved artifacts include a SHA-256 checksum over the redacted artifact body outside metadata. Use `Run.verify_integrity(...)` or `tine verify` before trusting a file copied through email, chat, or artifact storage. It is a checksum, not a signature.
Fourth: portability. A .tine file is a self-contained run graph where every step is identified by content hash. Move it between machines, share it with teammates, store it in S3 — the hashes are the same everywhere. No database IDs, no server state.
The implementation is straightforward: opentine uses SHA-256 over a canonical JSON-compatible representation of the step's immutable payload. The hash is computed from the data that defines step identity, so shared graph prefixes can be reused precisely.
This is the same principle that makes git fast and reliable — just applied to agent runs instead of source code.