Ollama (Local)

The Ollama adapter connects opentine to locally running models via the Ollama HTTP API. This gives you zero API costs, full data privacy, and no rate limits. It supports tool use but does not support extended thinking.

Prerequisites

You need Ollama installed and running with at least one model pulled.

Terminal
# Install Ollama from https://ollama.com
# Then pull a model:
ollama pull llama3.1

Installation

The Ollama adapter communicates over HTTP, so no additional Python dependencies are required beyond the base opentine package.

Terminal
pip install opentine
# No extra needed — the Ollama adapter uses the HTTP API directly

Quick Start

agent.py
1from opentine import Agent
2from opentine.models.ollama import Ollama
3
4# Connects to localhost:11434 by default
5model = Ollama("llama3.1")
6
7agent = Agent(model=model, tools=[...])
8run = agent.run_sync("Explain the difference between TCP and UDP")

Constructor

Signature
Ollama(
    model: str = "llama3.1",
    host: str | None = None,  # Falls back to OLLAMA_HOST env var, default http://localhost:11434
)
  • model — The Ollama model name. Defaults to "llama3.1". Must be a model you've already pulled with ollama pull.
  • host — The Ollama server URL. If not provided, the adapter falls back to the OLLAMA_HOST environment variable, then defaults to http://localhost:11434.

Host Configuration

By default the adapter connects to Ollama on localhost:11434. You can override this for remote Ollama instances or non-standard ports.

host.py
1from opentine.models.ollama import Ollama
2
3# Default: connects to http://localhost:11434
4model = Ollama("llama3.1")
5
6# Custom host
7model = Ollama("llama3.1", host="http://192.168.1.100:11434")
8
9# Or set the environment variable
10# export OLLAMA_HOST="http://192.168.1.100:11434"

Properties

properties.py
1from opentine.models.ollama import Ollama
2
3model = Ollama("llama3.1")
4
5print(model.name)               # "ollama/llama3.1"
6print(model.supports_tools)     # True
7print(model.supports_thinking)  # False
  • name — Returns "ollama/{model}"(e.g. "ollama/llama3.1").
  • supports_tools — Always True. Ollama supports tool calling via its API.
  • supports_thinking — Always False. Extended thinking is not supported for local models.

Available Models

Ollama supports a wide range of open-source models. Pull any model before using it.

Terminal
# List available models on your Ollama instance
ollama list

# Pull new models
ollama pull llama3.1
ollama pull codellama
ollama pull mistral
ollama pull mixtral
ModelSizeNotes
llama3.18BGeneral purpose, good tool use
codellama7BOptimized for code generation
mistral7BFast and capable
mixtral47B (MoE)Mixture of experts, high quality

Tool Use

Ollama models support tool calling. Pass your tools to the Agent as you would with any other adapter.

tool_use.py
1from opentine import Agent
2from opentine.models.ollama import Ollama
3from opentine.tools.fs import read, write
4from opentine.tools.python import execute
5
6model = Ollama("llama3.1")
7
8# Ollama models support tool use
9agent = Agent(
10    model=model,
11    tools=[read, write, execute],
12    system="You are a coding assistant. Use the tools to read, write, and run code.",
13    max_steps=20,
14)
15
16run = agent.run_sync("Create a Python script that generates Fibonacci numbers")

Remote Ollama

You can point the adapter at an Ollama instance running on another machine, such as a dedicated GPU server.

remote.py
1from opentine.models.ollama import Ollama
2
3# Connect to Ollama running on another machine
4model = Ollama("llama3.1", host="http://gpu-server.local:11434")
5
6# Or use an SSH tunnel
7# ssh -L 11434:localhost:11434 gpu-server.local
8model = Ollama("llama3.1")  # Connects to forwarded port

Cost Tracking

Since Ollama runs models locally, the cost field in responses is always 0.0. Your run tree will still track step durations and other metadata as usual.

cost.py
1from opentine.models.ollama import Ollama
2
3model = Ollama("llama3.1")
4
5# Ollama models run locally, so cost is always 0.0
6result = await model.complete(
7    messages=[{"role": "user", "content": "Hello!"}],
8)
9print(result["cost"])  # 0.0

Next Steps