Ollama (Local)

The Ollama adapter connects opentine to locally running models via the Ollama HTTP API. This gives you zero API costs, full data privacy, and no rate limits. It exposes tool schemas and can pass Ollama thinking controls when you supply think= for model families that expose thinking output.

Prerequisites

You need Ollama installed and running with at least one model pulled.

Terminal
# Install Ollama from https://ollama.com
# Then pull a model:
ollama pull llama3.1

Installation

The Ollama adapter communicates over HTTP, so no additional Python dependencies are required beyond the base opentine package.

Terminal
pip install opentine
# No extra needed — the Ollama adapter uses the HTTP API directly

Quick Start

agent.py
1from opentine import Agent
2from opentine.models.ollama import Ollama
3
4# Connects to localhost:11434 by default
5model = Ollama("llama3.1")
6
7agent = Agent(model=model, tools=[...])
8run = agent.run_sync("Explain the difference between TCP and UDP")

Constructor

Signature
Ollama(
    model: str = "llama3.1",
    host: str | None = None,  # Falls back to OLLAMA_HOST env var, default http://localhost:11434
    think: bool | str | None = None,
)
  • model — The Ollama model name. Defaults to "llama3.1". Must be a model you've already pulled with ollama pull.
  • host — The Ollama server URL. If not provided, the adapter falls back to the OLLAMA_HOST environment variable, then defaults to http://localhost:11434.
  • think — Optional Ollama thinking control. It is sent only when explicitly provided.

Host Configuration

By default the adapter connects to Ollama on localhost:11434. You can override this for remote Ollama instances or non-standard ports.

host.py
1from opentine.models.ollama import Ollama
2
3# Default: connects to http://localhost:11434
4model = Ollama("llama3.1")
5
6# Custom host
7model = Ollama("llama3.1", host="http://192.168.1.100:11434")
8
9# Or set the environment variable
10# export OLLAMA_HOST="http://192.168.1.100:11434"

Properties

properties.py
1from opentine.models.ollama import Ollama
2
3model = Ollama("llama3.1")
4
5print(model.name)               # "ollama/llama3.1"
6print(model.supports_tools)     # True
7print(model.supports_thinking)  # False for llama3.1
8
9thinking_model = Ollama("qwen3")
10print(thinking_model.supports_thinking)  # True
  • name — Returns "ollama/{model}"(e.g. "ollama/llama3.1").
  • supports_toolsTrue in the adapter. Provider and model-specific tool behavior should be validated against your local Ollama model.
  • supports_thinkingTrue for model names starting with qwen3, deepseek-r1, deepseek-v3.1, or gpt-oss; otherwise False.

Available Models

Ollama supports a wide range of open-source models. Pull any model before using it.

Terminal
# List available models on your Ollama instance
ollama list

# Pull new models
ollama pull llama3.1
ollama pull codellama
ollama pull mistral
ollama pull mixtral
ModelSizeNotes
llama3.18BGeneral purpose, good tool use
codellama7BOptimized for code generation
mistral7BFast and capable
mixtral47B (MoE)Mixture of experts, high quality

Tool Use

The adapter exposes tool-calling support through Ollama's API. Pass your tools to the Agent as you would with any other adapter, then validate the selected local model.

tool_use.py
1from opentine import Agent
2from opentine.core import PythonPolicy
3from opentine.models.ollama import Ollama
4from opentine.tools.fs import read, write
5from opentine.tools.python import execute as _execute
6
7def execute_python(code: str) -> str:
8    """Execute Python code in an isolated subprocess."""
9    return _execute(code, policy=PythonPolicy(enabled=True))
10
11model = Ollama("llama3.1")
12
13# The adapter exposes tool-calling support; validate each local model.
14agent = Agent(
15    model=model,
16    tools=[read, write, execute_python],
17    system="You are a coding assistant. Use the tools to read, write, and run code.",
18    max_steps=20,
19)
20
21run = agent.run_sync("Create a Python script that generates Fibonacci numbers")

Thinking Controls

supports_thinkingis a capability flag. To send Ollama's thinking control, pass think= explicitly when constructing the model.

thinking.py
1from opentine.models.ollama import Ollama
2
3# Passing think sends Ollama's explicit thinking control in the API payload.
4model = Ollama("qwen3", think=True)
5model = Ollama("deepseek-r1", think="high")

Remote Ollama

You can point the adapter at an Ollama instance running on another machine, such as a dedicated GPU server.

remote.py
1from opentine.models.ollama import Ollama
2
3# Connect to Ollama running on another machine
4model = Ollama("llama3.1", host="http://gpu-server.local:11434")
5
6# Or use an SSH tunnel
7# ssh -L 11434:localhost:11434 gpu-server.local
8model = Ollama("llama3.1")  # Connects to forwarded port

Cost Tracking

Since Ollama runs models locally, the cost field in responses is always 0.0. Your run graph will still track step durations and other metadata as usual.

cost.py
1from opentine.models.ollama import Ollama
2
3model = Ollama("llama3.1")
4
5# Ollama models run locally, so cost is always 0.0
6result = await model.complete(
7    messages=[{"role": "user", "content": "Hello!"}],
8)
9print(result["cost"])  # 0.0

Next Steps