Skip to content
← Integrations
Ollama

Ollama

Run any open model locally. Zero API fees, total privacy.

FreeLLM ProviderStable

Why Ollama

Ollama is Wilson's default AI backend. It downloads and runs open-weight language models on your hardware — no API keys, no cloud accounts, no per-token fees. Your financial data never leaves your machine.

Ollama supports Metal on Mac and CUDA on Linux/Windows, so inference is fast even on consumer hardware. Wilson auto-detects Ollama at startup and picks the best available model for each task.

The Ollama library has over 12,000 models. Wilson works best with small, fast models in the 2–7B parameter range — they handle categorization, anomaly detection, and structured output without needing a data-center GPU.

What you get

12,000+ open models available from the Ollama library

Metal and CUDA GPU acceleration for fast inference

Native tool calling — Wilson's agent loop works out of the box

Deploy fine-tuned models via custom Modelfiles

Getting started

1. Install Ollama

$ curl -fsSL https://ollama.com/install.sh | sh

2. Pull a model

$ ollama pull qwen2.5:3b

3. Run Wilson

$ wilson

> Ollama detected. Using qwen2.5:3b.

Recommended models

These models are tested with Wilson and work well for financial analysis tasks.

Model

Size

Best for

qwen2.5:3b
1.9 GB
Default — fast categorization and analysis
llama3.2:3b
2.0 GB
Strong general reasoning, good tool calling
mistral:7b
4.1 GB
Higher accuracy for complex financial analysis
gemma2:2b
1.6 GB
Lightweight — works on 8 GB RAM machines
phi3:mini
2.3 GB
Microsoft's small model — good structured output

Run AI on your terms.

Install Ollama, pull a model, and Wilson handles the rest. No cloud account required.