Ollama

Run any open model locally. Zero API fees, total privacy.

FreeLLM ProviderStable

Why Ollama

Ollama is Wilson's default AI backend. It downloads and runs open-weight language models on your hardware — no API keys, no cloud accounts, no per-token fees. Your financial data never leaves your machine.

Ollama supports Metal on Mac and CUDA on Linux/Windows, so inference is fast even on consumer hardware. Wilson auto-detects Ollama at startup and picks the best available model for each task.

The Ollama library has over 12,000 models. Wilson works best with small, fast models in the 2–7B parameter range — they handle categorization, anomaly detection, and structured output without needing a data-center GPU.

What you get

✓

12,000+ open models available from the Ollama library

✓

Metal and CUDA GPU acceleration for fast inference

✓

Native tool calling — Wilson's agent loop works out of the box

✓

Deploy fine-tuned models via custom Modelfiles

Getting started

1. Install Ollama

$ curl -fsSL https://ollama.com/install.sh | sh

2. Pull a model

$ ollama pull qwen2.5:3b

3. Run Wilson

$ wilson

> Ollama detected. Using qwen2.5:3b.

Recommended models

These models are tested with Wilson and work well for financial analysis tasks.

Model

Size

Best for

qwen2.5:3b

1.9 GB

Default — fast categorization and analysis

llama3.2:3b

2.0 GB

Strong general reasoning, good tool calling

mistral:7b

4.1 GB

Higher accuracy for complex financial analysis

gemma2:2b

1.6 GB

Lightweight — works on 8 GB RAM machines

phi3:mini

2.3 GB

Microsoft's small model — good structured output

Run AI on your terms.

Install Ollama, pull a model, and Wilson handles the rest. No cloud account required.

Get Ollama Read the docs