Transformers.js

In-process AI via WASM. No server, no setup, fully offline.

FreeLLM ProviderBeta

Why Transformers.js

Transformers.js is the simplest way to run AI in Wilson. It loads models directly into the Bun process via WebAssembly — no Ollama install, no Docker, no server management. Just install Wilson and go.

Models download automatically on first use and cache locally. After the initial download, everything runs fully offline. No network calls, no API keys, no accounts.

The trade-off is smaller model selection and slower inference on CPU compared to Ollama. But for basic categorization and embeddings, it's more than enough — and the zero-setup experience is hard to beat.

What you get

✓

Zero config — models download automatically on first use

✓

In-process via Bun — no external server to install or manage

✓

WebGPU support for hardware-accelerated inference

✓

Fully offline after initial model download

Supported models

Models run via ONNX Runtime. WebGPU models require a compatible GPU and Bun build with WebGPU support.

Model

Runtime

Best for

Xenova/qwen2.5-0.5b

CPU / WASM

Default — lightweight categorization

Xenova/all-MiniLM-L6-v2

CPU / WASM

Embeddings and similarity search

onnx-community/Qwen2.5-1.5B

WebGPU

GPU-accelerated categorization

onnx-community/Phi-3.5-mini

WebGPU

Higher accuracy analysis with GPU

Ollama vs Transformers.js

Ollama

Transformers.js

Setup

Install Ollama + pull model

Zero — built in

Model size

2–7B+ parameters

0.5–1.5B parameters

Speed

Fast (GPU-accelerated)

Slower (WASM/WebGPU)

Accuracy

Higher

Good for basics

External server

Yes (localhost)

No — in-process

Best for

Daily use, complex analysis

Quick start, simple tasks

Zero-setup AI.

Install Wilson and start analyzing. Transformers.js handles the rest.

Install Wilson Read the docs