Skip to content
← Integrations
Transformers.js

Transformers.js

In-process AI via WASM. No server, no setup, fully offline.

FreeLLM ProviderBeta

Why Transformers.js

Transformers.js is the simplest way to run AI in Wilson. It loads models directly into the Bun process via WebAssembly — no Ollama install, no Docker, no server management. Just install Wilson and go.

Models download automatically on first use and cache locally. After the initial download, everything runs fully offline. No network calls, no API keys, no accounts.

The trade-off is smaller model selection and slower inference on CPU compared to Ollama. But for basic categorization and embeddings, it's more than enough — and the zero-setup experience is hard to beat.

What you get

Zero config — models download automatically on first use

In-process via Bun — no external server to install or manage

WebGPU support for hardware-accelerated inference

Fully offline after initial model download

Supported models

Models run via ONNX Runtime. WebGPU models require a compatible GPU and Bun build with WebGPU support.

Model

Runtime

Best for

Xenova/qwen2.5-0.5b
CPU / WASM
Default — lightweight categorization
Xenova/all-MiniLM-L6-v2
CPU / WASM
Embeddings and similarity search
onnx-community/Qwen2.5-1.5B
WebGPU
GPU-accelerated categorization
onnx-community/Phi-3.5-mini
WebGPU
Higher accuracy analysis with GPU

Ollama vs Transformers.js

Ollama

Transformers.js

Setup

Install Ollama + pull model

Zero — built in

Model size

2–7B+ parameters

0.5–1.5B parameters

Speed

Fast (GPU-accelerated)

Slower (WASM/WebGPU)

Accuracy

Higher

Good for basics

External server

Yes (localhost)

No — in-process

Best for

Daily use, complex analysis

Quick start, simple tasks

Zero-setup AI.

Install Wilson and start analyzing. Transformers.js handles the rest.