Overview

8gent does not use a single model for every task. The model router learns from benchmark results and past sessions to route each task to the model most likely to succeed. Different models excel at different domains, and the router exploits this automatically.

Experience-Based Routing

The model router maintains an experience database that tracks how each model performs across task categories. After every benchmark run and scored session, the router updates its records.

When a new task arrives, the router:

Classifies the task into a domain (auth, data pipeline, frontend, etc.)
Looks up historical performance for each available model in that domain
Routes to the model with the highest recorded score
Falls back to the default model if no experience data exists

Multi-Model Fallback

8gent supports per-model fallback chains, scoped by channel. If the primary model is unavailable (provider down, model not pulled, timeout), the router walks the chain for that model and channel until it finds a healthy entry. Defaults live in packages/providers/failover.ts and can be overridden in ~/.8gent/failover.json.

Text channel (default)

Used by the TUI, daemon, and most session traffic. Each entry below is a primary model the router can be asked for; the indented chain is what it falls through to.

eight:latest (ollama)
    |-- timeout or error
    v
qwen3.5:latest (ollama)
    |-- timeout or error
    v
meta-llama/llama-3-8b-instruct:free (openrouter)

qwen3.5:latest (ollama)
    |-- timeout or error
    v
meta-llama/llama-3-8b-instruct:free (openrouter)

apple-foundationmodel (apple-foundation, when available)
    |-- timeout or unavailable
    v
eight-1.0-q3:14b (8gent)
    |-- timeout or error
    v
qwen3:14b (ollama)
    |-- timeout or error
    v
meta-llama/llama-3-8b-instruct:free (openrouter)

When the host qualifies for Apple Foundation (macOS 26+ Tahoe on Apple Silicon with the bridge binary installed), the router automatically prefixes every text chain with the on-device model.

Computer channel

Used by the 8gent Computer surface, where vision and tool-calling matter. This chain prefers a small fast chat tier, then falls through to a vision-capable local brain, then heavier cloud, then the free safety net.

apfel (apple-foundation chat tier)
    |-- vision prompt or error
    v
qwen3.6:27b (ollama)
    |-- timeout or error
    v
deepseek-v4-flash (deepseek)
    |-- timeout or error
    v
meta-llama/llama-3-8b-instruct:free (openrouter)

This chain is configured separately from the text channel; overrides go under the computer key in ~/.8gent/failover.json.

Benchmark-Driven Routing

The 39+ task benchmark suite provides concrete data for routing decisions. Here is a sample of how different models perform across domains:

Domain	qwen3.5	qwen3:14b
Auth System (BT001)	94	94
Event Architecture (BT002)	92	92
State Machine (BT005)	92	92
SEO Audit (BT007)	96	96
Video Production (BT011)	100	100

The router uses this data to make informed decisions. A reasoning-heavy task might route to qwen3:14b while a general coding task routes to qwen3.5.

Dynamic Free Model Selection

For OpenRouter users, the getBestFreeModel() function in packages/providers/index.ts queries the OpenRouter API to find the best available free model:

Fetches the full model list from /api/v1/models
Filters for models with the :free suffix
Sorts by context length (longer context = more capable for coding tasks)
Caches results for one hour

This means 8gent always routes to the best free model currently available, even as OpenRouter's free model lineup changes.

Spawn Agent Routing

The multi-agent orchestration system can spawn subagents with specific model preferences:

spawn_agent({
  task: "Build the data pipeline",
  model: "auto:free"  // automatically picks best free model
});

The auto:free identifier triggers dynamic model selection. Subagents inherit the parent's fallback chain unless explicitly overridden.

Integration with Kernel Fine-Tuning

When the kernel fine-tuning pipeline promotes a new checkpoint (e.g., eight-1.1-q3:14b), the model router automatically registers it as the primary model. The promotion only happens when the Gemini Flash judge confirms the fine-tuned model outperforms the current release on the autoresearch benchmark suite.

If a promoted model later shows declining performance (tracked via the kernel's health monitoring), the router can fall back to the previous version automatically.

Configuration

The model router does not require explicit configuration. It builds its experience database from benchmark runs and kernel scoring. To seed the experience database with benchmark data:

# Run benchmarks to populate experience data
bun run benchmark:v2

# Results feed into the model router automatically

On this page