SkillRank
Back to guides
Architecture8 minUpdated 2026-06-04

Model Routing and Fallback Design

Model routing is the difference between a demo and a system that can scale. A router decides which model should answer, when to retrieve context, when to ask a clarifying question, and when to escalate.

Route by task shape

Group tasks by shape: classification, extraction, transformation, retrieval answer, creative generation, long reasoning, code edit, and tool-using agent. Each group has different model requirements.

A router can be a simple rules file at first. For example, route schema extraction to a fast model, legal or financial synthesis to a stronger model, and low-confidence retrieval answers to a human review queue.

Use confidence signals carefully

Confidence should come from observable checks: valid JSON, citation coverage, retrieval score, policy match, evaluator score, or whether the model asked for missing information.

Do not rely only on the model saying it is confident. Self-reported confidence can be persuasive but wrong.

Design fallbacks for user trust

Fallbacks should preserve the user experience. If the first model fails, the system might ask a clarifying question, retrieve better context, escalate to a stronger model, or show a safe partial answer.

Avoid silent degradation. Users should not receive a lower-quality answer just because a premium model is unavailable.

Practical checklist

  1. 1Classify tasks before routing.
  2. 2Use observable confidence checks.
  3. 3Escalate high-risk answers.
  4. 4Log routing decisions.
  5. 5Test provider outages and degraded modes.

Related comparisons