Module 05

Intelligent Routing

Choosing the right path automatically

TL;DR

Choosing the right path automatically

> Overview

Not every user request needs the same treatment. Routing classifies what the user wants and sends it to the best-fit handler: a cheaper model for simple questions, a specialized one for complex tasks. This is the single biggest lever for cost optimization in AI products and directly impacts your unit economics.

> Why This Matters for Your Product

A simple FAQ lookup does not need a $0.03/request frontier model. A $0.001 model works fine. But a complex multi-step analysis does need the expensive model. Smart routing can cut AI costs by 60–80% while maintaining quality where it matters. This module teaches PMs how to define routing categories, set cost tiers, and monitor misrouting rates.

> Interactive & tools

Pricing (per 1K tokens)

Approximate cost per 1K tokens (early 2026)

ModelInput ($)Output ($)
Claude 3.5 Haiku0.00080.004
Claude Sonnet0.0030.015
Claude Opus0.0150.075
GPT-4o mini0.000150.0006
GPT-4o0.00250.01
Gemini Flash0.0000750.0003

With smart routing, 100K requests/day can save $50K–$200K annually vs. sending everything to a frontier model.

Prices change frequently; check provider pricing pages.

Fallback chain

Fallback chain

RequestClassifierTier 1 / 2 / 3quality check fails?Escalate to next tier

If the first model’s output doesn’t meet the bar, automatically try a more capable model. Not the same as reflection (same model retrying).

Related Engineering Patterns

These are the technical patterns your engineering team will implement. Understanding them helps you have better conversations.

RoutingResource-Aware OptimizationReasoning Techniques

See the full decision framework

Sign up free to see this module's Key Decisions, the questions to ask your engineering team, and play the interactive Intelligent Routing game.

Sign Up Free