Task-aware routing
for LLM agents
Your agent can't use Claude for every step. ModelPilot classifies each request (planning, coding, summarizing) and routes to the right model.
Drop-in replacement: change baseURL, keep your OpenAI code.
How it
works
Each request is classified by task type, then routed to a model optimized for that specific task. No manual model selection.
Task classification
Prompts are classified into task types: code generation, planning, summarization, extraction, creative writing, and more. Each type routes to specialized models.
Model matching
Code tasks → DeepSeek, Codestral. Writing → Claude. Reasoning → o1, GPT-4. Simple tasks → GPT-4o-mini, Haiku. Based on benchmark data, not vibes.
Automatic fallbacks
If a model fails or returns low-quality output, the request is automatically retried with a stronger model. Your agent loop doesn't break.
OpenAI SDK compatible
Same API as OpenAI. Works with LangChain, CrewAI, AutoGen, Vercel AI SDK, or any OpenAI client. Change two lines of config.
Integration
in 2 minutes
Create a router
Configure optimization weights (cost, latency, quality) or use our defaults. Get a router ID and API key.
Change your baseURL
Point your OpenAI client to modelpilot.co/api/router/{id}. That's it.
Requests are classified and routed
Each request is analyzed, matched to the best model, and executed. Check the dashboard for routing decisions.
Works with any OpenAI client
If it uses the OpenAI SDK, it works with ModelPilot. No SDK changes, no new dependencies, no vendor lock-in.
import OpenAI from "openai"
const client = new OpenAI({
apiKey: process.env.OPENAI_API_KEY,
baseURL: "https://api.openai.com/v1"
})import OpenAI from "openai"
const client = new OpenAI({
apiKey: process.env.MODELPILOT_API_KEY,
baseURL: "https://modelpilot.co/api/router/{routerId}"
})Router presets
Configure optimization weights or use a preset. Each router balances cost, latency, and quality differently.
Quality-first
Prefer stronger models
Use case: Production apps where output quality matters more than cost
Balanced
Default weights
Use case: General purpose agents, chatbots, most applications
Cost-optimized
Minimize spend
Use case: High-volume batch processing, internal tools, dev/test
Fallbacks &
retries
When a model returns an error or low-quality output, ModelPilot automatically retries with a different model. Your agent loop keeps running.
Provider failover
If OpenAI is down, route to Anthropic. If rate limited, try another provider. Configurable fallback chains.
Error detection
Detects 4xx/5xx errors, timeouts, malformed responses, and rate limits. Triggers automatic retry logic.
Model escalation
If a cheap model fails, retry with a stronger one. GPT-5-mini fails? Try Claude Sonnet. Still failing? GPT-5.
Handles edge cases
Rate limits
Timeouts
Provider outages
Malformed JSON
Custom
response memories
Train a custom router on your historical request logs. The classifier learns which models work best for your specific prompts and use cases.
Static prompt
optimization
Runtime optimizers add 2-3 seconds of latency per request. ModelPilot optimizes your prompt templates statically—zero added latency at runtime.
Zero-latency execution
Optimizations are applied to your templates at deploy time. No intermediate LLM calls in the hot path.
Model-specific formatting
Automatically formats prompts for the target model (e.g., XML tags for Claude, structured markers for GPT-4).
Static few-shot injection
We analyze your historical logs to find the best few-shot examples and bake them into your prompt templates.
Requires LLM roundtrip to rewrite prompt
Pre-computed at deploy time
Use XML tags for clear separation...
</claude_formatting>
Carbon tracking
per request
Every API response includes estimated CO₂e based on model size, architecture (dense vs MoE), and provider region. Export reports for ESG compliance.
CO₂e estimates in response headers and dashboard analytics
Optionally weight routing decisions by environmental impact
Export monthly carbon reports for compliance documentation
Start routing
in 2 minutes
Create a router, change your baseURL, done. Free tier includes $5 in credits. No credit card required.
npm install modelpilot • OpenAI SDK compatible