Deploy agents
like functions.
Call agents like models. Steps, skills, streaming—built in. Ship production agents in minutes, not months.
npm install agentlify • OpenAI SDK compatible
Why
Agentlify?
Stop managing Python microservices and complex orchestrators. Deploy agents like functions. Ship faster.
Deployable
Deploy agents with a single command. Get an endpoint and instant visibility.
Composable Steps
Build with steps: Research → Plan → Execute → Review. Mix and match like building blocks.
Real-time Streaming
Stream tool calls, step outputs, and reasoning in real-time. Clean console.
Forkable Templates
Fork starter agents and skill packs. One-click to deploy or customize.
Integration
in 2 minutes
Create a router
Configure optimization weights (cost, latency, quality) or use our defaults. Get a router ID and API key.
Change your baseURL
Point your OpenAI client to agentlify.co/api/router/{id}. That's it.
Requests are classified and routed
Each request is analyzed, matched to the best model, and executed. Check the dashboard for routing decisions.
Works with any OpenAI client
If it uses the OpenAI SDK, it works with Agentlify. No SDK changes, no new dependencies, no vendor lock-in.
import OpenAI from "openai"
const client = new OpenAI({
apiKey: process.env.OPENAI_API_KEY,
baseURL: "https://api.openai.com/v1"
})import OpenAI from "openai"
const client = new OpenAI({
apiKey: process.env.AGENTLIFY_API_KEY,
baseURL: "https://agentlify.co/api/router/{routerId}"
})Router presets
Configure optimization weights or use a preset. Each router balances cost, latency, and quality differently.
Quality-first
Prefer stronger models
Use case: Production apps where output quality matters more than cost
Balanced
Default weights
Use case: General purpose agents, chatbots, most applications
Cost-optimized
Minimize spend
Use case: High-volume batch processing, internal tools, dev/test
Fallbacks &
retries
When a model returns an error or low-quality output, Agentlify automatically retries with a different model. Your agent loop keeps running.
Provider failover
If OpenAI is down, route to Anthropic. If rate limited, try another provider. Configurable fallback chains.
Error detection
Detects 4xx/5xx errors, timeouts, malformed responses, and rate limits. Triggers automatic retry logic.
Model escalation
If a cheap model fails, retry with a stronger one. GPT-5-mini fails? Try Claude Sonnet. Still failing? GPT-5.
Handles edge cases
Rate limits
Timeouts
Provider outages
Malformed JSON
Custom
response memories
Train a custom router on your historical request logs. The classifier learns which models work best for your specific prompts and use cases.
Static prompt
optimization
Runtime optimizers add 2-3 seconds of latency per request. Agentlify optimizes your prompt templates statically—zero added latency at runtime.
Zero-latency execution
Optimizations are applied to your templates at deploy time. No intermediate LLM calls in the hot path.
Model-specific formatting
Automatically formats prompts for the target model (e.g., XML tags for Claude, structured markers for GPT-4).
Static few-shot injection
We analyze your historical logs to find the best few-shot examples and bake them into your prompt templates.
Requires LLM roundtrip to rewrite prompt
Pre-computed at deploy time
Use XML tags for clear separation...
</claude_formatting>
Carbon tracking
per request
Every API response includes estimated CO₂e based on model size, architecture (dense vs MoE), and provider region. Export reports for ESG compliance.
CO₂e estimates in response headers and dashboard analytics
Optionally weight routing decisions by environmental impact
Export monthly carbon reports for compliance documentation
Start routing
in 2 minutes
Create a router, change your baseURL, done. Free tier includes $5 in credits. No credit card required.
npm install agentlify • OpenAI SDK compatible