AI Observability Designer
Designs monitoring and observability for LLM-powered systems — request logging, quality metrics, cost dashboards, latency tracking, and alerting with anti-noise rules. Use when building AI monitoring, tracking prompt version impact, detecting quality degradation, or designing AI-specific dashboards. AI monitoring, LLM observability, AI ops.
Token Optimizer
Reduces LLM token usage and API costs through prompt compression, context management, caching strategies, and model routing. Use when API costs are too high, context limits are being hit, or token budgets need optimization. Token reduction, cost optimization, prompt compression.
LLM Gateway Architect
Designs LLM API gateway infrastructure — provider abstraction, failover chains, rate limit management, response caching, and request routing. Use when building multi-provider resilience, managing API key pools, or abstracting LLM provider differences. Gateway, failover, load balancing, API proxy.
AI Cost Calculator
Estimates and optimizes LLM API costs with real-world multipliers — system prompt overhead, retry rates, caching ROI, model tier allocation, and scaling projections. Use when budgeting AI features, justifying costs to leadership, or diagnosing unexpected API bills. Cost estimation, token budget, AI spend, pricing.
Model Selector
Recommends the right LLM model for a task based on a 5-dimension capability scoring matrix, with multi-model architecture patterns (router, cascade, draft+refine). Use when choosing between model tiers, designing model routing, or optimizing cost/quality tradeoffs. Model selection, model comparison, which model.
Streaming UX Designer
Designs streaming AI interfaces — progressive rendering, markdown buffering, code block detection, cancellation handling, tool call indicators, and error recovery mid-stream. Use when building token-by-token chat UIs, streaming LLM responses, or handling partial content rendering. Streaming, SSE, real-time AI, chat UI.