6.3lowCAUTION

LLM Router

Smart routing layer that sends AI tasks to the optimal model based on complexity and cost

DevToolsSaaS developers and B2B teams using LLMs in production who want to reduce API...
The Gap

SaaS builders are locked into expensive API providers for all tasks, even simple ones that don't need frontier models, creating cost unpredictability and vendor lock-in

Solution

A middleware/SDK that analyzes incoming AI requests and routes complex reasoning to Claude/GPT-4 while sending simple tasks (classification, extraction, internal tools) to local or cheap open models like Gemma 4 or Llama

Revenue Model

Freemium — free tier for low volume, usage-based pricing for production workloads, enterprise tier with custom routing rules and analytics

Feasibility Scores
Pain Intensity7/10

Real pain but not yet hair-on-fire for most. Companies spending $5K+/month on LLM APIs feel this acutely, but many startups are still in low-volume prototyping where costs are manageable. The pain intensifies as products scale — it's a 'growing into pain' problem. The Reddit signals confirm developer frustration with vendor lock-in and per-token costs, but it's more anxiety about future costs than current bleeding.

Market Size7/10

TAM is substantial — every company using LLM APIs in production is a potential customer. Estimated 100K+ companies actively using LLM APIs by 2026, with the heavy-spend segment ($10K+/mo) being ~10-15K companies. Addressable market for routing/optimization tooling is likely $500M-$1B. However, this is an infrastructure layer — margins are thin unless you move up the value chain into observability and optimization.

Willingness to Pay5/10

This is the core challenge. Developers strongly prefer open-source for infrastructure layers (LiteLLM proves this). Willingness to pay exists mainly at enterprise scale where the savings justify the cost, but enterprises will want on-prem/self-hosted options. The value proposition is 'save money on your AI spend,' which means your pricing must be a fraction of the savings — a thin margin business. Hard to charge meaningful SaaS prices when the open-source alternatives are good enough for most.

Technical Feasibility6/10

The proxy/gateway part is straightforward — 2-3 weeks for an MVP. The HARD part is the intelligent routing: accurately classifying prompt complexity in real-time without adding meaningful latency or cost. This requires training or fine-tuning a classifier, building evaluation datasets, and handling edge cases where 'simple-looking' prompts actually need strong models. RouteLLM shows it's possible but took a research team months. A solo dev can build a rules-based MVP in 4-6 weeks, but the ML-driven 'smart' routing that differentiates from LiteLLM is a 3-6 month effort.

Competition Gap4/10

This is the biggest red flag. The space is crowded with well-funded (Martian) and well-loved open-source (LiteLLM, RouteLLM) alternatives. OpenRouter has massive distribution. The specific gap — local model routing + smart complexity analysis — exists but is narrow. LiteLLM is likely to add smarter routing, and Martian is already doing ML-based routing. You'd be entering a market where the incumbents have 12-24 month head starts and strong communities.

Recurring Potential8/10

Strong recurring potential. Once integrated into a production pipeline, switching costs are high. Usage-based pricing naturally scales with customer growth. Companies don't reduce their LLM usage over time — they increase it. This is sticky infrastructure with natural expansion revenue.

Strengths
  • +Genuine and growing pain point as LLM costs scale — the market tailwind is real
  • +Strong recurring revenue mechanics — sticky infrastructure that grows with customer usage
  • +Local/self-hosted model routing is an underserved niche that incumbents haven't nailed
  • +Developer empathy angle (open-source friendly, anti-vendor-lock-in) resonates strongly
Risks
  • !Extremely crowded space with well-funded competitors (Martian) and strong open-source alternatives (LiteLLM, RouteLLM) that are 1-2 years ahead
  • !LLM providers themselves are commoditizing — prices are dropping 50-70% annually, which erodes the core 'save money' value prop over time
  • !The intelligent routing problem (classifying prompt complexity accurately in real-time) is genuinely hard and is the only real differentiator — without it, you're just another LiteLLM
  • !Thin-margin infrastructure business — hard to build venture-scale returns as a solo founder
  • !Risk of being a feature, not a product: Anthropic, OpenAI, or cloud providers could add native multi-model routing
Competition
OpenRouter

Unified API gateway that provides access to 100+ LLMs

Pricing: Pay-per-token with thin markup over provider pricing (~10-20% margin
Gap: No intelligent cost-optimization routing — users must manually select models. No complexity analysis of prompts. It's a gateway, not a smart router. No local/self-hosted model support.
LiteLLM

Open-source Python SDK and proxy server that provides a unified interface to 100+ LLM providers, translating calls to OpenAI-compatible format

Pricing: Free and open-source (MIT license
Gap: No automatic intelligent routing based on task complexity. Users define fallback rules manually. No built-in cost-optimization engine or prompt analysis. It's plumbing, not intelligence.
Martian (formerly Withmartian)

AI model router that automatically selects the best LLM for each request based on quality and cost targets, using a learned routing model

Pricing: Usage-based pricing, thin margin on top of model costs. Free tier for experimentation.
Gap: No support for routing to local/self-hosted models. Closed-source routing logic (black box). Limited customization of routing rules. Doesn't solve vendor lock-in — just shifts it to Martian.
Unify.ai

LLM routing platform that benchmarks models across quality, cost, and latency and routes requests to the optimal endpoint dynamically

Pricing: Free tier with limited requests. Pay-as-you-go for production. Custom enterprise pricing.
Gap: No local model routing. Limited SDK ecosystem. Smaller community than LiteLLM/OpenRouter. Still a dependency — you're trusting their routing decisions. No on-prem deployment option.
RouteLLM (by LMSys)

Open-source framework from the Chatbot Arena team that trains router models to dynamically route between strong and weak LLMs based on query complexity

Pricing: Free and open-source (research project from UC Berkeley/LMSys
Gap: Research project, not production-ready. No managed service. Limited to binary strong/weak routing (not multi-tier). No enterprise features (analytics, custom rules, team management). Requires ML expertise to deploy and fine-tune.
MVP Suggestion

Don't build another gateway. Instead, build a lightweight open-source SDK (Python first) that wraps LiteLLM and adds a rules-based routing engine with sensible defaults: route by token count, keyword patterns, and task type tags. Ship with pre-built routing profiles (e.g., 'cost-optimized', 'quality-first', 'balanced'). Include a simple dashboard showing cost savings. Differentiate by being the easiest on-ramp — 3 lines of code to integrate, with local Ollama model support out of the box. Build community first, monetize with hosted analytics and enterprise routing rules later.

Monetization Path

Open-source SDK (free, build community and trust) -> Hosted dashboard with cost analytics and routing optimization suggestions ($49-199/mo) -> Enterprise tier with custom routing policies, team management, audit logs, and SLA guarantees ($500-2000/mo) -> Usage-based pricing for managed routing at scale (% of savings model)

Time to Revenue

3-4 months to first dollar. Month 1-2: Build open-source SDK with rules-based routing and LiteLLM integration. Month 2-3: Launch on HN/Reddit, build early community. Month 3-4: Ship hosted dashboard as paid tier. Realistic first-year ARR for a solo founder: $5K-30K, heavily dependent on community traction.

What people are saying
  • not being locked into api pricing that can change overnight
  • send complex reasoning to Claude/GPT-4, route simple tasks to Gemma/Llama locally
  • why lock yourself into one provider's API?
  • api dependency risk with openai
  • paying per token