SaaS builders are locked into expensive API providers for all tasks, even simple ones that don't need frontier models, creating cost unpredictability and vendor lock-in
A middleware/SDK that analyzes incoming AI requests and routes complex reasoning to Claude/GPT-4 while sending simple tasks (classification, extraction, internal tools) to local or cheap open models like Gemma 4 or Llama
Freemium — free tier for low volume, usage-based pricing for production workloads, enterprise tier with custom routing rules and analytics
Real pain but not yet hair-on-fire for most. Companies spending $5K+/month on LLM APIs feel this acutely, but many startups are still in low-volume prototyping where costs are manageable. The pain intensifies as products scale — it's a 'growing into pain' problem. The Reddit signals confirm developer frustration with vendor lock-in and per-token costs, but it's more anxiety about future costs than current bleeding.
TAM is substantial — every company using LLM APIs in production is a potential customer. Estimated 100K+ companies actively using LLM APIs by 2026, with the heavy-spend segment ($10K+/mo) being ~10-15K companies. Addressable market for routing/optimization tooling is likely $500M-$1B. However, this is an infrastructure layer — margins are thin unless you move up the value chain into observability and optimization.
This is the core challenge. Developers strongly prefer open-source for infrastructure layers (LiteLLM proves this). Willingness to pay exists mainly at enterprise scale where the savings justify the cost, but enterprises will want on-prem/self-hosted options. The value proposition is 'save money on your AI spend,' which means your pricing must be a fraction of the savings — a thin margin business. Hard to charge meaningful SaaS prices when the open-source alternatives are good enough for most.
The proxy/gateway part is straightforward — 2-3 weeks for an MVP. The HARD part is the intelligent routing: accurately classifying prompt complexity in real-time without adding meaningful latency or cost. This requires training or fine-tuning a classifier, building evaluation datasets, and handling edge cases where 'simple-looking' prompts actually need strong models. RouteLLM shows it's possible but took a research team months. A solo dev can build a rules-based MVP in 4-6 weeks, but the ML-driven 'smart' routing that differentiates from LiteLLM is a 3-6 month effort.
This is the biggest red flag. The space is crowded with well-funded (Martian) and well-loved open-source (LiteLLM, RouteLLM) alternatives. OpenRouter has massive distribution. The specific gap — local model routing + smart complexity analysis — exists but is narrow. LiteLLM is likely to add smarter routing, and Martian is already doing ML-based routing. You'd be entering a market where the incumbents have 12-24 month head starts and strong communities.
Strong recurring potential. Once integrated into a production pipeline, switching costs are high. Usage-based pricing naturally scales with customer growth. Companies don't reduce their LLM usage over time — they increase it. This is sticky infrastructure with natural expansion revenue.
- +Genuine and growing pain point as LLM costs scale — the market tailwind is real
- +Strong recurring revenue mechanics — sticky infrastructure that grows with customer usage
- +Local/self-hosted model routing is an underserved niche that incumbents haven't nailed
- +Developer empathy angle (open-source friendly, anti-vendor-lock-in) resonates strongly
- !Extremely crowded space with well-funded competitors (Martian) and strong open-source alternatives (LiteLLM, RouteLLM) that are 1-2 years ahead
- !LLM providers themselves are commoditizing — prices are dropping 50-70% annually, which erodes the core 'save money' value prop over time
- !The intelligent routing problem (classifying prompt complexity accurately in real-time) is genuinely hard and is the only real differentiator — without it, you're just another LiteLLM
- !Thin-margin infrastructure business — hard to build venture-scale returns as a solo founder
- !Risk of being a feature, not a product: Anthropic, OpenAI, or cloud providers could add native multi-model routing
Unified API gateway that provides access to 100+ LLMs
Open-source Python SDK and proxy server that provides a unified interface to 100+ LLM providers, translating calls to OpenAI-compatible format
AI model router that automatically selects the best LLM for each request based on quality and cost targets, using a learned routing model
LLM routing platform that benchmarks models across quality, cost, and latency and routes requests to the optimal endpoint dynamically
Open-source framework from the Chatbot Arena team that trains router models to dynamically route between strong and weak LLMs based on query complexity
Don't build another gateway. Instead, build a lightweight open-source SDK (Python first) that wraps LiteLLM and adds a rules-based routing engine with sensible defaults: route by token count, keyword patterns, and task type tags. Ship with pre-built routing profiles (e.g., 'cost-optimized', 'quality-first', 'balanced'). Include a simple dashboard showing cost savings. Differentiate by being the easiest on-ramp — 3 lines of code to integrate, with local Ollama model support out of the box. Build community first, monetize with hosted analytics and enterprise routing rules later.
Open-source SDK (free, build community and trust) -> Hosted dashboard with cost analytics and routing optimization suggestions ($49-199/mo) -> Enterprise tier with custom routing policies, team management, audit logs, and SLA guarantees ($500-2000/mo) -> Usage-based pricing for managed routing at scale (% of savings model)
3-4 months to first dollar. Month 1-2: Build open-source SDK with rules-based routing and LiteLLM integration. Month 2-3: Launch on HN/Reddit, build early community. Month 3-4: Ship hosted dashboard as paid tier. Realistic first-year ARR for a solo founder: $5K-30K, heavily dependent on community traction.
- “not being locked into api pricing that can change overnight”
- “send complex reasoning to Claude/GPT-4, route simple tasks to Gemma/Llama locally”
- “why lock yourself into one provider's API?”
- “api dependency risk with openai”
- “paying per token”