Out of frustration, I built an AI API proxy that automatically routes each reque...

growt · 2025-11-26T20:42:03 1764189723

You should probably take the service down before the HN crowd maxes out your credit card with the already discovered security and auth issues. Then find a technical co founder of you still want to pursue this idea and build it from scratch.

h2o_wine · 2025-11-28T23:39:42 1764373182

My credit card isn't in there. The app was written 6 months ago where it stayed in Beta. I rolled it out as a way to reduce the cost of development. Use it or don't. It will be obsolete in another year or two when AI calls level in price.

kbaker · 2025-11-26T20:08:13 1764187693

Hi, curious, did you know about OpenRouter before building this?

> OpenRouter provides a unified API that gives you access to hundreds of AI models through a single endpoint, while automatically handling fallbacks and selecting the most cost-effective options. Get started with just a few lines of code using your preferred SDK or framework.

It isn't OpenAI API compatible as far as I know, but they have been providing this service for a while...

minimaxir · 2025-11-26T20:11:19 1764187879

OpenRouter can also prioritize providers by price: https://openrouter.ai/docs/guides/routing/provider-selection...

jasonsb · 2025-11-26T20:06:59 1764187619

> Typical savings: 60-90% on most requests, since Gemini Flash is often free/cheapest, but you still get Claude or GPT-4 when needed.

This claim seems overstated. Accurately routing arbitrary prompts to the cheapest viable model is a hard problem. If it were reliably solvable, it would fundamentally disrupt the pricing models of OpenAI and Anthropic. In practice, you'd either sacrifice quality on edge cases or end up re-running failed requests on pricier models anyway, eating into those "savings".

moduspol · 2025-11-26T20:11:02 1764187862

I genuinely wonder the use cases are where the required accuracy is so low (or I guess the prompts are so strong) that you don't need to vigorously use evals to prevent regressions with the model that works best--let alone actually just change models on the fly based on what's cheaper.

growt · 2025-11-26T20:31:20 1764189080

Yes and in addition for some reason that use case is also not a fit for some cheap OS model like qwen or kimi, but must be run on the cheapest model of the big three.