Ain't nobody got time to pick models and compare features. It's annoying enough having to switch from one LLM ecosystem to another all the time due to vague usage restrictions. I'm paying $20/mo to Anthropic for Claude Code, to OpenAI for Codex, and previously to Cursor for...I don't even know what. I know Cursor lets you select a few different models under the covers, but I have no idea how they differ, nor do I care.
I just want consistent tooling and I don't want to have to think about what's going on behind the scenes. Make it better. Make it better without me having to do research and pick and figure out what today's latest fashion is. Make it integrate in a generic way, like TLS servers, so that it doesn't matter whether I'm using a CLI or neovim or an IDE, and so that I don't have to constantly switch tooling.
Almost all my tooling is years (or decades) old and stable. But the code assistant LLM scene effectively didn't exist in any meaningful way until this year, and it changes almost daily. There is no stability in the tooling, and you're missing out if you don't switch to newer models at least every few weeks right now. Codex (OpenAI/ChatGPT CLI) didn't even exist a month ago, and it's a contender for the best option. Claude Code has only been out for a few months.
I use Neovim in tmux in a terminal and haven't changed my primary dev environment or tooling in any meaningful way since switching from Vim to Neovim years ago.
I'm still changing code AIs as soon as the next big thing comes out, because you're crippling yourself if you don't.
There is in fact a shit ton of hella good art being made with Photoshop 6, because it actually has fair feature parity in terms of what people actually use (content aware fill and puppet warp being the main missing features) while being really easy to crack, so it's a common version for people to install in third world countries. Photoshop has been enshittified for about 20 years though.
> Ain't nobody got time to pick models and compare features. ... Make it integrate in a generic way, ... , so that it doesn't matter whether I'm using a CLI or neovim or an IDE, and so that I don't have to constantly switch tooling.
I use GitHub Copilot Pro+ because this was my main requirement as well.
Pro+ has the new models as they come out -- actually just enabled Claude Haiku 4.5 for selection availability. I have not yet had a problem with running out of the premium allowance, but from reading how others use these, I am also not a power user type.
I have not yet the CLI version, but it looks interesting. Before the Intellij plugin improved, I would switch to VS Code to run a certain types of prompt then switch back after without issues. The web version has the `Spaces` thing that I find useful for niche things.
I have no idea how it compares to the individual offerings, and based on previous hn threads here, there was a lot of hate for gh copilot. So maybe it's actually terrible and the individual version are lightyears ahead -- but it stays out of my way until I want it and it does its job well enough for my use.
> I use GitHub Copilot Pro+ because this was my main requirement as well.
Frankly, i do not even get how people run out of 1500 requests. For a heavy coding session, my max is around 45 requests per day, and that means a ton of code / alterations and some wasted on fluff mini changes. Most days is barely in the 10 a 20.
I noticed that you can really eat your requests if you just do not care to switch models for small tasks, or constantly do edit/ask. When you rely on agent mode, it can edit multiple files at the same time, so your always saving tokens vs doing it yourself manually.
To be honest, i wish that Copilot had a 600 token version, instead of the massive jump to 1500. Other option is to just use the pay per request.
* Cheapest is Pro+, 1500 requests , year paid, at around 1.8cent / request
* The 300 requests Pro, year paid is around 2.4cent / request.
* The overflow tokens (so without subscription) is at 4 cent / request.
Note: The Pro and Pro+ prices assume you use 100% of you tokens. If you only use 700 tokens on the Pro+, your paying the same as the overflow 4 cent / request one.
So ironically, you are actually cheaper with a Pro (300 requests ) subscription, for the first 300, and then paying 4 cent / request between your 301 ~ 700...
> To be honest, i wish that Copilot had a 600 token version, instead of the massive jump to 1500. Other option is to just use the pay per request.
Same here. Well, 900 would be a good middle option for me as well. I was switching to the unlimited model for the simple things, but since I don't use all of the premium allotment I started just leaving it on the one that is working best for the job that day.
I guess part of the "value" of Pro+ is the extra "Spark" credits of which I have zero use for. But I simply wanted something that integrated into my ecosystem instead of having to add to/or change it. Also did not want to have to think about how many pennies I'm using (I appreciate that breakdown though! good to know) -- I'll pay a reasonable convenience tax for my time and mental space of not having to babysit usage.
Even if you pick one. First it's prompt driven development, then context driven. Then you should use a detailed spec. But no, now it's better to talk to it like a person/have a conversation. Hold up why are you doing that you should be doing example driven. Look, I get that they probably all have their place, but since there isn't consensus on any of this, it's next to impossible to find good examples. Some one posted a reply to me on a old post and called it bug-driven development and that stuck with me. You get it to do something (any way) and then you have to fix all the bugs and errors.
Work it out brother. If you can learn to code at a good level then you should be able to learn how to increase your productivity with LLM’s. When, where and how to use them is the key.
I don’t think it’s appreciated enough how valuable having a structured and consistent architecture combined with lots of specific custom context. Claude knows how my integration tests should look, it knows how my services should look, what dependencies they have and how they interact with the database. It knows my entire DB schema with all foreign key relationships. If I’m starting a new feature I can have it build 5 or 6 services (not without first making suggestions on things I’m missing) with integration tests, with raw sql all generated by Claude, and run an integration test loop until the services are doing what they should. I rarely have to step in and actually code. It shines for this use case and the productivity boost is genuinely incredible.
Other situations I know doing it myself will be better and/or quicker than asking Claude.
I think its a valid complaint. Who wants to constantly spend overhead on maintaining what's current without clear definitions and adding uncertainty to your tooling. It's a total PITA.
Then don't? I don't think it's a valid complaint at _all_.
It's totally fine to just pick one tool (chatGPT, Claude, Gemini) and just use whatever the best default they allow you to use. You'll get 90% of the benefits and not have to think at all.
AI is new and developing at breakneck pace. You can't complain that you want to get bleeding edge without having to do research or change workflows.
That's already unrealistic for "normal" fields. It's absurd to expect for AI.
We diverge on opinions here - most peoples workflow doesn't revolve around the current distinctions between AI. Having to incorporate that into their workflow is a PITA - I understand some people live off that and thats their bread and butter. Good for them for finding their competitive advantage. For every other builder out there who builds instead of integrates the newest AI model - its a PITA.
Right on, for code examples for my writing and my own ‘gentleman scientist’ experiments I stick with gemini-cli and codex.
For play time, I literally love experimenting with small local models. I am an old man, and I have always liked tools that ‘make me happy’ while programming like Emacs, Lisp languages, and using open source because I like to read other people’s code. But, for getting stuff done, for now gemini-cli and codex hit a sweet spot for me.
I love haiku 4.5, but you don't need it. It's like a motorcycle. Feels good, but doesn't do the heavy lifting.
Cursor has an auto mode for exactly your situation - it'll switch to something cost effective enough, fast enough, consistent enough, new enough. Cursor is on the ball most of the time and you're not stuck with degraded performance from OpenAI or Anthropic.
They're working on all that. I think "ACP" is supposed to be the answer. Then you can use the models in your IDEs, and they can all develop against the same spec so it'll be easy to pop into whatever model.
Gpt 5 is supposed to cleverly decide when to think harder.
But ya we're not there yet and I'm tired of it too, but what can you do.
This is what opencode does for me. One harness for all models, standardized TUI, and they're rolling out a product to serve models via API with one bill through them
We’re in the stage where the 8080,8085, Z80, 6502 and 6809 CPUs are all in the market, and the relevant buses are S100, with other buses not yet standardized.
You either live with what you’re using or you change around and fiddle with things constantly.
Unfortunately I already pay for and use both, because on the $20/mo plan, you get cut off after a few hours due to usage limits. Claude resets daily after "5 hours" (I can't determine what runs the clock, but it seems to be wall time (?!)), and Codex cuts you off for multiple days after a long session.
I use OpenRouter for similar reasons -- half to avoid lock-in, and the other half to reduce the switching pain, which is just a way to say "if I do get locked in, I want to move easily"
as mentioned already by the others, using opencode [1] helps with this, if you like the cli workflow. it is good enough and does not need to exceed what the leaders are doing.
when combined with the ability to use github copilot to make the llm calls, i can play with almost any provider i need. also helps if you get its access through your work or school.
for example, Haiku is already offered by them and costs a third in credits.
> annoying enough having to switch from one LLM ecosystem to another all the time due to vague usage restrictions
I use KiloCode and what I find amazing is that it'll be working on a problem and then a message will come up about needing to topup the money in my account to continue (or switch to a free model), so I switch to a free model (currently their Code Supernova 1million context) and it doesn't miss a beat and continues working on the problem. I don't know how they do this. It went from using a Claude Sonnet model to this Code Supernova model without missing a beat. Not sure if this is a Kilocode thing or if others do this as well. How does that even work? And this wasn't a trivial problem, it was adding a microcode debugger to a microcoded state machine system (coding in C++).
OK I understand what those words mean, but how exactly does that work? How does the new model 'know' what's being worked on when the old model was in the middle of working on a task and then a new model is switched to? (and where the task might be modifying a C++ file)
Generally speaking, agents send the entire previous conversation to the model on every message. That’s why you have to do things like context compaction. So if you switch models mid way, you are still sending the entire previous chat history to the new model
In addition to sibling comments you can play with this yourself by sending raw api requests with fake history to gaslight the model into believing it said things which it didn’t. I use this sometimes to coerce it into specific behavior, feeling like maybe it will listen to itself more than to my prompt (though I never benchmarked it):
- do <fake task> and be succinct
- <fake curt reply>
- I love how succinct that was. Perfect. Now please do <real prompt>
The models don’t have state so they don’t know they never said it. You’re just asking “given this conversation , what is the most likely next token?”
the underlying LLM service provider APIs require sending the entire history for every request anyway; the state is entirely in your local (or kilocode or whatever), not in some "session" on the API side. (There are some APIs that will optionally handle that state for you, like OpenAI's more recent stuff — but those are the exception, not the rule).
I just want consistent tooling and I don't want to have to think about what's going on behind the scenes. Make it better. Make it better without me having to do research and pick and figure out what today's latest fashion is. Make it integrate in a generic way, like TLS servers, so that it doesn't matter whether I'm using a CLI or neovim or an IDE, and so that I don't have to constantly switch tooling.