Ain't nobody got time to pick models and compare features. It's annoying enough ...

edmundsauto · 2025-10-16T02:55:08 1760583308

I don’t mean this with snark, but with age. It’s actually totally cool to not upgrade and then you have stability in your tooling.

I bet there is some hella good art being made with Photoshop 6.0 from the 90s right now.

The upgrade path is like the technical hedonistic treadmill. You don’t have to upgrade.

caymanjim · 2025-10-16T04:03:33 1760587413

Almost all my tooling is years (or decades) old and stable. But the code assistant LLM scene effectively didn't exist in any meaningful way until this year, and it changes almost daily. There is no stability in the tooling, and you're missing out if you don't switch to newer models at least every few weeks right now. Codex (OpenAI/ChatGPT CLI) didn't even exist a month ago, and it's a contender for the best option. Claude Code has only been out for a few months.

I use Neovim in tmux in a terminal and haven't changed my primary dev environment or tooling in any meaningful way since switching from Vim to Neovim years ago.

I'm still changing code AIs as soon as the next big thing comes out, because you're crippling yourself if you don't.

topaz0 · 2025-10-16T15:22:22 1760628142

Nobody is making you use LLMs at all... Just using them is already choosing to be on the hedonistic treadmill

edmundsauto · 2025-10-16T21:53:53 1760651633

> because you're crippling yourself if you don't

What makes you say this, practically?

CuriouslyC · 2025-10-16T10:51:18 1760611878

There is in fact a shit ton of hella good art being made with Photoshop 6, because it actually has fair feature parity in terms of what people actually use (content aware fill and puppet warp being the main missing features) while being really easy to crack, so it's a common version for people to install in third world countries. Photoshop has been enshittified for about 20 years though.

1659447091 · 2025-10-15T22:33:45 1760567625

> Ain't nobody got time to pick models and compare features. ... Make it integrate in a generic way, ... , so that it doesn't matter whether I'm using a CLI or neovim or an IDE, and so that I don't have to constantly switch tooling.

I use GitHub Copilot Pro+ because this was my main requirement as well.

Pro+ has the new models as they come out -- actually just enabled Claude Haiku 4.5 for selection availability. I have not yet had a problem with running out of the premium allowance, but from reading how others use these, I am also not a power user type.

I have not yet the CLI version, but it looks interesting. Before the Intellij plugin improved, I would switch to VS Code to run a certain types of prompt then switch back after without issues. The web version has the `Spaces` thing that I find useful for niche things.

I have no idea how it compares to the individual offerings, and based on previous hn threads here, there was a lot of hate for gh copilot. So maybe it's actually terrible and the individual version are lightyears ahead -- but it stays out of my way until I want it and it does its job well enough for my use.

benjiro · 2025-10-16T11:32:43 1760614363

> I use GitHub Copilot Pro+ because this was my main requirement as well.

Frankly, i do not even get how people run out of 1500 requests. For a heavy coding session, my max is around 45 requests per day, and that means a ton of code / alterations and some wasted on fluff mini changes. Most days is barely in the 10 a 20.

I noticed that you can really eat your requests if you just do not care to switch models for small tasks, or constantly do edit/ask. When you rely on agent mode, it can edit multiple files at the same time, so your always saving tokens vs doing it yourself manually.

To be honest, i wish that Copilot had a 600 token version, instead of the massive jump to 1500. Other option is to just use the pay per request.

* Cheapest is Pro+, 1500 requests , year paid, at around 1.8cent / request * The 300 requests Pro, year paid is around 2.4cent / request. * The overflow tokens (so without subscription) is at 4 cent / request.

Note: The Pro and Pro+ prices assume you use 100% of you tokens. If you only use 700 tokens on the Pro+, your paying the same as the overflow 4 cent / request one.

So ironically, you are actually cheaper with a Pro (300 requests ) subscription, for the first 300, and then paying 4 cent / request between your 301 ~ 700...

1659447091 · 2025-10-16T22:47:03 1760654823

> To be honest, i wish that Copilot had a 600 token version, instead of the massive jump to 1500. Other option is to just use the pay per request.

Same here. Well, 900 would be a good middle option for me as well. I was switching to the unlimited model for the simple things, but since I don't use all of the premium allotment I started just leaving it on the one that is working best for the job that day.

I guess part of the "value" of Pro+ is the extra "Spark" credits of which I have zero use for. But I simply wanted something that integrated into my ecosystem instead of having to add to/or change it. Also did not want to have to think about how many pennies I'm using (I appreciate that breakdown though! good to know) -- I'll pay a reasonable convenience tax for my time and mental space of not having to babysit usage.

osn9363739 · 2025-10-15T22:51:10 1760568670

Even if you pick one. First it's prompt driven development, then context driven. Then you should use a detailed spec. But no, now it's better to talk to it like a person/have a conversation. Hold up why are you doing that you should be doing example driven. Look, I get that they probably all have their place, but since there isn't consensus on any of this, it's next to impossible to find good examples. Some one posted a reply to me on a old post and called it bug-driven development and that stuck with me. You get it to do something (any way) and then you have to fix all the bugs and errors.

solumunus · 2025-10-16T06:09:33 1760594973

Work it out brother. If you can learn to code at a good level then you should be able to learn how to increase your productivity with LLM’s. When, where and how to use them is the key.

I don’t think it’s appreciated enough how valuable having a structured and consistent architecture combined with lots of specific custom context. Claude knows how my integration tests should look, it knows how my services should look, what dependencies they have and how they interact with the database. It knows my entire DB schema with all foreign key relationships. If I’m starting a new feature I can have it build 5 or 6 services (not without first making suggestions on things I’m missing) with integration tests, with raw sql all generated by Claude, and run an integration test loop until the services are doing what they should. I rarely have to step in and actually code. It shines for this use case and the productivity boost is genuinely incredible.

Other situations I know doing it myself will be better and/or quicker than asking Claude.

bapak · 2025-10-16T05:19:16 1760591956

> Ain't nobody got time to pick models and compare features

Then don't? Seems like a weird thing to complain about.

I just use whatever's available. I like Claude for coding and ChatGPT for generic tasks, that's the extent of my "pick and compare"

boringg · 2025-10-16T13:10:35 1760620235

I think its a valid complaint. Who wants to constantly spend overhead on maintaining what's current without clear definitions and adding uncertainty to your tooling. It's a total PITA.

silveraxe93 · 2025-10-16T13:50:07 1760622607

Then don't? I don't think it's a valid complaint at _all_.

It's totally fine to just pick one tool (chatGPT, Claude, Gemini) and just use whatever the best default they allow you to use. You'll get 90% of the benefits and not have to think at all.

AI is new and developing at breakneck pace. You can't complain that you want to get bleeding edge without having to do research or change workflows. That's already unrealistic for "normal" fields. It's absurd to expect for AI.

boringg · 2025-10-16T14:56:32 1760626592

We diverge on opinions here - most peoples workflow doesn't revolve around the current distinctions between AI. Having to incorporate that into their workflow is a PITA - I understand some people live off that and thats their bread and butter. Good for them for finding their competitive advantage. For every other builder out there who builds instead of integrates the newest AI model - its a PITA.

mark_l_watson · 2025-10-16T13:16:19 1760620579

Right on, for code examples for my writing and my own ‘gentleman scientist’ experiments I stick with gemini-cli and codex.

For play time, I literally love experimenting with small local models. I am an old man, and I have always liked tools that ‘make me happy’ while programming like Emacs, Lisp languages, and using open source because I like to read other people’s code. But, for getting stuff done, for now gemini-cli and codex hit a sweet spot for me.

muzani · 2025-10-16T10:14:48 1760609688

I love haiku 4.5, but you don't need it. It's like a motorcycle. Feels good, but doesn't do the heavy lifting.

Cursor has an auto mode for exactly your situation - it'll switch to something cost effective enough, fast enough, consistent enough, new enough. Cursor is on the ball most of the time and you're not stuck with degraded performance from OpenAI or Anthropic.

8n4vidtmkvmk · 2025-10-16T05:58:41 1760594321

They're working on all that. I think "ACP" is supposed to be the answer. Then you can use the models in your IDEs, and they can all develop against the same spec so it'll be easy to pop into whatever model.

Gpt 5 is supposed to cleverly decide when to think harder.

But ya we're not there yet and I'm tired of it too, but what can you do.

dmvinson · 2025-10-16T13:24:04 1760621044

This is what opencode does for me. One harness for all models, standardized TUI, and they're rolling out a product to serve models via API with one bill through them

PhilippGille · 2025-10-15T22:11:28 1760566288

One option: Use OpenRouter [1] with the `openrouter/auto` model [2], which will pick among GPT-5, Gemini 2.5 Pro, Claude Sonnet 4.5 and similar.

[1] https://openrouter.ai/

[2] https://openrouter.ai/openrouter/auto

shrubble · 2025-10-16T05:34:11 1760592851

We’re in the stage where the 8080,8085, Z80, 6502 and 6809 CPUs are all in the market, and the relevant buses are S100, with other buses not yet standardized.

You either live with what you’re using or you change around and fiddle with things constantly.

jbentley1 · 2025-10-15T22:20:07 1760566807

You can use Crystal (https://github.com/stravu/crystal) to run Codex and Claude Code at the same time and just pick the best result.

behnamoh · 2025-10-15T23:09:14 1760569754

Ain't nobody got time and money to run multiple agents at the same time

caymanjim · 2025-10-16T04:05:44 1760587544

Unfortunately I already pay for and use both, because on the $20/mo plan, you get cut off after a few hours due to usage limits. Claude resets daily after "5 hours" (I can't determine what runs the clock, but it seems to be wall time (?!)), and Codex cuts you off for multiple days after a long session.

kissgyorgy · 2025-10-16T10:18:12 1760609892

If you don't want to upgrade and follow model development so much, I would just pay one provider and stick with them.

This model worth knowing about, because it's 3x cheaper and 2x faster than the previous Claude model.

schmookeeg · 2025-10-15T22:13:21 1760566401

I use OpenRouter for similar reasons -- half to avoid lock-in, and the other half to reduce the switching pain, which is just a way to say "if I do get locked in, I want to move easily"

hansmayer · 2025-10-16T10:21:25 1760610085

Amen. Why dont they just release updates to the current models?

ygouzerh · 2025-10-16T06:45:01 1760597101

Github Copilot could help you, you can switch model from different providers on the fly (supports Anthropic, OpenAI, Grok,...)

tiberriver256 · 2025-10-15T22:23:58 1760567038

VSCode + the new "Auto" model probably worth a shot for this

rldjbpin · 2025-10-16T14:36:06 1760625366

as mentioned already by the others, using opencode [1] helps with this, if you like the cli workflow. it is good enough and does not need to exceed what the leaders are doing.

when combined with the ability to use github copilot to make the llm calls, i can play with almost any provider i need. also helps if you get its access through your work or school.

for example, Haiku is already offered by them and costs a third in credits.

[1] https://github.com/sst/opencode

UncleOxidant · 2025-10-15T21:07:40 1760562460

> annoying enough having to switch from one LLM ecosystem to another all the time due to vague usage restrictions

I use KiloCode and what I find amazing is that it'll be working on a problem and then a message will come up about needing to topup the money in my account to continue (or switch to a free model), so I switch to a free model (currently their Code Supernova 1million context) and it doesn't miss a beat and continues working on the problem. I don't know how they do this. It went from using a Claude Sonnet model to this Code Supernova model without missing a beat. Not sure if this is a Kilocode thing or if others do this as well. How does that even work? And this wasn't a trivial problem, it was adding a microcode debugger to a microcoded state machine system (coding in C++).

qsort · 2025-10-15T21:14:44 1760562884

Models are stateless, why would that not work?

UncleOxidant · 2025-10-15T21:29:47 1760563787

OK I understand what those words mean, but how exactly does that work? How does the new model 'know' what's being worked on when the old model was in the middle of working on a task and then a new model is switched to? (and where the task might be modifying a C++ file)

simonw · 2025-10-15T21:41:14 1760564474

Every time you send a prompt to a model you actually send the entire previous conversation along with it, in an array that looks like this:

  curl https://api.anthropic.com/v1/messages \
    -H "content-type: application/json" \
    -H "x-api-key: $(llm keys get anthropic)" \
    -H "anthropic-version: 2023-06-01" \
    -d '{
      "model": "claude-haiku-4-5-20251001",
      "max_tokens": 1024,
      "messages": [
        {
          "role": "user",
          "content": "What is the capital of France?"
        },
        {
          "role": "assistant",
          "content": "The capital of France is Paris."
        },
        {
          "role": "user",
          "content": "Germany?"
        },
        {
          "role": "assistant",
          "content": "The capital of Germany is Berlin."
        },
        {
          "role": "user",
          "content": "Belgium?"
        }
      ]
    }'

You can see this yourself if you use their APIs.

behnamoh · 2025-10-15T23:10:54 1760569854

that is true unless you use the Response API endpoint...

simonw · 2025-10-15T23:23:25 1760570605

That's true, the signature feature of that API is that OpenAI can now manage your conversation state server-side for you.

You still have the option to send the full conversation JSON every time if you want to.

You can send "store": false to turn off the feature where it persists your conversation server-side for you.

basket_horse · 2025-10-15T21:42:38 1760564558

Generally speaking, agents send the entire previous conversation to the model on every message. That’s why you have to do things like context compaction. So if you switch models mid way, you are still sending the entire previous chat history to the new model

nothrabannosir · 2025-10-15T22:02:06 1760565726

In addition to sibling comments you can play with this yourself by sending raw api requests with fake history to gaslight the model into believing it said things which it didn’t. I use this sometimes to coerce it into specific behavior, feeling like maybe it will listen to itself more than to my prompt (though I never benchmarked it):

- do <fake task> and be succinct

- <fake curt reply>

- I love how succinct that was. Perfect. Now please do <real prompt>

The models don’t have state so they don’t know they never said it. You’re just asking “given this conversation , what is the most likely next token?”

riwsky · 2025-10-15T21:36:24 1760564184

the underlying LLM service provider APIs require sending the entire history for every request anyway; the state is entirely in your local (or kilocode or whatever), not in some "session" on the API side. (There are some APIs that will optionally handle that state for you, like OpenAI's more recent stuff — but those are the exception, not the rule).

handfuloflight · 2025-10-15T21:36:38 1760564198

Here's a hint. What goes inside the inference engine is an array. You control that array every time you call for inference.

flawn · 2025-10-15T21:42:37 1760564557

Probably context, logs or some sort of state passed in as context by your editor/extension

sandos · 2025-10-16T07:26:02 1760599562

Wow, not knowing that models have 0 working memory is.. wild.

solumunus · 2025-10-16T05:54:08 1760594048

This really seems like a you problem.