My one shot rate for unattended prompts (triggered GitHub actions) has gone from...

Retric · 2025-10-14T13:52:01 1760449921

Requesting concrete examples isn’t a high bar. Autopilot got better tells me effectively nothing. Autopilot can now handle stoplights does.

kasey_junk · 2025-10-14T14:14:38 1760451278

“ but i cannot point to a single person who seems to think that they are accomplishing real-world tasks with GPT5 better than they were with GPT4.”

I don’t use OpenAI stuff but I seem to think Claude is getting better for accomplishing the real world tasks I ask of it.

Retric · 2025-10-14T17:15:20 1760462120

Specifics are worth talking about. I just felt it unfair to complain about raising the bar when you didn’t initially reach it.

In your own worlds: “You asked for a single anecdote of llms getting better at daily tasks.”

Which is already less specific than their request: “i'd love to hear tangible ways it's actually better.”

Saying “is getting better for accomplishing the real world tasks I ask of it” brings nothing to a discussion and was the kind of vague statement that they were initially complaining about. If LLM’s are really improving it’s not a major hurdle to say something meaningful about what specific is getting better. /tilting at windmills