I agree with this and actually Claude Code agrees with it too. I've had Codex cl...

pkreg01 · 2025-10-20T21:00:00 1760994000

I share your observations. It's strange to see Anthropic loosing so much ground so fast - they seemed to be the first to crack long-horizon agentic tasks via what I can only assume is an extremely exotic RL process.

Now, I will concede that for non-coding long-horizon tasks, GPT-5 is marginally worse than Sonnet 4.5 in my own scaffolds. But GPT-5 is cheaper, and Sonnet 4.5 is about 2 months newer. However, for coding in a CLI context, GPT-5-Codex is night-and-day better. I don't know how they did it.

typpilol · 2025-10-21T03:52:13 1761018733

Every since 4.5, I can't get Claude to do anything that takes a while

4.0 would chug a long for 40 mins. 4.5 refuses and straight up says the scope is too big sometimes.

My theory is anthropic is super compute constrained and even though 4.5 is smarter, the usage limits and it's obsession with rushing to finish was put in mainly to save their servers compute.