Has anyone switched to Gemini CLI? It's so important but also exhausting keeping up with which model is the leading edge. Especially since every model has different idiosyncrasies you have to learn to work with it effectively.
Currently my ranking is
* Cursor composer: impressively fast and able but not tuned to be that agentic, so it's better for one-shot code changes than long-running tasks. Fantastic UI.
* Claude Code: Works great if you can set up a verifiable environment, a clear plan and set it loose to build something for an hour
* Grok: Similar to cursor composer but slower and more agentic. Not currently using.
I haven't tried Gemini CLI with Gemini 3 Pro, but pretty much all the others. I usually run four agents at the same time, for each task, giving them the same prompt and then comparing their responses.
Gemini CLI has the lowest rate limits, lowest inability to steer the models (not sure that's a model or tooling thing, but I cannot get any of the Google models to stop outputting code comments constantly and everywhere) and seemingly the API frequently becomes unavailable for some reason.
Claude Code is fast, easy to steer, but the quality really degrades really quickly and randomly, seemingly by time of day. I'm not sure if they're running differently quanitized models during different times, but there is a clear quality difference depending on when in the day I use it, strangely. Haven't found a way of verifying this though, ideas welcome.
Codex CLI is probably what I use the most, with "gpt-5+high", which is kind of slow, a lot slower than Claude Code, but it almost always gets it right on the first try, and seemingly no other model+tool does instruction following as good, even if your AGENTS.md is almost overflowing with rules and requirements, it seems to nail things anyways.
Codex has gotten kind of nerfed with their weird choice to limit loc read to 250 and dropping middle of context a lot. None of the CLIs are performing well for me right now. I'm codex and claude max btw. Disappointing.
For Gemini 3.0, the rate limits are very very generous. Google says rate limits refresh every five hours, and that only “a very small fraction of power users” will ever hit the limits.
Maybe these new releases bring some serious enhancements, but my experience with the Gemini cli has been dreadful. It craps out at least half of the time. When it works it is ridiculously fast so I keep trying it. But it has proven very inferior to the Claude code experience in my usage
Codex with gpt-5-high I trust to get things right without much effort. Claude is the best tool using agent out there. Very good at using the tools to ground whether changes are producing outcomes.
Currently my ranking is
* Cursor composer: impressively fast and able but not tuned to be that agentic, so it's better for one-shot code changes than long-running tasks. Fantastic UI.
* Claude Code: Works great if you can set up a verifiable environment, a clear plan and set it loose to build something for an hour
* Grok: Similar to cursor composer but slower and more agentic. Not currently using.
* ChatGPT Codex, Gemini: Haven't tried yet.