If you believe another thread the benchmarks are comparing Gemini-3 (probably th... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		HardCodedBias 37 days ago \| parent \| context \| favorite \| on: Gemini 3 If you believe another thread the benchmarks are comparing Gemini-3 (probably thinking) to GPT-5.1 without thinking. The person also claims that with thinking on the gap narrows considerably. We'll probably have 3rd party benchmarks in a couple of days.

iamdelirium 37 days ago [–]

This is easily shown that the numbers are for GPT 5.1 thinking high.

Just go to the leaderboard website and see for yourself: https://arcprize.org/leaderboard

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact