Hacker Newsnew | past | comments | ask | show | jobs | submit | nico1207's commentslogin

They are not even leading in Terminal-Bench... GPT 5.1-codex is better than Gemini 3 Pro


You can just view the dataset on hugging face


GPT-4o with image output is not yet available. So what did you even test? Dall-E 3?


It's making images for me when I ask it to.

I'm using the web interface, if that helps. It doesn't have all the 4o options yet, but it does do pictures. I think they are the same as with 4.5.

I just noticed after further testing the text it shows in images is not anywhere near as accurate as shown in the article's demo, so maybe it's a hybrid they're using for now.


That's not 4o that'd be 4o routing the request to Dalle. Afaik only text output is enabled so far.


Yes it likely is. I've had time to play around and see that so far it doesn't look any different (yet). I have a paid account, so apparently I'll be among the early folks getting all the things. Just not yet.

I definitely look forward to re-doing my Three Blind Mice test when it happens.

I noticed in their demo the 4o text still has glitches, but nowhere near to the extent the current Dall-e returns give you (the longer the text, the worse it gets). It's pretty important that eventually they get text right in the graphics.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: