I just tried your trademark benchmark on the new 4o Image Output, though it's no...

jonomacd · 2025-03-26T07:46:47 1742975207

And the same thing with gemini 2.0 flash native image output.

https://imgur.com/a/V4YAkX5

It's sort of irrelevant though as the test is about SVGs.

Unroasted6154 · 2025-03-25T21:21:54 1742937714

Was that an actual SVG?

simonw · 2025-03-25T21:36:16 1742938576

No that's GPT-4o native image output.

sebzim4500 · 2025-03-25T22:27:36 1742941656

I wonder how far away we are from models which, given this prompt, generate that image in the first step in their chain-of-thought and then use it as a reference to generate SVG code.

It could be useful for much more than just silly benchmarks, there's a reason why physics students are taught to draw a diagram before attempting a problem.

simonw · 2025-03-25T22:46:50 1742942810

Someone managed to get ChatGPT to render the image using GPT-4o, then save that image to a Code Interpreter container and run Python code with OpenCV to trace the edges and produce an SVG: https://bsky.app/profile/btucker.net/post/3lla7extk5c2u

qingcharles · 2025-03-26T16:31:56 1743006716

Does this match the rules of your test, or is it cheating? :)