This looks like the very complaint of "specification gaming". I was wondering ho...

TeMPOraL · 2025-03-27T21:28:00 1743110880

I'm gonna guess GP used a rather short prompt. At least that's what happens when people heavily underspecify what they want.

It's a communication issue, and it's true with LLMs as much as with humans. Situational context and life experience papers over a lot of this, and LLMs are getting better at the equivalent too. They get trained to better read absurdly underspecified, relationship-breaking requests of the "guess what I want" flavor - like when someone says, "make this test pass", they don't really mean "make this test pass", they mean "make this test into something that seems useful, which might include implementing the feature it's exercising if it doesn't exist yet".

polygot · 2025-03-28T21:15:41 1743196541

My prompt was pretty short, I think it was "Make these tests pass". Having said that, I wouldn't mind if it asked me for clarification before proceeding.