Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

the point is if humaneval was in gpt training data, then this component improved memorization from mediocre to Ok-ish, and actual coding skills still not tested.


From other people’s and my experiences, GPT-4 can do more than simply memorizing. It can at least interpolate and reason a little bit too.

A few other tests show that GPT-4 would achieve much better results than 67% for something it has sufficient training data on like GRE Verbal and AP Macroeconomics.

https://openai.com/research/gpt-4

Yes, it still can’t generalize properly outside its training distribution. However, when armed with feedback and self-reflection, it seems better at that too.


yup. we're (we=the public) far from getting access to the full model. that said one of the commentors in the twitter thread brings up how openapi isnt being fully forthcoming about their methods.

AI Explained has a good summary of many of these topics




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: