Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

What I'm getting from this thread is that people have their own private benchmarks. It's almost a cottage industry. Maybe someone should crowd source those benchmarks, keep them completely secret, and create a new public benchmark of people's private AGI tests. All they should release for a given model is the final average score.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: