Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The same author thought there would be no scaling walls: https://stochasm.blog/posts/scaling_post/


Scaling the model size, the compute and the dataset hit a wall. Too large a model, or if it needs too much compute - it becomes too expensive to use. And the dataset .. we benefitted in one go from about multiple decades of content accumulation online, but since late 2022 it's only been 3 years, so organic text does not increase exponentially past this size, it only worked for 50T tokens or so.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: