Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This is extremely interesting: The authors look at features (like making poetry, or calculating) of LLM production, make hypotheses about internal strategies to achieve the result, and experiment with these hypotheses.

I wonder if there is somewhere an explanation linking the logical operations made on a on dataset, are resulting in those behaviors?



And they show the differences when the language models are made larger.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: