Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> The complexity comes from the number of steps and the number of parameters.

Yes, it seems like a transformer model simple enough for us to understand isn't able to do anything interesting, and a transformer complex enough to do something interesting is too complex for us to understand.

I would love to study something in the middle, a model that is both simple enough to understand and complex enough to do something interesting.



You might be interested, if you aren't already familiar, in some of the work going on in the mechanistic interpretability field. Neel Nanda has a lot of approachable work on the topic: https://www.neelnanda.io/mechanistic-interpretability


I was not familiar with it, and that does look fascinating, thank you. If anyone else is interested, this guide "Concrete Steps to Get Started in Transformer Mechanistic Interpretability" on his site looks like a great place to start:

https://www.neelnanda.io/mechanistic-interpretability/gettin...


I would assume that the boundaries of those ranges are such that the middle in between those extremes is something that is already too complex for a human to properly understand while still too small to be able to do anything interesting.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: