Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

LLMs can have surprisingly strong "theory of mind", even at base model level. They have to learn that to get good at predicting all the various people that show up in conversation logs.

You'd be surprised at just how much data you can pry out of an LLM that was merely exposed to a single long conversation with a given user.

Chatbot LLMs aren't trained to expose all of those latent insights, but they can still do some of it occasionally. This can look like mind reading, at times. In practice, the LLM is just good at dredging the text for all the subtext and the unsaid implications. Some users are fairly predictable and easy to impress.



Do you have evidence to support any of this? This is the first time I’ve heard that LLMs exhibit understanding of theory of mind. I think it’s more likely that the user I replied to is projecting their own biases and beliefs onto the LLM.


Basically, just about any ToM test has larger and more advanced LLMs attaining humanlike performance on it. Which was a surprising finding at the time. It gets less surprising the more you think about it.

This extends even to novel and unseen tests - so it's not like they could have memorized all of them.

Base models perform worse, and with a more jagged capability profile. Some tests are easier to get a base model to perform well on - it's likely that they map better onto what a base model already does internally for the purposes of text prediction. Some are a poor fit, and base models fail much more often.

Of course, there are researchers arguing that it's not "real theory of mind", and the surprisingly good performance must have come from some kind of statistical pattern matching capabilities that totally aren't the same type of thing as what the "real theory of mind" does, and that designing one more test where LLMs underperform humans by 12% instead of the 3% on a more common test will totally prove that.

But that, to me, reads like cope.


There are several papers studying this, but the situation is far more nuanced than you’re implying. Here’s one paper stating that these capabilities are an illusion:

https://dl.acm.org/doi/abs/10.1145/3610978.3640767


AIs have neither a "theory of mind", nor a model of the world. They only have a model of a text corpus.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: