> *Is this based on analogising LLMs to animal mental capacities, or based on a ...

mjburgess · 2025-05-29T12:18:05 1748521085

> On the observation that most of the failure modes of LLMs also happen to human

That's assuming that LLMs operate according to how we read their text. What you're doing is reading llm chain-of-thought as-if said by a human, and imparting the implied capacities that would be implied if a human said it. But this is almost certainly not how LLMs work.

LLMs are replaying "linguisitc behaivour" which we take, often accurately, to be dispositive of mental states in people. They are not evidence of mental capacities and states in LLMs, for seemingly obvious reasons. When a person says, "I am hungry" it is, in verdical cases, caused by their hunger. When an LLM says it the cause is something like, "responding appropriately, accoring to a history of appropriate use of such words, on the occasion of a prompt which would, in ordinary historical cases, give this response".

The reason an LLM generates a text prima fascie never involves any associated capacities which would have been required for that text to have been written in the first place. Overcoming this leap of logic requires vastly more than "it seems to me".

> On how embeddings work

The space of necessary capacities is no exhausted by "embedding", by which you mean a (weakly) continuous mapping of historical exemplars into a space. Eg., logical relationships, composition, recursion, etc. are not mental capacities which can be implemented this way.

> We don't know what drives our own mental processes either.

Sure we do. At the level of enumerating mental capacities, their operation and so on, we can give very exhaustive lists. We do not know how even the most basic of these is implemented biologically, save I believe, we can say quite a lot about how properties of complex biological systems generically enable this.

But we have a lot of extremely carefully designed experiments to show the existence of relevant capacities in other animals. None of these experiments can be used on an LLM, because by design, any experiment we would run would immediately reveal the facade: any measurement of the GPU running the LLM and its environmental behaviour shows a total empirical lack of anything which could be experimentally measured.

We are, by the charaltan's design, only supposed to use token-in/token-out as "measuremnt". But this isn't a valid measure, becuase LLMs are constructed on historical cases of linguistic behaviour in people. We know, prior to any experiment, that the one thing designed to be a false measure, is the lingustic behaviour of the LLM.

Its as if we have constructed a digital thermometer to always replay historical temperature readings -- we know, by design, that these "readings" are therefore never indicative of any actual capacity of the device to measure temperature.