> it commits to that word, without knowing what the next word is going to be Sou...

kazinator · 2025-03-28T01:47:43 1743126463

That makes no difference. At some point it decides that it has predicted the word, and outputs it, and then it will not backtrack over it. Internally it may have predicted some other words and backtracked over those. But the fact it is, accepts a word, without being sure what the next one will be and the one after that and so on.

Externally, it manifests the generation of words one by one, with lengthy computation in between.

It isn't ruminating over, say, a five word sequence and then outputting five words together at once when that is settled.

bonoboTP · 2025-03-28T03:30:07 1743132607

> It isn't ruminating over, say, a five word sequence and then outputting five words together at once when that is settled.

True, and it's a good intuition that some words are much more complicated to generate than others and obviously should require more computation than some other words. For example if the user asks a yes/no question, ideally the answer should start with "Yes" or with "No", followed by some justification. To compute this first token, it can only do a single forward pass and must decide the path to take.

But this is precisely why chain-of-thought was invented and later on "reasoning" models. These take it "step by step" and generate sort of stream of consciousness monologue where each word follows more smoothly from the previous ones, not as abruptly as immediately pinning down a Yes or a No.

But if you want explicit backtracking, people have also done that years ago (https://news.ycombinator.com/item?id=36425375).

LLMs are an extremely well researched space where armies of researchers, engineers, grad and undergrad students, enthusiasts and everyone in between has been coming up with all manners of ideas. It is highly unlikely that you can easily point to some obvious thing they missed.