It seems like chain-of-thought will work pretty well when backtracking isn't nee...

It seems like chain-of-thought will work pretty well when backtracking isn't needed. It can look up or guess the first step, and that gives it enough info to look up or guess the second, and so on.

(This can be helpful for people too.)

If it goes off track it might have trouble recovering, though.

(And that's sometimes true of people too.)

I wonder if LoRA fine-tuning could be used to help it detect when it gets stuck, backtrack, and try another approach? It worked pretty well for training it to follow instructions.

For now, it seems like it's up to the person chatting to see that it's going the wrong way.