I think you’re referring to the Deepseek-R1 branch of reasoning models, where a ...

		anon373839 9 months ago \| parent \| context \| favorite \| on: Tracing the thoughts of a large language model I think you’re referring to the Deepseek-R1 branch of reasoning models, where a small amount of SFT reasoning traces is used as a seed. But for non-“reasoning” models, SFT is very important and definitely imparts enhanced capabilities and reliability.