Fwiw I assumed they were using o1 to train. But it doesn’t matter: the big story here is that massive compute resources are unlikely to be as important in the future as we thought. It cuts the legs off stargate etc just as it’s announced. The CCP must be highly entertained by the timeline.
That's what I thought and assumed. This is the narrative that's been running through all the major news outlets.
It didn't even occur to me that DeepSeek could have been training their models using the output of other models until reading this article.