ChatGPT outputs are all over the internet. It is harder to prove that deepseek used specifically o1 for training, instead of a lot of chatgpt output ending up in the training set from other sources.
That's a good point, at least for the prompts I saw. Like "do you have an app I can use" is commonly seen with "here's the ChatGPT app" online. And maybe they don't add anything telling Deepseek that it's Deepseek.