Actually the "our IP" argument is ridiculous. What they are doing is stealing data from all over the web, without people's consent for that data to be used in training ML models. If anything, then "Open"AI should be sued and forced to publish their whole product. The people should demand knowing exactly what is going on with their data.
Also still an unresolved issue is how they will ever comply with a deletion request, should any model output personal data of someone. They are heavily in a gray area, with regards to what should be allowed. If anything, they should really shut up now.
They can still have IP while using copyrighted training materials - the actual model source code.
But DeepSeek didn't use that presumably (since it's secret). They definitely can't argue that using copyrighted material for training is fine, but using output from other commercial models isn't. That's too inconsistent.
> Only works with human authors can receive copyrights, U.S. District Judge Beryl Howell said[1]
IANAL but it seems to me that OpenAI wouldn’t be able to claim their outputs are IP since they are AI-generated.
It may be against their TOS, meaning OpenAI could refuse to provide service to DeepSeek in the future, but they can’t really sue them.
Did OpenAI ask all of the authors of the works they ingested to train their model for permission? Is OpenAI the biggest copyrighted works launderer in existence?
I don't think OpenAI should be able to make any claims of IP for the AI generated outputs, since they based that on other work, partially copyrighted work, which they hide. They simply throw algorithms at data that is not their data to begin with.
If I steal something, keep the exact thing I stole hidden, and sell a product, that I could only have made, based on the stolen thing, how can I expect that to be even legal, let alone untouchable IP?
I think way too many people have seen too many dollar signs in front of their eyes. The whole thing is outrageous. If they were transparently proving, that they are using open data sets, adhering to licenses, then they would get to claim IP.
> [OpenAI] definitely can't argue that using copyrighted material for training is fine, but using output from other commercial models isn't. That's too inconsistent.
Well, they can argue that, if they're fine with being hypocrites.
If there's any litigation, a counterclaim would be interesting. But DeepSeek would need to partner with parties that have been damaged by OpenAI's scraping.
I'm getting popcorn ready for the trial where an apparatus of the Chinese Communist Party files a counterclaim in an American Court together with the common people - millions of John Does - as litigants against an organization that has aggressively and in many cases of oppressively scraped their websites (DDoS)
I would definitely pay for seeing that movie! Especially if it led to greedy tech giants becoming very careful about what data they gather and ingest for training of ML models.
Also still an unresolved issue is how they will ever comply with a deletion request, should any model output personal data of someone. They are heavily in a gray area, with regards to what should be allowed. If anything, they should really shut up now.