Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

OpenAI is taking the position similar to that if you sell a cook book, people are not allowed to teach the recipes to their kids, or make better versions of them.

That is absurd.

Copyright law is designed to strike a balance between two issues. One the one hand, the creator’s personality that’s baked into the specific form of expression. And on the other hand, society’s interest in ideas being circulated, improved and combined for the common good.

OpenAI built on the shoulders of almost every person that wrote text on a website, authored a book, or shared a video online. Now others build on the shoulders of OpenAI. How should the former be legal but not the latter?

Can’t have it both ways, Sam.

(IAAL, for what it’s worth.)



The stuff about copyright seems irrelevant.

OpenAI's future investments -- billions -- were just threatened to be undercut by several orders of magnitude by a competitor. It's in their best interests to cast doubt on that competitor's achievements. If they can do so by implying that OpenAI are in fact the source of most of the DeepSeek's performance then all the better.

It doesn't matter whether there's a compelling legal argument around copyright, or even if it's true that they actually copied. It just needs to be plausible enough that OpenAI can make a reasonable case for continuing investment at the levels it's historically attained.

And plausibility is something they've handily achieved with this announcement -- the sentiment on HN at least is that it is indeed plausible that DeepSeek trained on OpenAI. Which means there's now doubt that a DeepSeek-level model could be trained without making use of OpenAI's substantial levels of investment. Which is the only thing that OpenAI should be caring about.


> It's in their best interests to cast doubt on that competitor's achievements.

it is, but the 2nd order logic says that if they are trying to cast doubt, it means they've got nothing better to offer and casting doubt is the only step they have.

if i was an investor in openAI, this should be very scary as it simply means I've overvalued it.


>it is, but the 2nd order logic says that if they are trying to cast doubt, it means they've got nothing better to offer and casting doubt is the only step they have.

this implies that when casting doubt the doubt is always false, if the doubt here is true, then it is a good offer.


If the doubt were true, it wouldn't be a doubt.


Something is true whether or not you doubt it, you then confirm your doubt as true or prove it false.

Commonly the phrase sowing doubt is used to say an argument someone has made is false, but that was evidently not what the parent poster meant, although it was what the comment I replied to probably interpreted it as.

on edit: I believe what the parent poster meant is that whether or not OpenAI/Altman believes the doubts expressed, they are pretty much constrained to cast some doubt as they do whatever else they are planning to deal with the situation. From outside we can't know if they believe it or not.


DeepSeek is a card trick. They came up with a clever way to do multi-headed attention, the rest is fluff. Janus-Pro-7B is a joke. It would have mattered a year ago but also just a poor imitation of what's already on the market. Especially when they've obfuscated that they're using a discrete encoder to downsample image generation.


Like most illusions, if you can't tell the difference between the fake and the real, they're both real.


> it is, but the 2nd order logic says that if they are trying to cast doubt, it means they've got nothing better to offer and casting doubt is the only step they have.

I don't think that this is a working argument, because all their steps I can imagine are not mutually exclusive.


Even if that narrative is true, they were still undercut by DeepSeek. Maybe DeepSeek couldn't have succeeded without o1, but then it should have been even easier for OpenAI to do what DeepSeek did, since they have better access to o1.


This argument would excuse many kinds of intellectual property theft. "The person whose work I stole didn't deserve to have it protected, because I took their first draft and made a better second draft. Why didn't they just skip right to the second draft, like me?"


If DeepSeek "stole" from OpenAI, then OpenAI stole from everyone who ever contributed anything accessible on the internet.

I just don't see how OpenAI makes a legitimate copyright claim without stepping on its entire business model.


Isn’t this precisely how so many opensource LLMs caught up with OpenAI so quickly, because they could just train on actual ChatGPT output?


I’d take this argument more seriously if there weren’t billboards advocating hiring AI employees instead of human employees.

Sure, Open AI invested billions banking on the livelihood of every day people being replaced, or as Sam says, “A renegotiation of the social contract”

so as an engineer that is being targeted by meta and sales force under the “not hiring engineers plan” all o have to say to Open AI is “welcome to the social contract renegotiation table”


"It doesn't matter whether there's a compelling legal argument around copyright, or even if it's true they actually copied."

Indeed, when the alleged infringer is outside US jurisdiction and not violating any local laws in the country where it's domiciled.

The fact that Microsoft cannot even get this app removed from "app stores" tells us all we need to know.

It will be OpenAI and others who will be copying DeepSeek.

Some of us would _love_ to see Microsoft try to assert copyright over a LLM. The question might not be decided in their favour, putting a spectre over all their investment. It is not a risk worth taking.

Anyone remember this one: https://en.wikipedia.org/wiki/Microsoft_Corp._v._Zamos


>It just needs to be plausible enough that OpenAI can make a reasonable case for continuing investment at the levels it's historically attained

>there's now doubt that a DeepSeek-level model could be trained without making use of OpenAI's substantial levels of investment.

But, this still seems to be a problem for OpenAI. Who wants to invest "substantially" in a company whose output can be used by competitors to build an equal or better offering for orders of magnitude less?

Seems they'd need to make that copyright stick. But, that's a very tall and ironic order, given how OpenAI obtained its data in the first place.

There's a scenario where this development is catastrophic for OpenAI's business model.


>There's a scenario where this development is catastrophic for OpenAI's business model.

Is there a scenario where it isn’t?

Either (1) a competitor is able to do it better without their work or (2) a competitor is able to use their output and develop a better product.

Either way, given the costs, how do you justify investing in OpenAI if the competitor is going to eat their lunch and you’ll never get a return on your investment?


The scenario to which I was alluding assumed the latter (2) and, further, that OpenAI was unable to prevent that—either technically or legally (i.e. via IP protection).

More specifically, on the legal side I don't see how they can protect their output without stepping on their own argument for ingesting everyone else's. And, if that were to indeed prove impossible, then that would be the catastrophic scenario.

On your point (1), I don't think that's necessarily catastrophic. That's just good old-fashioned competition, and OpenAI would have to simply best them on R&D.


OpenAI is going out of their way to demonstrate that they will willingly spend the money of their investors to the tune of 100s of billions of dollars, only to then enable 100s of derivative competitors that can be launched at a fraction of the cost.

Basically, in a round about way, OpenAi is going back to their roots and more - they're something between a charity and Robin Hood, stealing the money of rich investors and giving it to poor and aspirational AI competitors.


Homogeneous systems kill innovation, with that in mind, I guess it’s a good thing DeepSeek disregards licenses? Seems like sledding down an icy slope, slippery. and they suck.


i thought ai couldn’t violate licenses? isn’t that the premise?


As another attorney, I would impart some more wisdom:

"Karma's a bitch, ain't it."


I quite agree. The NY Times must be feeling a lot of schadenfreude right now.


You can't copyright AI generated works. OpenAI are barking up the wrong tree.


They're not making a legal claim, they're trying to establish provenance over Deepseek in the public eye.


Yes; and trying to justify their own valuation by pointing out that Deepseek cost more than advertised to create if you count in the cost of creating OpenAI's model.


Though I also think it’s extremely bad for OpenAI’s valuation.

If you give me $500B to train the best model in the world, and then a couple people at a hedge fund in China can use my API to train a model that’s almost equal for a tiny fraction of what I paid, then it appears to be outrageously foolish to build new frontier models.

The only financial move that makes sense is to wait for someone else to burn hundreds of billions building a better model, and then clone it. OpenAI primarily exists to do one of the most foolish things you can possibly do with money. Seems like a really bad deal for investors.


Fortunately everyone who gave OpenAI money did it to further their stated mission of bettering humanity, and not for the chance at any financial gain.


At one time they were nonprofit by choice. Now they are nonprofit not by choice.


I am very doubtful that is why MS gave them so much.


I suspect (and hope) that this is a satirization of the claims about OpenAI's nonprofit goals and complicated legal structure


Indeed those donors must be elated.



As it turns out, first to market only matters if you can actually make a novel moat. Which OpenAi has presently got no chance to do.


Like going to the Moon.


If that's the case, then nearly every software company should be counting the cost of Linus Torvalds development.


It still doesn’t justify their valuation because it shows that their product is unprotectable.

In fact, I’d argue this is even worse, because no matter how much OpenAI improves their product, and Altman is prancing around claiming to need $7Trillion to improve their product, someone else can replicate it for a few million.


It still doesn’t justify their valuation because it shows that their product is unprotectable.

First-mover advantage doesn't always have to pay off in the marketplace. FedEx probably has to schedule extra flights between SF and DC just to haul all of OpenAI's patent applications.

I suspect that it's going to end up like the early days of radio, when everybody had to license dozens of key patents from RCA ( https://reason.com/2020/08/05/how-the-government-created-rca... ). For the same reason, Microsoft is reputed to make more money from Android licenses than Google does.


Those patents wont do much to protect them from competitors abroad.


The public gives no shit about any of these companies. There are no moats or loyalties in this space. Just individuals and corporations looking for the cheapest tool possible that gets the job close to done.

OpenAi spent investor money to enable random Chinese Ai startups to offer a better version of their own product at a fraction of the cost. In some ways, this was inevitable to be the conclusion, but I do find the way we arrive at this conclusion to be particularly enjoyable to watch playout.


Is it predictable that they would seek to establish provenance, sure.

Is it our job as a thinking public to decry it? Also sure. In fact, wildly yes.


If this is their goal, R1 is on par with the $200 a month model. Most people don't give a shit.


>Most people don't give a shit.

I think it's more accurate to say most people can't (and don't) care about big monetary figures.

As far as Joe Average is concerned, ChatGPT cost $OoomphaDuuumpha and Deepseek cost $RuuunphaBuuunpha. The only thing Joe Average will care is the bill he gets after using it himself.


That's what I mean, Joe average is going to go with free over $2400 a year.


They use their terms of service as both a sword and a shield here. It's a little bit ridiculous.


Recipes, that is, lists of ingredients and preparation instructions, are specifically uncopyrightable. Perhaps that’s why they used it as an example.


Seriously. OpenAI consciously stole The NY Times for almost all news content. Everything about the company is shady.

Sam should focus on the product instead of trying to out-jerk Elon and his buddies.


Just to play devil's advocate, OAI can argue that they spent great effort creating and procuring annotated data. Such datasets are indeed their secret, and now DS gets them for free by distilling OAI's output. Besides, OAI's EULA explicitly forbids users from using the output of their API for model training. I'm not saying that OAI is right, of course. Just to present OAI's point of view.


This is an incomplete version of OpenAI’s point of view.

OpenAI has a legally submitted point of view that they believe the benefits of AI to humanity are so great that anyone creating AI should be allowed to trample all over copyright laws, Terms of Use, EULAs, etc.

But OpenAI’s version of benefit to humanity is that they should be allowed to trample over those laws so they can benefit humanity by closely guarding the output of trampling those laws and charging humanity an access fee.

Even if we accept all of OpenAI’s criticisms of DeepSeek, they’re arguing that DeepSeek doing the exact same thing, but releasing the output for free for anyone to use is somehow less beneficial to humanity.


This goes back to my previous criticism of OAI: Stratechery said that Altman's greatest crime is to seek regulatory capture. I think it's spot on. Altman portrays himself as a visionary leader, a messiah of the AI age. Yet when the company was so small and that the progress in AI just got started, his strategic move was to suffocate innovation in the name of AI safety. For that, I question his vision, motive, and leadership.


So a bank robber that manages to steal from Fort Knox gets to keep the gold bars because it was a very complicated job?


If the Fort Knox gold was originally stolen from the Incas


Feist comprehensively rejected that argument under US copyright law, and the attempts in the late 90s to pass a law in response establishing a sui generis prohibition on copying databases also failed in the US. The EU did adopt a directive to that effect, which may be why there are no significant European search engines.

However, OpenAI and Google are far more politically influential than the lobbyists in the 90s, so it is likely to succeed.


They aren’t actually getting the dataset though.


I am not a lawyer (I'm UK based) You are a lawyer (probably local).

My understanding is that legal positions and arguments (within Common Law) need not be consistent across "cases" - they are considered in isolation with regards the body of law extant at the time.

I think that Sam can quite happily argue two differing points of view to two courts. Until a judgement is made, those arguments are simply arguments and not "binding" or even "influential elsewhere" or whatever the correct terms are.

I think he can legitimately argue both ways but may not have it both ways.


> I think he can legitimately argue both ways

It would be very sensible that if a trial comes up, all these arguments that Sam Altman made for the other side score against him and OpenAI.


OpenAI is taking the position similar to that if you sell a cook book, people are not allowed to copy the recipes into their own book and claim they did it all on their own.


Nobody is copying their model parameters or inference code.

What people “suck out” of their API are the general ideas. And they do it specifically so they can reassemble them in their own way.

It’s like reading all the Jack Reacher novels and then creating your own hero living through similar situations, but with a different name.

You’ll read it and you’ll say, dang, that situation/metaphor/expression/character reminds me of that Reacher novel. But there’s nothing Lee Child can do about it.

And that’s perfectly fine. Because he himself took many of his ideas from others, like Le Carré.

It’s the Eternal Tao.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: