There's a bunch of other fully open models, including the [Marin](https://marin.community/) series of models out of Stanford and Nvidia regularly releases fully open models.
(I’m a researcher on the post-training team at Ai2.)
7B models are mostly useful for local use on consumer GPUs. 32B could be used for a lot of applications. There’s a lot of companies using fine tuned Qwen 3 models that might want to switch to Olmo now that we have released a 32B base model.
May I ask why you went for a 7B and a 32B dense models instead of a small MoE like Qwen3-30B-A3B or gpt-oss-20b given how successful these MoE experiments were?
MoEs have a lot of technical complexity and aren't well supported in the open source world. We plan to release a MoE soon(ish).
I do think that MoEs are clearly the future. I think we will release more MoEs moving forward once we have the tech in place to do so efficiently. For all use cases except local usage, I think that MoEs are clearly superior to dense models.
7B runs on my Intel Macbook Pro - there is a broad practical application served here for developers who need to figure out a project on their own hardware, which improves time/cost/effort economy. Before committing to a bigger model for the same project.
Are there quantized (eg 4bit) models available yet? I assume the training was done in BF16, but it seems like most inference models are distributed in BF8 until they're quantized.
This is why I will never work somewhere with a short post termination exercise period (PTEP). If it’s not at least 5 years, ideally 10, they don’t seriously consider equity something that employees are owed.
Can you explain? In most cases, preferences won’t come into play, assuming you raise at a standard 1x preference and sell for more than you have raised. In that case, owning 0.5% should roughly translate into $5M (modulo dilution).
There are plenty of valid scenarios where the company sells for a lot, but less than it raised. And 1x preferences are no longer standard post-ZIRP, afaik.
People are often not aware that the value of common is nonlinear, so the value of 0.5% in this case is zero. (For the ML fans out there, the common price per share has one or more ReLU activation layers. :) )
Even with 1x preferences, the company might have raised $2 billion but sells for $1 billion because the investors don't want to get any further losses.
The general rule of thumb is that acquisitions are bad for employees, and IPOs are good, especially if the share price is stable for 6 months.
Also for acquisitions, often you'll have to work at the acquiring company for some time to get money from your options. Or might get options in the acquiring company instead (which again are worth nothing until some future possible equity event which hopefully translates into cash).
That would be the naive mathematical interpretation and how the system would work if engineers designed it. Lawyers designed it, though, and they probably know some tricks to make that not happen.
Business people hired lawyers to design means and methods to commit _implicit_ fraud and deceptive practices to improve the value of their capital assets.
Those lawyers then go on to sell this product to others.
I'm sure there's some lawyers out there that are going out there shopping this stuff around, but it's Capitalism and Business thats the active agent, not Lawyers.
I hate how every company that I place an order with treats that as permission to send a constant drip of marketing emails. I send them straight to spam.
I think FAANG-like big IC offers right now are flying around for technical people familiar with the current hot "AI" methods. A lot of hype, a lot of situational ethics around copyright, and some investment scams, but some of the tech is ready for legitimate things people want to do, at acceptable quality levels for the application.
Also, one thing that happened early in the dotcom gold rush is that a ton of people swarmed in, all suddenly acting like experts and professionals, with little/no prior experience. Meanwhile, Internet people who were also prolific programmers were, like, who are all these people, and why do many of them have fashionable eyeglasses like nerds would never try to pull off. I don't know how much we'll see something like that this time.
agree 100% with this nice comment from a fellow dot-com crash observer.
Many HackerNews readers are young'uns who are seduced by FANG salaries, and the latest AI/ML bandwagon. They dont have the perspective one gets having seeing the highs and the lows.
I can bet there will be another rude awakening around the corner which will wipe out the "pretenders". After the dot-com bust, many pretenders left the silicon valley with anecdotes of leased BMWs abandoned at SFO airport.
After the AI/ML hype deflates, there is likely to be a similar separation. Folks in it for real will be separated from the folks who came in for the riches.
And for the history-buffs this rhymes with the CA gold-rush circa late 1800s - some found it, most didnt, levis profited selling denims to them.
There's a bunch of other fully open models, including the [Marin](https://marin.community/) series of models out of Stanford and Nvidia regularly releases fully open models.