Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

You will get at 20gb model. Distillation is so compute efficient that it’s all but inevitable that if not OpenAI, numerous other companies will do it.

I would rather have an open weights model that’s the best possible one I can run and fine tune myself, allowing me to exceed SOTA models on the narrower domain my customers care about.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: