Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

How does the outfit intend to fund the project? OpenAI spends millions on computing resources to train the models.


Hey! One of the lead devs here. A cloud computing company called CoreWeave is giving us the compute for free in exchange for us releasing it. We're currently at the ~10B scale and are working on understanding datacenter scale parallelized training better, but we expect to train the model on 300-500 V100s for 4-6 months.


I imagine recreating the model will be computationally cheaper because they will not have to sift through the same huge hyperparameter space as the initial GPT-3 team had to.


This is not true. The OpenAI team only trained one full-sized GPT-3, and conducted their hyperparameter sweep on significantly smaller models (see: https://arxiv.org/abs/2001.08361). The compute savings from not having to do the hyperparameter sweep are negligible and do not significantly change the feasibility of the project.


Why is that?


The cloud company CoreWeave has agreed to provide the GPU resources necessary.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: