How does the outfit intend to fund the project? OpenAI spends millions on comput...

stellaathena · on Jan 18, 2021

Hey! One of the lead devs here. A cloud computing company called CoreWeave is giving us the compute for free in exchange for us releasing it. We're currently at the ~10B scale and are working on understanding datacenter scale parallelized training better, but we expect to train the model on 300-500 V100s for 4-6 months.

tmalsburg2 · on Jan 18, 2021

I imagine recreating the model will be computationally cheaper because they will not have to sift through the same huge hyperparameter space as the initial GPT-3 team had to.

jne2356 · on Jan 18, 2021

This is not true. The OpenAI team only trained one full-sized GPT-3, and conducted their hyperparameter sweep on significantly smaller models (see: https://arxiv.org/abs/2001.08361). The compute savings from not having to do the hyperparameter sweep are negligible and do not significantly change the feasibility of the project.

thelastwave · on Jan 18, 2021

Why is that?

jne2356 · on Jan 18, 2021

The cloud company CoreWeave has agreed to provide the GPU resources necessary.