Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The rule of thumb is roughly 44gb, as most models are trained in bf16, and require 16 bits per parameter, so 2 bytes. You need a bit more for activations, so maybe 50GB?

you need enough RAM and HBM (GPU RAM) so it’s a constraint on both.



Which GPU card can I buy to run this model? Can it run on commercial RTX3090 or does it need a custom GPU?


3090 or 4090 will be able to run quantized 22B models.

Though realistically for code completion smaller models will be better due to speed


Easy..


Most GPUs still use GDDR I'm pretty sure, not HBM. Do you mean VRAM?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: