you need enough RAM and HBM (GPU RAM) so it’s a constraint on both.
Though realistically for code completion smaller models will be better due to speed
you need enough RAM and HBM (GPU RAM) so it’s a constraint on both.