the GH/GB compute has LPDDR5X - a single or dual GPU shares 480GB, depending if it's GH or GB, in addition to the HBM memory, with NVLink C2C - it's not bad!
Essentially, the Grace CPU is a memory and IO expander that happens to have a bunch of ARM CPU cores filling in the interior of the die, while the perimeter is all PHYs for LPDDR5 and NVLink and PCIe.
Sure, but 72x Neoverse V3 (approximately Cortex X3) is a choice that seems more driven by convenience than by any real need for an AI server to have tons of somewhat slow CPU cores.
If someone gave me one for free, I'd totally make it my daily driver. I don't do much AI, but I always wanted to have a machine with lots of puny cores since the Xeon Phi appeared.
The justification is that processors cores aren't getting much faster, but what they are is getting more numerous - entry-level machines have between 4 and 8 cores - and adapting code to run across multiple cores is important if we want to utilise all those cores.