Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

How what?

The fp64 GFLOPS per watt metric in the post is almost entirely meaningless to compare between these accelerators and NVIDIA GPUs, for example it says

> Hopper H200 is 47.9 gigaflops per watt at FP64 (33.5 teraflops divided by 700 watts)

But then if you consider H100 PCIe [0] instead, it's going to be 26000/350 = 74.29 GFLOPS per watt. If you go look harder you can find ones with better on-paper fp64 performance, for example AMD MI300X has 81.7 TFLOPs with typical board power of "750W Peak", which gives 108.9 GFLOPS per watt.

The truth is the power allocation of most GPGPUs are heavily tilted for Tensor usages. This has been the trend well before B300.

That's all for HPC.

And Pezy processors are certainly not designed for "AI" (i.e. linear algebra with lower input precision). For AI inference starting from 2020 everyone is talking about how many T(FL)OPS per watt, not G.

[0] which is a nerfed version of H200's precursor.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: