Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Yeah, that sounds decent for some one-shots. The unified memory systems can have longer back-and-forth context chats, but at slower speed (at least on AMD).

I find Qwen 3 Coder to be quite usable, I get around 20TPS on my AMD AI 350 system, as long as the net-new context isn't too big.



I was looking at the 5060 Ti 16GB, and it has about half the memory bandwidth of the 5070 Ti, but at half the price here. With four of them you'd have 64 GB VRAM and still a lot cheaper than a 5090. Should get around 20-25 TPS for Qwen 3 Coder 30B, which is within usable range.

Need a big case tho or go bitcoin miner style.

Not seriously thinking about it, just playing around.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: