I run 4 Mac Studio ultras at work (they’re pricy when maxed out), for local-first AI dev services. But there’s a few things that make me want to switch to the Spark. Networking is the biggest one, the Macs have Thunderbolt and Ethernet, but if I run distributed inference with EXO over Thunderbolt; the drop in tokens/second is massive. These Sparks get RDMA and can stack nicely. The other big one is access to CUDA, MLX has come a long way but being able to have CUDA and GPU access in containers would simplify the stack so nicely. If I had a USB-C/Thunderbolt backplane it might compare, but scaling with the Spark is likely a lot more straightforward.
I call the stack with Mac Studios “MacAIver” because it feels like a duct tape solution, but the Spark equivalent would likely be more elegant.
You'd have to stack 16 of these to get 2TB of VRAM, equivalent to 4 Mac Studios 512GBs chained together.
16 compared to 4. Surely even much faster networking in the Spark would degrade with that many devices?
Biggest problem with Macs is that they don't have dedicated tensor cores in the GPU which makes prompt processing very slow compared to Nvidia and AMD.
It’s $12k for each Mac Studio, and the networking makes them only effective individually (it’s like less that 15 tokens/s with EXO) while NVLINK is very effective. The Spark is definitely more scalable, but the MLX and metal teams are cooking, so honestly either way is still winning.
I mean the spark is $3,999 and current M3 Max 28-Core CPU 60-Core GPU is the same price. I would expect the refreshed studio will stay around the same price.
Fly to a state with no sales tax. Portland, Oregon serves this purpose for high end shoppers that come from out of state and out of the country. Folks fly in to buy their Rolex, Gucci, etc, with no tax.
1. Yes they will if they suspect you (age group, clothes, newest phone, certain flights like LAX, LGA) as all custom officers all over the world do. As my bags have been searched every time I've entered the US.
US customs won't care if they find new electronics, so they're no problem (they are annoying, with the suitcase thing at port-of-entry, but no problem). As for German customs, I don't know, but: do the initial leave out of Germany over the road where there's checks only in theory and leave from an airport not too far over the border. You can probably get a cheaper flight in the process (e.g. Fly from Basel)
Because it's only a superior solution if you just want one box, and that mostly for inference. Once you start scaling to larger loads, it's much trickier to get a clusters of Macs to efficiently process them in parallel, whereas datacenter GPUs are designed for clusters.