I’m hoping this isn’t as attractive as it sounds for non-hobbyists because the performance won’t scale well to parallel workloads or even context processing, where parallelism can be better used.
Hopefully this makes it really nice for people that want the experiment with LLMs and have a local model but means well funded companies won’t have any reason to grab them all vs GPUs.
No way buying a bunch of minis could be as efficient as much denser GPU racks. You have to consider all the logistics and power draw, and high end nVidia stuff and probably even AMD stuff is faster than M series GPUs.
What this does offer is a good alternative to GPUs for smaller scale use and research. At small scale it’s probably competitive.
Apple wants to dominate the pro and serious amateur niches. Feels like they’re realizing that local LLMs and AI research is part of that, is the kind of thing end users would want big machines to do.
There’s been rumors of Apple working on M-chips that have the GPU and CPU as discrete chiplets. The original rumor said this would happen with the M5 Pro, so it’s potentially on the roadmap.
Theoretically they could farm out the GPU to another company but it seems like they’re set on owning all of the hardware designs.
TSMC has a new tech that allows seamless integration of mini chiplets, i.e. you can add as many CPU/GPU cores in mini chiplets as you wish and glue them seamlessly together, at least in theory. The rumor is that TSMC had some issues with it which is why M5P and M5M are delayed.
I’m not saying a Mac Pro with expansion slots, I’m saying a Mac Pro whose marketing angle is locally running AI models. A hungry market that would accept moderate performance and is already used to bloated price tags has to have them salivating.
I think the hold up here is whether TSMC can actually deliver the M5 Pro/Ultra and whether the MLX team can give them a usable platform.
Power draw? A entire Mac Pro running flat out uses less power than 1 5090.
If you have a workload that needs a huge memory footprint then the tco of the Macs, even with their markup may be lower.
I haven’t looked yet but I might be a candidate for something like this, maybe. I’m RAM constrained and, to a lesser extent, CPU constrained. It would be nice to offload some of that. That said, I don’t think I would buy a cluster of Macs for that. I’d probably buy a machine that can take a GPU.
I’m not particularly interested in training models, but it would be nice to have eGPUs again. When Apple Silicon came out, support for them dried up. I sold my old BlackMagic eGPU.
That said, the need for them also faded. The new chips have performance every bit as good as the eGPU-enhanced Intel chips.
eGPU with an Apple accelerator with a bunch or RAM and GPU cores could be really interesting honestly. I’m pretty sure they are capable of designing something very competitive especially in terms of performance per watt.
They are inseparable for Apple. CPUS/GPUs/memory. They can use chipsets to tweak ratios, but I doubt they will change the underlying module format—everything together.
My suggestion is to accept that format and just provide a way to network them at a low level via pci or better.
The lack of official Linux/BSD support is enough to make it DOA for any serious large-scale deployment. Until Apple figures out what they're doing on that front, you've got nothing to worry about.
Having used both professionally, once you understand how to drive Apple's MDM, Mac OS is as easy to sysadmin as Linux. I'll grant you it's a steep learning curve, but so is Linux/BSD if you're coming at it fresh.
In certain ways it's easier - if you buy a device through Apple Business you can have it so that you (or someone working in a remote location) can take it out of the shrink wrap, connect it to the internet, and get a configured and managed device automatically. No PXE boot, no disk imaging, no having it shipped to you to configure and ship out again. If you've done it properly the user can't interrupt/corrupt the process.
The only thing they're really missing is an iLo, I can imagine how AWS solved that, but I'd love to know.
Where the in the world are you working where MDM is the limiting factor on Linux deployments? North Korea?
Macs are a minority in the datacenter even compared to Windows server. The concept of a datacenter Mac would disappear completely if Apple let free OSes sign macOS/iOS apps.
I’m talking about using MDM with Mac OS (to take advantage of Apple Silicon, not licensing) in contrast to the tools we already have with other OSes. Probably you could do it to achieve a large scale on prem Linux deployment, fortunately I’ve never tried.
Well, be that as it may, it's quite unrelated to deploying Macs in the datacenter. It's definitely not a selling point to people putting Proxmox or k8s on their machines.
macOS is XNU-based. There is BSD code that runs in the microkernel level and BSD tools in the userland, but the kernel does not resemble BSD's architecture or adopt BSD's license.
This is an issue for some industry-standard software like CUDA, which does provide BSD drivers with ARM support that just never get adopted by Apple: https://www.nvidia.com/en-us/drivers/unix/
Because Apple already does...? There's still PowerPC and MIPS code that runs in macOS. Asking for CUDA compatibility is not somehow too hard for the trillion-dollar megacorp to handle.
Hopefully this makes it really nice for people that want the experiment with LLMs and have a local model but means well funded companies won’t have any reason to grab them all vs GPUs.