The Mac Studio they talk about running the 685b model on costs 12 000 dollaridoos (after 20% taxes here). I get that it’s a consumer device and that it draws less power, but at that point you’d just get a server for less. The power consumption is an outlier since it has everything on chip
But imagine what you’ll be able to run it on in four more months. But yeah, it’s stretching the definition of consumer hardware a bit.
You can use the smaller models on (beefy) consumer hardware already. That’s something, right? 😅
I want the full 1TB model running on my 10 year old linux laptop
Just put your persistent memory as swap. Easy
I think the key part is that you can run these large scale models cheaply in terms of energy cost. The price of hardware will inevitably come down going forward, but now we know that there is no fundamental blocker for running models efficiently.
I generally agree, but given how niche a powerful SoC like this is, I doubt it matters right now (<5 years). I understand it proves a point, but I wager there’s still a long-ish way to see power-efficient hardware like this available for cheaper (which will most likely come from China natively)
Yeah, a 5 year or so timeline before we see SoC design becomes dominant is a good guess. There are other interesting ideas like analog chips that have potential to drastically cut power usage for neural networks as well. Next few years will be interesting to watch.
deleted by creator
the key bit
This represents a potentially significant shift in AI deployment. While traditional AI infrastructure typically relies on multiple Nvidia GPUs consuming several kilowatts of power, the Mac Studio draws less than 200 watts during inference. This efficiency gap suggests the AI industry may need to rethink assumptions about infrastructure requirements for top-tier model performance.
The takeaway from all this is that Western tech companies have gone full parasite and are inflating costs at all costs to funnel more and more VC and public money into their incestuous cesspool. From what Deepseek first demonstrated it seems possible that OpenAI could cut their costs by a significant factor (like by half or maybe a lot more). This could allow them to turn a profit for the first time in their lives but the thought does not even cross their minds.
And that’s the problem with profit motive in a nutshell.
More than turning a profit, they could bend the arc of ai’s climate crisis by competing in this domain of efficiency
32 gpus -> 1 gpu loosely translates to a 10x reduction in water and energy needed for interference after the 100x reduction in training requirements.
And these are just the improvements within 4 months…
but have you considered tinysquare man massacre
Shit. I had not.