DeepSeek's free 685B-parameter AI model runs at 20 tokens/second on Apple's Mac Studio, outperforming Claude Sonnet while using just 200 watts, challenging OpenAI's cloud-dependent business model.
I think the key part is that you can run these large scale models cheaply in terms of energy cost. The price of hardware will inevitably come down going forward, but now we know that there is no fundamental blocker for running models efficiently.
I generally agree, but given how niche a powerful SoC like this is, I doubt it matters right now (<5 years). I understand it proves a point, but I wager there’s still a long-ish way to see power-efficient hardware like this available for cheaper (which will most likely come from China natively)
Yeah, a 5 year or so timeline before we see SoC design becomes dominant is a good guess. There are other interesting ideas like analog chips that have potential to drastically cut power usage for neural networks as well. Next few years will be interesting to watch.
I think the key part is that you can run these large scale models cheaply in terms of energy cost. The price of hardware will inevitably come down going forward, but now we know that there is no fundamental blocker for running models efficiently.
I generally agree, but given how niche a powerful SoC like this is, I doubt it matters right now (<5 years). I understand it proves a point, but I wager there’s still a long-ish way to see power-efficient hardware like this available for cheaper (which will most likely come from China natively)
Yeah, a 5 year or so timeline before we see SoC design becomes dominant is a good guess. There are other interesting ideas like analog chips that have potential to drastically cut power usage for neural networks as well. Next few years will be interesting to watch.