As enterprise cloud inference costs skyrocket, Dell Technologies is betting on on-prem AI computing as the key to affordable agentic AI at scale.
The company's new Deskside Agentic AI systems position local compute as the essential foundation for agentic workflows. Research agents burning $600 per cloud session in a single run become far more economical when the hardware is owned outright, according to Dell senior distinguished engineer Marc Hammons.
"The cloud is where frontier models go first," Hammons said. "It's also where your costs are going to be buried if you don't do something about that. The opportunity is to bring some of that compute locally on the machine and start to adjust the tokenomics of the situation."
Unlike simple prompt-response exchanges, agents iterate in loops that compound token consumption rapidly. Dell's Charlie Walker said the hardware investment can pay for itself in three to six months when burn rates are moved on-prem.
Dell's portfolio ranges from the Pro Max with GB10 for persistent local agents to the Pro Max with GB300, which delivers 20 petaFLOPS and 748 gigabytes of coherent memory in a deskside tower - bringing datacenter-scale compute to individual desks.
"You do need those frontier models in the cloud for repository-scale reasoning," Hammons said. "But then it can delegate that down to these machines that are deskside and really let it take over and drive the individual efforts."