Enterprises are grappling with the economics of generative AI-runaway token costs, data sovereignty demands, and a gap between pilots and production ROI. Dell Technologies and H2O.ai are offering a solution: vertical AI models running on-premises.

Satish Iyer, Dell's VP and CTO of technology innovation and ecosystems, said the goal is to bring AI to where customer data resides. Most enterprise data stays on-prem, and running large general-purpose models in the cloud consumes tokens at scale.

H2O.ai founder and CEO Sri Ambati said top developers burn $1,000 a day in tokens, yet many users see little measurable return. The answer lies in small language models and vertical AI models, which combine predictive and generative capabilities for specific industries. H2O.ai recently released TabH2O, a tabular foundation model for structured enterprise data, requiring no expensive parameter tuning.

Dell has deployed more than 5,000 AI factories globally, with financial services, healthcare, and telecom leading adoption. By orchestrating across ecosystems-including Gemini, OpenAI models, and open-source models-enterprises gain predictable token consumption and pricing.

The path forward is distilling large models into deployable assets that run at the edge: in mining, hospitals, and contact centers, where decisions must be made locally.