pwshub.com

AI chip startup Groq rakes in $640M to grow LPU cloud

Even as at least some investors begin to question the return on investment of AI infrastructure and services, venture capitalists appear to be doubling down. On Monday, AI chip startup Groq — not to be confused with xAI's Grok chatbot — announced it had scored $640 million in series-D funding to bolster its inference cloud.

Founded in 2016, the Mountain View, California-based startup began its life as an AI chip slinger targeting high throughput, low cost inferencing as opposed to training. Since then the company has transitioned to an AI infrastructure-as-a-service provider and walked away from selling hardware.

In total, Groq has raised more than $1 billion and now boasts a valuation of $2.8 billion, with its latest funding round led by the likes of BlackRock, Neuberger Berman, Type One Ventures, Cisco Investments, Global Brain, and Samsung Catalyst.

The firm's main claim to fame is that its chips can generate more tokens faster, while using less energy, than GPU-based equipment. At the heart of all of this, is Groq's Language Processing Unit (LPU), which approaches the problem of running LLMs a little differently.

Someone's face being taken over by AI, an abstract illustration

AI has yet to pay off – or is transforming business

READ MORE

As our sibling site The Next Platform previously explored, Groq's LPUs don't require gobs of pricy high-bandwidth memory or advantaged packaging — both factors that have contributed to bottlenecks in the supply of AI infrastructure.

Instead, Groq's strategy is to stitch together hundreds of LPUs, each packed with on-die SRAM, using a fiber optic interconnect. Using a cluster of 576 LPUs, Groq claims it was able to achieve generation rates of more than 300 tokens per second on Meta's Llama 2 70B model, 10x that of an HGX H100 system with eight GPUs, while consuming a tenth of the power.

Groq now intends to use its millions to expand headcount and bolster its inference cloud to support more customers. As it stands, Groq purports to have more than 360,000 developers build on GroqCloud creating applications using openly available models.

Training AI models is solved, now it's time to deploy these models so the world can use them

"This funding will enable us to deploy more than 100,000 additional LPUs into GroqCloud," CEO Jonathan Ross said Monday.

"Training AI models is solved, now it's time to deploy these models so the world can use them. Having secured twice the funding sought, we now plan to significantly expand our talent density.

These won't, however, be Groq's next-gen LPUs. Instead, they'll be built using GlobalFoundries' 14nm process node, and delivered by the end of Q1 2025. Nvidia's next-gen Blackwell GPUs are expected to be arriving within the next 12 or so months, depending on how delayed they turn out to be.

Groq is said to be working on two new generations of LPUs, which, last we heard, would utilize Samsung's 4nm process tech and deliver somewhere between 15x and 20x higher power efficiency.

You can find a deeper dive on Groq's LPU strategy and performance claims on The Next Platform.

VC Capital continues to flow into AI startups

Groq isn't the only infrastructure vendor that's managed to capitalize on all the AI hype. In fact, $640 billion is far from the largest chunk of change we've seen startups walk away with in recent memory.

As you may recall, back in May, GPU bit barn CoreWeave scored $1.1 billion in series-C funding weeks before it managed to talk Blackstone, Blackrock, and others into a loan for $7.5 billion using its GPUs as collateral.

Meanwhile, Lambda labs, another GPU cloud operator, used its cache of GPUs to secure a combined $820 million in fresh funding and debt financing since February, and it doesn't look like it is satisfied yet. Last month we learned Lambda was reportedly in talks with VCs for another $800 million in funding to support the deployment of yet more Nvidia GPUs.

  • Stock-trading apps fall under the feet of stampeding panicking investors
  • Enterprise spend on cloud up sharply as world biz splashes $80B in Q2
  • Nvidia reportedly delays Blackwell GPUs until 2025 over packaging issues
  • Intel to shed at least 15% of staff, will outsource more to TSMC, slash $10B in costs

While VC funding continues to flow into AI startups, it seems some on Wall Street are increasingly nervous about whether these multi-billion-dollar investments in AI infrastructure will ever pay off.

Still that hasn't stopped ML upstarts, such as Cerebras, from pursuing an initial public offering (IPO). Last week the outfit, best known for its dinner plate-sized accelerators aimed at model training, revealed it had confidentially filed for a public listing.

Is AI going to pay its way? Wall Street wants tech world to show it the money

LISTEN IN

The size and price range of the IPO have yet to be determined. Cerebras' rather unusual approach to the problem of AI training has helped it win north of $900 million in commitments from the likes of G42.

Meanwhile, with the rather notable exception in Intel, which saw its profits plunge $1.6 billion year-over-year in Q2 amid plans to lay off at least 15 percent of its workforce, chip vendors and the cloud providers reselling access to their accelerators have been among the biggest beneficiaries of the AI boom. Last week, AMD revealed its MI300X GPUs accounted for more than $1 billion of its datacenter sales.

However, it appears that the real litmus test for whether the AI hype train is about to derail won't come until the market leader Nvidia announces its earnings and outlook later this month. ®

Source: theregister.com

Related stories
1 week ago - AI infra startup serves up Llama 3.1 405B at 100+ tokens per second Not to be outdone by rival AI systems upstarts, SambaNova has launched inference cloud of its own that it says is ready to serve up Meta’s largest models faster than the...
1 week ago - Datacenters are the lifeline for its $30B ML-fueled boom AI has made GPUs one of the hottest commodities on the planet, driving more than $30 billion in revenues for Nvidia in Q2 alone. But, without datacenters, the chip powerhouse and...
1 month ago - The market for AI accelerators is crowded, leaving little room for new entrants. There are now a dozen or so companies in the U.S. designing chips specifically for "AI" workloads. There are a few dozen more in China, and of course, all...
3 weeks ago - Shove 32 of 'em in a box and you've got nearly 24 petaFLOPS of FP8 perf Hot Chips RISC-V champion Tenstorrent offered the closest look yet at its upcoming Blackhole AI accelerators at Hot Chips this week, which they claim can outperform...
1 month ago - Company hopes acquisition of ZT Systems will accelerate adoption of its data center chips.
Other stories
8 minutes ago - Experts at the Netherlands Institute for Radio Astronomy (ASTRON) claim that second-generation, or "V2," Mini Starlink satellites emit interference that is a staggering 32 times stronger than that from previous models. Director Jessica...
8 minutes ago - The PKfail incident shocked the computer industry, exposing a deeply hidden flaw within the core of modern firmware infrastructure. The researchers who uncovered the issue have returned with new data, offering a more realistic assessment...
8 minutes ago - Nighttime anxiety can really mess up your ability to sleep at night. Here's what you can do about it right now.
8 minutes ago - With spectacular visuals and incredible combat, I cannot wait for Veilguard to launch on Oct. 31.
8 minutes ago - Finding the perfect pair of glasses is difficult, but here's how to do so while considering your face shape, skin tone, lifestyle and personality.