Image

As Google eyes exponential surge in serving capability, analyst says we’re coming into ‘stage two of AI’

Google’s AI infrastructure boss warned the company needs to scale up its tech to accommodate a massive influx of users and complex requests being handled by AI products—and it may be a sign that fears of a bubble are overblown.

Amin Vahdat, a VP who leads the global AI and infrastructure team at Google, said during a presentation at a Nov. 6 all-hands meeting that the company needs to double its serving capacity every six months, with “the next 1000x in 4-5 years,” CNBC reported.

This refers to Google’s ability to ensure that Gemini and other AI products depending on Google Cloud can still work well when queried by a skyrocketing number of users. That’s different from compute, or the physical infrastructure involved in training AI.

A Google spokesperson told Fortune that “demand for AI services means we are being asked to provide significantly more computing capacity, which we are driving through efficiency across hardware, software, and model optimizations, in addition to new investments,” pointing to the company’s Ironwood chips as an example of its own hardware driving improvements in computing capacity.

In previous years, every hyperscaler—think Google Cloud but also Amazon and Microsoft Azure—rushed to increase compute in anticipation of an influx of AI users.

Now, the users are here, said Shay Boloor, chief market strategist at Futurum Equities. But as each company ratchets up its AI offerings, serving capacity is emerging as the next major challenge to tackle.

“We’re entering the stage two of AI where serving capacity matters even more than the compute capacity, because the compute creates the model, but serving capacity determines how widely and how quickly that model can actually reach the users,” he told Fortune.

Google, with its vast capital expenditures and past strategic moves to develop its own AI chips, is likely capable of doubling its serving capacity every six months, said Boloor. 

Yet Google and its competitors are still facing an uphill battle, he added, especially as AI products start to deal with more complex requests, including advanced search queries and video.

“The bottleneck is not ambition, it’s just truly the physical constraints, like the power, the cooling, the networking bandwidth and the time needed to build these energized data center capacities,” he said.

However, the fact that Google is seemingly facing so much demand for its AI infrastructure that it is pushing to double its serving capacity so quickly might be a sign that gloomy predictions made by AI pessimists aren’t entirely accurate, said Boloor.

Such concerns sent all three major stock indexes down by 1.9% or more this past week—including the tech-heavy Nasdaq.

“This is not like speculative enthusiasm, it’s just unmet demand sitting in backlog,” he said. “If things are slowing down a bit more than a lot of people hope for, it’s because they’re all constrained on the compute and more serving capacity.”

SHARE THIS POST