How much do foundation models matter?
It might seem like a silly question, but it’s come up a lot in my conversations with AI startups, which are increasingly comfortable with businesses that used to be dismissed as “GPT wrappers,” or companies that build interfaces on top of existing AI models like ChatGPT. These days, startup teams are focused on customizing AI models for specific tasks and interface work, and see the foundation model as a commodity that can be swapped in and out as necessary. That approach was on display especially at last week’s Boxworks conference, which seemed devoted entirely to the user-facing software built on top of AI models.
Part of what is driving this is that the scaling benefits of pre-training — that initial process of teaching AI models using massive datasets, which is the sole domain of foundation models — has slowed down. That doesn’t mean AI has stopped making progress, but the early benefits of hyperscaled foundational models have hit diminishing returns, and attention has turned to post-training and reinforcement learning as sources of future progress. If you want to make a better AI coding tool, you’re better off working on fine-tuning and interface design rather than spending another few billion dollars worth in server time on pre-training. As the success of Anthropic’s Claude Code shows, foundation model companies are quite good at these other fields too — but it’s not as durable an advantage as it used to be.
In short, the competitive landscape of AI is changing in ways that undermine the advantages of the biggest AI labs. Instead of a race for an all-powerful AGI that could match or exceed human abilities across all cognitive tasks, the immediate future looks like a flurry of discrete businesses: software development, enterprise data management, image generation and so on. Aside from a first-mover advantage, it’s not clear that building a foundation model gives you any advantage in those businesses. Worse, the abundance of open-source alternatives means that foundation models may not have any price leverage if they lose the competition at the application layer. This would turn companies like OpenAI and Anthropic into back-end suppliers in a low-margin commodity business – as one founder put it to me, “like selling coffee beans to Starbucks.”
It’s hard to overstate what a dramatic shift this would be for the business of AI. Throughout the contemporary boom, the success of AI has been inextricable from the success of the companies building foundation models — specifically, OpenAI, Anthropic, and Google. Being bullish on AI meant believing that AI’s transformative impact would make these into generationally important companies. We could argue about which company would come out on top, but it was clear that some foundation model company was going to end up with the keys to the kingdom.
At the time, there were lots of reasons to think this was true. For years, foundation model development was the only AI business there was — and the fast pace of progress made their lead seem insurmountable. And Silicon Valley has always had a deep-rooted love of platform advantage. The assumption was that, however AI models ended up making money, the lion’s share of the benefit would flow back to the foundation model companies, who had done the work that was hardest to replicate.
The past year has made that story more complicated. There are lots of successful third-party AI services, but they tend to use foundation models interchangeably. For startups, it no longer matters whether their product sits on top of GPT-5, Claude or Gemini, and they expect to be able to switch models in mid-release without end users noticing the difference. Foundation models continue to make real progress, but it no longer seems plausible for any one company to maintain a large enough advantage to dominate the industry.
Techcrunch event
San Francisco
|
October 27-29, 2025
We already have plenty of indication that there is not much of a first-mover advantage. As venture capitalist Martin Casado of a16z pointed out on a recent podcast, OpenAI was the first lab to put out a coding model, as well as generative models for image and video — only to lose all three categories to competitors. “As far as we can tell, there is no inherent moat in the technology stack for AI,” Casado concluded.
Of course, we shouldn’t count foundation model companies out just yet. There are still lots of durable advantages on their side, including brand recognition, infrastructure, and unthinkably vast cash reserves. OpenAI’s consumer business may prove harder to replicate than its coding business, and other advantages may emerge as the sector matures. Given the fast pace of AI development, the current interest in post-training could easily reverse course in the next six months. Most uncertain of all, the race toward general intelligence could pay off with new breakthroughs in pharmaceuticals or materials science, radically shifting our ideas about what makes AI models valuable.
But in the meantime, the strategy of building ever-bigger foundation models looks a lot less appealing than it did last year — and Meta’s billion-dollar spending spree is starting to look awfully risky.