AI stack splits in three: $965B at the top, new competition in inference

Summary

The frontier will narrow to 3-4 vendors — Anthropic raised $65B at a $965B post-money valuation; Apple signed a $1B/year deal with Google for Siri-powering Gemini.
Inference is fragmenting and getting cheaper — Groq is raising $650M after Nvidia’s $20B partial deal; competition for compute is intensifying.
Even Apple gave up on building it themselves — the world’s richest tech company is licensing a 1.2T-parameter Gemini because its own in-house models (max 150B parameters) cannot keep up.

Bottom line: the AI supply chain is splitting into layers with opposite dynamics — concentration at the top, cheaper competition in the middle. Your job as a leader is to know which layer you’re buying from and which you’re not.

1. Anthropic at $965B — frontier capital concentrates at the top

What changed. Anthropic raised $65B in Series H at a $965B post-money valuation. Lead investors: Altimeter, Dragoneer, Greenoaks, Sequoia. Annualized run rate now $47B — up from $9B in December. Alongside the round, the company released Claude Opus 4.8 with a 1M-token context window, 69.2% on SWE-Bench Pro (10 points above GPT-5.5), and a new “Dynamic Workflows” (ultracode) capability — a single task spawns hundreds of parallel subagents (demonstrated migrating a 750,000-line codebase in 6 days).

Why it matters. Two years ago there were a dozen serious frontier-model contenders. Today there are 3-4 — Anthropic, OpenAI, Google, maybe Mistral. The capital wall (the cash you need to stay at the frontier) is now beyond anyone who doesn’t have $60B+ in the bank. That means your choice of vendor over the next two years narrows sharply, and switching costs rise.

What to do this month.

List which AI models your workflows currently depend on (chat, embeddings, code, voice). How many vendors? How much of your data lives behind one?
Add integration for at least one alternative frontier model — even if you don’t use it today. Multi-vendor clauses in contracts will be standard next year.
Look at Anthropic’s Dynamic Workflows as a stress test — if what your team does in a week can be done by an agent in 6 days with proper orchestration, which of your workflows are worth automating first?

What to expect.

Next 30-90 days: OpenAI and Google respond with their own price or context-window improvements.
6-month window: frontier API prices on input tokens drop 30-50% at the top tier.
By year-end: the first European enterprises start requiring “multi-vendor clauses” from SaaS vendors that white-label only one model.

2. Groq’s $650M and Nvidia’s $20B — the inference layer fragments

What changed. Groq, the AI inference-chip startup, is raising $650M from existing investors — Disruptive and Infinitium have agreed to cover any shortfall from other investors. This follows a December deal in which Nvidia paid ~$20B to bring senior Groq employees aboard and license Groq’s hardware technology, without a full acquisition. Groq is currently led by interim CEO Adam Winter and CFO Matt Eng. The key pivot: from chip manufacturing to inference-cloud services.

Why it matters. Inference — the compute that runs when your user hits “submit” and the AI responds — is now a larger market than model training. The fact that Nvidia paid $20B to absorb Groq’s talent, and that Groq’s investors are putting in another $650M after that, signals that inference prices are being pushed down hard. For a mid-size business, this is a clear message: do not sign 3-year fixed-price AI compute contracts right now; in 6-12 months, the price you pay per million output tokens may drop 40-60%.

What to do this month.

Review every AI cloud contract coming up for renewal this quarter and next. Defer long-term commitments by 3-6 months where you can.
Calculate your monthly AI inference bill (including every SaaS tool that uses OpenAI/Anthropic “under the hood”). Do you know the number? If not — that’s your first problem.
Identify one critical workflow where you are overpaying for AI right now, and find an alternative (different model pricing, on-prem model, or simply shorter context).

What to expect.

90 days: another one or two inference startups (Cerebras, SambaNova) announce major rounds.
6 months: the first public price war between Anthropic, OpenAI and Google on input tokens below $1/million.
Year-end: “consolidated AI bill” emerges as a line item in larger company finance reports.

3. Apple pays Google $1B/year for Gemini — even the giants stop building in-house

What changed. Apple and Google jointly announced on 12 January that Apple will gain access to a custom 1.2-trillion-parameter Gemini model, built specifically for Siri and Apple Intelligence. The deal is valued at roughly $1B/year (Bloomberg / Mark Gurman). Apple distills (compresses) the large Gemini into smaller models that run on-device on iPhone with no network connection. iOS 26.4 delivers the first Gemini-powered Siri features to 1.5 billion daily users. A full Siri redesign is expected with iOS 27 (WWDC 8 June, public release September).

Why it matters. Apple is the most resource-rich technology company in the world, with $200B+ in cash. For three years Apple built its own AI models — peaking at 150B parameters. The message from this deal: even with all that cash and talent, Apple admitted it could not catch up to the frontier in time. For mid-size businesses still spending budget on “our own AI team training our own model” — that’s wasted cost and time. Your competitive opening is not in the model — it’s in how you integrate existing models with your data and workflows.

What to do this month.

Look at how your AI budget splits: what percentage goes to “the model” (training, fine-tuning, our own LLM) vs. “integration” (your data, your workflows, your interface)? A healthy split today is 10/90 or 5/95 in favour of integration.
Identify 1-2 workflows where calling a frontier model directly (via API) is cheaper and faster than buying a SaaS tool that does the same thing.
If your team is working on “our own LLM fine-tune” — re-evaluate whether it’s still worth it compared to a frontier model used with your data in the context window.

What to expect.

30 days: more detail from WWDC 8 June on how Apple integrates Gemini in iOS 27.
90 days: the first regulated enterprises (banking, healthcare) start publicly debating why they cannot use Apple-Google Siri for corporate use.
Year-end: at least one major non-US tech company (Samsung, or an EU player) announces a similar licensing deal with Anthropic or Mistral.

Today’s picture

Three pieces of news — Anthropic $965B, Groq $650M, Apple-Google $1B/year — tell the same story: the AI supply chain is splitting into layers with opposite dynamics. At the top — frontier models — capital and talent concentrate into 3-4 vendors. In the middle — inference — competition rises and prices fall. At the bottom — applications and integration — open ground, where most of your economic value will be created. Even Apple’s build strategy has retreated from the middle layers to focus only on application. That’s a hint for where you, with a smaller budget, should be focusing too.

Event	Consequence
Anthropic at $965B valuation, $47B run rate	Frontier vendor count narrows to 3-4; multi-vendor clauses become critical
Groq $650M after Nvidia $20B deal	Inference prices fall over next 6-12 months; don’t lock long-term now
Apple licenses Gemini at $1B/year	Even giants stop building in-house; redirect your budget from “our own model” to “integration”

Three questions for the leader:

Do you know your company’s total AI spend (every SaaS tool that uses OpenAI/Anthropic/Google under the hood)?
If Anthropic or OpenAI doubled their price tomorrow, which workflow would stop, and in how many days could you switch to an alternative?
Does your 2026 budget include a “build our own model” or “fine-tune our own LLM” line? If yes — would you like to discuss why Apple just gave up on doing the same?

Summary

1. Anthropic at $965B — frontier capital concentrates at the top

2. Groq’s $650M and Nvidia’s $20B — the inference layer fragments

3. Apple pays Google $1B/year for Gemini — even the giants stop building in-house

Today’s picture

Sources