AI stack splits in three: $965B at the top, new competition in inference
Anthropic raises $65B at $965B, Groq seeks $650M, Apple pays Google $1B/year for Gemini — the AI supply chain splits into layers with opposite dynamics.
Summary
- The frontier will narrow to 3-4 vendors — Anthropic raised $65B at a $965B post-money valuation; Apple signed a $1B/year deal with Google for Siri-powering Gemini.
- Inference is fragmenting and getting cheaper — Groq is raising $650M after Nvidia’s $20B partial deal; competition for compute is intensifying.
- Even Apple gave up on building it themselves — the world’s richest tech company is licensing a 1.2T-parameter Gemini because its own in-house models (max 150B parameters) cannot keep up.
Bottom line: the AI supply chain is splitting into layers with opposite dynamics — concentration at the top, cheaper competition in the middle. Your job as a leader is to know which layer you’re buying from and which you’re not.
1. Anthropic at $965B — frontier capital concentrates at the top
What changed. Anthropic raised $65B in Series H at a $965B post-money valuation. Lead investors: Altimeter, Dragoneer, Greenoaks, Sequoia. Annualized run rate now $47B — up from $9B in December. Alongside the round, the company released Claude Opus 4.8 with a 1M-token context window, 69.2% on SWE-Bench Pro (10 points above GPT-5.5), and a new “Dynamic Workflows” (ultracode) capability — a single task spawns hundreds of parallel subagents (demonstrated migrating a 750,000-line codebase in 6 days).
Why it matters. Two years ago there were a dozen serious frontier-model contenders. Today there are 3-4 — Anthropic, OpenAI, Google, maybe Mistral. The capital wall (the cash you need to stay at the frontier) is now beyond anyone who doesn’t have $60B+ in the bank. That means your choice of vendor over the next two years narrows sharply, and switching costs rise.
What to do this month.
- List which AI models your workflows currently depend on (chat, embeddings, code, voice). How many vendors? How much of your data lives behind one?
- Add integration for at least one alternative frontier model — even if you don’t use it today. Multi-vendor clauses in contracts will be standard next year.
- Look at Anthropic’s Dynamic Workflows as a stress test — if what your team does in a week can be done by an agent in 6 days with proper orchestration, which of your workflows are worth automating first?
What to expect.
- Next 30-90 days: OpenAI and Google respond with their own price or context-window improvements.
- 6-month window: frontier API prices on input tokens drop 30-50% at the top tier.
- By year-end: the first European enterprises start requiring “multi-vendor clauses” from SaaS vendors that white-label only one model.
2. Groq’s $650M and Nvidia’s $20B — the inference layer fragments
What changed. Groq, the AI inference-chip startup, is raising $650M from existing investors — Disruptive and Infinitium have agreed to cover any shortfall from other investors. This follows a December deal in which Nvidia paid ~$20B to bring senior Groq employees aboard and license Groq’s hardware technology, without a full acquisition. Groq is currently led by interim CEO Adam Winter and CFO Matt Eng. The key pivot: from chip manufacturing to inference-cloud services.
Why it matters. Inference — the compute that runs when your user hits “submit” and the AI responds — is now a larger market than model training. The fact that Nvidia paid $20B to absorb Groq’s talent, and that Groq’s investors are putting in another $650M after that, signals that inference prices are being pushed down hard. For a mid-size business, this is a clear message: do not sign 3-year fixed-price AI compute contracts right now; in 6-12 months, the price you pay per million output tokens may drop 40-60%.
What to do this month.
- Review every AI cloud contract coming up for renewal this quarter and next. Defer long-term commitments by 3-6 months where you can.
- Calculate your monthly AI inference bill (including every SaaS tool that uses OpenAI/Anthropic “under the hood”). Do you know the number? If not — that’s your first problem.
- Identify one critical workflow where you are overpaying for AI right now, and find an alternative (different model pricing, on-prem model, or simply shorter context).
What to expect.
- 90 days: another one or two inference startups (Cerebras, SambaNova) announce major rounds.
- 6 months: the first public price war between Anthropic, OpenAI and Google on input tokens below $1/million.
- Year-end: “consolidated AI bill” emerges as a line item in larger company finance reports.
3. Apple pays Google $1B/year for Gemini — even the giants stop building in-house
What changed. Apple and Google jointly announced on 12 January that Apple will gain access to a custom 1.2-trillion-parameter Gemini model, built specifically for Siri and Apple Intelligence. The deal is valued at roughly $1B/year (Bloomberg / Mark Gurman). Apple distills (compresses) the large Gemini into smaller models that run on-device on iPhone with no network connection. iOS 26.4 delivers the first Gemini-powered Siri features to 1.5 billion daily users. A full Siri redesign is expected with iOS 27 (WWDC 8 June, public release September).
Why it matters. Apple is the most resource-rich technology company in the world, with $200B+ in cash. For three years Apple built its own AI models — peaking at 150B parameters. The message from this deal: even with all that cash and talent, Apple admitted it could not catch up to the frontier in time. For mid-size businesses still spending budget on “our own AI team training our own model” — that’s wasted cost and time. Your competitive opening is not in the model — it’s in how you integrate existing models with your data and workflows.
What to do this month.
- Look at how your AI budget splits: what percentage goes to “the model” (training, fine-tuning, our own LLM) vs. “integration” (your data, your workflows, your interface)? A healthy split today is 10/90 or 5/95 in favour of integration.
- Identify 1-2 workflows where calling a frontier model directly (via API) is cheaper and faster than buying a SaaS tool that does the same thing.
- If your team is working on “our own LLM fine-tune” — re-evaluate whether it’s still worth it compared to a frontier model used with your data in the context window.
What to expect.
- 30 days: more detail from WWDC 8 June on how Apple integrates Gemini in iOS 27.
- 90 days: the first regulated enterprises (banking, healthcare) start publicly debating why they cannot use Apple-Google Siri for corporate use.
- Year-end: at least one major non-US tech company (Samsung, or an EU player) announces a similar licensing deal with Anthropic or Mistral.
Today’s picture
Three pieces of news — Anthropic $965B, Groq $650M, Apple-Google $1B/year — tell the same story: the AI supply chain is splitting into layers with opposite dynamics. At the top — frontier models — capital and talent concentrate into 3-4 vendors. In the middle — inference — competition rises and prices fall. At the bottom — applications and integration — open ground, where most of your economic value will be created. Even Apple’s build strategy has retreated from the middle layers to focus only on application. That’s a hint for where you, with a smaller budget, should be focusing too.
| Event | Consequence |
|---|---|
| Anthropic at $965B valuation, $47B run rate | Frontier vendor count narrows to 3-4; multi-vendor clauses become critical |
| Groq $650M after Nvidia $20B deal | Inference prices fall over next 6-12 months; don’t lock long-term now |
| Apple licenses Gemini at $1B/year | Even giants stop building in-house; redirect your budget from “our own model” to “integration” |
Three questions for the leader:
- Do you know your company’s total AI spend (every SaaS tool that uses OpenAI/Anthropic/Google under the hood)?
- If Anthropic or OpenAI doubled their price tomorrow, which workflow would stop, and in how many days could you switch to an alternative?
- Does your 2026 budget include a “build our own model” or “fine-tune our own LLM” line? If yes — would you like to discuss why Apple just gave up on doing the same?
Sources
- Latent Space: Anthropic raises $965B Series H, releases Opus 4.8 and Dynamic Workflows/ultracode
- TechCrunch: After Nvidia’s $20B not-acqui-hire, AI chip startup Groq reportedly raising $650M
- Financial Content: Apple Inks $1 Billion Deal with Google to Power Gemini-Fueled Siri Revamp
- Ars Technica: Apple working to cram massive Gemini model into iPhone to power new Siri
- Koen van Gilst: Notes from the Mistral AI Now Summit