The AI Stack Weekly
Issue archive.
A weekly read on the flywheel between software, hardware, and networking. Boardroom-grade. Source-graded. Predictions scored.
Each issue runs to about seven minutes, ships every Friday, and is archived here the moment it lands. The thesis the brief defends is published separately and revised when evidence demands.
Subscribe via RSSPublished issues.
- Issue 07/Week 23 of 2026/
The AI factory became a power-and-fabric problem, not a model-release problem.
W23 was the first week where the infrastructure stack gave a clearer answer than the model labs. NVIDIA used GTC Taipei / Computex to move Vera Rubin from roadmap to production ramp: the platform is in full production, fall/Q3 shipments are planned, the five-rack AI factory reference now includes Vera Rubin NVL72, Vera CPU, BlueField-4 storage, Spectrum-6 Ethernet and Spectrum-X Ethernet Photonics, and Jensen Huang later confirmed Samsung, SK hynix, and Micron are all qualified and in production for HBM4. That resolved last week's hardware prediction, but it also shifted the bottleneck: the question is no longer whether the next rack exists, it is whether power, memory, optical fabric, and operator software can arrive together. On software, the closed frontier was quiet — Gemini 3.5 Pro still had not GA'd by the end of the window — while open weights widened in the efficient-agent layer: JetBrains Mellum2, NVIDIA Cosmos 3, and Holo3.1 all targeted deployable sub-agents, physical-AI reasoning, or local computer-use rather than a monolithic chatbot benchmark. On applications, Microsoft Scout, Salesforce Coworker, ServiceNow Otto, Wordsmith, and Stilta all pointed at the same control-plane fight: governed agents with identities, permissions, and workflow authority. Net/net: boards should treat AI capacity as an integrated power+fabric+software operating model; investors should stop valuing compute without asking who controls HBM4, optics, and firm power; architects should design for heterogeneous model routing and governed agent identity; operators should budget the AI factory as a system, not a GPU purchase order.
- Issue 06/Week 22 of 2026/
A quiet release week resolved last week's two biggest open questions — and moved the contested ground to optics and the agent control plane.
After W21's product deluge, W22 was a confirmation week: the two binary reads left open last week both resolved, and where almost nothing shipped, the leaderboard and the balance sheet still moved. On software, Claude Opus 4.8 (May 28) retook #1 on the Artificial Analysis Intelligence Index (61.4 vs GPT-5.5's 60.2) with a >10-point SWE-Bench Pro lead — at flat list pricing — while Gemini 3.5 Pro slipped to June, leaving Anthropic the unchallenged closed frontier. On capital, the W21 watch item resolved cleanly: Anthropic's round closed May 28 as a $65B Series H at a $965B post-money, the largest private AI raise on record (though ~$15B is previously-committed hyperscaler capital, not fresh cash). On hardware, the other open risk closed too — Samsung's union ratified its wage deal on May 27 (73.7% approval), formally averting the strike that threatened HBM4 packaging, so the binding constraint stays supply (CoWoS + HBM4 + surging DRAM pricing), not demand. The genuinely new front was networking: four independent vendors (GlobalFoundries, Wiwynn, Credo, Edgecore) pushed co-packaged optics and all-photonic fabric from demos toward deployable product in-window, with one analyst house putting a ~$154B TAM on optical interconnect — a textbook Gilder arc where per-GPU bandwidth, and the dollar content that carries it, scales faster than compute. So-what for the four audiences: boards should treat the HBM4 supply scare as closed and the custom-silicon counter-flywheel (ASIC shipments +44.6% YoY vs +16.1% for merchant GPUs) as the durable margin question; investors should read the Anthropic close as the answered binary and discount the louder unverified headlines (a '$45B Anthropic-pays-xAI-through-2029' filing the principal himself recut to a ~180-day lease; a secondary-sourced '$1T OpenAI September IPO'); architects should re-baseline coding-agent evals on Opus 4.8 but pin effort levels first; operators should put agent governance and the data-cloud control plane (Snowflake's Natoma/MCP buy, Salesforce's $1.2B Agentforce ARR) on this quarter's procurement roadmap.
- Issue 05/Week 21 of 2026/
The bull thesis got every confirmation it needed. The strategic battleground moved to runtimes.
W21 was the loudest week of Q2 across every lens at once. NVIDIA Q1 FY27 (Tuesday May 20) printed $81.6B revenue (+85% YoY) with Data Center at $75.2B (+92% YoY, +21% QoQ from $62.3B) and a Q2 guide of $91B with zero China DC compute assumed; total supply commitments stepped to $145B; $80B additional buyback authorized and the dividend lifted 25x. Huang said NVIDIA 'will be supply-constrained through the entire life of Vera Rubin' (Q3 2026 production) — the bear thesis that hyperscaler capex isn't translating to chip revenue is gone, and the binding constraint moved firmly to HBM4 and CoWoS packaging. The frontier-lab valuation curve reset 15x in 14 months. Bloomberg (Thursday May 22) reported Anthropic's $30B-plus round closing 'as soon as next week' at a $900-plus-billion valuation, vaulting past OpenAI's $852B March mark, with Sequoia, Dragoneer, Altimeter, and Greenoaks each writing ~$2B. NVIDIA's Q1 transcript explicitly named Anthropic alongside the hyperscalers as a Blackwell deployment customer, confirming the equity bid matches compute-side demand. Google I/O 2026 (May 19-22) compressed a normal three-month product cycle into one keynote — Gemini 3.5 Flash (AA Intelligence Index 55.3, Terminal-Bench 76.2%, 4x faster than incumbent frontier at ~280 tok/sec), Gemini Omni Flash (native text-and-image-and-audio-and-video into grounded video out, shipping immediately in Gemini app / Flow / YouTube Shorts), Antigravity 2.0 standalone agent IDE with managed Linux sandboxes, and TPU 8t/8i dual-chip 8th gen. Combined with OpenAI Codex 'Goal mode' GA + macOS Appshots (May 21), Anthropic's self-hosted Managed Agent sandboxes + MCP tunnels (May 19), and xAI Grok Build (W20), the strategic battleground shifted decisively from model intelligence to agent-runtime monetization and ecosystem lock-in. The specialist canopy kept widening at a fraction of frontier cost. Cohere Command A+ shipped the first-ever Apache 2.0 frontier-adjacent MoE (218B/25B-active, runs on 2x H100, τ²-Bench Telecom jumped 37%→85% generation-over-generation). Microsoft Research Fara1.5-27B (open weights, fine-tuned on Qwen3.5) hit 72% Online-Mind2Web — beating OpenAI Operator (58.3%) and Gemini Computer Use (57.3%) — collapsing the cost structure for browser-agent fleets. Alibaba Qwen3.7-Max (May 20) became the first Chinese model in the AA Intelligence Index top 5 (56.6, ahead of Gemini 3.5 Flash) with a demonstrated 35-hour autonomous tool-use run. The one bullet dodged: Samsung-union talks reached a tentative wage agreement late Wednesday May 20 — about an hour before the planned 18-day general strike — granting a 10.5% performance bonus pending ratification (vote May 27-28). The W20 downside scenario (Vera Rubin HBM4 supply at acute risk) is largely off the table conditional on the vote passing. So-what for the four audiences: boards should retire 'AI capex peak' theses and treat federal-procurement + capability-gating as durable vendor-risk dimensions; investors should treat the Anthropic close (not the headline valuation) as the binary read of the week and price NVIDIA on supply-not-demand; architects and operators should pull agent-runtime selection into procurement bake-offs now — model swaps inside someone else's harness are constrained.
- Issue 04/Week 20 of 2026/
Frontier text took a breath. The constraint stack moved to supply, regulation, and pricing structure.
The model layer took a breath this week. No new GPT-class or Claude refresh between W19 and Google I/O (May 19-20), no headline frontier-text GA — but the specialist canopy widened sharply: Perceptron Mk1 priced video and embodied reasoning 80-90% below Gemini Flash Lite on May 12; NVIDIA's open-weights SANA-WM put a minute-scale 720p world model on a single RTX 5090 on May 15; OpenBMB's MiniCPM-V 4.6 collapsed multimodal SLMs to 1.3B running on-device across iOS, Android, and HarmonyOS on May 11. Pricing structure became the dominant decision input. Anthropic emailed Max-20x subscribers a June 15 policy that moves Claude Agent SDK, `claude -p`, GitHub Actions, and third-party agents off subscription rate limits onto separate $20-$200/mo metered credit at API list prices — 12x-175x effective price increase per workload, ending Claude Code subscription arbitrage. OpenAI and Microsoft formalized a $38B cumulative cap on Microsoft's revenue-share through 2030 — the first hard ceiling on any hyperscaler's AI revenue-capture. xAI shipped Grok Build CLI and a SuperGrok Heavy tier at $300/mo. The unit of competition shifted from model intelligence to agent-runtime monetization. Demand pulled forward into prints. Cisco Q3 FY26 raised FY26 AI infrastructure orders guide from $5B to $9B (Q3 AI orders $1.9B vs $600M YoY; ~$300M from non-hyperscaler neocloud / sovereign / enterprise buyers). Applied Materials Q2 FY26 printed a record $7.91B with record 50% non-GAAP gross margin and guided calendar-2026 semicap growth above 30%. Anthropic in talks at over $900B post-money on $30B April ARR (Bloomberg May 12), target close end-May. OpenAI launched a $4B+ Deployment Company with TPG, Advent, Bain, and Brookfield. The supply chain wedged itself into the read. Samsung-union talks collapsed on May 13, locking in an 18-day general walkout May 21-June 7 across 50,000+ workers — Samsung is the sole HBM4 mass-shipper for NVIDIA's Vera Rubin platform, holds ~40% global DRAM share, and a single April 23 half-day rally already cut daily memory output 18.4%. TSMC's Taiwan Tech Symposium guided AI wafer demand 11x 2022-2026 and CoWoS capacity CAGR above 80% through 2027 — capacity is coming, but not before Q3. The so-what for the four audiences: boards should treat federal procurement (Canada-TELUS BC sovereign AI factory May 11; HMRC-Quantexa £175M sovereign data and AI May 14) and capability gating (UK AISI's 4.7-month doubling rate on autonomous cyber) as durable vendor-risk dimensions. Investors should treat the Anthropic round close (not the headline valuation) as the binary read of the week and watch for Samsung-walkout impact on NVIDIA Q1 FY27 (May 20) and H2 hyperscaler capex. Architects and operators running Claude Code at scale need to model true API burn before the June 15 cutover this week, lock HBM allocation and Q3 GPU delivery slots, and add at least one specialist physical-AI model (Perceptron Mk1, MolmoAct 2) to 2H roadmaps before procurement budgets close.
- Issue 03/Week 19 of 2026/
Capability stops being about parameters. The constraint stack flipped to power, fiber, and packaging.
W19 is the week the AI compute story stopped being told in parameter counts. No new closed-frontier text flagship shipped — GPT-5.5 and Opus 4.7 still anchor the AA Intelligence Index — yet the most consequential capability move was a 300 MW SpaceX deal that let Anthropic double Claude Code rate limits inside a week. OpenAI shipped a voice trio (GPT-Realtime-2 + Translate + Whisper) with GPT-5-class reasoning baked in, jumping 15 points on Big Bench Audio; xAI moved Grok 4.3 to GA with 1M context at $1.25/$2.50 per Mtok; Google promoted Gemini 3.1 Flash-Lite to GA at $0.25/$1.50. Meanwhile NVIDIA wrote two equity checks that buy zero transistors: $500M in Corning warrants for a 10x expansion of US optical capacity, and up to $2.1B in IREN warrants alongside a $3.4B 5-year contract underwriting 5 GW of NVIDIA-aligned AI infrastructure. On the same Monday, OpenAI / AMD / Broadcom / Intel / Microsoft / NVIDIA jointly published Multipath Reliable Connection (MRC) as an OCP-released open RDMA transport — already in production at the Stargate Abilene site, Microsoft Fairwater, and OCI — collapsing 100,000-GPU clusters from three or four switching tiers down to two. The procurement read: capability is being bought with photonics and power purchase agreements, not parameter counts. Boards should reset AI-vendor diligence to ask 'who's your power partner and which fabric standard do you ship on,' not 'what's your AA Index score.' Investors should treat NVIDIA's equity stakes in CoreWeave and now IREN as a circular-financing pattern that auditors will start framing — supplier financing customers who buy supplier product. Architects should add MRC compliance and AMD-as-substrate (ZAYA1's open frontier-class MoE trained end-to-end on MI300X this week) to procurement bake-offs that previously assumed NVIDIA + closed-fabric defaults. Operators in PJM territory should re-bid 5-7 year time-to-power for new Dominion-territory applications and pre-secure interconnect now.
- Issue 02/Week 18 of 2026/
Q1 broke the bear thesis. The marginal dollar started earning more — and the network still compounds.
The W17 read was 'the marginal dollar earns less at every layer except the network.' Q1 2026 hyperscaler prints (MSFT / GOOG / META / AMZN, all on Apr 29) reset the math. Aggregate 2026 capex stepped to $695-725B (+77% YoY) but AI revenue grew faster — Microsoft Cloud commercial RPO doubled YoY to $627B, Microsoft AI run rate +123% YoY to $37B, Google Cloud +63%, AWS +28% (a 15-quarter high). The hyperscaler capex / AI revenue lever compressed from ~5.5 to ~5.0-5.2 — the first Q-over-Q compression in five quarters. The network hypothesis got an audited proof point in the same 48 hours: Equinix Q1 Fabric revenue +26% YoY with bookings +70% YoY at a record 51% adjusted EBITDA margin, raised guidance, ~60% of the largest Q1 deals AI-related. Custom silicon at ~31% (Trainium's $225B forward book, Maia 200 going GA in Iowa, MTIA expanded with Broadcom and AMD lanes) is starting to compress merchant GPU pricing power. Capability gating became market structure rather than research-org concern: GPT-5.5-Cyber and Claude Security both shipped this week as gated, separately-priced enterprise products. And the earn-out got a federal proof point on Friday: the Pentagon's May 1 eight-vendor classified-network AI procurement slate puts frontier-grade AI on IL6/IL7 deployment paths for OpenAI, Google, Microsoft, AWS, NVIDIA, SpaceX, Reflection AI, and Oracle — explicitly excluding Anthropic on supply-chain-risk grounds with Mythos cited as a separate national security moment. Federal procurement is now a third distinct earn-out channel alongside enterprise platform pull-through and gated capability SKUs, and capability-gating just became a federal-disqualification risk, not only a research-org concern. The arc the buildout is now on: less hype, more earn-out — boards should reset their AI-ROI thresholds upward, investors should track the capex / AI-revenue ratio (now ~5.0, threshold >6.0) as the wedge it has stopped being and add federal-channel parity as a vendor-risk dimension, and architects should pull custom-silicon ASIC TCO into procurement bake-offs that previously assumed merchant GPU dominance.
- Issue 01/Week 17 of 2026/
The marginal dollar at every layer earns less. Except the network.
Capital is still flooding the AI stack, but the marginal dollar earns less than the prior dollar at every layer except one. Hyperscalers are deploying ~$700B of 2026 capex against ~$120B of AI-attributable revenue — a ratio that gets tested when Microsoft, Google, Meta, and Amazon all print Q1 between April 29 and May 6. Frontier labs are committing forward compute at multiples of their disclosed revenue. The on-prem floor moved up sharply this week with DeepSeek V4 (open weights, frontier-class on coding), reshaping enterprise procurement: the question is no longer whether to self-host frontier capability, but which workload to move and on what fabric. The durable winner of the buildout is the network and interconnect layer that connects cloud, neocloud, and on-prem for hybrid AI — the only layer where pricing power compounds rather than compresses.
The methodology.
- — Every quantitative claim is graded 1–5 on source quality. Anything 2 or below is flagged as noise.
- — Every prediction is falsifiable, time-bounded, and tied to a specific signal. Future issues score them hit, miss, partial, or pending.
- — Twelve levers are tracked week over week, with thresholds that materially change the read.
- — All sources are public. Independent analysis only.
Operate. Publish. Teach.