The AI Stack Weekly

Issue archive.

A weekly read on the flywheel between software, hardware, and networking. Boardroom-grade. Source-graded. Predictions scored.

Each issue runs to about seven minutes, ships every Friday, and is archived here the moment it lands. The thesis the brief defends is published separately and revised when evidence demands.

Subscribe via RSS

Published issues.

Issue 14/Week 30 of 2026/July 25, 2026
The mid-tier compressed frontier economics while the rack-and-power layer turned into a two-vendor race
Two curves moved toward each other this week. At the model layer, Anthropic put near-Fable intelligence into Claude Opus 5 at $5/$25 per million tokens — roughly half Fable 5's price and unchanged from Opus 4.8 — while Google pushed high-volume agent economics down with Gemini 3.6 Flash at $1.50/$7.50 and a claimed 17% reduction in output tokens versus 3.5 Flash. Flash-Lite set the throughput floor at roughly 350 output tokens per second for $0.30/$2.50. Intelligence is not free, but the premium for useful frontier work compressed sharply in four days. At the physical layer, AMD's Helios rack made the accelerator market look less like NVIDIA plus alternatives and more like a two-vendor rack race. The announced 72-GPU MI455X design uses OCP Open Rack Wide, UALoE over merchant Broadcom Tomahawk 6, 2.4 Tbps scale-out bandwidth per GPU, and a 225-245 kW power envelope. AMD claims 50% more HBM4 capacity and bandwidth than Vera Rubin and 15-25% better training performance on paper; neither claim is production evidence. CoreWeave's measured DeepSeek-R1 result adds a harder counterpoint: Vera Rubin NVL72 delivered 10x more tokens per second per megawatt than GB200 in its test. That is one operator and one workload, not a universal efficiency ratio, but it raises the evidentiary bar for Helios. The comparison buyers will make is delivered Helios rack versus measured Rubin rack, not MI455X versus an NVIDIA GPU. Capital is reinforcing that race. AMD and Anthropic announced up to $5B of AMD equity investment alongside deployment of up to 2 GW of MI450-series and Helios systems. Hut 8 separately put a 15-year, 352 MW, $9.8B base-term contract behind Beacon Point Phase 2. OpenAI's Camellia agreement adds a different constraint: a $20B campus with 3.2 GW of staged service, full infrastructure-cost recovery, and up to 1 GW of peak curtailment. The market is no longer funding chips in isolation; it is underwriting complete rack, site, power, and offtake systems. The decision implication is blunt. Model buyers should re-baseline routing now: Opus 5 for hard judgment, Flash for high-volume loops, and explicit telemetry for fallbacks, token use, and cache continuity. Infrastructure buyers should preserve competitive tension between two rack roadmaps but refuse paper-performance comparisons without delivered-cluster evidence. Site and utility planners should assume that multi-gigawatt AI load will increasingly come with long-term offtake, self-funded infrastructure, and curtailment obligations. The value is migrating away from a single best chip or model and toward the control plane that can route work, power, and capital across constrained tiers.
Download executive brief (PDF)
Issue 13/Week 29 of 2026/July 18, 2026
Model pricing is converging from both directions onto one frontier band — closed labs cutting down into $3/$15 while Chinese open-weight labs raise up into it — and the differentiation is shifting from price to who actually ships weights.
Here is the claim nobody else made this week: frontier model pricing is now converging from both directions onto a single $1-3 input / $6-15 output band. Last week the closed labs cut down into it (GPT-5.6 Terra at half GPT-5.5's rate, Grok 4.5 at $2/$6). This week the open-weight side raised up into it — Moonshot priced Kimi K3 at $0.30/$3/$15 per MTok, roughly 3x its predecessor and at Claude Sonnet 5's standard rate card (though Sonnet's $2/$10 intro price runs through Aug 31, leaving K3 ~50% above the live street price) — while DeepSeek added the first peak-hour 2x surge pricing from a major model API. The 'end of cheap Chinese AI' framing itself was already circulating on launch day (the-decoder; Simon Willison called K3 the most expensive Chinese-lab model to date); what the convergence adds is the falsifiable market-structure read: with price no longer the differentiator, the axis shifts to deployment control — and K3 launched hosted-only with a vendor-reported 2.8T parameters, an unpublished license, and weights promised by July 27, the same week Thinking Machines' 975B Inkling shipped Apache 2.0 weights on day one. 'Open' is becoming a license-and-weights fact, not a launch label. The capability side makes the pricing story matter: K3 is the first open-weights-slated model to reach the closed-frontier tier on independent evaluations — 57.1 on the Artificial Analysis Intelligence Index (#4 of 189, within three points of Claude Fable 5 and GPT-5.6 Sol) and #1 on LMArena's Frontend Code Arena. Meanwhile the physical layer tightened on schedule: TSMC posted a record $40.2B quarter, raised 2026 capex to $60-64B, added $100B to Arizona — and CEO C.C. Wei said advanced packaging capacity is 'so tight that now it's limiting my customers' growth.' PJM's 2028/29 capacity auction cleared at the FERC cap for the third straight year while 6,831 MW short (our house measurement below prices what the cap hides), and New York signed the first statewide data-center permit moratorium a day later. Compute itself began trading across former battle lines: Anthropic in reported talks to lease up to $10B of capacity from Meta, while GPU-backed credit institutionalized — a $775M Nebius facility at SOFR+2.50% (2.5 points over the floating benchmark rate), a Fitch-rated $2.6B CoreWeave delayed-draw term loan, and the first inference-ASIC-collateralized facility. What to do with it: boards should re-price single-vendor model dependency against a credible self-host option arriving July 27 — and treat the pricing convergence as the negotiating window. Investors should note credit and equity are telling different stories about GPU assets (cleanly at CoreWeave; the sector read is contested — Nebius equity rose on its debt deal). Architects and operators should treat packaging, power, and permits — not chips or capital — as the binding constraints for 2027 planning.
Download executive brief (PDF)
Issue 12/Week 28 of 2026/July 11, 2026
The harness became the product: OpenAI and Anthropic shipped rival work runtimes 48 hours apart — while memory and power marked the scarcity trade to market with $26.5B of real money.
Two launches this week redrew the enterprise AI procurement map. Anthropic pushed Claude Cowork to web and mobile on July 7 with cloud-run background sessions, citing 1.2 million sessions across 600,000+ organizations showing most Cowork use is non-coding knowledge work. OpenAI answered on July 9 with ChatGPT Work — the Codex task runtime generalized to all knowledge work, bundled into a desktop app available on every plan including Free, with 1 million of Codex's 5 million weekly users already working outside software development. The same week, two independent quantitative studies explained why the harness is where the fight moved: Databricks' merged-PR benchmark found the same model at the same effort costs over 2x more per task depending on harness choice — with open-weight GLM 5.2 statistically tied with Opus 4.8 at $1.28 vs $1.94 per task — and LangChain/NVIDIA showed harness tuning alone lifts an open model to near-Opus quality at roughly 10x lower cost. The model layer kept commoditizing on cue: GPT-5.6 went GA with Terra at $2.50/$15 (half of GPT-5.5's rate, resolving prediction p55 as a hit), and Grok 4.5 launched at $2/$6 positioning explicitly on cost-per-task. Meanwhile the physical layer banked the other side of the trade: SK hynix closed the largest-ever foreign US IPO at $26.5B with 7x demand and a +13% first-day pop, Samsung guided to a record ~KRW 89.4T quarter on AI memory, and Meta committed C$13B to a 1 GW Alberta campus where it must fund its own gas generation because the grid cannot host multiple large AI loads. Net/net: capability is commoditizing, harnesses are consolidating into suites, and margin keeps pooling in memory and power. Boards should treat agent-suite governance (defaults, budgets, audit) as this quarter's control gap; investors should note the scarcity trade just got public-market confirmation; architects should re-run agent cost benchmarks at the harness level, not the rate card.
Download executive brief (PDF)
Issue 11/Week 27 of 2026/July 4, 2026
The sell-side of AI compute gained two hyperscale balance sheets in 48 hours — just as the model layer started cutting prices.
Within two days, Bloomberg reported Meta is building "Meta Compute" to sell excess AI capacity (and possibly hosted Muse Spark access), knocking 12-15% off CoreWeave and Nebius in a morning, and SoftBank answered with SB Neo, a Delaware neocloud targeting 10GW by roughly 2030. The compute-rental market just gained two entrants with balance sheets larger than the entire listed neocloud category — exactly as the Epoch analysis showing hyperscaler free cash flow hitting zero around Q3 2026 became the consensus macro frame. The software lens ran the same economics in the other direction: Anthropic launched Claude Sonnet 5 at $2/$10 introductory pricing, and OpenAI's gated GPT-5.6 preview promises a Terra tier at half GPT-5.5's cost — the model layer is cutting prices into the very demand that is supposed to fill all this capacity. Meanwhile the frontier itself became a regulatory variable: GPT-5.6 previewed to only ~20 government-vetted organizations, while Claude Fable 5 returned globally on July 1 after export controls were withdrawn. Net/net: the industry's central question moved from "who has the best model" to "who can profitably fill capacity they have already financed" — boards should stress-test compute-vendor concentration, investors should separate scarce-input owners (memory, power) from capacity resellers, and architects should use the price war to renegotiate inference contracts now.
Download executive brief (PDF)
Issue 10/Week 26 of 2026/June 27, 2026
The week shifted from model announcements to the physical bottlenecks that decide who can serve agentic demand.
W26 was not a closed-frontier model week. It was an infrastructure-conversion week: the demand signal moved from benchmark tables into memory, inference capacity, optical fabric, and power policy. Micron's Jun 24 fiscal Q3 print showed AI memory moving from scarcity story to income statement, with HBM4 in high-volume shipments for a lead platform, HBM4 ramping faster than HBM3E, and record data-center margins. Groq raised $650M on Jun 22 to expand an inference cloud that already spans 13 data centers and targets 200MW by 2027, showing the neocloud story is shifting from training clusters to low-latency inference operations. NVIDIA's Vera Rubin production and Spectrum-X Ethernet Photonics messaging made 1.6T/CPO fabric part of the default AI-factory bill of materials, while FERC's large-load show-cause orders continued to reprice where gigawatt-scale campuses can be served and who pays for upgrades. The application and agent layers moved in parallel: Codex/Cursor-style automations and ServiceNow's governed build-agent pattern point to background software work becoming an enterprise control-plane problem, not a chat feature. Net/net: boards should read this as a capacity-allocation week, investors should separate demand winners from power/memory bottleneck owners, architects should design for multi-provider inference and budgeted agent loops, and operators should lock memory, power, optical fabric, and verification gates before assuming model access converts into production throughput.
Download executive brief (PDF)
Issue 09/Week 25 of 2026/June 20, 2026
Power policy and open weights became the week's leverage while the closed frontier stalled.
W25 moved at the two ends of the stack that the AI factory actually depends on — the grid and the open model layer — while the closed frontier marked time. The single most consequential event was regulatory: on Jun 18 FERC issued six simultaneous Section 206 show-cause orders telling PJM, MISO, SPP, CAISO, ISO-NE, and NYISO their tariffs 'appear unjust and unreasonable' for large data-center loads, forcing co-location/behind-the-meter rules and cost-shift transparency within 60 days. Developers kept routing around the interconnection queue entirely (Cummins–Circe's 2GW behind-the-meter gas) while hyperscalers added grid-connected capacity explicitly paying 100% of interconnection cost (Amazon's $10B Missouri campus, Google's $1.5B Alabama expansion). At the model layer the story was openness and stasis: Z.ai shipped GLM-5.2 under a genuine MIT license, which independent testing says beats GPT-5.5 on several long-horizon coding benchmarks at roughly one-sixth the cost; MiniMax-M3's sparse-attention weights matured; and Artificial Analysis rebased its Intelligence Index to v4.1 around agentic tasks — all while Anthropic's Claude Fable 5 stayed government-suspended the entire week (Opus 4.8 remained the working leader) and Gemini 3.5 Pro slipped to July. Hardware and networking were quiet (Supermicro opened Vera Rubin NVL72 order availability; Coherent expanded indium-phosphide laser capacity). Net/net: boards should treat power policy as a board-level siting and cost risk now being repriced by FERC; investors should watch the open-weight cost curve collapsing frontier-adjacent capability toward commodity inference; architects should pilot MIT-licensed GLM-5.2 for sovereign and self-host coding while keeping multi-model fallbacks given the closed frontier's demonstrated fragility; operators should lock generation and long-lead transformers before GPUs.
Download executive brief (PDF)
Issue 08/Week 24 of 2026/June 13, 2026
The frontier became a sovereign and capital-markets object, not just a benchmark race.
W24 was the week the model layer stopped behaving like a product cycle and started behaving like critical infrastructure with a kill switch. Anthropic shipped Claude Fable 5, a new "Mythos-class" tier above Opus, and it debuted #1 on the independent Artificial Analysis Intelligence Index at 64.9 — then, three days later (Jun 12), a U.S. government export-control directive forced Anthropic to disable Fable 5 and Mythos 5 for every customer because it could not filter foreign-national access in real time; queries now fall back to Opus 4.8. That is the first known government-forced takedown of a deployed frontier model, and it converts "which model is best" into "which model can be switched off by a third party." The same week, the capital story went public: SpaceX, with xAI embedded, priced the largest IPO ever (~$75B raised) and closed near a $2.1T valuation, giving the market its first traded read on a frontier lab, while OpenAI's confidential S-1 (Jun 8) kept the next two labs at the pre-public stage. Underneath, demand concentrated visibly — Oracle booked an RPO of $638B (+363% YoY), TSMC printed a record month with CoWoS sold out through 2027, and d-Matrix put a merchant inference ASIC into volume production explicitly to route around the HBM chokepoint. Net/net: boards should treat top-tier closed models as single points of failure and mandate hard fallbacks; investors now have a public frontier-lab proxy to price against; architects should design multi-model routing and second-source compute; operators should assume the binding constraints are regulatory access, power, and packaging, not raw intelligence.
Download executive brief (PDF)
Issue 07/Week 23 of 2026/June 6, 2026
The AI factory became a power-and-fabric problem, not a model-release problem.
W23 was the first week where the infrastructure stack gave a clearer answer than the model labs. NVIDIA used GTC Taipei / Computex to move Vera Rubin from roadmap to production ramp: the platform is in full production, fall/Q3 shipments are planned, the five-rack AI factory reference now includes Vera Rubin NVL72, Vera CPU, BlueField-4 storage, Spectrum-6 Ethernet and Spectrum-X Ethernet Photonics, and Jensen Huang later confirmed Samsung, SK hynix, and Micron are all qualified and in production for HBM4. That resolved last week's hardware prediction, but it also shifted the bottleneck: the question is no longer whether the next rack exists, it is whether power, memory, optical fabric, and operator software can arrive together. On software, the closed frontier was quiet — Gemini 3.5 Pro still had not GA'd by the end of the window — while open weights widened in the efficient-agent layer: JetBrains Mellum2, NVIDIA Cosmos 3, and Holo3.1 all targeted deployable sub-agents, physical-AI reasoning, or local computer-use rather than a monolithic chatbot benchmark. On applications, Microsoft Scout, Salesforce Coworker, ServiceNow Otto, Wordsmith, and Stilta all pointed at the same control-plane fight: governed agents with identities, permissions, and workflow authority. Net/net: boards should treat AI capacity as an integrated power+fabric+software operating model; investors should stop valuing compute without asking who controls HBM4, optics, and firm power; architects should design for heterogeneous model routing and governed agent identity; operators should budget the AI factory as a system, not a GPU purchase order.
Download executive brief (PDF)
Issue 06/Week 22 of 2026/May 30, 2026
A quiet release week resolved last week's two biggest open questions — and moved the contested ground to optics and the agent control plane.
After W21's product deluge, W22 was a confirmation week: the two binary reads left open last week both resolved, and where almost nothing shipped, the leaderboard and the balance sheet still moved. On software, Claude Opus 4.8 (May 28) retook #1 on the Artificial Analysis Intelligence Index (61.4 vs GPT-5.5's 60.2) with a >10-point SWE-Bench Pro lead — at flat list pricing — while Gemini 3.5 Pro slipped to June, leaving Anthropic the unchallenged closed frontier. On capital, the W21 watch item resolved cleanly: Anthropic's round closed May 28 as a $65B Series H at a $965B post-money, the largest private AI raise on record (though ~$15B is previously-committed hyperscaler capital, not fresh cash). On hardware, the other open risk closed too — Samsung's union ratified its wage deal on May 27 (73.7% approval), formally averting the strike that threatened HBM4 packaging, so the binding constraint stays supply (CoWoS + HBM4 + surging DRAM pricing), not demand. The genuinely new front was networking: four independent vendors (GlobalFoundries, Wiwynn, Credo, Edgecore) pushed co-packaged optics and all-photonic fabric from demos toward deployable product in-window, with one analyst house putting a ~$154B TAM on optical interconnect — a textbook Gilder arc where per-GPU bandwidth, and the dollar content that carries it, scales faster than compute. So-what for the four audiences: boards should treat the HBM4 supply scare as closed and the custom-silicon counter-flywheel (ASIC shipments +44.6% YoY vs +16.1% for merchant GPUs) as the durable margin question; investors should read the Anthropic close as the answered binary and discount the louder unverified headlines (a '$45B Anthropic-pays-xAI-through-2029' filing the principal himself recut to a ~180-day lease; a secondary-sourced '$1T OpenAI September IPO'); architects should re-baseline coding-agent evals on Opus 4.8 but pin effort levels first; operators should put agent governance and the data-cloud control plane (Snowflake's Natoma/MCP buy, Salesforce's $1.2B Agentforce ARR) on this quarter's procurement roadmap.
Download executive brief (PDF)
Issue 05/Week 21 of 2026/May 23, 2026
The bull thesis got every confirmation it needed. The strategic battleground moved to runtimes.
W21 was the loudest week of Q2 across every lens at once. NVIDIA Q1 FY27 (Tuesday May 20) printed $81.6B revenue (+85% YoY) with Data Center at $75.2B (+92% YoY, +21% QoQ from $62.3B) and a Q2 guide of $91B with zero China DC compute assumed; total supply commitments stepped to $145B; $80B additional buyback authorized and the dividend lifted 25x. Huang said NVIDIA 'will be supply-constrained through the entire life of Vera Rubin' (Q3 2026 production) — the bear thesis that hyperscaler capex isn't translating to chip revenue is gone, and the binding constraint moved firmly to HBM4 and CoWoS packaging. The frontier-lab valuation curve reset 15x in 14 months. Bloomberg (Thursday May 22) reported Anthropic's $30B-plus round closing 'as soon as next week' at a $900-plus-billion valuation, vaulting past OpenAI's $852B March mark, with Sequoia, Dragoneer, Altimeter, and Greenoaks each writing ~$2B. NVIDIA's Q1 transcript explicitly named Anthropic alongside the hyperscalers as a Blackwell deployment customer, confirming the equity bid matches compute-side demand. Google I/O 2026 (May 19-22) compressed a normal three-month product cycle into one keynote — Gemini 3.5 Flash (AA Intelligence Index 55.3, Terminal-Bench 76.2%, 4x faster than incumbent frontier at ~280 tok/sec), Gemini Omni Flash (native text-and-image-and-audio-and-video into grounded video out, shipping immediately in Gemini app / Flow / YouTube Shorts), Antigravity 2.0 standalone agent IDE with managed Linux sandboxes, and TPU 8t/8i dual-chip 8th gen. Combined with OpenAI Codex 'Goal mode' GA + macOS Appshots (May 21), Anthropic's self-hosted Managed Agent sandboxes + MCP tunnels (May 19), and xAI Grok Build (W20), the strategic battleground shifted decisively from model intelligence to agent-runtime monetization and ecosystem lock-in. The specialist canopy kept widening at a fraction of frontier cost. Cohere Command A+ shipped the first-ever Apache 2.0 frontier-adjacent MoE (218B/25B-active, runs on 2x H100, τ²-Bench Telecom jumped 37%→85% generation-over-generation). Microsoft Research Fara1.5-27B (open weights, fine-tuned on Qwen3.5) hit 72% Online-Mind2Web — beating OpenAI Operator (58.3%) and Gemini Computer Use (57.3%) — collapsing the cost structure for browser-agent fleets. Alibaba Qwen3.7-Max (May 20) became the first Chinese model in the AA Intelligence Index top 5 (56.6, ahead of Gemini 3.5 Flash) with a demonstrated 35-hour autonomous tool-use run. The one bullet dodged: Samsung-union talks reached a tentative wage agreement late Wednesday May 20 — about an hour before the planned 18-day general strike — granting a 10.5% performance bonus pending ratification (vote May 27-28). The W20 downside scenario (Vera Rubin HBM4 supply at acute risk) is largely off the table conditional on the vote passing. So-what for the four audiences: boards should retire 'AI capex peak' theses and treat federal-procurement + capability-gating as durable vendor-risk dimensions; investors should treat the Anthropic close (not the headline valuation) as the binary read of the week and price NVIDIA on supply-not-demand; architects and operators should pull agent-runtime selection into procurement bake-offs now — model swaps inside someone else's harness are constrained.
Download executive brief (PDF)
Issue 04/Week 20 of 2026/May 17, 2026
Frontier text took a breath. The constraint stack moved to supply, regulation, and pricing structure.
The model layer took a breath this week. No new GPT-class or Claude refresh between W19 and Google I/O (May 19-20), no headline frontier-text GA — but the specialist canopy widened sharply: Perceptron Mk1 priced video and embodied reasoning 80-90% below Gemini Flash Lite on May 12; NVIDIA's open-weights SANA-WM put a minute-scale 720p world model on a single RTX 5090 on May 15; OpenBMB's MiniCPM-V 4.6 collapsed multimodal SLMs to 1.3B running on-device across iOS, Android, and HarmonyOS on May 11. Pricing structure became the dominant decision input. Anthropic emailed Max-20x subscribers a June 15 policy that moves Claude Agent SDK, `claude -p`, GitHub Actions, and third-party agents off subscription rate limits onto separate $20-$200/mo metered credit at API list prices — 12x-175x effective price increase per workload, ending Claude Code subscription arbitrage. OpenAI and Microsoft formalized a $38B cumulative cap on Microsoft's revenue-share through 2030 — the first hard ceiling on any hyperscaler's AI revenue-capture. xAI shipped Grok Build CLI and a SuperGrok Heavy tier at $300/mo. The unit of competition shifted from model intelligence to agent-runtime monetization. Demand pulled forward into prints. Cisco Q3 FY26 raised FY26 AI infrastructure orders guide from $5B to $9B (Q3 AI orders $1.9B vs $600M YoY; ~$300M from non-hyperscaler neocloud / sovereign / enterprise buyers). Applied Materials Q2 FY26 printed a record $7.91B with record 50% non-GAAP gross margin and guided calendar-2026 semicap growth above 30%. Anthropic in talks at over $900B post-money on $30B April ARR (Bloomberg May 12), target close end-May. OpenAI launched a $4B+ Deployment Company with TPG, Advent, Bain, and Brookfield. The supply chain wedged itself into the read. Samsung-union talks collapsed on May 13, locking in an 18-day general walkout May 21-June 7 across 50,000+ workers — Samsung is the sole HBM4 mass-shipper for NVIDIA's Vera Rubin platform, holds ~40% global DRAM share, and a single April 23 half-day rally already cut daily memory output 18.4%. TSMC's Taiwan Tech Symposium guided AI wafer demand 11x 2022-2026 and CoWoS capacity CAGR above 80% through 2027 — capacity is coming, but not before Q3. The so-what for the four audiences: boards should treat federal procurement (Canada-TELUS BC sovereign AI factory May 11; HMRC-Quantexa £175M sovereign data and AI May 14) and capability gating (UK AISI's 4.7-month doubling rate on autonomous cyber) as durable vendor-risk dimensions. Investors should treat the Anthropic round close (not the headline valuation) as the binary read of the week and watch for Samsung-walkout impact on NVIDIA Q1 FY27 (May 20) and H2 hyperscaler capex. Architects and operators running Claude Code at scale need to model true API burn before the June 15 cutover this week, lock HBM allocation and Q3 GPU delivery slots, and add at least one specialist physical-AI model (Perceptron Mk1, MolmoAct 2) to 2H roadmaps before procurement budgets close.
Download executive brief (PDF)
Issue 03/Week 19 of 2026/May 8, 2026
Capability stops being about parameters. The constraint stack flipped to power, fiber, and packaging.
W19 is the week the AI compute story stopped being told in parameter counts. No new closed-frontier text flagship shipped — GPT-5.5 and Opus 4.7 still anchor the AA Intelligence Index — yet the most consequential capability move was a 300 MW SpaceX deal that let Anthropic double Claude Code rate limits inside a week. OpenAI shipped a voice trio (GPT-Realtime-2 + Translate + Whisper) with GPT-5-class reasoning baked in, jumping 15 points on Big Bench Audio; xAI moved Grok 4.3 to GA with 1M context at $1.25/$2.50 per Mtok; Google promoted Gemini 3.1 Flash-Lite to GA at $0.25/$1.50. Meanwhile NVIDIA wrote two equity checks that buy zero transistors: $500M in Corning warrants for a 10x expansion of US optical capacity, and up to $2.1B in IREN warrants alongside a $3.4B 5-year contract underwriting 5 GW of NVIDIA-aligned AI infrastructure. On the same Monday, OpenAI / AMD / Broadcom / Intel / Microsoft / NVIDIA jointly published Multipath Reliable Connection (MRC) as an OCP-released open RDMA transport — already in production at the Stargate Abilene site, Microsoft Fairwater, and OCI — collapsing 100,000-GPU clusters from three or four switching tiers down to two. The procurement read: capability is being bought with photonics and power purchase agreements, not parameter counts. Boards should reset AI-vendor diligence to ask 'who's your power partner and which fabric standard do you ship on,' not 'what's your AA Index score.' Investors should treat NVIDIA's equity stakes in CoreWeave and now IREN as a circular-financing pattern that auditors will start framing — supplier financing customers who buy supplier product. Architects should add MRC compliance and AMD-as-substrate (ZAYA1's open frontier-class MoE trained end-to-end on MI300X this week) to procurement bake-offs that previously assumed NVIDIA + closed-fabric defaults. Operators in PJM territory should re-bid 5-7 year time-to-power for new Dominion-territory applications and pre-secure interconnect now.
Download executive brief (PDF)
Issue 02/Week 18 of 2026/May 1, 2026
Q1 broke the bear thesis. The marginal dollar started earning more — and the network still compounds.
The W17 read was 'the marginal dollar earns less at every layer except the network.' Q1 2026 hyperscaler prints (MSFT / GOOG / META / AMZN, all on Apr 29) reset the math. Aggregate 2026 capex stepped to $695-725B (+77% YoY) but AI revenue grew faster — Microsoft Cloud commercial RPO doubled YoY to $627B, Microsoft AI run rate +123% YoY to $37B, Google Cloud +63%, AWS +28% (a 15-quarter high). The hyperscaler capex / AI revenue lever compressed from ~5.5 to ~5.0-5.2 — the first Q-over-Q compression in five quarters. The network hypothesis got an audited proof point in the same 48 hours: Equinix Q1 Fabric revenue +26% YoY with bookings +70% YoY at a record 51% adjusted EBITDA margin, raised guidance, ~60% of the largest Q1 deals AI-related. Custom silicon at ~31% (Trainium's $225B forward book, Maia 200 going GA in Iowa, MTIA expanded with Broadcom and AMD lanes) is starting to compress merchant GPU pricing power. Capability gating became market structure rather than research-org concern: GPT-5.5-Cyber and Claude Security both shipped this week as gated, separately-priced enterprise products. And the earn-out got a federal proof point on Friday: the Pentagon's May 1 eight-vendor classified-network AI procurement slate puts frontier-grade AI on IL6/IL7 deployment paths for OpenAI, Google, Microsoft, AWS, NVIDIA, SpaceX, Reflection AI, and Oracle — explicitly excluding Anthropic on supply-chain-risk grounds with Mythos cited as a separate national security moment. Federal procurement is now a third distinct earn-out channel alongside enterprise platform pull-through and gated capability SKUs, and capability-gating just became a federal-disqualification risk, not only a research-org concern. The arc the buildout is now on: less hype, more earn-out — boards should reset their AI-ROI thresholds upward, investors should track the capex / AI-revenue ratio (now ~5.0, threshold >6.0) as the wedge it has stopped being and add federal-channel parity as a vendor-risk dimension, and architects should pull custom-silicon ASIC TCO into procurement bake-offs that previously assumed merchant GPU dominance.
Download executive brief (PDF)
Issue 01/Week 17 of 2026/April 25, 2026
The marginal dollar at every layer earns less. Except the network.
Capital is still flooding the AI stack, but the marginal dollar earns less than the prior dollar at every layer except one. Hyperscalers are deploying ~$700B of 2026 capex against ~$120B of AI-attributable revenue — a ratio that gets tested when Microsoft, Google, Meta, and Amazon all print Q1 between April 29 and May 6. Frontier labs are committing forward compute at multiples of their disclosed revenue. The on-prem floor moved up sharply this week with DeepSeek V4 (open weights, frontier-class on coding), reshaping enterprise procurement: the question is no longer whether to self-host frontier capability, but which workload to move and on what fabric. The durable winner of the buildout is the network and interconnect layer that connects cloud, neocloud, and on-prem for hybrid AI — the only layer where pricing power compounds rather than compresses.
Download executive brief (PDF)

The methodology.

— Every quantitative claim is graded 1–5 on source quality. Anything 2 or below is flagged as noise.
— Every prediction is falsifiable, time-bounded, and tied to a specific signal. Future issues score them hit, miss, partial, or pending.
— Twelve levers are tracked week over week, with thresholds that materially change the read.
— All sources are public. Independent analysis only.

Operate. Publish. Teach.

Published issues.

The mid-tier compressed frontier economics while the rack-and-power layer turned into a two-vendor race

Model pricing is converging from both directions onto one frontier band — closed labs cutting down into $3/$15 while Chinese open-weight labs raise up into it — and the differentiation is shifting from price to who actually ships weights.

The harness became the product: OpenAI and Anthropic shipped rival work runtimes 48 hours apart — while memory and power marked the scarcity trade to market with $26.5B of real money.

The sell-side of AI compute gained two hyperscale balance sheets in 48 hours — just as the model layer started cutting prices.

The week shifted from model announcements to the physical bottlenecks that decide who can serve agentic demand.

Power policy and open weights became the week's leverage while the closed frontier stalled.

The frontier became a sovereign and capital-markets object, not just a benchmark race.

The AI factory became a power-and-fabric problem, not a model-release problem.

A quiet release week resolved last week's two biggest open questions — and moved the contested ground to optics and the agent control plane.

The bull thesis got every confirmation it needed. The strategic battleground moved to runtimes.

Frontier text took a breath. The constraint stack moved to supply, regulation, and pricing structure.

Capability stops being about parameters. The constraint stack flipped to power, fiber, and packaging.

Q1 broke the bear thesis. The marginal dollar started earning more — and the network still compounds.

The marginal dollar at every layer earns less. Except the network.

The methodology.