Frontier text models took a breath this week — no new GPT-class or Claude refresh between W19 and Google I/O (May 19-20), and LMArena top-of-board moved less than 1 Elo. But the specialist canopy widened sharply, with three model rows joining the tree across video / embodied, on-device multimodal, and open-source world modeling.
Perceptron Mk1 collapsed video and embodied reasoning costs by 80-90% on May 12, pricing frontier-grade physical-AI perception under Gemini Flash Lite ($0.15 / $1.50 per Mtok, 85.1 EmbSpatialBench, 72.4 RefSpatialBench) and forcing a re-cost of any 2H roadmap that ships robotics, surveillance, or screen-watching agents. NVIDIA's open-weights SANA-WM (May 15, Apache 2.0) put a minute-scale 720p world model on a single RTX 5090 in 34 seconds — robotics, AV simulation, and synthetic-data teams can now in-house what they previously rented from closed video APIs. OpenBMB's MiniCPM-V 4.6 1.3B (May 11) reset the low end of multimodal SLMs at 262K context with native iOS / Android / HarmonyOS deployment, compressing on-device feature timelines from quarters to weeks.
On the platform layer, the unit of competition shifted from model intelligence to agent-runtime monetization. Anthropic's Code with Claude conference (May 6-12) made Managed Agents — Dreaming (between-session memory consolidation), Multiagent Orchestration, Outcomes (rubric-graded), and signed Webhooks — a hosted service; Claude Platform on AWS hit GA May 11; xAI shipped Grok Build CLI on May 14 with a SuperGrok Heavy tier at $300/mo; OpenAI's realtime voice trio (GPT-Realtime-2 / Translate / Whisper, added W19) went broadly available May 11 with 128K context and adjustable reasoning effort. Architects building bespoke agent stacks should justify build vs buy explicitly in 2H plans.
The W18 Pentagon vendor-disqualification thread compounded. UK AISI published paired evaluations May 13 confirming autonomous offensive-cyber capability is now doubling every 4.7 months (aligned with METR's 4.2-month SWE figure), and Anthropic emailed Max-20x subscribers a June 15 policy that moves Claude Code third-party agents off subscription rate limits — 12x-175x effective price increase per workload. Capability gating, federal procurement, and pricing structure are now durable vendor-risk dimensions; bench scores alone no longer settle a procurement decision.
For the May 19-26 window, the load-bearing catalysts are Google I/O 2026 (Gemini 3.2 Flash / Pro confirmation, fabric session), NVIDIA Q1 FY27 (May 20), Microsoft Build (May 19-22), the Samsung HBM4 walkout starting May 21, and any Anthropic / OpenAI counter-launch following I/O.