Software.
OpenAI ships GPT-Realtime-2 + GPT-Realtime-Translate + GPT-Realtime-Whisper, exiting Realtime API beta — first voice model with GPT-5-class reasoning, 128K context, adjustable effort, priced at $32/$64 per Mtok audio (Translate $0.034/min, Whisper $0.017/min); Zillow, Priceline, Deutsche Telekom named as early enterprises
openai.com/index/advancing-voice-intelligence-with-new-models-in-the-api, ChannelNewsAsia, 9to5Mac, TheNextWeb, MarkTechPost, Latent.Space
xAI moves Grok 4.3 to GA on the public API with 1M-token context, native video input, and aggressive pricing of $1.25 / $2.50 per Mtok (~40-60% below Grok 4.20); eight legacy Grok models scheduled for retirement on May 15
docs.x.ai/developers/models, VentureBeat, Artificial Analysis, help.apiyi.com
Anthropic doubles Claude Code 5-hour rate limits across Pro/Max/Team/Enterprise, removes peak-hours throttling for Pro/Max, raises Opus API limits — coordinated with the same-morning announcement of an Anthropic ↔ SpaceX deal: 300+ MW at Colossus 1 (Memphis), ~220K NVIDIA GPUs within the month; Claude Managed Agents ships Multiagent Orchestration (Netflix deployer), Outcomes (+8.4% docx / +10.1% pptx task success), and Dreaming (research preview) at the Code w/ Claude SF event
anthropic.com/news/higher-limits-spacex, claude.com/blog/new-in-claude-managed-agents, Bloomberg, CNBC, The Verge, Ars Technica, simonwillison.net
US Center for AI Standards and Innovation (CAISI, Department of Commerce) signs pre-deployment evaluation agreements with Google DeepMind, Microsoft, and xAI — completing perimeter coverage of the top-five US labs with OpenAI and Anthropic already renegotiated; labs will provide early access including model versions 'with lowered safety protections' for national-security testing
BBC News, CNBC, The Hill, neura.market
Google promotes Gemini 3.1 Flash-Lite to GA on Gemini Enterprise at $0.25 input / $1.50 output per Mtok with adjustable thinking levels (minimal/low/medium/high) and full multimodal (text, code, image, audio, video, PDF); Gladly, JetBrains, and Astrocade named as enterprise users
cloud.google.com/blog/products/ai-machine-learning/gemini-3-1-flash-lite-is-now-generally-available, blog.google, deepmind.google/models/gemini/flash-lite, simonwillison.net
What this means
W19 is the week frontier text went into maintenance and capability moved into voice, throughput, and price-band compression. OpenAI's voice trio + Inworld TTS-2 turn 'voice model' into a frontier tier with its own benchmarks (Big Bench Audio, Audio MultiChallenge, Speech Arena Elo) — boards and architects should treat voice/IVR replacement as a 90-day procurement decision, not a 2027 plan. Anthropic's response to a compute deal — doubling paid-seat throughput rather than cutting list price — is the inverse of every other vendor's quarter and signals that the binding constraint on Claude was always GPU supply, not unit economics; operators locked into Claude Code can plan against materially higher concurrent throughput before mid-summer. CAISI completing the top-five-lab pre-deployment perimeter means every US frontier model now ships through a federal capability/risk gate before public release; pair with The Model Pulse for the seven new tree rows this week (gpt-5-5-instant, the Realtime trio, inworld-realtime-tts-2, zaya1-8b, emo-1b14b).