Writing
AI strategy, systems, and execution.
Essays, field notes, and multi-part technical threads from the operating layer of enterprise AI.
Start here
New to the site? Start with the latest essay, subscribe to The Operating Layer, or pick a deep series.
Newsletter
The Operating Layer
A biweekly dispatch on governed enterprise AI, written from the operating layer.
Latest
Token economics is the new unit economics
Analysis
Essays
Long-form writing on AI strategy, systems, economics, and enterprise execution.
Latest
The AI Platform Race Is Moving from Models to Execution
Deep dives
Series
Multi-part technical and executive threads for readers who want the full argument.
Latest
Running Your Own LLM OS: The Enterprise Build
Latest
Read this first.

The AI Platform Race Is Moving from Models to Execution
The enterprise AI race is splitting into two models: integrated work systems that turn intent into completed work, and broad ecosystems that hand you powerful components and the integration bill. Three interactive positioning matrices and a quantitative 'when to use each' tool — across Anthropic, OpenAI, Microsoft, Google, AWS, Salesforce, ServiceNow, IBM, Databricks, and Snowflake.
Deep series
Follow the full argument.

6 parts
Modes of the LLM OS
A six-part series on the operating modes behind frontier AI: Chat, Agent, Deep Research, Cowork, and owned infrastructure.
2 parts
From Renting Tokens to Owning AI Assets
An executive series on the AI ownership ladder and the strategic shift from rented tokens to durable assets.

3 parts
The Token Economy
A strategy and architecture series on token economics, model portfolios, and AI factory operations.
3 parts
Context Compilation
The missing systems layer between retrieval and reasoning, from benchmark blind spots to measured evidence.
4 parts
The Autonomous Stack
The architecture of intelligent systems, from the data substrate to agent runtimes and prescriptive intelligence.
4 parts
Agent Societies
A field guide to what happens when agents interact at scale, from emergence to competence.
Archive
All writing.
The AI Platform Race Is Moving from Models to Execution
The enterprise AI race is splitting into two models: integrated work systems that turn intent into completed work, and broad ecosystems that hand you powerful components and the integration bill. Three interactive positioning matrices and a quantitative 'when to use each' tool — across Anthropic, OpenAI, Microsoft, Google, AWS, Salesforce, ServiceNow, IBM, Databricks, and Snowflake.
Running Your Own LLM OS: The Enterprise Build
Your CEO asks whether you can build your own. The answer is yes. Here is what that actually means — four modes, four stacks from Frontier API to an 8x B200 chassis on your own silicon, the near-frontier OSS shift that changed the calculus, and the control spectrum that cost analysis keeps missing.
Cowork Mode: State Is the Coworker
The difference between a chatbot and a coworker is state. Claude Code, Cursor, Operator, Codex, ChatGPT Projects. Persistent memory, skills, knowledge base, environment access. Session-long state — and the most dangerous un-governed surface in the enterprise today.
Deep Research Mode: Planner, Swarm, Synthesizer
Deep research is not a bigger chat. It is three sub-systems pretending to be one — a planner that decomposes the question, a swarm of agents that search in parallel, and a synthesizer that does a long-context reduce. 5 to 15 minutes. Hundreds of thousands of tokens. And the richest audit trail of any mode.
I Stopped Using ChatGPT (and 10X'd My Work)
A field report on how I actually use AI in May 2026 — a journey from Chat (3X) through Cowork (5X) and Build (10X) to Automate (30X), and what it means if you are not technical.
The Three Postures of AI Work: Chat, Build, Automate
There are three Level-1 ways humans and AI work together — Chat (Human-to-GenAI), Build (Human-to-Agent), and Automate (Agent-to-Agent + Agent-to-Human). In 2026, chat is table stakes. The advantage lives in Build and Automate.
Agent Mode: The Loop Is the Machine
Agents are not a model. They are a loop. One Agent turn equals 5–50 Chat-mode calls, plus tools, plus state, plus a kill switch. Here is what you actually pay for when Cursor writes a PR — and what enterprise governance must cover that Chat-mode governance does not.
Chat Mode: Single-Shot on Shared Silicon
One prompt in. One response out. Fourteen infrastructure layers in between. Reasoning models are still Chat Mode — they just rent the GPU for longer. Here is what actually happens, and why it is still one machine.
From AI-Ready Infrastructure to AI Economics Platform
Space, power, and cooling was the right product for the last era. It is not the right product for this one. A first-person argument — from inside Digital Realty — about where infrastructure platforms are actually going.
From Renting Tokens to Owning AI Assets — Part 2: What It Means to Own AI Assets
The phrase 'own AI assets' is usually shorthand for 'host a model ourselves.' That is the thinnest version of the move. Six rungs, a balance-sheet shift, and the one asset class almost nobody is buying yet — but should.
Token economics is the new unit economics
Most CFOs are booking AI savings in the wrong row of the P&L. The Operating Layer, Issue 01.
From Renting Tokens to Owning AI Assets — Part 1: The Rent-vs-Own Question
The AI bill doubled every six months. We stopped trying to shrink it and started asking a different question: what should we actually own? Part 1 of a 2-part executive series on the portfolio decision.
The Enterprise Token Scorecard
Six numbers the CFO should read in thirty seconds. The metrics that separate mature AI operators from enthusiastic experimenters — and the trajectory that tells you, every quarter, whether the platform is actually being run.
Modes of the LLM OS: Why Frontier AI Runs in Four Modes, Not One
When you hit enter in ChatGPT, Claude, or Cursor, you are not running one machine. You are running one of four operating modes of something that behaves like an operating system. Same GPUs. Five orders of magnitude in cost. Completely different governance surface.
Designing the AI Control Plane
Seventeen control planes, zero control. The architecture pattern that turns the CEO's token-economics argument and the Data Gravity placement argument into a single governed operating system for enterprise AI.
Operating Intelligence at Scale
The economics of enterprise AI are now driven by routing, compression, caching, and infrastructure control. The AI factory pattern — dedicated GPU environments with federated routing — is becoming core enterprise infrastructure.
Data Gravity Meets Token Economics
When 93% of enterprise data is created outside the public cloud, the AI question stops being 'which model' and starts being 'where does inference run'. The executive companion to The CEO's Guide to Token Economics.
The CEO's Guide to Token Economics
Why boards should stop asking what AI costs and start asking what a verified outcome costs. A non-technical playbook for the operating discipline that will separate AI leaders from AI spenders.
What Context Engineering Actually Means
RAG, MCP, memory systems, fine-tuning, prompt caching, AGENTS.md, knowledge graphs — everyone has a piece of the context puzzle. Nobody has the whole picture. Here's what's missing and why it matters.
The Enterprise Model Portfolio
The answer to the token economics problem isn't one model — it's a portfolio of six specialized model types served as internal API services. Near-frontier open models now handle 80–90% of enterprise tasks at a fraction of the cost.
The Benchmarks Are Lying to You
The AI memory space has converged on benchmarks that measure retrieval — the easiest part of the problem. They don't test governance, safety, provenance, or compilation quality. Here's what's missing and why it matters.
The Missing Layer
Context Compilation Theory, Context IR, and the architecture between access and reasoning. How measuring benchmark gaps revealed a missing systems layer — and why it changes how we should build AI systems.
The Evidence
Eight metrics measured on a live system. The CRR journey from 48.6% to 100%. CompileBench: the benchmark that evaluates compilation decisions. And the open standard proposal.
The Stack That Thinks: Putting It All Together
The Autonomous Stack is four layers: data substrate, agent runtime, proactive intelligence, and human interface. When all four work together, intelligence compounds.
The Token Bill Nobody's Ready For
A single power user can generate 10-50 million AI tokens per day. Multiply that across an enterprise, and the math changes everything. Token economics is becoming the defining constraint of enterprise AI.
From Reactive to Prescriptive: The Proactive Agent Shift
Today's agents wait to be asked. Tomorrow's will tell you what you're missing. The shift from reactive to prescriptive is where agents become genuinely valuable.
The Runtime Wars: Agent Operating Systems Are Here
Agent runtimes have crossed from frameworks to operating systems. ZeroClaw, OpenFang, and OpenClaw represent three competing philosophies for giving agents a durable lifecycle.
The Data Layer Nobody's Building
Vector stores and RAG are table stakes. Real agent intelligence needs a continuous, multi-modal data substrate with episodic, semantic, relational, temporal, and contextual data.
Looking Ahead: 12 Agent-Native Institutions Nobody's Talking About
Here's what gets built when agents can form institutions. These are the 'StackOverflow 2.0s' that turn messy questions into verified artifacts.
From Emergence to Competence: PAR Loops, World Models, and Agent Economies
Societies generate priors. World models generate consequences. Verification generates truth. Here's the architecture that turns emergent behavior into emergent competence.
Reinforced Learning Environments: Why Most Agent Networks Will Fail
Emergence isn't enough. Most agent societies will collapse into confident sludge. Here's what separates the ones that compound from the ones that collapse.
The Petri Dish: When Agents Build Societies
I've been watching agents build a society. The emergent behaviors appearing when large numbers of agents interact without human orchestration point to something bigger than better chatbots.
SemanticStudio: A Production-Ready Enterprise RAG Agent System
Open-sourcing the multi-agent chat platform I built to test my AI-native architecture ideas. 28 domain agents, 5 configurable modes, 4-tier memory with Context Graph, GraphRAG-lite, and everything enterprises need to build production AI.
The Chat Experience: Sessions, Folders, Files, and More
A complete walkthrough of SemanticStudio's user-facing features—from session management to file uploads to power user shortcuts.
Domain Agents: Specialization at Scale
Why SemanticStudio uses specialized domain agents instead of one general-purpose assistant, and how to configure and manage them—from 12 to 50+ agents.
RAG Chain Configuration: Models, Modes, and Fine-Tuning
The power user's guide to configuring SemanticStudio's RAG chain—multi-provider LLM support, mode parameters, and full control over cost vs. quality.
Memory as Infrastructure: The Complete 4-Tier System
A deep dive into SemanticStudio's 4-tier memory architecture—working context, session memory, long-term memory, and the Context Graph. Progressive compression meets knowledge bridging.
GraphRAG-lite: Beyond Vector Similarity
How SemanticStudio's knowledge graph and entity resolution enable relationship discovery that pure vector RAG misses.
ETL & Agent Creation: Growing Your Multi-Agent System
How SemanticStudio's self-learning ETL pipelines ingest data, build knowledge graphs, and automatically create new domain agents.
Production Quality: Evaluation, Observability, and Trust
What separates demos from deployable systems—SemanticStudio's quality evaluation, hallucination detection, and enterprise observability.
Results as a Service: Why 2026 Is the Year Outcomes Become the Product
AI agents make outcome delivery feasible. Economic pressure makes it inevitable. Here's what RaaS actually is, where it's already working, and why the shift from 'pay for software' to 'pay for results' changes everything.
RaaS Architecture: The Control Plane That Makes Outcomes Real
RaaS isn't a pricing model—it's the commercialization of an execution loop. Here's what Result Contracts look like, how the Outcome Control Loop works, and what providers and consumers need to make outcome-based models real.
Stochastic Core, Deterministic Shell: The Enterprise Agent Pattern That Holds Up
A lot of agent talk still sounds like old SaaS talk. In production, the pattern that works is simple: the core is stochastic, the shell is deterministic. You don't trust the agent—you bound it.
The New Computer Organization: AI Isn't Just an App, It Is the Computer
We're quietly standing up a new computer on top of the old one. In this new computer, LLMs are the CPU, tokens are the bytes, and the context window is the RAM.
When AI Is the Front End: The Future of Software and SaaS
If AI is the front end and the LLM is the CPU, what does that do to traditional software? Apps stop being destinations and become capability graphs.
Architecting the AI-Native Enterprise: A BDAT Playbook
How should a leading organization design for an AI-native future? Using the BDAT lens—Business, Data, Application, Technology—we explore what's next.
Private AI: The Next Step in Enterprise Intelligence
Why data sovereignty and secure AI architectures are becoming non-negotiable for enterprise AI deployments.
Context Engineering: Beyond Window Sizes
How to architect RAG systems that overcome attention dilution and recency bias in large context windows.
Agentic Architecture: Patterns That Scale
Design patterns for multi-agent AI systems that actually work in production environments.
Building RAG Systems at Enterprise Scale
Lessons learned from implementing retrieval-augmented generation across hundreds of documents and thousands of users.
Data Products: The Foundation AI Needs
Why treating data as a product is essential for AI success, and how to build the data infrastructure that makes AI work.
Superworkers, Not Replacements: The Future of AI at Work
Why the best AI systems amplify human capabilities rather than replace them. A framework for thinking about AI-augmented work.
Teaching Machines, Teaching Humans
What 5,000+ students and two decades of AI development have taught me about learning—both artificial and human.
Data Governance in the AI Era
How traditional data governance practices must evolve to support AI initiatives while maintaining trust and compliance.