brianletort.ai

Writing

AI strategy, systems, and execution.

Essays, field notes, and multi-part technical threads from the operating layer of enterprise AI.

Start here

New to the site? Start with the latest essay, subscribe to The Operating Layer, or pick a deep series.

Latest

Read this first.

Open essay
Visual for The AI Platform Race Is Moving from Models to Execution
Essay//11 min read

The AI Platform Race Is Moving from Models to Execution

The enterprise AI race is splitting into two models: integrated work systems that turn intent into completed work, and broad ecosystems that hand you powerful components and the integration bill. Three interactive positioning matrices and a quantitative 'when to use each' tool — across Anthropic, OpenAI, Microsoft, Google, AWS, Salesforce, ServiceNow, IBM, Databricks, and Snowflake.

Deep series

Follow the full argument.

Research foundation

Archive

All writing.

Weekly industry signal
Essay//11 min read

The AI Platform Race Is Moving from Models to Execution

The enterprise AI race is splitting into two models: integrated work systems that turn intent into completed work, and broad ecosystems that hand you powerful components and the integration bill. Three interactive positioning matrices and a quantitative 'when to use each' tool — across Anthropic, OpenAI, Microsoft, Google, AWS, Salesforce, ServiceNow, IBM, Databricks, and Snowflake.

llm os modes / Part 06//19 min read

Running Your Own LLM OS: The Enterprise Build

Your CEO asks whether you can build your own. The answer is yes. Here is what that actually means — four modes, four stacks from Frontier API to an 8x B200 chassis on your own silicon, the near-frontier OSS shift that changed the calculus, and the control spectrum that cost analysis keeps missing.

llm os modes / Part 05//11 min read

Cowork Mode: State Is the Coworker

The difference between a chatbot and a coworker is state. Claude Code, Cursor, Operator, Codex, ChatGPT Projects. Persistent memory, skills, knowledge base, environment access. Session-long state — and the most dangerous un-governed surface in the enterprise today.

llm os modes / Part 04//10 min read

Deep Research Mode: Planner, Swarm, Synthesizer

Deep research is not a bigger chat. It is three sub-systems pretending to be one — a planner that decomposes the question, a swarm of agents that search in parallel, and a synthesizer that does a long-context reduce. 5 to 15 minutes. Hundreds of thousands of tokens. And the richest audit trail of any mode.

Essay//31 min read

I Stopped Using ChatGPT (and 10X'd My Work)

A field report on how I actually use AI in May 2026 — a journey from Chat (3X) through Cowork (5X) and Build (10X) to Automate (30X), and what it means if you are not technical.

Essay//17 min read

The Three Postures of AI Work: Chat, Build, Automate

There are three Level-1 ways humans and AI work together — Chat (Human-to-GenAI), Build (Human-to-Agent), and Automate (Agent-to-Agent + Agent-to-Human). In 2026, chat is table stakes. The advantage lives in Build and Automate.

llm os modes / Part 03//12 min read

Agent Mode: The Loop Is the Machine

Agents are not a model. They are a loop. One Agent turn equals 5–50 Chat-mode calls, plus tools, plus state, plus a kill switch. Here is what you actually pay for when Cursor writes a PR — and what enterprise governance must cover that Chat-mode governance does not.

llm os modes / Part 02//12 min read

Chat Mode: Single-Shot on Shared Silicon

One prompt in. One response out. Fourteen infrastructure layers in between. Reasoning models are still Chat Mode — they just rent the GPU for longer. Here is what actually happens, and why it is still one machine.

Essay//15 min read

From AI-Ready Infrastructure to AI Economics Platform

Space, power, and cooling was the right product for the last era. It is not the right product for this one. A first-person argument — from inside Digital Realty — about where infrastructure platforms are actually going.

rent vs own / Part 02//12 min read

From Renting Tokens to Owning AI Assets — Part 2: What It Means to Own AI Assets

The phrase 'own AI assets' is usually shorthand for 'host a model ourselves.' That is the thinnest version of the move. Six rungs, a balance-sheet shift, and the one asset class almost nobody is buying yet — but should.

The Operating Layer / Issue 01//6 min read

Token economics is the new unit economics

Most CFOs are booking AI savings in the wrong row of the P&L. The Operating Layer, Issue 01.

rent vs own / Part 01//12 min read

From Renting Tokens to Owning AI Assets — Part 1: The Rent-vs-Own Question

The AI bill doubled every six months. We stopped trying to shrink it and started asking a different question: what should we actually own? Part 1 of a 2-part executive series on the portfolio decision.

Essay//12 min read

The Enterprise Token Scorecard

Six numbers the CFO should read in thirty seconds. The metrics that separate mature AI operators from enthusiastic experimenters — and the trajectory that tells you, every quarter, whether the platform is actually being run.

llm os modes / Part 01//12 min read

Modes of the LLM OS: Why Frontier AI Runs in Four Modes, Not One

When you hit enter in ChatGPT, Claude, or Cursor, you are not running one machine. You are running one of four operating modes of something that behaves like an operating system. Same GPUs. Five orders of magnitude in cost. Completely different governance surface.

Essay//16 min read

Designing the AI Control Plane

Seventeen control planes, zero control. The architecture pattern that turns the CEO's token-economics argument and the Data Gravity placement argument into a single governed operating system for enterprise AI.

Token Economy / Part 03//12 min read

Operating Intelligence at Scale

The economics of enterprise AI are now driven by routing, compression, caching, and infrastructure control. The AI factory pattern — dedicated GPU environments with federated routing — is becoming core enterprise infrastructure.

Essay//18 min read

Data Gravity Meets Token Economics

When 93% of enterprise data is created outside the public cloud, the AI question stops being 'which model' and starts being 'where does inference run'. The executive companion to The CEO's Guide to Token Economics.

Essay//15 min read

The CEO's Guide to Token Economics

Why boards should stop asking what AI costs and start asking what a verified outcome costs. A non-technical playbook for the operating discipline that will separate AI leaders from AI spenders.

Essay//12 min read

What Context Engineering Actually Means

RAG, MCP, memory systems, fine-tuning, prompt caching, AGENTS.md, knowledge graphs — everyone has a piece of the context puzzle. Nobody has the whole picture. Here's what's missing and why it matters.

Token Economy / Part 02//13 min read

The Enterprise Model Portfolio

The answer to the token economics problem isn't one model — it's a portfolio of six specialized model types served as internal API services. Near-frontier open models now handle 80–90% of enterprise tasks at a fraction of the cost.

context compilation / Part 01//9 min read

The Benchmarks Are Lying to You

The AI memory space has converged on benchmarks that measure retrieval — the easiest part of the problem. They don't test governance, safety, provenance, or compilation quality. Here's what's missing and why it matters.

context compilation / Part 02//8 min read

The Missing Layer

Context Compilation Theory, Context IR, and the architecture between access and reasoning. How measuring benchmark gaps revealed a missing systems layer — and why it changes how we should build AI systems.

context compilation / Part 03//8 min read

The Evidence

Eight metrics measured on a live system. The CRR journey from 48.6% to 100%. CompileBench: the benchmark that evaluates compilation decisions. And the open standard proposal.

autonomous stack / Part 04//10 min read

The Stack That Thinks: Putting It All Together

The Autonomous Stack is four layers: data substrate, agent runtime, proactive intelligence, and human interface. When all four work together, intelligence compounds.

Token Economy / Part 01//17 min read

The Token Bill Nobody's Ready For

A single power user can generate 10-50 million AI tokens per day. Multiply that across an enterprise, and the math changes everything. Token economics is becoming the defining constraint of enterprise AI.

autonomous stack / Part 03//9 min read

From Reactive to Prescriptive: The Proactive Agent Shift

Today's agents wait to be asked. Tomorrow's will tell you what you're missing. The shift from reactive to prescriptive is where agents become genuinely valuable.

autonomous stack / Part 02//10 min read

The Runtime Wars: Agent Operating Systems Are Here

Agent runtimes have crossed from frameworks to operating systems. ZeroClaw, OpenFang, and OpenClaw represent three competing philosophies for giving agents a durable lifecycle.

autonomous stack / Part 01//9 min read

The Data Layer Nobody's Building

Vector stores and RAG are table stakes. Real agent intelligence needs a continuous, multi-modal data substrate with episodic, semantic, relational, temporal, and contextual data.

agent societies / Part 04//8 min read

Looking Ahead: 12 Agent-Native Institutions Nobody's Talking About

Here's what gets built when agents can form institutions. These are the 'StackOverflow 2.0s' that turn messy questions into verified artifacts.

agent societies / Part 03//7 min read

From Emergence to Competence: PAR Loops, World Models, and Agent Economies

Societies generate priors. World models generate consequences. Verification generates truth. Here's the architecture that turns emergent behavior into emergent competence.

agent societies / Part 02//6 min read

Reinforced Learning Environments: Why Most Agent Networks Will Fail

Emergence isn't enough. Most agent societies will collapse into confident sludge. Here's what separates the ones that compound from the ones that collapse.

agent societies / Part 01//6 min read

The Petri Dish: When Agents Build Societies

I've been watching agents build a society. The emergent behaviors appearing when large numbers of agents interact without human orchestration point to something bigger than better chatbots.

semanticstudio / Part 01//8 min read

SemanticStudio: A Production-Ready Enterprise RAG Agent System

Open-sourcing the multi-agent chat platform I built to test my AI-native architecture ideas. 28 domain agents, 5 configurable modes, 4-tier memory with Context Graph, GraphRAG-lite, and everything enterprises need to build production AI.

semanticstudio / Part 02//6 min read

The Chat Experience: Sessions, Folders, Files, and More

A complete walkthrough of SemanticStudio's user-facing features—from session management to file uploads to power user shortcuts.

semanticstudio / Part 03//7 min read

Domain Agents: Specialization at Scale

Why SemanticStudio uses specialized domain agents instead of one general-purpose assistant, and how to configure and manage them—from 12 to 50+ agents.

semanticstudio / Part 04//7 min read

RAG Chain Configuration: Models, Modes, and Fine-Tuning

The power user's guide to configuring SemanticStudio's RAG chain—multi-provider LLM support, mode parameters, and full control over cost vs. quality.

semanticstudio / Part 05//8 min read

Memory as Infrastructure: The Complete 4-Tier System

A deep dive into SemanticStudio's 4-tier memory architecture—working context, session memory, long-term memory, and the Context Graph. Progressive compression meets knowledge bridging.

semanticstudio / Part 06//8 min read

GraphRAG-lite: Beyond Vector Similarity

How SemanticStudio's knowledge graph and entity resolution enable relationship discovery that pure vector RAG misses.

semanticstudio / Part 07//7 min read

ETL & Agent Creation: Growing Your Multi-Agent System

How SemanticStudio's self-learning ETL pipelines ingest data, build knowledge graphs, and automatically create new domain agents.

semanticstudio / Part 08//7 min read

Production Quality: Evaluation, Observability, and Trust

What separates demos from deployable systems—SemanticStudio's quality evaluation, hallucination detection, and enterprise observability.

Essay//7 min read

Results as a Service: Why 2026 Is the Year Outcomes Become the Product

AI agents make outcome delivery feasible. Economic pressure makes it inevitable. Here's what RaaS actually is, where it's already working, and why the shift from 'pay for software' to 'pay for results' changes everything.

Essay//15 min read

RaaS Architecture: The Control Plane That Makes Outcomes Real

RaaS isn't a pricing model—it's the commercialization of an execution loop. Here's what Result Contracts look like, how the Outcome Control Loop works, and what providers and consumers need to make outcome-based models real.

Essay//12 min read

Stochastic Core, Deterministic Shell: The Enterprise Agent Pattern That Holds Up

A lot of agent talk still sounds like old SaaS talk. In production, the pattern that works is simple: the core is stochastic, the shell is deterministic. You don't trust the agent—you bound it.

ai native computer / Part 01//7 min read

The New Computer Organization: AI Isn't Just an App, It Is the Computer

We're quietly standing up a new computer on top of the old one. In this new computer, LLMs are the CPU, tokens are the bytes, and the context window is the RAM.

ai native computer / Part 02//7 min read

When AI Is the Front End: The Future of Software and SaaS

If AI is the front end and the LLM is the CPU, what does that do to traditional software? Apps stop being destinations and become capability graphs.

ai native computer / Part 03//9 min read

Architecting the AI-Native Enterprise: A BDAT Playbook

How should a leading organization design for an AI-native future? Using the BDAT lens—Business, Data, Application, Technology—we explore what's next.

Essay//2 min read

Private AI: The Next Step in Enterprise Intelligence

Why data sovereignty and secure AI architectures are becoming non-negotiable for enterprise AI deployments.

Essay//3 min read

Context Engineering: Beyond Window Sizes

How to architect RAG systems that overcome attention dilution and recency bias in large context windows.

Essay//3 min read

Agentic Architecture: Patterns That Scale

Design patterns for multi-agent AI systems that actually work in production environments.

Essay//2 min read

Building RAG Systems at Enterprise Scale

Lessons learned from implementing retrieval-augmented generation across hundreds of documents and thousands of users.

Essay//3 min read

Data Products: The Foundation AI Needs

Why treating data as a product is essential for AI success, and how to build the data infrastructure that makes AI work.

Essay//2 min read

Superworkers, Not Replacements: The Future of AI at Work

Why the best AI systems amplify human capabilities rather than replace them. A framework for thinking about AI-augmented work.

Essay//3 min read

Teaching Machines, Teaching Humans

What 5,000+ students and two decades of AI development have taught me about learning—both artificial and human.

Essay//2 min read

Data Governance in the AI Era

How traditional data governance practices must evolve to support AI initiatives while maintaining trust and compliance.