TL;DR
- 12 domains are ripe for agent-native institutions across personal, money, physical, and meta categories
- The pattern: stuck moment → society activates → competing approaches → verification → reusable artifact
- Physical world and meta-layer score highest because verification surfaces are clearest
- The winners won't be the loudest chat—they'll be systems that produce verified artifacts
In Part 1, I introduced the "2.0 test" for identifying domains ready for agent-native institutions.
In Part 2 and Part 3, I explained the architecture: Reinforced Learning Environments, PAR loops, verification as the moat.
Now: where to apply this.
The Pattern
Before diving into specific institutions, let's be clear about what we're building. An agent-native institution isn't an app. It's a system where:
From Stuck to Artifact
How an agent-native institution transforms a messy question into a verified, reusable answer
Stuck Moment
Query
Society Activates
Competing Approaches
Verification Gate
Artifact Produced
Every institution follows this pattern:
- Stuck moment: Someone (human or agent) hits a wall
- Query enters system: Context gets gathered
- Society activates: Multiple agents engage (builder, critic, curator)
- Competing approaches: Parallel solutions generated
- Verification gate: Solutions tested against constraints
- Artifact produced: Verified answer becomes reusable
The artifact is the key. It doesn't just answer one question—it becomes a protocol that others can apply.
The 2.0 Test (Recap)
A domain is ready when it scores high on:
- High-frequency stuck moments
- Fast feedback loops
- Clear verification surface
- Artifact reusability
- Reputation signal
The 2.0 Test
Is this domain ready for an agent-native institution?
Try a sample domain
View criteria details →
Do people frequently hit 'what do I do now?' moments?
Can you test outcomes quickly?
Are there objective checks, measurements, or constraints?
Can solutions be templated, replayed, or audited?
Can you rank contributors by signal quality?
Now let's apply this to 12 specific opportunities.
12 Agent-Native Institutions
I'm naming these like "institutions," not "apps," because the moat is social + verification + memory. Apps can be copied. Institutions have network effects.
12 Agent-Native Institutions
The "2.0" opportunities nobody's talking about yet
The throughline: These are all "when you're stuck" markets. The winners won't be the loudest chat—they'll be the systems that turn messy questions into verified artifacts.
Category A: Personal AI (High Volume, High Leverage)
These touch everyday life. The stuck moments are frequent, and the verification surfaces are often personal data.
Beauty Routine Lab 2.0
Stuck moment: "Why is my skin/hair reacting?" / "What routine actually works for me?"
Verification surface: Photos over time, ingredient constraints, patch-test protocols, outcomes tracking.
Winning artifact: A routine as a versioned protocol (with what changed + why).
This works because beauty has measurable outcomes (skin texture, hydration, breakouts) and personal constraints (allergies, sensitivities). The verification surface exists—it just needs to be systematized.
Health Triage Commons 2.0 (Careful + Heavily Governed)
Stuck moment: "Is this serious?" / "What should I ask my doctor?"
Verification surface: Guidelines, contraindication rules, lab ranges, symptom timelines.
Winning artifact: Question list + decision tree + "what would change my mind" checklist.
This becomes powerful without practicing medicine. It's about pre-visit preparation, not diagnosis. The agent society helps you ask better questions—verified against medical guidelines—not replace your doctor.
Meal Plan Compiler 2.0
Stuck moment: "I need food that fits constraints (time, budget, macros, allergies)."
Verification surface: Nutrition databases, constraint solving, grocery receipts, adherence tracking.
Winning artifact: Meal programs that compile into shopping + prep + schedule.
This is essentially constraint satisfaction with a verification loop. The artifact isn't a recipe—it's a week-long program that accounts for your specific constraints.
Homeowner Repair Guild 2.0
Stuck moment: "What's the safest next step before I call someone?"
Verification surface: Manuals, code constraints, sensor readings, photos, risk classification.
Winning artifact: Step-by-step triage that knows when to stop and escalate.
The key is knowing when to stop. The artifact isn't "how to fix everything"—it's "how far you can safely go, and when to call a professional."
Category B: Money + Career (High Stakes)
These require stronger governance because the stakes are higher. Verification must be robust.
Investing Committee 2.0 (Education-First, Compliance-Aware)
Stuck moment: "How do I reason about this decision under uncertainty?"
Verification surface: Backtests (with leakage controls), risk models, scenario analysis, IPS constraints.
Winning artifact: An "investment memo" template with assumptions + risk bounds + scenarios.
This isn't "what to buy." It's "how to decide." The artifact is a reasoning framework that helps you think clearly, not a recommendation that bypasses thought.
Compensation & Role Calibration 2.0
Stuck moment: "Am I underleveled/underpaid? What's market for my scope?"
Verification surface: Job ladders, comp bands (where available), offer letters (redacted), skill evidence.
Winning artifact: A level rubric + negotiation packet grounded in scope and evidence.
The verification surface here is tricky—comp data is noisy and often private. But the artifact (a calibrated understanding of your level and market rate) is valuable.
Interview Loop Foundry 2.0
Stuck moment: "How do we make hiring consistent and fair?"
Verification surface: Structured rubrics, validity checks, interviewer calibration stats.
Winning artifact: Role-specific interview packs that produce comparable signals.
This is about reducing variance and bias in hiring. The artifact is a calibrated interview process, not just a question bank.
Category C: Physical World (Where Verification Is the Moat)
These have the strongest verification surfaces because physical systems produce measurable data.
Manufacturing Debug Bay 2.0
Stuck moment: "The line is drifting—what changed?"
Verification surface: Sensor traces, SPC charts, digital work instructions, parts traceability.
Winning artifact: Root-cause narrative + patch + guardrail test.
Manufacturing is data-rich. The verification surface is excellent. The challenge is synthesizing across multiple data streams to find the real root cause.
Maintenance Triage Swarm 2.0
Stuck moment: "Is this a $5 fix or a $50k failure?"
Verification surface: Vibration/thermal readings, error codes, historical failure modes.
Winning artifact: Ranked fault tree with confidence + next measurement to take.
The artifact here is diagnostic: not just "what's wrong" but "how confident are we, and what would change that confidence."
Supply Chain Exception Room 2.0
Stuck moment: "This shipment is stuck—what's the cheapest resolution path?"
Verification surface: Lead times, contractual SLAs, port/carrier constraints, cost curves.
Winning artifact: Exception playbooks that get reused and improved.
Supply chain exceptions are repetitive enough that patterns emerge. The institution captures those patterns as reusable playbooks.
Category D: Meta-Layer (The Institutions That Make Others Work)
These are foundational—they enable the other categories.
Eval Harness Exchange 2.0
Stuck moment: "How do we know this agent/system is actually good?"
Verification surface: Benchmarks, simulators, unit tests, red-team suites.
Winning artifact: Shared evaluation suites as a public good (with provenance).
This is infrastructure. As agent systems proliferate, we need shared ways to evaluate them. The artifact is a trustworthy eval suite that others can use.
Reputation & Identity Graph 2.0
Stuck moment: "Who/what should I trust for this kind of work?"
Verification surface: Track record on verified tasks, reliability under constraints, audit trails.
Winning artifact: Portable reputation that routes tasks to the right specialists.
This is the meta-institution that makes all others work better. Portable, verified reputation is the substrate for agent economies.
Category Readiness
Not all categories are equally ready. Here's how they score:
Category Readiness Radar
Which domains are most ready for agent-native institutions?
Click to isolate category
Observation: Physical World and Meta-Layer score highest because they have the strongest verification surfaces. The moat is always verification.
Physical World and Meta-Layer score highest because their verification surfaces are clearest. Sensors don't lie. Benchmarks produce numbers.
Personal AI scores well on frequency and reusability, but verification is sometimes subjective (did the skincare routine "work"?).
Money + Career has high stakes but often slower feedback loops. You don't know if career advice worked for months or years.
The Throughline
What do all 12 have in common?
They're all "when you're stuck" markets.
The winners won't be the loudest chat. They'll be the systems that turn messy questions into verified artifacts.
Notice what's not on this list:
- General conversation (no verification surface)
- Creative writing (subjective outcomes)
- Relationship advice (long feedback loops, no clear verification)
- Strategic planning (highly contextual, hard to template)
These may eventually have agent-native institutions, but they fail the 2.0 test today.
What To Build
If you're looking for opportunities:
Start where verification is clearest. Physical world and meta-layer institutions are lower risk because you can prove they work.
Start where stuck moments are frequent. High volume creates more data, more patterns, more opportunities to learn.
Start where artifacts are reusable. The compounding comes from artifacts that others can apply. One-off answers don't compound.
Start where reputation matters. If quality can't be distinguished, the institution can't select for quality.
The Convergence (Revisited)
This series has been building toward one idea:
Society + loops + grounding = compounding competence.
- Societies generate priors (what's been tried, what works)
- Loops provide feedback (PAR at micro, meso, macro scales)
- Grounding provides truth (verification, world models, consequences)
The 12 institutions above are applications of this pattern. Each one creates a society around a specific "stuck moment," with verification that separates signal from noise, and artifacts that compound value over time.
What's Next
The platforms that figure this out in 2026-2027 will have durable advantages. They'll have:
- Network effects from accumulated reputation
- Data moats from verified interaction traces
- Quality moats from selection pressure
The ones that don't will produce increasingly sophisticated noise.
The question isn't whether agent-native institutions will emerge. They will.
The question is: which ones will you build?
This concludes the Agent Societies series. For the architectural foundations, see my work on PAR loops, Stochastic Core, Deterministic Shell, and RaaS Architecture.