Skip to content

5-Minute Introduction

The problem: AI agents can cause real harm. One powerful agent with full access is risky—if it fails or schemes, damage may be unbounded.

The proposed solution: Decompose into many limited components. Budget risk like money.


Instead of this (stylized—real systems exist on a spectrum):

User → [Powerful AI Agent] → Any Action
Full context
Full capability
Full autonomy

Consider this:

User → [Router] → [Narrow Component 1] → [Verifier] → Limited Action 1
→ [Narrow Component 2] → [Verifier] → Limited Action 2
→ [Narrow Component 3] → [Human Gate] → Sensitive Action

Each component has:

  • Limited capability (can only do specific things)
  • Limited context (doesn’t see the whole system)
  • Limited autonomy (verified or human-approved)

Delegation Risk = probability of bad outcome × damage

Delegation Risk = Σ P(failure) × Damage

Example: Component has 1% chance of causing 10,000damageDelegationRisk=10,000 damage → Delegation Risk = 100

Budget your Delegation Risk across components. If total exceeds budget, add safeguards.


Apply “Least X” to every component:

PrincipleMeaning
Least IntelligenceDumbest implementation that works
Least PrivilegeMinimum permissions
Least ContextMinimum information
Least PersistenceNo memory between calls
Least AutonomyHuman oversight where needed

Use the most verifiable option:

Best → Verified Code → Regular Code → Narrow Models → General LLMs → Worst

Don’t use GPT-4 for things a regex can do.


Task: Slack bot answering questions from docs

ComponentImplementationDelegation Risk
RetrieverCode (vector search)$5/mo
AnswererFine-tuned 7B$50/mo
PosterCode (rate limited)$1/mo
Total$56/mo

Budget: $500/mo → Within budget. Ship it.


SituationFramework Value
Internal tools, recoverable mistakesLight application
Customer-facing productsMedium rigor
Autonomous agents with real-world actionsFull application
Safety-critical systemsEssential


TL;DR: Consider building many limited components instead of one powerful AI. Quantify risk. Budget it like money.