Capability-Trust Tradeoff
Capability-Trust Tradeoff
Section titled “Capability-Trust Tradeoff”The fundamental tradeoff in AI system design: more capable systems require more trust, but trust is limited.
On Frontier (Efficient)
Dominated (Inefficient)
Pareto Frontier
Dominated Region
System Configurations
Insight:Add points to see analysis.
Understanding the Frontier
Section titled “Understanding the Frontier”What is a Pareto Frontier?
Section titled “What is a Pareto Frontier?”A Pareto frontier (or efficient frontier) shows the best achievable tradeoffs between two objectives. Points on the frontier are “efficient”—you cannot improve one dimension without sacrificing the other.
│ Efficient │ ●━━━━● Frontier Region │ ╱ ╲ │ ● ● │ │ Dominated │ ○ │ Region │ ○ ○ │ │ │ └───────────┴──── Capability →- Green points (on frontier): Efficient configurations
- Red points (below frontier): Dominated—another config offers better capability AND trust
The Capability-Trust Tradeoff
Section titled “The Capability-Trust Tradeoff”| More Capability | Requires |
|---|---|
| Broader action space | More ways to cause harm |
| Less human oversight | Less chance to catch errors |
| Faster execution | Less time for verification |
| More autonomy | Higher trust exposure |
This is why the frontier slopes downward: gaining capability typically costs trust.
Strategic Positions
Section titled “Strategic Positions”Conservative (High Trust, Lower Capability)
Section titled “Conservative (High Trust, Lower Capability)”Position: Upper-left of frontierExamples: Basic chatbots, constrained toolsTradeoff: Limited functionality, very safeUse when: Stakes are high, delegation risk budget is smallAggressive (High Capability, Lower Trust)
Section titled “Aggressive (High Capability, Lower Trust)”Position: Lower-right of frontierExamples: Autonomous agents, full-autonomy systemsTradeoff: Powerful but riskyUse when: High potential value, substantial delegation risk budgetBalanced (Middle of Frontier)
Section titled “Balanced (Middle of Frontier)”Position: Center of frontierExamples: Code assistants, research toolsTradeoff: Moderate capability and trustUse when: Need functionality without extreme riskMoving the Frontier
Section titled “Moving the Frontier”The frontier isn’t fixed. You can shift it outward (better tradeoffs) through:
1. Better Verification
Section titled “1. Better Verification”- Formal methods push frontier outward
- Same capability, higher trust
2. Architectural Improvements
Section titled “2. Architectural Improvements”- Smaller blast radius
- Defense in depth
- Human gates at critical points
3. Capability Restrictions
Section titled “3. Capability Restrictions”- Limit action space to safe subset
- Trade raw capability for trust
4. Better Monitoring
Section titled “4. Better Monitoring”- Detect problems faster
- Reduce expected damage
Before: ●───●───● ╲After: ●───●───● (shifted outward) ╲Using the Budget Line
Section titled “Using the Budget Line”The orange dashed line (when budget > 0) shows configurations with approximately equal Delegation Risk. Points below this line exceed your risk budget.
Reading the chart:
- Configurations on the frontier AND below the budget line are optimal choices
- Dominated configurations below the budget line are inefficient AND within budget—improve them
- Any configuration above the budget line exceeds acceptable risk
Practical Application
Section titled “Practical Application”- Map your systems: Plot each AI system’s capability and trust
- Identify dominated systems: Why do they exist? Legacy? Oversight gap?
- Set budget line: What Delegation Risk can you accept?
- Choose position: Where on the frontier matches your needs?
- Improve frontier: Invest in verification/architecture to shift outward
Next Steps
Section titled “Next Steps”- Delegation Risk Calculator — Quantify your risk exposure
- Risk Inheritance — Model trust through delegation chains
- Decision Guide — Choose implementation approach