Skip to content

Architecture Comparator

Compare the risk profiles of different delegation architectures to make informed design decisions.

This tool provides side-by-side comparison of delegation architectures:

  1. Component Breakdown: See which AI/human components each architecture uses
  2. Risk Visualization: Compare expected vs. worst-case risk
  3. Mitigation Impact: See how different safety measures affect overall risk
  4. Trade-off Analysis: Understand the cost-benefit of each approach

Architecture Comparison
Compare risk profiles of different delegation architectures. Darker bars show expected monthly risk; lighter bars show worst-case potential.
Baseline (Human Only)
Traditional human-driven process with no AI delegation
$72/mo
expected
Simple AI Assist
AI provides suggestions, human makes all decisions
$79/mo
expected
Autonomous AI
AI handles routine tasks, human handles exceptions
$151/mo
expected
Full Automation
End-to-end AI with minimal human intervention
$602/mo
expected
Trade-off Analysis
ArchitectureExpected RiskWorst CaseComponentsMitigations
Baseline (Human Only)$72/mo*$2,00012
Simple AI Assist$79/mo$2,50022
Autonomous AI$151/mo$4,80033
Full Automation$602/mo$8,00033
Component Type Legend
Deterministic(~0.1% base)
Narrow ML(~3.0% base)
General LLM(~10.0% base)
RL Agent(~20.0% base)
Human(~5.0% base)

Traditional human-driven process with no AI delegation. Establishes the risk level you’re comparing against.

Characteristics:

  • Single point of failure (human error)
  • Predictable but expensive
  • Limited scalability
  • Well-understood failure modes

AI provides suggestions and recommendations, but humans make all decisions. Common for high-stakes domains.

Characteristics:

  • AI errors caught by human review
  • Slower than autonomous but safer
  • Good for building trust in AI systems
  • Human bottleneck remains

AI handles routine tasks independently; humans handle exceptions. Balances efficiency with safety.

Characteristics:

  • Higher throughput for routine work
  • Complex failure modes (routing errors)
  • Requires robust exception handling
  • Multiple points of mitigation

End-to-end AI with minimal human intervention. Maximum efficiency, maximum risk.

Characteristics:

  • Highest potential damage
  • Requires extensive mitigation
  • Suitable only for well-understood domains
  • Fastest degradation if poorly designed

  • Expected Risk (dark bar): Average monthly risk given probability distributions
  • Worst Case (light bar): Maximum possible damage if everything fails

A wide gap indicates high tail risk.

Risk varies significantly by component type:

TypeBase Failure RateTypical Use
Deterministic~0.1%Rule-based routing, validation
Narrow ML~3%Classification, detection
General LLM~10%Generation, reasoning
RL Agent~20%Autonomous decision-making
Human~5%Review, exception handling

Each mitigation reduces risk multiplicatively. With 3 mitigations at 85% effectiveness each:

final_risk = base_risk × 0.85³ = base_risk × 0.61

More mitigations help, but with diminishing returns.


  • Stakes are extremely high
  • Decisions require nuanced judgment
  • AI systems aren’t well-calibrated for your domain
  • Regulatory requirements demand human oversight
  • AI can improve human decision quality
  • Speed is important but not critical
  • Building organizational trust in AI
  • Failure costs are moderate
  • High volume of routine decisions
  • Clear criteria for “routine” vs “exception”
  • Good monitoring and fallback in place
  • Moderate-to-high risk tolerance
  • Domain is well-understood with clear boundaries
  • Extensive testing and validation completed
  • Strong mitigation stack in place
  • Benefits significantly outweigh risks

  1. Start with AI suggestions for low-stakes decisions
  2. Track AI accuracy vs. human decisions
  3. Gradually expand scope based on performance
  4. Maintain human review throughout
  1. Identify truly routine tasks (over 95% predictable)
  2. Implement robust exception detection
  3. Add monitoring and alerting
  4. Pilot with limited scope, then expand
  1. Achieve consistent performance metrics
  2. Reduce human exception handling rate to less than 5%
  3. Implement comprehensive mitigation stack
  4. Establish clear rollback procedures

The default architectures serve as templates. To analyze your specific situation:

  1. Use the Risk Calculator to model each architecture
  2. Use the Sensitivity Dashboard to identify key parameters
  3. Use the Trust Updater to calibrate component reliability
  4. Document your analysis in the Decomposition Worksheet