Skip to content

Risk Calculator

This tool calculates your expected monthly delegation risk using Monte Carlo simulation with probabilistic inputs.

  1. Define Harm Modes: Each potential failure mode has a probability of occurring and an associated damage distribution
  2. Probabilistic Modeling: Probabilities use beta distributions; damages use lognormal distributions to capture uncertainty
  3. Monte Carlo Simulation: 10,000 scenarios are sampled to build the risk distribution
  4. Budget Analysis: Compare your risk tolerance against the simulated distribution

$/month
Budget Confidence
93.4%
probability of staying within budget
Expected Risk
$222
/month
Median
$166
/month
5th Percentile
$58
/month
95th Percentile
$557
/month
Risk Distribution (10,000 samples)
Green = under budget,Yellow = at budget,Red = over budget
$0$3243
Harm Modes
~$25/mo
5.0%
Higher = more uncertain
~$50/mo
1.0%
Higher = more uncertain
~$80/mo
8.0%
Higher = more uncertain
View as Squiggle Code
// Delegation Risk Model
// Generated from calculator inputs


// Data processing error
prob_0 = beta(5, 95)
damage_0 = lognormal(log(500), 0.6)
risk_0 = prob_0 * damage_0

// Security vulnerability
prob_1 = beta(1, 99)
damage_1 = lognormal(log(5000), 1)
risk_1 = prob_1 * damage_1

// Incorrect recommendation
prob_2 = beta(8, 92)
damage_2 = lognormal(log(1000), 0.8)
risk_2 = prob_2 * damage_2


// Total monthly delegation risk
totalRisk = risk_0 + risk_1 + risk_2

// Budget analysis
budget = 500
budgetConfidence = cdf(totalRisk, budget)

// Summary statistics
mean(totalRisk)      // Expected risk
quantile(totalRisk, 0.5)   // Median
quantile(totalRisk, 0.05)  // 5th percentile
quantile(totalRisk, 0.95)  // 95th percentile

The expected probability that this harm mode occurs each month. For example:

  • 0.05 = 5% chance of occurring
  • 0.01 = 1% chance of occurring

How confident you are in the probability estimate. Higher values create wider distributions around the mean.

The typical cost when this harm mode occurs. Uses lognormal distribution, so actual damages could be significantly higher.

Controls the “tail thickness” of the damage distribution:

  • 0.5 = relatively predictable damages
  • 1.0 = moderate variation
  • 1.5 = high variation (rare events could be much larger)

The probability that your actual monthly risk stays within budget. Target levels:

  • 90%+: Conservative, suitable for risk-averse organizations
  • 70-90%: Moderate, typical for most applications
  • Below 70%: Aggressive, may need risk reduction measures
  • Expected (mean): Average of all simulations. Pulled up by tail events.
  • Median: Middle value. More representative of “typical” month.

If expected is much greater than median, you have significant tail risk.

  • 5th percentile: Best-case realistic scenario
  • 95th percentile: Worst-case realistic scenario (1-in-20 months)

Use these starting points based on component type:

Component TypeProbability PriorDamage Prior
Deterministic codebeta(1, 100) ~1%lognormal(log(100), 0.5)
Narrow MLbeta(3, 100) ~3%lognormal(log(500), 0.7)
General LLMbeta(8, 80) ~10%lognormal(log(1000), 0.9)
RL/Agenticbeta(15, 60) ~20%lognormal(log(2000), 1.1)

See Probability Priors for detailed distributions.


  1. Start conservative: Underestimating risk is worse than overestimating
  2. Use reference classes: How did similar systems perform historically?
  3. Decompose carefully: More specific harm modes are easier to estimate
  4. Update with data: Use the Trust Updater as you gather track record
  5. Sensitivity check: Which inputs matter most? See Sensitivity Dashboard