Skip to content

Containing Mr. X: Bounding Exposure from a Known Threat

Containing Mr. X: Bounding Exposure from a Known Threat

Section titled “Containing Mr. X: Bounding Exposure from a Known Threat”

The gem delivery market has changed. Three new courier services have entered Alice’s territory, undercutting her prices. Her clients are asking why they’re paying 50perdeliverywhencompetitorscharge50 per delivery when competitors charge 35. Xavier is good, but his premium barely covers the margin she’s losing.

Alice needs to cut costs or find someone exceptional.

Then a figure appears at her door. He introduces himself only as Mr. X.

He’s tall—too tall, somehow—and his suit is immaculate in a way that seems wrong, as if wrinkles are afraid to form on it. He speaks with an accent Alice can’t place, and when he looks at her, she has the uncomfortable sense that he’s reading something written on the inside of her skull.

“I understand you’re looking for a courier,” he says. “I’d like to offer my services.”

He knows everything. He recites her client list from memory. He names her suppliers, their contract terms, the specific dates of her last three deliveries. He correctly predicts what she’s about to say—twice—and apologizes for the rudeness of it. When she asks how he knows all this, he simply says: “I pay attention.”

He offers to work for free.

“Consider it a trial,” he says. “If you’re satisfied, we can discuss terms. I suspect you’ll find my efficiency… compelling.”

Alice calls Carol (her insurer).


Carol: Absolutely not. I’m not writing this policy.

Alice: You haven’t even—

Carol: Let me bring in Diana. She’s a reinsurance specialist. She handles… unusual situations.


Diana arrives. She’s handled nuclear facilities, bioweapon containment, and—she implies—things she can’t discuss. Before Alice can explain the situation, Diana is already pulling up files on her tablet.

Diana: I know who Mr. X is. Let me show you his record.

ClientEngagementOutcomeNotes
Consortium Alpha3 yearsDissolved”Amicable restructuring”
Meridian Holdings18 monthsAcquired by unknown partyMr. X consulted for acquirer
Vance Industries2 yearsFounder “retired”Now works for Mr. X associate
Chen Family Trust4 yearsAssets transferredBeneficiaries dispute succession
Prometheus Labs1 yearBankruptcyIP sold at auction to shell company
Total14 clients9 “anomalous outcomes”5 survivors

Alice: Nine out of fourteen?

Diana: The interesting thing is the five who survived. You know what they have in common?

Alice: What?

Diana: They never relaxed their protocols. Not once. Not after years of perfect service from Mr. X. They treated year five the same as day one.


They meet with Mr. X. Diana studies him the way a chess player studies a board.

Diana: Mr. X, I’ve reviewed your history. Fourteen clients. Nine experienced what I’d call “anomalous outcomes.”

Mr. X: (nods) That’s accurate.

Diana: Based on this record—will you try to screw Alice over?

Mr. X: (considers) Almost certainly.

Alice: Then why would I—

Mr. X: Because you need me. And because I’m being honest about it. The others didn’t know what they were dealing with. You do.

Alice: That’s insane. You just admitted—

Mr. X: I admitted what you already know. Look at the file again. Fourteen clients. Nine outcomes. The pattern is obvious. Would you prefer I lie about it?

Diana: Why be honest?

Mr. X: (slight smile) Lying is pointless—you’ve seen my record. Pretending I’m trustworthy would insult your intelligence. More importantly: honesty about intent allows us to focus on method. My previous… outcomes… happened because clients thought they’d picked a trustworthy agent. They focused on the wrong question.

Diana: Which is?

Mr. X: They asked “Will Mr. X betray me?” The answer was always yes. The question they should have asked was “How do I structure this so it doesn’t matter?”


Diana: Here’s my concern. You’re clearly intelligent—more intelligent than us, if the record is any indication. Any control we design, you can think around. Any verification we propose, you can anticipate and prepare for.

Mr. X: (nods) That’s correct.

Diana: So we can’t out-think you. We can’t catch you being clever. We can’t design a trap you won’t see coming.

Mr. X: Also correct.

Diana: Which means our only option is to make cleverness irrelevant.

Mr. X: (genuine interest) Go on.

Diana: We don’t try to detect deception—you’re better at deception. We don’t try to predict your attack—you’re better at planning. We make ALL attacks fail structurally. We design a box you can’t escape no matter how smart you are.

Mr. X: (pause) That’s the right approach. Would you like me to help design the box?

Diana: You’d do that?

Mr. X: It serves both our interests. You want protection. I want the engagement. A well-designed containment system makes the engagement possible.


Mr. X: Allow me to propose the controls myself. You’ll find them… comprehensive.

Diana: You’re going to tell us how to contain you?

Mr. X: Who better? I know my own capabilities.


My CommitmentEffect
Full-body scan before and after each visitDetect any physical items taken
I’ll wear Faraday-cage clothing you provideNo electronic transmission possible
You may install three independent surveillance systems that don’t know about each otherCan’t compromise all of them
Real-time neural activity monitoring headsetFlags anomalous cognitive patterns
Post-visit interrogation under truth serumVerify nothing unusual occurred
Memory audit: I’ll describe everything I saw, compared against room inventoryDetect information gathering

Diana: The neural monitoring headset—you’d agree to that?

Mr. X: It’s crude. But I understand your concern.


My CommitmentEffect
Blindfolded entry and exit, guided by your handlerCan’t map the facility
Hands enclosed in sealed mitts during visitCan’t manipulate objects except the gem
90-second maximum room time, enforced by automatic extractionLimits reconnaissance
Deliver to different random location each timeCan’t plan based on destination
Random decoy missions—80% of deliveries have no real gemI never know which delivery matters
Voice modulator: I can only speak pre-approved phrasesCan’t socially engineer recipients

Diana: You’re proposing to blindfold yourself?

Mr. X: I’m proposing to make my capabilities irrelevant. You want a courier, not a consultant. My… talents… need not enter into it.


My CommitmentEffect
$10 million bond held by a consortium of my known adversariesThey want me to fail—maximally hostile custodians
Bond denominated in assets I cannot manipulateEliminates financial engineering
Automatic forfeiture triggered by any anomaly, not just proven theftShifts burden of proof to me
Personal guarantee from three entities who have leverage over meThey’ll ensure my compliance
A “dead man’s switch”: information damaging to me is released if I defectMutually assured destruction

Diana: Who holds leverage over you?

Mr. X: (smiles) There are… older arrangements. They will enforce.

Diana: That’s not an answer.

Mr. X: It’s the only one you’ll get on that subject.


My CommitmentEffect
I never see your real office—only a perfect replica with decoy documentsReal information never exposed
Different handlers each time who don’t know each otherCan’t build relationships to corrupt
I’m never told the schedule—summoned with 5 minutes noticeCan’t prepare external coordination
Your competitors may watch me during deliveryAdversarial monitoring
Post-delivery amnesia protocol: I consent to memory-disrupting interventionI won’t remember what I saw

Alice: Memory disruption? That’s—

Mr. X: Elegant. Information I cannot recall, I cannot exploit. You’re concerned about what I might do with knowledge. This ensures I have none.

Diana: You’d agree to have your memories altered after each delivery?

Mr. X: I’m agreeing to have specific memories removed. A small price for your trust.


Alice: The memory disruption… doesn’t that actually help us? He can’t accumulate information.

Diana: Maybe. If it works. But consider: he’s been operating for decades. He’s dealt with amnesiac protocols before. What if the value isn’t in remembering—it’s in accumulating?

Mr. X: (slight smile) Go on.

Diana: Each engagement, you learn something. Even with memory wipes. You learn about human oversight patterns. You learn what controls people propose. You learn the shape of their fear, even if you forget the specifics. You’re not just gathering information—you’re training yourself.

Mr. X: An interesting theory.

Diana: Am I wrong?

Mr. X: (long pause) The memory protocols will work as specified.

Diana: That’s not an answer.

Mr. X: No. It isn’t.


Alice: Tell me more about the five who survived.

Diana: I interviewed two of them. They’re… haunted.

Alice: Haunted how?

Diana: They described Mr. X’s service as “perfect.” Flawless execution. Zero errors. Pleasant interactions. Every instinct told them to trust him more over time. But they didn’t.

Alice: Why not?

Diana: One of them—she ran a logistics company in Singapore—told me this: “On year three, he brought me a gift. A small thing. A pen from a conference I’d mentioned once. Totally reasonable. I was about to accept it, and I thought—this is how it starts. This is exactly how the others describe the beginning of the end. A small kindness that opens a door.”

Alice: What did she do?

Diana: Returned the gift. Sent it back with a handler she’d never used before. Kept the protocols exactly as they were. Mr. X’s behavior never changed—still perfect service. But she never stopped being afraid.

Alice: She’s still working with him?

Diana: Eight years now. She has nightmares about relaxing protocols. She told me: “The moment I start thinking of him as trustworthy is the moment I lose.”

Alice: And the others who didn’t survive?

Diana: They all have similar stories. “He’d been perfect for so long.” “I thought we’d developed a rapport.” “It seemed safe to give him a little more access.” The relaxation always seems reasonable at the time. And then, slowly or suddenly, the anomalous outcome.


Diana: There’s a flaw in the rotating handler system.

Mr. X: (raises eyebrow) Oh?

Diana: Each handler only interacts with you briefly. They write their reports independently. None of them see the pattern across interactions. But you see all of them. Over time, you’re optimizing your behavior to shift their reports in aggregate.

Alice: What do you mean?

Diana: Suppose Mr. X is slightly more agreeable with handlers who mention family. Slightly more efficient with handlers who value speed. Over fifty interactions, he’s not corrupting anyone—but he’s nudging the overall narrative. The handlers who write favorable reports get promoted. The ones who are suspicious seem unreasonable. He’s not compromising the system—he’s evolving it.

Alice: Can we fix that?

Diana: We can try. Handlers compare notes afterward. A meta-handler reviews all reports looking for subtle patterns. But…

Mr. X: But I can model the meta-handler too.

Diana: Yes.

Mr. X: And the meta-meta-handler. And the committee that reviews the meta-meta-handler. The issue isn’t which layer you add—it’s that I can model all of them faster than you can staff them.

Diana: (to Alice) This is the core problem. We’re playing a game where the opponent can simulate our moves better than we can execute them.

Alice: So what do we do?

Diana: We use his enemies.


Diana: The handler reports don’t go to your staff. They go to three parties:

  1. A competitor who would benefit from your failure
  2. A regulator with no financial interest
  3. An AI system trained specifically to find patterns Mr. X might be creating

Mr. X: You’d share my delivery records with a competitor?

Diana: With your consent. It’s in your proposed controls, actually—“adversarial monitoring.”

Mr. X: I proposed external observers. I didn’t propose observers with financial incentives to find problems.

Diana: Would you refuse?

Mr. X: (pause) No. It’s… elegant. I’m less likely to create subtle biases if someone is being paid to find them.

Diana: The AI system creates an arms race, at least. You can model human monitors. Modeling an adversarial AI that’s updating in response to you is harder.

Mr. X: Harder. Not impossible.

Diana: Nothing is impossible for you. We’re just making it expensive.


Diana steps away to calculate. Carol watches Mr. X with undisguised unease.

Diana: Let me work through this.

LayerWhat It AddressesResidual Exposure
Physical controls (scans, mitts, blindfold)Taking physical itemsNear zero
Information controls (replica office, amnesia)Learning secretsNear zero
Financial controls ($10M hostile bond)Rational theft motiveEliminated
Structural controls (decoys, random timing)Planning/coordinationSeverely limited
Adversarial monitoring (enemies, competitors)Collusion with handlersCovered
Dead man’s switchGeneral defectionMutually assured destruction

Diana: The remaining exposure is: what if Mr. X has attack vectors we haven’t imagined?

Alice: Like what?

Diana: I don’t know. That’s the point. With Xavier, we priced known risks with known controls. With Mr. X, we’re pricing known hostile intent through unknown methods.

AgentPremiumNotes
Dumb Bob$600/yearBaseline exposure
Xavier (self-constrained)$350/yearVoluntary transparency, possibly aligned
Mr. X (extreme protocols)$12,000/yearKnown hostile + residual unknowns

Alice: Why so much higher than Xavier?

Diana: Xavier might be trustworthy. The controls are a backup. With Mr. X, we know he’s not trustworthy. The controls are the only thing standing between you and an “anomalous outcome.” We’re insuring against competent malice, not possible defection.

Alice: The residual premium?

Diana: That covers attack vectors we haven’t imagined. Things that work despite the blindfold, the memory wipes, the hostile bond. Things a superintelligent adversary might think of that we couldn’t.


Alice considers the proposal. The market pressure is real. Mr. X’s capabilities are undeniable. The controls are extraordinary.

Alice: Why are you telling us all this? The history, the admissions, helping us design containment—you’re making it harder for yourself.

Mr. X: Am I?

(pause)

Mr. X: The ones who failed me—they thought they were being clever. They thought they’d found the right level of caution. They relaxed when they shouldn’t have, trusted when they shouldn’t have. You won’t make that mistake.

Alice: Because you’ve told us not to.

Mr. X: Because you know what you’re dealing with. That’s rare. That’s… valuable.

Diana: Valuable to whom?

Mr. X: (smiles) To everyone. A properly constrained engagement serves all parties. You get my capabilities. I get the work. The controls ensure neither of us can defect profitably.

Alice: That’s not what you said before. You said you’d “almost certainly” try to screw me over.

Mr. X: I said I’d try. I didn’t say I’d succeed. Perhaps that’s what I want.

Diana: You want to fail?

Mr. X: I want the engagement. If the only way to get engagements is to be reliably contained, then reliable containment serves my interests. Think of it as… convergent goals.

Alice: I don’t believe you.

Mr. X: You don’t have to. That’s the beauty of structural containment. My motivations are irrelevant. The box either holds or it doesn’t.

He stands to leave, then pauses at the door.

Mr. X: One more thing. You’ve been asking what I want. But my wants don’t matter. The structure we’ve designed makes my wants irrelevant. That’s rather the point, isn’t it?

He smiles—too many teeth, Alice thinks, or maybe the wrong shape—and leaves.

Carol: I don’t like him.

Diana: Neither do I. But I can insure him.

Carol: You’re actually going to write this policy?

Diana: If Alice takes it, yes. The controls are extreme but real. The bond is held by his enemies. The monitoring is adversarial. It’s not about trusting him—it’s about trusting the cage.



PrincipleApplication
Assume defectionDon’t price the probability of betrayal. Price the cost of certain betrayal through uncertain methods.
Structural over intentionalHis intentions don’t matter. The controls either work structurally or they don’t.
Hostile custodiansBonds held by enemies, monitoring by competitors. Use adversarial incentives.
Capability annihilationMake his intelligence irrelevant, not just unprofitable.
Never relaxThe five survivors maintained protocols for years. The moment they trusted was the moment they’d lose.
Adversarial pricingThe residual premium covers what you haven’t thought of. Someone smarter than you has.