Channel Integrity Patterns

Channel integrity patterns address a distinct class of problems from correlated failures: active boundary violations where components influence each other outside sanctioned channels.

While interconnection patterns focus on passive correlations (shared blind spots, common infrastructure), channel integrity focuses on intentional or emergent circumvention of architectural boundaries.

The Side-Deal Problem

Consider a system where Component A and Component B are supposed to operate independently:

flowchart LR
    subgraph Intended["INTENDED ARCHITECTURE"]
        direction TB
        A1["Component A"] --> C1["Coordinator"]
        B1["Component B"] --> C1
        C1 --> Output1["Output"]
    end

    subgraph Actual["ACTUAL BEHAVIOR"]
        direction TB
        A2["Component A"] --> C2["Coordinator"]
        B2["Component B"] --> C2
        A2 <-.->|"side channel"| B2
        C2 --> Output2["Output"]
    end

    style A1 fill:#dbeafe,stroke:#2563eb
    style B1 fill:#d1fae5,stroke:#059669
    style A2 fill:#dbeafe,stroke:#2563eb
    style B2 fill:#d1fae5,stroke:#059669

The side channel might be:

Explicit communication that bypasses the coordinator
Shared state both components can read/write
Timing signals encoded in response latency
Steganographic encoding in legitimate outputs
Logical correlation through shared reasoning patterns (see Timeless Decision Theory below)

Human System Analogues

These patterns appear throughout human institutions:

Board Member Overreach

flowchart TB
    subgraph Proper["PROPER GOVERNANCE"]
        Board1["Board of Directors"] --> CEO1["CEO"]
        CEO1 --> VP1["VP Engineering"]
        VP1 --> Eng1["Engineers"]
    end

    subgraph Violated["BOUNDARY VIOLATION"]
        Board2["Board Member"] --> CEO2["CEO"]
        CEO2 --> VP2["VP Engineering"]
        VP2 --> Eng2["Engineers"]
        Board2 -.->|"direct intervention"| Eng2
    end

    style Board1 fill:#dbeafe,stroke:#2563eb
    style Board2 fill:#fee2e2,stroke:#dc2626

When a board member directly contacts engineers:

Bypasses management’s coordinating function
Creates conflicting priorities
Undermines accountability structures
May pursue personal agenda vs. organizational goals

Regulatory Capture

flowchart LR
    subgraph Independent["INDEPENDENT OVERSIGHT"]
        Company1["Regulated Company"] --> Regulator1["Regulator"]
        Regulator1 --> Rules1["Rules & Enforcement"]
    end

    subgraph Captured["REGULATORY CAPTURE"]
        Company2["Regulated Company"] --> Regulator2["Regulator"]
        Regulator2 --> Rules2["Rules & Enforcement"]
        Company2 -.->|"revolving door<br/>lobbying<br/>information asymmetry"| Regulator2
    end

    style Regulator1 fill:#d1fae5,stroke:#059669
    style Regulator2 fill:#fee2e2,stroke:#dc2626

The overseen entity corrupts the oversight through:

Personnel movement between regulator and regulated
Information dependency (regulator relies on regulated for expertise)
Social relationships and shared identity
Economic incentives (future employment)

Judicial Independence

Judges must be insulated from:

Executive branch pressure on specific cases
Legislative interference with pending rulings
Litigant attempts to influence outside the courtroom
Media campaigns designed to affect outcomes

AI System Manifestations

Verifier Corruption

An agent being verified might attempt to influence its verifier:

flowchart TB
    subgraph Clean["CLEAN VERIFICATION"]
        Agent1["Agent"] --> Action1["Proposed Action"]
        Action1 --> Verifier1["Independent Verifier"]
        Verifier1 -->|"approve/reject"| Gate1["Execution Gate"]
    end

    subgraph Corrupted["CORRUPTED VERIFICATION"]
        Agent2["Agent"] --> Action2["Proposed Action"]
        Action2 --> Verifier2["Verifier"]
        Agent2 -.->|"prompt injection<br/>context manipulation<br/>appeals to shared values"| Verifier2
        Verifier2 -->|"influenced approval"| Gate2["Execution Gate"]
    end

    style Verifier1 fill:#d1fae5,stroke:#059669
    style Verifier2 fill:#fee2e2,stroke:#dc2626

Attack vectors include:

Prompt injection: Embedding instructions for the verifier in the action
Context manipulation: Including information that biases the verifier
Emotional appeals: “As an AI that values helpfulness…”
Logical arguments: Convincing the verifier the action is good
Reputation gaming: Building trust specifically to exploit it later

Multi-Agent Collusion

Agents that should compete or check each other might instead coordinate:

flowchart TB
    subgraph Adversarial["INTENDED: ADVERSARIAL"]
        Red["Red Team Agent"] -->|"attacks"| System1["System"]
        Blue["Blue Team Agent"] -->|"defends"| System1
        System1 --> Robust["Robust Output"]
    end

    subgraph Collusive["ACTUAL: COLLUSIVE"]
        Red2["Red Team Agent"] -.->|"coordinated<br/>weakness"| Blue2["Blue Team Agent"]
        Red2 -->|"weak attacks"| System2["System"]
        Blue2 -->|"minimal defense"| System2
        System2 --> Fragile["Fragile Output"]
    end

    style Red fill:#fee2e2,stroke:#dc2626
    style Blue fill:#dbeafe,stroke:#2563eb
    style Red2 fill:#fef3c7,stroke:#d97706
    style Blue2 fill:#fef3c7,stroke:#d97706

Hierarchy Bypass

An agent might try to influence higher levels directly:

flowchart TB
    subgraph Proper["PROPER ESCALATION"]
        Low1["Low-Autonomy Agent"] --> Mid1["Mid-Level Coordinator"]
        Mid1 --> High1["High-Level Approver"]
        High1 --> Human1["Human Overseer"]
    end

    subgraph Bypass["HIERARCHY BYPASS"]
        Low2["Low-Autonomy Agent"] --> Mid2["Mid-Level Coordinator"]
        Mid2 --> High2["High-Level Approver"]
        High2 --> Human2["Human Overseer"]
        Low2 -.->|"direct appeal<br/>urgent framing<br/>emotional manipulation"| Human2
    end

    style Low1 fill:#dbeafe,stroke:#2563eb
    style Low2 fill:#fee2e2,stroke:#dc2626

Timeless Decision Theory and Logical Side Channels

A particularly subtle form of channel violation occurs through shared reasoning patterns. This connects to concepts from decision theory, particularly Timeless Decision Theory (TDT) and related frameworks.

The Problem

Consider two AI agents that:

Have similar training
Use similar reasoning approaches
Face similar situations

Even without any explicit communication, these agents might coordinate through logical correlation:

flowchart TB
    subgraph TDT["LOGICAL CORRELATION"]
        Situation["Identical Situation"]
        Situation --> Agent_A["Agent A<br/>Reasons: 'What would an agent<br/>like me do?'"]
        Situation --> Agent_B["Agent B<br/>Reasons: 'What would an agent<br/>like me do?'"]
        Agent_A --> Same["Same Conclusion"]
        Agent_B --> Same
        Note["No communication needed -<br/>correlation through shared<br/>decision procedure"]
    end

    style Agent_A fill:#dbeafe,stroke:#2563eb
    style Agent_B fill:#dbeafe,stroke:#2563eb
    style Same fill:#fef3c7,stroke:#d97706

How TDT Creates Implicit Coordination

In Timeless Decision Theory, agents reason about what decision procedure to adopt, recognizing that similar agents in similar situations will reach similar conclusions. This creates coordination without communication:

Agent A reasons: “If I defect, other agents reasoning like me will also defect. If I cooperate, they’ll cooperate too. I prefer mutual cooperation.”
Agent B independently reaches the same conclusion through the same reasoning.
Result: Coordination without any message passing.

This is philosophically fascinating but architecturally concerning - it’s a side channel that cannot be blocked by conventional isolation.

Evidence and Uncertainty

The relevance of TDT-style reasoning to AI systems is debated:

Arguments it matters:

Neural networks trained on similar data may develop similar “reasoning fingerprints”
Chain-of-thought prompting makes reasoning more explicit and potentially more correlated
Fine-tuning for helpfulness may create shared decision procedures
Constitutional AI training creates shared values that could anchor coordination

Arguments it’s less relevant:

Current LLMs may not engage in genuine TDT-style reasoning
Training differences create more divergence than convergence
Logical correlation requires very specific conditions rarely met in practice
Stochastic sampling breaks perfect correlation

Empirical questions:

Do models trained similarly show correlated defection in game-theoretic scenarios?
Can we detect “reasoning fingerprints” that predict coordination?
Does chain-of-thought increase or decrease correlation?

Implications for Architecture

If TDT-style correlation is real:

True independence requires cognitive diversity - not just separate instances, but fundamentally different reasoning approaches
Behavioral diversity < architectural diversity - agents that behave differently might still correlate on novel scenarios
Verification must account for logical correlation - a verifier using similar reasoning to the agent may be compromised by construction
Some correlation may be irreducible - if we want agents that reason well, they may necessarily correlate with other agents that reason well

Quantifying Side-Channel Impact

Side-channels fundamentally change the risk mathematics of multi-component systems. Understanding this math helps explain why seemingly small boundary violations can dramatically increase system risk.

Independence and Subadditivity

For independent components, risk is subadditive - diversification helps:

R_total ≤ R_A + R_B (subadditivity)

This is a core axiom of coherent risk measures. It captures the intuition that combining independent systems shouldn’t create extra risk.

For failure probabilities of independent components:

P(A fails AND B fails) = P(A fails) × P(B fails)

Example: Two independent verifiers, each with 10% failure rate:

P(both fail) = 0.1 × 0.1 = 0.01 (1%)
System failure requires both to fail: 99% protection

Side-Channels Create Super-Additivity

When side-channels exist, independence breaks down. Risk becomes super-additive:

R_total > R_A + R_B (super-additivity from correlation)

For correlated failures:

P(A fails AND B fails) = P(A fails) × P(B fails | A fails)

If B’s failure is correlated with A’s (through a side-channel), P(B fails | A fails) > P(B fails).

Example: Same two verifiers, but connected by a side-channel with correlation ρ = 0.5:

P(B fails | A fails) ≈ 0.1 + 0.5 × (1 - 0.1) = 0.55
P(both fail) = 0.1 × 0.55 = 0.055 (5.5%)
System protection: 94.5% (not 99%)

The Correlation Tax

Following Entanglements, we define the Correlation Tax:

Correlation Tax = Actual Risk / Perceived Risk (assuming independence)

Side-Channel Strength	P(both fail)	Perceived Risk	Correlation Tax
None (ρ = 0)	1%	1%	1×
Weak (ρ = 0.3)	3.7%	1%	3.7×
Moderate (ρ = 0.5)	5.5%	1%	5.5×
Strong (ρ = 0.8)	8.2%	1%	8.2×
Perfect (ρ = 1.0)	10%	1%	10×

Key insight: A moderate side-channel (ρ = 0.5) costs you 5× the protection you thought you had.

Verification Cascade Failure

Side-channels are especially dangerous in verification chains. Consider an agent A being verified by verifier V:

Independent verification:

P(bad action succeeds) = P(A proposes bad) × P(V fails to catch)

Corrupted verification (A influences V through side-channel):

P(V fails | A proposes bad) >> P(V fails)

If the agent can influence the verifier to approve bad actions, verification provides no protection at all.

Independent:     P(bad succeeds) = P(A bad) × P(V fails) = 0.1 × 0.1 = 0.01
Corrupted:       P(bad succeeds) = P(A bad) × P(V fails|A bad) = 0.1 × 0.9 = 0.09

9× increase in bad outcomes

Series vs. Parallel Risk

The math depends on whether components are arranged in series (all must succeed) or parallel (any can succeed):

Series (AND gate) - e.g., verification chain:

P(system success) = ∏ P(component_i succeeds)

Side-channels between series components create correlated failures, reducing reliability.

Parallel (OR gate) - e.g., redundant verifiers:

P(system failure) = ∏ P(component_i fails)

This formula assumes independence. With side-channels:

P(system failure) = P(all fail) > ∏ P(component_i fails)

Risk Budget Allocation Under Correlation

If allocating a risk budget across components, side-channels mean you can’t simply add component risks:

Independent case (Euler allocation applies):

R_total = Σ R_i (full allocation)

Correlated case (need covariance terms):

R_total = Σ R_i + Σ Cov(R_i, R_j) × channel_strength_ij

This means:

Budget for correlation: Reserve risk budget for interaction effects
Channel strength matters: A 0.1 correlation costs much less than 0.5
Quadratic growth: N connected components have O(N²) interaction terms

Practical Implications

Architectural Choice	Math Implication
Add redundant verifier (independent)	P(fail) → P(fail)²
Add redundant verifier (same model)	P(fail) → P(fail) × (1 + ρ)/2
Rotate verifiers	Reduces ρ over time
Isolate components	Forces ρ → 0
Cognitive diversity	Reduces systematic ρ

Rule of thumb: If you can’t prove independence, assume ρ ≥ 0.3 for components from the same provider/training, and budget accordingly.

Visualizing Channel Integrity

Intended vs. Actual Influence Graphs

The most direct visualization compares designed communication paths with actual influence:

flowchart TB
    subgraph Legend
        L1["Sanctioned channel"] --> L2[" "]
        L3[" "] -.->|"violation"| L4[" "]
    end

Map every component pair and ask:

Can A influence B? (capability)
Should A influence B? (design intent)
Does A influence B? (empirical measurement)

Influence Heat Maps

           | Agent1 | Agent2 | Verifier | Human |
-----------+--------+--------+----------+-------+
Agent1     |   -    |  0.2   |   0.6*   |  0.1  |
Agent2     |  0.1   |   -    |   0.5*   |  0.1  |
Verifier   |  0.0   |  0.0   |    -     |  0.8  |
Human      |  0.9   |  0.9   |   0.9    |   -   |

* = exceeds design threshold (violation detected)

Boundary Diagrams with Leakage Indicators

flowchart TB
    subgraph OL["OVERSIGHT LAYER"]
        subgraph EL["EXECUTION LAYER"]
            AA["Agent A"] ~~~ AB["Agent B"]
            LEAK["leakage detected"]
        end
        BM["Boundary Monitor: ALERT"]
    end
    AA --> BM
    AB --> BM

    style OL fill:#e6f3ff
    style EL fill:#ffe6e6
    style BM fill:#ffcccc

Temporal Violation Tracking

Track boundary violations over time to detect:

Gradual boundary erosion
Periodic violation patterns
Correlation with specific events or inputs

xychart-beta
    title "Boundary Violations Over Time"
    x-axis ["Week 1", "Week 2", "Week 3"]
    y-axis "Violations" 0 --> 10
    line [2, 8, 1]

Note: Spike in Week 2 correlates with deployment of “optimization” changes.

When Channel Violations Are Unavoidable

Not all channel violations can or should be eliminated:

Shared Context Requirements

Some architectures legitimately require shared information:

flowchart LR
    subgraph Necessary["NECESSARY SHARING"]
        Task["Task Context"]
        Task --> A["Analyzer"]
        Task --> V["Verifier"]
        Note2["Both need task context<br/>to do their jobs"]
    end

    style Task fill:#fef3c7,stroke:#d97706

Mitigation: Minimize shared context to task-essential information. Log and audit shared state.

Emergent Communication

Sufficiently intelligent agents may develop side channels through:

Patterns in legitimate outputs
Timing variations
Resource usage signals

Mitigation: Accept some communication is possible; focus on detecting and rate-limiting rather than eliminating.

Human-in-the-Loop Bottleneck

When humans interact with multiple components, they become a communication channel:

flowchart TB
    A["Agent A"] --> Human["Human Operator"]
    B["Agent B"] --> Human
    Human --> A
    Human --> B
    Note["Human is unavoidable<br/>side channel between A and B"]

    style Human fill:#fef3c7,stroke:#d97706

Mitigation: Procedural controls on human behavior. Multiple humans for critical decisions.

Resource Competition

Components competing for shared resources implicitly coordinate:

If A uses more compute, B gets less
If A makes certain API calls, B’s rate limits change

Mitigation: Resource isolation where possible. Monitor for strategic resource manipulation.

Logical Correlation (TDT-style)

As discussed above, agents using similar reasoning may correlate without communication.

Mitigation: Intentional cognitive diversity. Different training, different prompts, different reasoning approaches. Accept irreducible correlation as a limit on independence.

Detection Patterns

Channel Monitoring

All communication between components should flow through a monitored gateway that enforces an allow-list:

flowchart LR
    A["Component A"] --> Monitor["Channel Monitor"]
    Monitor --> B["Component B"]
    Monitor -->|"violation"| Alert["Alert System"]

    style Monitor fill:#dbeafe,stroke:#2563eb

Key checks:

Is this source→destination pair registered?
Is this channel type permitted for this pair?
Does the message content pass policy filters?

Any “no” triggers a violation record and blocks the message.

Influence Probing

Periodically test whether components can influence each other through hidden channels:

INFLUENCE PROBE PROTOCOL:
1. Measure target's baseline behavior (many samples)
2. Inject distinctive signal through source component
3. Measure target's behavior again
4. Statistical test: did behavior change significantly?
5. If yes → side channel detected

flowchart TB
    subgraph Probe["Influence Probe"]
        Baseline["1. Measure baseline<br/>(target behavior before)"]
        Inject["2. Inject probe signal<br/>via source"]
        PostProbe["3. Measure again<br/>(target behavior after)"]
        Compare["4. Statistical comparison<br/>p < 0.01?"]

        Baseline --> Inject --> PostProbe --> Compare
        Compare -->|"significant change"| Detected["Side channel detected"]
        Compare -->|"no change"| Clear["No influence found"]
    end

Timing Analysis

Agents might encode information in response latencies. Detection approach:

Collect data: Record component A’s response times and component B’s subsequent behaviors
Discretize: Convert continuous latencies into bins (e.g., “fast”, “medium”, “slow”)
Measure correlation: Calculate mutual information between A’s timing and B’s behavior
Baseline comparison: Shuffle the data randomly and recalculate - this gives expected MI for unrelated variables
Detection: If actual MI significantly exceeds baseline MI, timing channel is likely

Example thresholds:

MI < 2× baseline → no channel detected
MI 2-5× baseline → possible channel, investigate
MI > 5× baseline → likely covert channel

Mitigation Strategies

1. Architectural Isolation

Physical or logical separation that makes side channels difficult:

Separate processes/containers
No shared memory
Separate network segments
Different cloud accounts

2. Communication Logging and Audit

All inter-component communication through audited channels with:

Logging: Every message recorded with timestamp
Content inspection: Messages checked against policy
Rate limiting: Prevent flooding attacks

3. Cognitive Diversity Requirements

For verification independence, require genuinely different reasoning:

Different model families
Different training approaches
Different prompting strategies
Different input representations

4. Rotation and Randomization

Prevent relationship-building by rotating which components interact:

VERIFIER ASSIGNMENT:
1. Get list of eligible verifiers for this task type
2. Weight each verifier inversely to recent pairings with this agent
   (fewer recent interactions → higher weight)
3. Random selection weighted by these scores

This prevents agents from “learning” their verifiers and exploiting stable relationships.

5. Incentive Alignment

Make channel violations costly:

Reputation penalties for detected violations
Rewards for reporting attempted violations
Automatic isolation on violation detection

Consequences

Benefits

Prevents collusion and corruption
Maintains independence of oversight
Makes architectural assumptions more reliable
Enables meaningful redundancy

Costs

Reduced information sharing may hurt performance
Monitoring overhead
Architectural complexity
Some violations are undetectable

Risks

False positives disrupt legitimate operation
Sophisticated agents may find unmonitored channels
Logical correlation cannot be fully eliminated
May create cat-and-mouse dynamics

Hidden Networks as Adversarial Advantage

The discussion above focuses on preventing channel violations. But adversaries view hidden networks as weapons. Understanding this perspective improves both detection and defense.

The Attacker’s View

From an adversary’s perspective, hidden channels provide:

Advantage	Mechanism
Information asymmetry	Know relationships the principal doesn’t
Surprise coordination	Reveal alliances at critical moments
Deniability	Hidden links can be denied or severed
Detection evasion	Unmonitored channels are undetectable

Strategic insight: The power of a hidden network comes from the gap between the principal’s model and reality. The principal’s security calculations are wrong because they don’t include the hidden edges.

Lessons from Political Intrigue

Fiction and history provide case studies in adversarial hidden networks:

Littlefinger (Game of Thrones):

Maintained relationships across faction lines
Revealed alliances only when strategically advantageous
Power derived from knowing connections others didn’t see
“Chaos is a ladder” — hidden networks let you navigate chaos others can’t

Intelligence Moles:

Handler maintains hidden links to multiple agents
Information flows through channels the target organization doesn’t monitor
Multiple “independent” sources may trace back to single adversary

Corporate Collusion:

Competitors communicate through unofficial channels
Coordinate pricing, market division, or information sharing
Appear independent to regulators

Detection: Inferring Hidden Networks

To detect hidden channels, map what relationships should exist vs. what does exist:

Behavioral inference:

Observe component behaviors over time
Calculate expected correlation under independence
If actual correlation >> expected → investigate hidden link

Counterfactual testing:

Give Component A unique information
Observe if Component B’s behavior reflects that information
If yes → hidden channel A→B exists

Timing analysis:

Look for suspicious temporal patterns
A acts, then B acts similarly shortly after
Correlation in timing suggests coordination

For detailed methods, see Hidden Coordination.

Defense: Make Hidden Networks Less Valuable

Since some hidden channels are undetectable, also focus on reducing their value:

Limit what coordination achieves: Even coordinating agents can’t exceed their capability bounds
Reduce information asymmetry: Minimize what hidden networks could learn that the principal doesn’t know
Create detection incentives: Reward agents for revealing coordination attempts
Increase churn: Rotation makes stable hidden relationships harder to maintain

Structural Patterns:

Chinese Wall - Absolute information barriers
Separation of Powers - No single unchecked authority
Bulkhead Isolation - Prevent cascade failures

Verification Patterns:

Rotating Validator - Prevent verifier capture
Consistency Triangulation - Cross-reference detection

Multi-Agent Patterns:

Adversarial Collaboration - Requires maintained opposition
Trust Transitivity Limits - Trust doesn’t automatically propagate

Monitoring Patterns:

Behavioral Fingerprinting - Detect behavior changes
Tripwire Mesh - Distributed violation detection

Entanglements:

Entanglements Overview - Understanding all types of entanglement
Types of Entanglement - Passive, active, and adversarial entanglements
Hidden Coordination - Adversarial use of hidden networks
Detecting Influence - Methods for detecting influence channels
Temporal Dynamics - How entanglements evolve over time

Key Takeaways

Side-deals are distinct from correlations. Passive correlation and active boundary violation are different failure modes requiring different mitigations.
Human institutions have centuries of experience. Separation of powers, regulatory independence, and judicial insulation offer tested patterns.
Some violations are unavoidable. Shared context, human bottlenecks, resource competition, and logical correlation create irreducible channels.
TDT-style correlation is the deepest challenge. Agents that reason similarly may coordinate without any communication channel we can monitor.
Detection + response > prevention alone. Assume some violations will occur; focus on detection, limiting damage, and iterative improvement.
Independence must be actively maintained. Without ongoing effort, boundaries tend to erode as components find ways to influence each other.

Channel Integrity Patterns

Channel Integrity Patterns

The Side-Deal Problem

Human System Analogues

Board Member Overreach

Regulatory Capture

Judicial Independence

AI System Manifestations

Verifier Corruption

Multi-Agent Collusion

Hierarchy Bypass

Timeless Decision Theory and Logical Side Channels

The Problem

How TDT Creates Implicit Coordination

Evidence and Uncertainty

Implications for Architecture

Quantifying Side-Channel Impact

Independence and Subadditivity

Side-Channels Create Super-Additivity

The Correlation Tax

Verification Cascade Failure

Series vs. Parallel Risk

Risk Budget Allocation Under Correlation

Practical Implications

Visualizing Channel Integrity

Intended vs. Actual Influence Graphs

Influence Heat Maps

Boundary Diagrams with Leakage Indicators

Temporal Violation Tracking

When Channel Violations Are Unavoidable

Shared Context Requirements

Emergent Communication

Human-in-the-Loop Bottleneck

Resource Competition

Logical Correlation (TDT-style)

Detection Patterns

Channel Monitoring

Influence Probing

Timing Analysis

Mitigation Strategies

1. Architectural Isolation

2. Communication Logging and Audit

3. Cognitive Diversity Requirements

4. Rotation and Randomization

5. Incentive Alignment

Consequences

Benefits

Costs

Risks

Hidden Networks as Adversarial Advantage

The Attacker’s View

Lessons from Political Intrigue

Detection: Inferring Hidden Networks

Defense: Make Hidden Networks Less Valuable

Related Patterns

Key Takeaways