Adversarial Dynamics
The lens that makes predation, capture, and exploitation legible inside cooperative systems — naming the intentional adversary who studies the system to exploit it and the structural adversary that rewards defection regardless of intent.
Full Practice · Bond · Cooperating Under Bad Faith
Mechanism
Adversarial Dynamics is the lens that makes predation, capture, and exploitation legible inside cooperative systems. Inside Cooperating Under Bad Faith, it is the lens the category's other tools operate under: trust mining, the cooperative vulnerability, the exclusion problem, and sabotage diagnostics are each a specific mechanism this lens brings into focus.
Most of what the Codex maps as failure is drift. Incentives misalign over time, entropy accumulates, and systems move toward Control or Decay because no one is holding them against the current. That kind of failure is slow and diffuse; the river erodes the bank. Some failures arrive differently. An actor sits inside the cooperative framework, learns its language well enough to pass any calibration test, accumulates authority through visible alignment, and then spends the accumulated authority on something the framework was never designed to sanction. The cooperative vocabulary itself — good faith, charitable engagement, the strongest version of your argument — becomes the instrument of the exploitation, because those are the tools the framework runs on. This is a river diverted upstream by someone who knew exactly which channel to cut.
Some failures are engineered, not structural. The existing Codex analysis of drift is correct, and it is also incomplete. A framework for cooperation that does not account for the actors who come prepared to exploit it reads, to anyone who has watched exploitation happen in real time, as a framework for sincere people in a world that is not only sincere.
The lens names two adversaries. The intentional adversary is not the caricature. The adversary in this model is usually intelligent, often more fluent in the cooperative system's language than its average member, and frequently convinced they serve some higher purpose. The pattern is strategic rather than malicious in the ordinary sense: the actor studies the cooperative system, learns its norms better than most of its members, and uses that fluency to accumulate authority specifically in order to spend it. This requires not paranoia but recognition: the incentive to pass for cooperative is available to anyone in any system that rewards people for looking cooperative, and any framework aimed at continuity has to account for that.
The structural adversary is not an actor at all but the environment that rewards defection regardless of intent — the system whose incentives produce exploitative outcomes from people who never set out to exploit anyone. A cooperative framework has to withstand both: the intentional adversary who comes prepared to exploit it, and the structural drift that produces exploitative behavior without anyone intending it.
What the lens brings into focus is a set of specific mechanisms. Each is a contributing tool in this category with its own profile; what follows is what the lens reveals about each.
Trust mining is the practice of building up trust capital specifically in order to spend it. The accumulation phase looks identical to an ordinary career inside a cooperative system: the actor passes calibration tests, demonstrates alignment, earns authority through visible competence, patiently, sometimes over years. Then, at the point where the accumulated authority is high enough to be worth spending, the exploitation phase begins — extraction, redirection, sometimes outright capture. The specific vulnerability this reveals: calibrated trust, the Bond's own prescription, can be gamed by an actor patient enough to pass the calibration. Institutional capture is trust mining run at scale, through the legitimate channels of an institution, over time; the capture is invisible step by step, because step by step nothing unusual is actually happening.
The cooperative vulnerability is the framework's own vocabulary becoming the attack surface. Steelmanning turns into a legitimation tool; good faith becomes a shield demanded asymmetrically; connection before correction becomes a guarantee that a manipulative position will be received sympathetically before it can be challenged. Cooperative language is genuine when both parties are accountable to it. It is weaponized when one party uses it to constrain the other while exempting themselves. The test is whether the vocabulary binds both sides equally. The answer is not to abandon good faith — that would be surrender, becoming what the adversary already is — but to make good faith conditional on reciprocation, and to build the diagnostic capacity to see when reciprocation is being performed rather than practiced.
The exclusion problem is the hardest operational question the lens raises: when does a cooperative framework have to exclude a participant in order to survive? The Compact says identity is through practice; what happens when someone claims membership while systematically violating the practice? Four conditions together justify exclusion — a sustained pattern of bad faith rather than a single incident, evidence of strategic exploitation distinct from honest disagreement, failure of the full repair sequence, and an exclusion that serves the Range rather than the excluder's comfort. The risk of false positives is severe: labeling dissent as sabotage is one of the oldest Control moves in institutional history, and exclusion wielded to enforce conformity is the Codex practicing the failure mode it claims to resist.
Sabotage diagnostics is the discipline of telling genuine dissent from strategic sabotage. From outside, the two can be indistinguishable: both oppose prevailing positions, both name problems, both are uncomfortable for the institution on the receiving end. The difference shows up only as a pattern, over time, across five behavioral signatures.
Beyond the four mechanisms inside this category, the lens reframes how other Bond instruments are read: trust calibration, costly signals, repair, boundaries, reciprocity, and hostile-review readiness all take on sharper meaning once adversarial dynamics are in view. That cross-category reframing is architectural reasoning the category's deeper layer will carry; it is named here so the lens's reach is visible.
Practice
The diagnostic question is: "Is the other party accountable to the same norms they are invoking?"
Three practices convert that question into something you can apply in the middle of a live situation.
The Reciprocation Test. Before extending further trust or engagement, ask whether the other party is accountable to the norms they are invoking. If they demand good faith but do not practice it, or insist you steelman their positions while engaging yours in caricature, the asymmetry is the diagnostic. The test is not whether they use the cooperative vocabulary but whether the vocabulary binds them as it binds you. When it does not, the cooperative frame is already compromised.
The Pattern Assessment. Single interactions are ambiguous; patterns over time are not. Track whether the actor's behavior consistently produces fragmentation where cooperation was possible, whether their arguments shift with the audience in ways that suggest strategic positioning, whether they engage the strongest versions of opposing positions or reliably the weakest, and whether they respond to evidence or only to changes in strategic position. One data point is not a pattern; five over six months is. The Bond's commitment to cooperate in good faith does not require you to ignore what patterns over time reveal about who you are actually cooperating with.
The Structural Check. When considering any serious response, and particularly exclusion, require structural evidence: a sustained pattern across multiple independent observations, failed repair protocols, strategic exploitation confirmed rather than assumed. Never act on a single incident, a single observer's assessment, or evidence that supports only "this person is difficult" rather than "this person is exploiting the cooperative system." The structural check is deliberately slow, because the cost of false positives is as severe as the cost of false negatives.
These three practices share an architecture. Each converts an intuition — something is off here — into a structural claim: this is the asymmetry, this is the pattern, this is the evidence. The conversion is where the discipline lives, because intuitions on their own go wrong in both directions: too generous toward the exploiter, too harsh toward the dissenter. The practices slow judgment down and require it to surface its reasoning, which is what lets you examine the reasoning before you act on it.
In the Wild
A rationalist community built a research institution on cooperative norms: rigorous argument, charitable engagement, meritocratic allocation of resources. A major donor, fluent in both the community's language and its values, accumulated trust through visible alignment over several years. When the exploitation phase arrived, it arrived large — billions of dollars of customer deposits misappropriated to prop up a trading operation. The community's post-mortem focused less on whether the failure had happened and more on how an actor so deeply embedded had operated undetected for years. The answer pulled together trust mining, institutional capture, and the weaponization of the community's own epistemic norms. What the community is still absorbing is that rigor itself does not protect against an adversary who has made rigorous study of how the community thinks.
A regulatory agency was created to oversee a powerful industry. Over two decades, senior staff rotated through industry positions and back into the agency. Each individual career move was defensible, several admirable on their own terms. The aggregate was capture: the agency stopped producing decisions that constrained the industry in meaningful ways. The formal structure persisted for years while the substantive function had quietly been evacuated from inside it. By the time the capture was visible in the outcomes, the people best placed to name what had happened had already internalized the institutional norms that made naming it difficult.
A political movement adopted the language of reform. Its members attended every meeting, followed every procedure, won every vote. Critics were told to engage through the established processes. When objections were raised through those processes, the movement used its majority to block them. When procedural changes were proposed to restore balance, the movement invoked the sanctity of process. When the process produced outcomes the movement did not want, the movement attacked the process. Each move on its own was defensible; the pattern added up to sabotage. What was consistent was not any substantive position but the selection rule underneath it — the appeals shifted with the strategic position, and the commitment to process held only for as long as the process produced the wanted outcome.
A small team operating with high mutual trust noticed something odd. One member's contributions were increasingly received as correct by default, while another member's were increasingly examined carefully. When the team sat down with it, the asymmetry had nothing to do with the quality of the work — it was about accumulated authority. The first member had been demonstrating visible alignment for months; the second had been disagreeing, legitimately but uncomfortably, with roughly the same frequency. The team had quietly started treating visible alignment as evidence of competence. They caught the drift before it could institutionalize, and corrected it by making the basis for their assessments explicit. Not every case of adversarial dynamics ends in catastrophe; the ones that do not are almost always the ones where the diagnostic tools were available and the team had the discipline to use them while there was still time.
The Bond without adversarial dynamics is incomplete: it teaches how to cooperate well, and it has to also teach what cooperation looks like when the person across from you has studied the cooperative framework in order to take from it.
Suspicion as default would destroy the cooperative framework as thoroughly as any adversary could — it would turn you into the thing you were resisting and cost you the advantages cooperation gives you in the first place. Naive openness fails in the opposite direction, inviting over time exactly the exploitation it refuses to anticipate. What the Bond asks for under adversarial pressure is harder than either: calibrated trust that carries a specific new capacity inside it, the ability to recognize in real time when the cooperative framework itself is being turned against you. That capacity is not native. It has to be built.
The Codex values disagreement and requires dissent; nothing in adversarial-dynamics analysis reduces that commitment. What it changes is only the assumption that everyone invoking the cooperative vocabulary is also accountable to it. Whether they are accountable, not whether they speak the language, is what you are watching for.
Lineage
The Codex did not invent adversarial dynamics. The mechanisms named on this page were mapped by several independent research traditions over the twentieth century; the Codex's contribution is to assemble them into a coherent lens for the Bond.
Robert Axelrod's tournaments, published in The Evolution of Cooperation (1984), established the game-theoretic foundation. Axelrod showed that cooperative strategies like Tit-for-Tat outperform both unconditional cooperation and unconditional defection in repeated interactions, but also showed the conditions under which cooperation breaks down: when the shadow of the future shortens, when reputation becomes unreliable, when a player exits the game. Trust mining is a defection strategy Axelrod's framework predicts — a long accumulation phase that exploits the cooperative equilibrium until the payoff from a single large defection exceeds the value of continued cooperation.
Mancur Olson's The Logic of Collective Action (1965) and Elinor Ostrom's Governing the Commons (1990) developed the institutional economics of cooperation. Olson's work on free-riding named the vulnerability: when the benefits of cooperation are diffuse and the costs of monitoring are high, actors who defect while appearing to cooperate can extract substantial value before detection. Ostrom's work on commons governance mapped the institutional defenses — graduated sanctions, monitoring designed into the structure, clearly defined membership — the ancestors of the practices on this page.
The political science literature on entryism and regulatory capture documented institutional capture empirically. George Stigler's work (The Theory of Economic Regulation, 1971) demonstrated that regulatory bodies systematically come to serve the industries they regulate, not through conspiracy but through career incentives, informational asymmetry, and the social dynamics of repeated interaction.
Karl Popper's The Open Society and Its Enemies (1945) formalized the paradox of tolerance: a society committed to tolerance has to be intolerant of intolerance, or the intolerant will use tolerance to destroy the society that extends it. John Rawls' Political Liberalism (1993) developed the concept of the limits of reasonable pluralism — a liberal society owes reasonable disagreement its full engagement but does not owe unreasonable positions the same standing. The exclusion problem is a direct descendant of this line, adapted from political philosophy to cooperative frameworks.
Social psychology contributed the evidence that systems produce exploitation even from actors who would not individually choose it. Stanley Milgram's obedience experiments and Philip Zimbardo's Stanford prison experiment have both been criticized on methodological grounds, and the strong forms of their conclusions significantly revised. The durable finding is subtler: structures shape behavior in ways that produce exploitative outcomes from people who did not set out to exploit anyone. This is the structural adversary — the vulnerability that exists even without intentional adversaries.
Cross-references
Within the category. Adversarial Dynamics is the lens; the other four contributing tools in Cooperating Under Bad Faith are the specific mechanisms it brings into focus. Trust Mining is the two-phase accumulation-then-extraction pattern, including institutional capture as its scaled form. The Cooperative Vulnerability is the framework's own vocabulary turned into an attack surface. Sabotage Diagnostics is the discipline of telling genuine dissent from strategic sabotage across a pattern of behavioral signatures. The Exclusion Problem is the conditions under which a cooperative framework has to exclude a participant to survive. Each has its own profile; this lens carries the compressed treatment that frames them as one architectural family.
Across the Workshop. Calibrating Trust to Behavior is the Bond category Adversarial Dynamics most directly pressures — calibrated trust is the Bond's prescription, and trust mining is the strategy that games the calibration. Receiving Disagreement Well carries the receiving-side function of Steelmanning, which Adversarial Dynamics shows can be weaponized when one party engages the strongest version of their own position while caricaturing yours. In the Foundation, Steelmanning itself is the clearest case of a cooperative practice that becomes an attack surface under bad faith — the lens reads the asymmetry that Steelmanning's own practice cannot detect from inside.
Two limitations worth naming. Adversarial-dynamics reasoning can become paranoid in the wrong hands. A framework that treats every disagreement as potential sabotage has already failed the test it claims to apply; the tools here are built to raise the threshold for adversarial interpretation, not lower it. If you find yourself reaching for this lens to explain away dissent you happen to find uncomfortable, you are using it wrong. Second, no diagnostic eliminates uncertainty. You can apply every signature, follow every practice, and still be wrong about a specific case. The tools improve the probability of correct assessment; they do not produce certainty, and the position this profile takes is that imperfect tools applied with discipline are better than no tools applied with confidence.