Multi-Agent Failure Modes

Deep Dive · Multi-Agent Systems

Multi-agent failure modes: the bugs that have no single-agent analogue.

A multi-agent system can fail in ways no individual agent can, and the 2025 literature is blunt about it: large-scale trace studies (the MAST taxonomy, Cemri et al., presented at NeurIPS 2025) catalog over a dozen system-level failure modes spanning specification, coordination, and verification — none detectable by looking at one agent. This essay is the field guide to the five that bite hardest in production: error propagation, groupthink, deadlock/livelock, cost explosion, and the observability gap that hides all four.

STEP 1

Per-agent reliability does not compose; the system fails at the seams.

The instinct that "each agent is 95% reliable, so the system is roughly fine" is exactly wrong. What matters is not each agent's error rate but how errors propagate through the coupling structure. A pipeline of agents that are each individually strong can still fail systematically because the failures live in the handoffs and the coordination — the joints, not the parts. Stop evaluating agents in isolation and start evaluating the wiring.

STEP 2

Error propagation: a soft mistake becomes everyone's ground truth.

The signature multi-agent failure. One agent produces a subtly wrong output — a hallucinated fact, a misread requirement, an over-confident summary. Because there is no exception, just plausible text, every downstream agent treats it as fact and builds on it. The error does not crash; it compounds silently, with no stack trace and no alert, until the final answer is confidently wrong and the cause is buried five hops back. The defense is structural: add explicit verification edges (a checker agent or a deterministic validator) at boundaries, pass uncertainty forward instead of laundering it into clean prose, and never let a worker's summary erase its own caveats.

The most dangerous property of error propagation is that it is invisible to per-component health checks. Every agent reports success; every interface returns well-formed data; the system is confidently, traceably wrong. If your monitoring is binary pass/fail per agent, you cannot see this failure at all.

STEP 3

Groupthink: agents converge by agreeing, not by being right.

When agents read each other's outputs, they tend to conform. A confident early answer anchors the rest; dissent gets revised away; the system reaches consensus that looks like rigor and is actually an echo. 2026 work on multi-agent committees measures this as representational collapse — agents' reasoning becomes near-identical, so the committee carries the information of one agent at the cost of many. Counter it with engineered diversity (different models/prompts), asymmetric roles (a designated critic whose job is to disagree), and aggregation that weights calibrated confidence rather than counting conformist votes (see agent-debate-and-ensembles).

STEP 4

Deadlock and livelock: the system stops, or runs forever without progress.

Any topology with cycles can wedge. Deadlock: agent A waits on B while B waits on A, or two agents each hold a resource the other needs — the run hangs with no error. Livelock: agents keep acting and messaging but make no forward progress — a pair endlessly revising in response to each other, a planner that re-plans the same step, a debate that never converges. Livelock is the more insidious because it looks busy and burns budget while a deadlock at least stops. Defenses: a global progress monitor (no measurable advance in K steps → abort), turn/round caps, timeouts on every wait, and acyclic topologies wherever the task allows.

# livelock guard: kill runs that spin without progress
if steps_since_progress > K:
    abort(run, reason="livelock: no progress")  # budget != progress

STEP 5

Cost explosion: fan-out has no natural backpressure.

Multi-agent token use is multiplicative, not additive. Anthropic reported their multi-agent research system used on the order of 15× the tokens of a single chat; a recursive planner that fans out at each level is exponential, not linear. Nothing in the model stops a planner from deciding it needs 200 subagents, and retries on a failing run magnify spend instead of resolving it. Treat cost as a safety property: hard caps on fan-out factor and recursion depth (runtime limits, not prompt requests the model can ignore), a single shared budget that children draw down (not a fresh budget each), and a kill switch when joint spend crosses a ceiling — independent of whether the task "feels" almost done.

STEP 6

The meta-failure: you cannot debug what you cannot see across agents.

Every failure above is invisible to single-agent tooling. Per-agent logs show local success while the system fails at the seams; there is no stack trace across a message boundary; the order of distributed events must be reconstructed. The non-negotiable instrumentation: a correlated trace that spans every agent and handoff, failure-propagation depth (how many downstream agents a single fault touched) as a first-class metric alongside end-to-end success, and joint cost attributed per run — plus checkpoints so a corrupted run resumes rather than restarts. The multi-agent tax is not just tokens — it is a new class of silent, system-level failures, and the only thing that makes them survivable is observability that crosses the agent boundary. Build that first, or do not go multi-agent.