Designing for Failure & Recovery

Playbook · Agent UX & Human Interaction

The quality of an agent product is decided by what happens after it is wrong.

Every agent will fail visibly in front of a user. The differentiator is not failure rate alone — it is whether failure is graceful, recoverable, and trust-repairing rather than catastrophic, opaque, and trust-destroying. This essay covers failing safely, undo and rollback as the primary safety net, error messages a user can actually act on, and the specific work of rebuilding calibrated trust after a mistake the user watched happen.

STEP 1

Design the failure path with the same care as the success path.

Most agent UX effort goes into the happy path; the failure path is whatever the exception handler happens to do. That ratio is backwards — users form their durable judgment of an agent during failures, not successes. A graceful failure has three properties: it stops before compounding (one bad step, not ten built on it), it leaves the system in a known state (not half-applied), and it tells the user precisely where things stand. Failing loudly and early beats failing silently and late every time.

The dangerous failure is not the visible crash — it is the agent that quietly produces a plausible-but-wrong result and continues. Bias detection toward surfacing uncertainty and stopping, not toward pushing through.

STEP 2

Undo is the safety net that makes everything else affordable.

If actions can be cleanly undone, the entire cost of being wrong drops by an order of magnitude — and so can the friction everywhere else (this is the reversibility lever from H2, viewed from the failure side). Invest in undo before investing in better confirmations: a strong undo lets you safely lower approval friction, which reduces fatigue, which improves the oversight that catches the failures undo then cleans up.

# Make the action reversible; record how to reverse it
op = effect.apply(action)
journal.record(op.inverse)          # compensating action, kept ready
if user.undo(op) or monitor.detected_regression(op):
    await journal.compensate(op)     # roll back to known-good state

Prefer real reversibility (compensating actions) over a confirmation that merely warns.
Where true rollback is impossible, insert a delay window with cancel before the effect lands.
Make undo discoverable at the moment of regret, not buried in a menu.

STEP 3

An error message is a UI for the user's next action.

"Something went wrong" tells the user nothing they can use. An actionable failure message answers three questions: what state am I in now (was anything applied? is it consistent?), what can I do (retry / undo / take over / escalate), and what should I not assume (e.g., "the email was NOT sent"). Surface the durable facts — what completed, what was rolled back, what is pending — and put the recovery action the user most likely wants as the primary control, right there, not three screens away.

STEP 4

Fail closed on consequences, fail open on capability.

The direction of failure is a deliberate choice. On the consequence axis — money, data, external communication — fail closed: when uncertain or erroring, do less, not more; an action not taken is recoverable, an erroneous irreversible action often is not. On the capability axis — a tool is down, a source is unreachable — fail open toward honest degradation: return partial results clearly labeled, or hand back to the human with state intact, rather than fabricating to appear functional. The worst failure mode combines them: a confident, complete-looking, wrong, irreversible result.

STEP 5

Trust repair after a visible mistake is explicit work, not time.

A visible error does not just cost that task; it resets the user's calibration (H1) for the whole agent, often globally even though the failure was scoped. Trust does not recover by waiting — it recovers through deliberate repair: acknowledge the specific error plainly (no minimizing, no anthropomorphic apology theater), state the concrete change that makes a recurrence less likely, and — most powerfully — propose a temporarily lowered autonomy rung (H5) in exactly the scope that failed. A demotion the agent proposes itself reads as accountability and rebuilds calibrated trust faster than any worded apology.

After a visible failure, scope the trust impact for the user explicitly: "this affected refund handling; ticket labeling is unchanged." Unscoped failures silently poison trust in capabilities that never failed.

STEP 6

When NOT to auto-retry.

If the failure may have partially applied a non-idempotent effect, do not auto-retry — silently retrying over uncertain state turns one recoverable failure into a duplicated, irreversible one.