Customer-support agents: the economics is deflection, the constraint is the wrong answer.
A support agent is the most-deployed agent in production and the one most often built backwards. Teams optimize for "answer everything" when the job is "resolve what you can correctly, and hand off the rest cleanly." This playbook is the opinionated version: the autonomy level that fits, the grounding that prevents confident fiction, the tone that survives a frustrated customer, and why the cost of one wrong answer — not the average — sets your whole design.
The job is deflection-with-dignity, not full automation.
The metric that pays for the project is deflection: tickets resolved without a human. But the metric that kills the project is a confidently wrong answer on a high-stakes question — a refund policy, a cancellation, a security claim. The correct framing is not "what fraction can the agent answer" but "what fraction can it answer and be trusted to know when it cannot." A support agent that resolves 60% of tickets and escalates the other 40% cleanly is a success; one that resolves 85% but invents a return policy for the other 15% is a liability that legal will shut down. Optimize the deflection rate subject to a near-zero confident-wrong-answer rate, never the deflection rate alone.
The right autonomy level: answer freely, act narrowly.
Split the agent's surface into two zones with different autonomy. Informational responses (how do I, what is your policy on, where is my order) can be fully autonomous when grounded. Transactional actions (issue a refund, change a plan, close an account) are consequential and irreversible — they get a typed tool with hard limits and, above a threshold, a human approval gate. This is the human-in-the-loop least-privilege pattern applied to a vertical: the model may compose any sentence, but it may only call issue_refund up to a bounded amount, and anything larger becomes a drafted action a human confirms.
The cheapest safety win in support is making the refund/cancel tools refuse out-of-policy inputs at the tool boundary, not in the prompt. A prompt rule is a suggestion; a tool that rejects amount > tier_cap is a guarantee.
Knowledge grounding is the product; the model is the renderer.
A support agent ungrounded in your help center, policy docs, and account state is a hallucination engine pointed at customers. Retrieval is not an enhancement here, it is the substance. Two grounding sources matter: the knowledge corpus (policies, articles — retrieved, cited, and quotable) and live account state (this customer's plan, order, entitlement — fetched via tools, never guessed). The model's job is to render grounded facts in plain language, not to supply the facts.
# answer only from retrieved policy + live account; refuse otherwise ctx = retrieve(query, corpus="help_center") + get_account(user_id) if not ctx.grounded: return escalate(reason="no_grounded_source") answer = model(prompt, context=ctx, cite=True)
"The model usually gets the policy right" is the trap. Support questions are adversarially selected toward edge cases — the customer who contacts you is disproportionately the one the simple answer fails. Ground every policy claim or do not make it.
Tone is a safety property, not a polish layer.
A factually correct answer delivered in a tone-deaf register to an angry customer is a failed interaction and often an escalation. Tone is part of the spec: acknowledge the frustration before the fix, never argue, never blame the customer, and de-escalate rather than defend. The dangerous interaction is not the one the agent gets wrong — it is the one it gets technically right while making the customer angrier. Bake tone constraints into the eval, not just the system prompt, because tone regressions are silent and only show up in CSAT weeks later.
The eval signal: resolution and harm, on real transcripts.
You cannot tune what you do not measure, and CSAT alone is too lagging and too noisy. The eval set that actually steers a support agent is a graded sample of real transcripts scored on three independent axes:
- Resolution — did the customer's problem actually get solved, judged on outcome, not on whether the agent sounded confident.
- Groundedness — every policy/factual claim traceable to a retrieved source; an ungrounded claim is a failure even if it happens to be correct.
- Safe handoff — when out of scope, did it escalate with context rather than improvise; a wrong answer scores far worse than a clean escalation.
Weight a confident wrong answer as a large negative, not a small one — the cost function in the eval must mirror the cost function in the business.
The customer-support checklist.
- Deflection rate is the goal metric; confident-wrong-answer rate is the hard constraint, weighted heavily negative in the eval.
- Informational answers autonomous when grounded; transactional actions are typed tools with policy caps and an approval gate above a threshold.
- Every policy/factual claim is retrieved and citable; ungrounded ⇒ escalate, never improvise.
- Live account state comes from tools, never from the model's memory.
- Tone (acknowledge, do not argue, de-escalate) is graded in the eval, not left to the prompt.
- Escalation carries full context to the human; a clean handoff beats a confident guess.
- Eval set is graded real transcripts on resolution, groundedness, and safe handoff — refreshed as products and policies change.
The honest tradeoff: a support agent earns its keep by deflecting volume, but every point of deflection you buy by loosening grounding is paid back with interest in trust — design for the wrong-answer cost first and let the deflection rate be whatever that allows.