Tool-Design Anti-Patterns

Deep Dive · Tool & Capability Design

Most failing agents are not failing on the model — they are failing on four recurring tool-design anti-patterns.

The first five essays argued for principles; this one is the field guide to their violations. Each anti-pattern below is something you can spot in an existing tool definition in under a minute, each has a characteristic failure signature in the trace, and each has a mechanical fix. If an agent is mysteriously unreliable and the prompt looks fine, the tools are the first place to look — and these four are what you will usually find.

STEP 1

The kitchen-sink tool: one call with a mode flag that is really five tools.

The signature: a single tool with an action or operation argument and a schema where most fields are conditionally relevant — half apply only when action="create", half only when action="delete". The model must first pick the tool, then pick the mode, then figure out which fields this mode needs, and it gets the mode wrong or fills the wrong field set. The fix is K2's split: one tool per action, each with a tight schema. The trace tell is the same tool called with internally contradictory arguments.

# Anti-pattern: kitchen-sink, conditional fields
manage_user(action: str, id=None, name=None,
            email=None, role=None, hard=None)

# Fix: one tool per action, each schema tight
create_user(name, email, role)
delete_user(id, hard=False)

STEP 2

Stringly-typed arguments: a free string where a structure belonged.

The signature: a parameter typed string that actually expects a date, an enum value, a JSON blob, or a query DSL the model has to construct from nothing. Free strings are where the model puts plausible garbage — "next Tuesday" into a field that wanted 2026-05-26, "high priority" into a field that wanted P1. The fix is K3: replace the string with an enum, a typed field with a format, or a structured object so the bad value cannot be expressed.

Grep your tool definitions for parameters typed plain string with no enum, format, or pattern. Each one is a place the model is currently free to hallucinate; most of them wanted a constrained type and nobody tightened it.

STEP 3

Silent failure: the tool returns success when nothing happened.

The signature: a tool that catches its own exceptions and returns an empty list, a default object, or a bare 200 instead of an error. This is the most dangerous anti-pattern because it produces no local symptom — the agent proceeds confidently on a false premise and the failure surfaces many steps later as behavior no trace can localize to a cause. The fix is K4: fail loudly, report partial success as partial, never let "nothing happened" look like "it worked." A tool that cannot fail visibly cannot be debugged.

"Return empty on error so the agent doesn't crash" is the single most expensive convenience in agent engineering. The agent doesn't crash — it confidently does the wrong thing for ten more steps, and the postmortem cannot find where it went wrong because the tool erased the evidence.

STEP 4

The leaky abstraction: the tool exposes its implementation, not its job.

The signature: a tool whose arguments or outputs are shaped by the backend, not the task — pagination cursors the model has to thread, internal status codes, raw join keys, a db_query that makes the model author SQL. The model is forced to reason about your storage layer instead of the user's goal, and it reasons about it badly. The fix is K1: design the tool around the task and hide the machinery; if the model is constructing your internal query language, the abstraction has leaked and the tool is doing the wrong job.

STEP 5

The meta-tell: these rarely travel alone.

A kitchen-sink tool is usually also stringly-typed (the action arg) and often a leaky abstraction (it mirrors a CRUD endpoint), and it tends to fail silently because broad tools have broad, vague error handling. When you find one, audit the same tool for the other three — fixing the granularity (K2) frequently dissolves the typing and abstraction problems at the same time, because a task-shaped tool has nowhere to leak and nothing to mode-switch on.

STEP 6

When an "anti-pattern" is the pragmatic call.

These are defaults to violate consciously, not laws. A throwaway internal tool used once by a fixed harness can be stringly-typed and leaky and it does not matter. A genuinely general escape hatch — a sandboxed run_code — is a deliberate, contained leaky abstraction whose power is the point. The anti-pattern is shipping these by accident on a model-facing, freely-chosen, side-effecting tool — not choosing one knowingly, scoped, with the failure mode understood.