When to Use an Agent

Concepts · Agentic AI Explained

When to use an agent — and when not to.

The most valuable agent skill is not building one — it is recognizing the minority of problems that actually need one. Agents are the most powerful pattern in this section and the most expensive on every axis that isn't capability. This entry is a decision procedure: the three properties a task must have to justify an agent, the honest cost of choosing one, the cheaper patterns that solve most "agent" problems, and a short list of cases where reaching for an agent is a mistake.

STEP 1

The three properties a task needs to justify an agent.

An agent earns its cost only when a task has all three of these. Two out of three means a workflow does it better; one or zero means an agent is the wrong tool entirely.

1. The path can't be enumerated in advance. The sequence of steps depends on results you only learn at runtime, and those results vary enough that you cannot draw the flowchart. Debugging an unfamiliar failure: every clue redirects the next probe. If you can draw the flowchart, you have a workflow — build that, it's cheaper and testable.
2. The environment gives genuine feedback. Each action produces an observation that meaningfully constrains the next decision — a test passes or fails, a search returns or doesn't, an API errors with a specific code. Agents convert feedback into progress; with no informative feedback there is nothing for the loop to chew on, and you have a generation task, not an agent task.
3. The value clears the unpredictability tax. The task is valuable enough, or frequent enough, that automating it is worth giving up the ability to know in advance exactly what the system will do. Automating a rare, low-value task with an unpredictable system is a bad trade no matter how cool the demo is.

The compressed test: "Would a competent human doing this task need to make judgment calls based on what they find as they go, and is it worth enough to let software make those calls?" Yes to both → candidate for an agent. A "no" on the first means a workflow. A "no" on the second means it's not worth the unpredictability, whatever the technology.

STEP 2

The honest cost of choosing an agent.

Pick an agent and you are signing up for all of the following, permanently. None of them is a bug to be fixed later; they are intrinsic to letting a model direct itself:

You cannot fully test it. There is no finite set of paths. You move from "tests pass" to "the eval suite shows an N% success rate," and you ship knowing some percentage of runs will be wrong.
You cannot fully predict its cost. The same input can take 3 turns or 30. Budget caps bound the worst case; they don't make cost predictable.
You debug behaviors, not lines. A failure is "on turn 9 of a 14-turn trace it made a judgment I disagree with." There is no stack trace pointing at a bug; there is a transcript to read.
Its failures are quiet and plausible. A workflow fails loudly (an exception). An agent fails by doing something reasonable-looking that's subtly wrong and continuing confidently. Quiet-wrong is the expensive kind.
Its blast radius is its toolbox. Every write tool is something the agent can do wrong, autonomously, possibly while manipulated. Capability and risk are the same surface.

If those costs are acceptable for the value at stake, an agent is a great choice. If reading that list made you wince for your use case, that wince is information — listen to it.

STEP 3

The cheaper patterns that solve most "agent" problems.

Before building an agent, confirm none of these simpler patterns does the job. In practice the large majority of "we need an agent" problems are one of these in disguise — and these are more reliable, cheaper, and testable:

A single well-prompted model call. If the task is "transform this input into that output" with no need to act on the world, you don't need a loop at all. Summarize, classify, extract, rewrite: one call.
A prompt chain / pipeline. A fixed sequence of model calls, each feeding the next. "Outline → draft → critique → revise." Deterministic order, fully testable, no autonomy.
A workflow with a router. A model classifies the input; hardcoded branches decide what runs. This handles the overwhelming majority of "intelligent automation" needs and is dramatically more controllable than an agent.
Retrieval-augmented generation. If the real need is "answer using our documents," that is RAG, not an agent. Adding a loop to a lookup problem adds cost and failure modes for no benefit.
An assistant (human in the loop). If the model proposing and a human approving is fast enough, you get most of the value with almost none of the risk. Many "agents" are better as a great assistant a human drives.

The rule of thumb that survives contact with production: find the simplest pattern that could possibly work, build that, and only escalate to an agent when you have observed the simpler thing failing for a reason an agent specifically fixes. "We might need the flexibility later" is not that reason. Escalate on evidence, not on anticipation — the unpredictability you take on is permanent, the flexibility you might never use is free to add later.

STEP 4

Clear "do not use an agent" cases.

Some tasks are actively wrong for an agent, no matter how capable the model. Recognizing these on sight saves projects:

The task is fully specifiable. You can write the exact steps and they don't change. That is code (or a workflow). Wrapping deterministic logic in a non-deterministic model adds cost, latency, and a failure mode in exchange for nothing.
Every consequential action is irreversible and high-stakes. If every write the agent would make needs human approval anyway, you have not built an agent — you have built a slow assistant with extra steps. Just build the assistant.
You cannot define "done." From the previous entry: no checkable completion criterion means no reliable termination. The agent will run until the budget burns and you'll hope. Don't ship a hope.
The cost of a quiet, plausible error is catastrophic and unbounded. Some domains cannot tolerate "wrong but confident, and it already happened" at any rate. There, autonomy is the wrong objective; the right design keeps a human as a required gate, which by definition is not an agent.
The task is rare and low value. The engineering, evaluation, monitoring, and incident cost of a production agent is large and ongoing. A task run twice a month does not amortize it. Do it by hand or with a script.

The honest closing, and the bridge to the final entry: an agent is a precision instrument for a specific shape of problem — open-ended path, real feedback, value that clears the unpredictability tax. Used there, nothing else comes close. Used as a default because the word is exciting, it is slower, costlier, less reliable, and more dangerous than the boring pattern it replaced. Knowing the difference is the actual expertise. The last entry, "the risks and limits of agents," is what you must accept even when the answer is correctly "yes, use an agent."