Plan-and-execute — decompose, then run.
Plan-and-execute commits to a multi-step plan up front, then executes each step, replanning only when reality diverges. It trades ReAct's per-step adaptivity for global coherence and cost predictability. This essay covers the planner/executor split, replanning strategies, and when the up-front plan becomes a liability.
The planner/executor split.
Plan-and-execute separates two concerns that ReAct fuses. A planner — usually one strong-model call — reads the task and emits an explicit, ordered list of subtasks. An executor then runs each subtask, often with a cheaper model or a ReAct loop scoped to that single step. The plan is a first-class artifact: you can log it, show it to a user for approval, diff it across runs, and resume from a failed step.
The lineage runs from least-to-most and decomposition prompting through Plan-and-Solve (Wang et al., 2023), which showed that explicitly generating a plan before solving reduces missing-step and calculation errors versus undifferentiated chain-of-thought. The production value is less about raw accuracy and more about control: a plan is a contract you can inspect before any irreversible action runs.
# Planner output — a typed, inspectable artifact plan = [ {"id": 1, "task": "fetch customer's last 3 invoices", "tool": "billing_api"}, {"id": 2, "task": "compute total overdue amount", "deps": [1]}, {"id": 3, "task": "draft a payment-reminder email", "deps": [2]}, ]
Make the plan a typed object with explicit deps, not a prose list. Dependencies let the executor parallelize independent steps and let you render a clean approval UI. Prose plans cannot be parallelized or validated.
The execution loop and the replanning question.
The naive executor walks the plan in dependency order and stops on first failure. The interesting engineering is what happens when a step fails or returns something the plan did not anticipate. This is the central design decision of the pattern, and there are three established strategies:
- Static plan (no replanning). Execute exactly the plan; on failure, abort and surface the partial result. Cheapest and most predictable. Correct only when the environment is stable and steps rarely fail.
- Replan-on-failure. When a step fails, feed the planner the original task, the plan, and the failure, and ask for a revised remaining plan. Bounded by a replan budget. The pragmatic default for most production systems.
- Plan-and-reflect (continuous). After every step, a lightweight check asks "does the remaining plan still make sense given what we just learned?" and replans if not. Most adaptive, closest to ReAct, highest cost. The LLMCompiler / ReWOO line of work optimizes the static-plan end by resolving an argument-dependency graph so independent calls run in parallel.
# Replan-on-failure executor while plan.has_pending(): step = plan.next_ready() # deps satisfied result = execute(step) if result.ok: plan.mark_done(step, result) elif replans_used < REPLAN_BUDGET: plan = planner.revise(task, plan, failed=step, why=result.error) replans_used += 1 else: return partial(plan) # fail loud, with what we have
When plan-and-execute pays off.
- Multi-stage tasks with stable structure — ETL-style pipelines, onboarding flows, report generation — where the shape is knowable up front and coherence across steps matters more than per-step improvisation.
- Cost-sensitive workloads. One expensive planner call plus many cheap executor calls is far cheaper than a strong model re-deliberating every step as in ReAct.
- Human-in-the-loop / high-stakes actions. The explicit plan is a natural approval checkpoint before anything irreversible (sending money, mutating production) runs.
- Parallelizable work. A dependency graph lets independent steps run concurrently — a structural latency win ReAct cannot match because it discovers steps one at a time.
When it actively hurts.
The up-front plan is the pattern's strength and its core liability. Concrete failure modes:
- Plan-reality divergence. On genuinely open-ended tasks the planner cannot foresee the steps, so it emits a plausible-looking but wrong plan, and the executor faithfully marches off a cliff. Symptom: high replan rate. If you are replanning every step, you have reinvented ReAct with extra latency — switch patterns.
- Stale-plan execution. A static-plan executor keeps running steps whose premises were invalidated by an earlier step's surprising result. This is the most dangerous mode because it fails silently with a confident wrong output.
- Planner overconfidence on under-specified tasks. Vague inputs yield vague plans the executor cannot ground. Mitigation: a clarification step before planning, or a planner that is allowed to emit "insufficient information."
The diagnostic metric for this pattern is replan rate. Near 0% means a static plan is fine. Moderate and bounded is healthy replan-on-failure. Consistently high means the task is not actually decomposable in advance and you are paying planning overhead for nothing — drop to ReAct.
The honest tradeoff.
Plan-and-execute buys global coherence, cost predictability, parallelism, and an inspectable approval point in exchange for adaptivity and an extra failure surface (the plan can be wrong before a single step runs). Choose it when the task's structure is more knowable than its details. Choose ReAct when the details are discoverable only by acting. In practice many systems nest them: a planner produces the skeleton, and each executor step is itself a small bounded ReAct loop — coherence at the macro scale, adaptivity at the micro scale.