crewai - ✅(Solved) Fix [FEATURE] Adaptive Re-planning When Task Results Deviate from Plan [1 pull requests, 1 participants]

Official PRs (…)
ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

Helpful · Quick feedback

Loading…
GitHub stats
crewAIInc/crewAI#4983Fetched 2026-04-08 01:06:58
View on GitHub
Comments
0
Participants
1
Timeline
3
Reactions
0
Participants
Timeline (top)
cross-referenced ×1labeled ×1referenced ×1

Error Message

  • An API call returns an unexpected format or error

Fix Action

Fixed

PR fix notes

PR #4984: feat: add adaptive re-planning for crew task execution (#4983)

Description (problem / solution / changelog)

Summary

Adds optional adaptive re-planning to crew execution. When replan_on_failure=True and planning=True, the crew evaluates each completed task's result against the original plan via a lightweight LLM call. If significant deviation is detected, remaining tasks get revised plans.

New components:

  • ReplanningEvaluator (replanning_evaluator.py): LLM-based evaluator that produces a structured ReplanDecision (should_replan, reason, affected_task_numbers)
  • CrewPlanner._handle_crew_replanning(): generates revised PlannerTaskPydanticOutput for remaining tasks given completed results and deviation reason
  • Crew._maybe_replan(): hook called after each sync task in both _execute_tasks() and _aexecute_tasks()

New Crew fields (backwards compatible — both default to off/safe values):

  • replan_on_failure: bool = False
  • max_replans: int = 3

33 new tests covering evaluator, planner replanning, crew field defaults, _maybe_replan guard conditions, plan application, max-replan cap, and execution integration.

Updates since last revision

  • Fixed ruff lint error: added explicit strict=False to zip() call in _create_completed_tasks_summary
  • Applied ruff formatting to all changed source files

Review & Testing Checklist for Human

  • Blocking sync call inside async path: _maybe_replan calls ReplanningEvaluator.evaluate() which internally uses task.execute_sync(). In _aexecute_tasks, this will block the event loop. Verify whether this is acceptable or if an async variant is needed.
  • Plan text extraction correctness: Plan portion is extracted via task.description[len(original_desc):]. Confirm this holds up if task descriptions are modified between planning and execution (e.g., by _interpolate_inputs or other hooks).
  • LLM prompt quality: All tests mock LLM responses. Manually verify that the evaluator and replanning prompts produce reliable structured output (ReplanDecision / PlannerTaskPydanticOutput) with a real model.
  • Cost/latency implications: Each completed task triggers an extra LLM evaluation call. For N tasks, that's up to N-1 evaluation calls plus up to max_replans replanning calls. Consider whether documentation or a warning is needed.
  • End-to-end test: Create a small crew with planning=True, replan_on_failure=True, mock one task to return a clearly deviating result, and confirm the remaining task descriptions get updated with revised plans during a full kickoff().

Notes

  • _original_task_descriptions is only populated inside _handle_crew_planning(). If replan_on_failure=True but planning=False, the feature is inert (no plan text → early return). This is intentional but undocumented.
  • The CrewPlanner instance in _maybe_replan is constructed with self.tasks (all tasks), but _handle_crew_replanning only uses its explicit arguments—self.tasks on the planner object is unused during replanning.

Link to Devin session: https://app.devin.ai/sessions/19e57a876625492f86b149047b0103fb

Changed files

  • lib/crewai/src/crewai/crew.py (modified, +114/-0)
  • lib/crewai/src/crewai/utilities/planning_handler.py (modified, +112/-0)
  • lib/crewai/src/crewai/utilities/replanning_evaluator.py (added, +162/-0)
  • lib/crewai/tests/utilities/test_replanning.py (added, +572/-0)

Code Example

crew = Crew(
    agents=[...],
    tasks=[...],
    planning=True,           # existing flag, required
    replan_on_failure=True,  # new flag
    max_replans=2,           # new flag, prevents infinite loops
)
RAW_BUFFERClick to expand / collapse

Feature Area

Core functionality

Is your feature request related to a an existing bug? Please link it here.

Problem

When planning=True, AgentPlanner generates a step-by-step plan before the crew starts executing — which is great. However, the plan is static: once generated, it is never updated regardless of what actually happens during execution.

This becomes a problem when a task returns results that contradict the plan's assumptions. For example:

  • A research task finds no data where the plan assumed data would be available
  • An API call returns an unexpected format or error
  • An early task reveals that a planned approach is infeasible

In all these cases, the remaining agents continue following the original (now incorrect) plan, leading to compounding errors and poor final outputs.

Describe the solution you'd like

Proposed Solution

Add an optional replan_on_failure flag to the Crew class that enables adaptive re-planning during execution.

New API (fully backwards compatible — defaults to False):

crew = Crew(
    agents=[...],
    tasks=[...],
    planning=True,           # existing flag, required
    replan_on_failure=True,  # new flag
    max_replans=2,           # new flag, prevents infinite loops
)

How it would work:

  1. After each task completes, a lightweight ReplanningEvaluator makes a structured LLM call asking: "Does this result deviate significantly from what the plan assumed?"
  2. If yes (returns ReplanDecision(should_replan=True, reason=..., affected_steps=[...])), AgentPlanner.replan() is called with: original goal + completed results so far + the deviation reason
  3. A revised plan is generated for remaining tasks only and injected into their descriptions
  4. Execution continues with the updated plan
  5. A replan_count counter prevents runaway loops (capped at max_replans)

Files I'd change:

  • src/crewai/utilities/replanning_evaluator.py — new file
  • src/crewai/utilities/planning_handler.py — add replan() method
  • src/crewai/crew.py — add fields + hook into _execute_tasks()
  • tests/utilities/test_replanning_evaluator.py — new tests
  • docs/core-concepts/Planning.mdx — new section

Why Backwards Compatible

replan_on_failure defaults to False. All existing crews with planning=True are completely unaffected unless they explicitly opt in.

Questions for Maintainers

  1. Does this direction align with your vision for the planning feature?
  2. Any preference on where the evaluator logic lives — separate class vs. inside AgentPlanner?
  3. Should the ReplanningEvaluator be pluggable (i.e. users can pass a custom evaluator)? I'd lean yes for flexibility.
  4. Any preference on the parameter names (replan_on_failure, max_replans)?

Happy to open a draft PR once direction is confirmed. Let me know if you'd like me to adjust the approach.

Describe alternatives you've considered

No response

Additional context

No response

Willingness to Contribute

Yes, I'd be happy to submit a pull request

extent analysis

Fix Plan

To implement the adaptive re-planning feature, follow these steps:

  • Create a new file replanning_evaluator.py with a ReplanningEvaluator class:
class ReplanningEvaluator:
    def evaluate(self, task_result, plan_assumptions):
        # Make a structured LLM call to determine if the result deviates from the plan
        # Return ReplanDecision(should_replan=True, reason=..., affected_steps=[...]) if deviation is significant
        pass
  • Add a replan method to the PlanningHandler class in planning_handler.py:
class PlanningHandler:
    def replan(self, original_goal, completed_results, deviation_reason):
        # Generate a revised plan for remaining tasks only
        # Inject the revised plan into their descriptions
        pass
  • Modify the Crew class in crew.py to include the replan_on_failure flag and max_replans counter:
class Crew:
    def __init__(self, agents, tasks, planning, replan_on_failure=False, max_replans=2):
        self.replan_on_failure = replan_on_failure
        self.max_replans = max_replans
        self.replan_count = 0

    def _execute_tasks(self):
        for task in self.tasks:
            # Execute the task
            task_result = task.execute()
            if self.replan_on_failure:
                evaluator = ReplanningEvaluator()
                decision = evaluator.evaluate(task_result, self.plan_assumptions)
                if decision.should_replan:
                    self.replan_count += 1
                    if self.replan_count <= self.max_replans:
                        self.planning_handler.replan(self.original_goal, self.completed_results, decision.reason)
                    else:
                        # Handle max replans exceeded
                        pass
  • Create new tests in test_replanning_evaluator.py to cover the ReplanningEvaluator class.

Verification

To verify the fix, test the Crew class with the replan_on_failure flag enabled and simulate task failures that trigger re-planning. Check that the revised plan is generated and executed correctly.

Extra Tips

  • Consider making the ReplanningEvaluator pluggable to allow users to pass a custom evaluator.
  • Use logging to track re-planning events and decisions.
  • Review the max_replans counter to ensure it prevents infinite loops in case of repeated task failures.

Vote matrix · Quick signals

Works
Did the solution work? Tap to confirm.
Easy Fix
Was it a quick fix?
Time Saver
Did it save you time?
Blocking
Was it severely blocking?
Common Issue
Are others likely hitting this too?
Flaky / Intermittent
Is it intermittent?
Verified / Reproducible
Can you reproduce it reliably?
Loading…

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.

Back to top recommendations

TRENDING