crewai - ✅(Solved) Fix [FEATURE] Adaptive Re-planning When Task Results Deviate from Plan [1 pull requests, 1 participants]

NithiN-1808 · 2026-03-20T11:50:37Z

[crewai] PR 4984: feat: add adaptive re-planning for crew task execution 4983 - Repository: crewAIInc/crewAI - Author: devin-ai-integration bot - State: open |… # PR #4984: feat: add adaptive re-planning for crew task execution (#4983) - Repository: crewAIInc/crewAI - Author: devin-ai-integration[bot] - State: open | merged: False - Link: https://github.com/crewAIInc/crewAI/pull/4984 ## Description (problem / solution / changelog) ## Summary Adds optional adaptive re-planning to crew execution. When `replan_on_failure=True` and `planning=True`, the crew evaluates each completed task's result against the original plan via a lightweight LLM call. If significant deviation is detected, remaining tasks get revised plans. **New components:** - `ReplanningEvaluator` (`replanning_evaluator.py`): LLM-based evaluator that produces a structured `ReplanDecision` (should_replan, reason, affected_task_numbers) - `CrewPlanner._handle_crew_replanning()`: generates revised `PlannerTaskPydanticOutput` for remaining tasks given completed results and deviation reason - `Crew._maybe_replan()`: hook called after each sync task in both `_execute_tasks()` and `_aexecute_tasks()` **New Crew fields** (backwards compatible — both default to off/safe values): - `replan_on_failure: bool = False` - `max_replans: int = 3` 33 new tests covering evaluator, planner replanning, crew field defaults, `_maybe_replan` guard conditions, plan application, max-replan cap, and execution integration. ### Updates since last revision - Fixed ruff lint error: added explicit `strict=False` to `zip()` call in `_create_completed_tasks_summary` - Applied ruff formatting to all changed source files ## Review & Testing Checklist for Human - [ ] **Blocking sync call inside async path**: `_maybe_replan` calls `ReplanningEvaluator.evaluate()` which internally uses `task.execute_sync()`. In `_aexecute_tasks`, this will block the event loop. Verify whether this is acceptable or if an async variant is needed. - [ ] **Plan text extraction correctness**: Plan portion is extracted via `task.description[len(original_desc):]`. Confirm this holds up if task descriptions are modified between planning and execution (e.g., by `_interpolate_inputs` or other hooks). - [ ] **LLM prompt quality**: All tests mock LLM responses. Manually verify that the evaluator and replanning prompts produce reliable structured output (`ReplanDecision` / `PlannerTaskPydanticOutput`) with a real model. - [ ] **Cost/latency implications**: Each completed task triggers an extra LLM evaluation call. For N tasks, that's up to N-1 evaluation calls plus up to `max_replans` replanning calls. Consider whether documentation or a warning is needed. - [ ] **End-to-end test**: Create a small crew with `planning=True, replan_on_failure=True`, mock one task to return a clearly deviating result, and confirm the remaining task descriptions get updated with revised plans during a full `kickoff()`. ### Notes - `_original_task_descriptions` is only populated inside `_handle_crew_planning()`. If `replan_on_failure=True` but `planning=False`, the feature is inert (no plan text → early return). This is intentional but undocumented. - The `CrewPlanner` instance in `_maybe_replan` is constructed with `self.tasks` (all tasks), but `_handle_crew_replanning` only uses its explicit arguments—`self.tasks` on the planner object is unused during replanning. Link to Devin session: https://app.devin.ai/sessions/19e57a876625492f86b149047b0103fb ## Changed files - `lib/crewai/src/crewai/crew.py` (modified, +114/-0) - `lib/crewai/src/crewai/utilities/planning_handler.py` (modified, +112/-0) - `lib/crewai/src/crewai/utilities/replanning_evaluator.py` (added, +162/-0) - `lib/crewai/tests/utilities/test_replanning.py` (added, +572/-0) ## Fixed - Fixed by PR: feat: add adaptive re-planning for crew task execution (#4983) (https://github.com/crewAIInc/crewAI/pull/4984) ### Feature Area Core functionality ### Is your feature request related to a an existing bug? Please link it here. ### Problem When `planning=True`, `AgentPlanner` generates a step-by-step plan before the crew starts executing — which is great. However, the plan is **static**: once generated, it is never updated regardless of what actually happens during execution. This becomes a problem when a task returns results that contradict the plan's assumptions. For example: - A research task finds no data where the plan assumed data would be available - An API call returns an unexpected format or error - An early task reveals that a planned approach is infeasible In all these cases, the remaining agents continue following the original (now incorrect) plan, leading to compounding errors and poor final outputs. ### Describe the solution you'd like ### Proposed Solution Add an optional `replan_on_failure` flag to the `Crew` class that enables adaptive re-planning during execution. **New API (fully backwards compatible — defaults to `False`):** ```python crew = Crew( agents=[...],

crewai2026-03-20 11:50:37

ON THIS PAGE

Recommended Tools

×6

Utilities matched from this issue’s tags and category — try them while you read without losing context.

GitHub issue graph ai analysis

Paste a GitHub issue URL. We fetch that issue, discover linked issues from bodies/comments/timeline, collect linked pull requests, and produce a structured English report.

The report is written in English Markdown for sharing and archival.

GitHub issue URL

Helpful · Quick feedback

GitHub stats

crewAIInc/crewAI#4983•Fetched 2026-04-08 01:06:58

View on GitHub

Comments

Participants

Timeline

Reactions

Author

NithiN-1808

Participants

NithiN-1808

Timeline (top)

cross-referenced ×1labeled ×1referenced ×1

Error Message

An API call returns an unexpected format or error

Fix Action

Fixed

Fixed by PR: feat: add adaptive re-planning for crew task execution (#4983) (https://github.com/crewAIInc/crewAI/pull/4984)

PR fix notes

PR #4984: feat: add adaptive re-planning for crew task execution (#4983)

Repository: crewAIInc/crewAI
Author: devin-ai-integration[bot]
State: open | merged: False
Link: https://github.com/crewAIInc/crewAI/pull/4984

Description (problem / solution / changelog)

Summary

Adds optional adaptive re-planning to crew execution. When replan_on_failure=True and planning=True, the crew evaluates each completed task's result against the original plan via a lightweight LLM call. If significant deviation is detected, remaining tasks get revised plans.

New components:

ReplanningEvaluator (replanning_evaluator.py): LLM-based evaluator that produces a structured ReplanDecision (should_replan, reason, affected_task_numbers)
CrewPlanner._handle_crew_replanning(): generates revised PlannerTaskPydanticOutput for remaining tasks given completed results and deviation reason
Crew._maybe_replan(): hook called after each sync task in both _execute_tasks() and _aexecute_tasks()

New Crew fields (backwards compatible — both default to off/safe values):

replan_on_failure: bool = False
max_replans: int = 3

33 new tests covering evaluator, planner replanning, crew field defaults, _maybe_replan guard conditions, plan application, max-replan cap, and execution integration.

Updates since last revision

Fixed ruff lint error: added explicit strict=False to zip() call in _create_completed_tasks_summary
Applied ruff formatting to all changed source files

Review & Testing Checklist for Human

Blocking sync call inside async path: _maybe_replan calls ReplanningEvaluator.evaluate() which internally uses task.execute_sync(). In _aexecute_tasks, this will block the event loop. Verify whether this is acceptable or if an async variant is needed.
Plan text extraction correctness: Plan portion is extracted via task.description[len(original_desc):]. Confirm this holds up if task descriptions are modified between planning and execution (e.g., by _interpolate_inputs or other hooks).
LLM prompt quality: All tests mock LLM responses. Manually verify that the evaluator and replanning prompts produce reliable structured output (ReplanDecision / PlannerTaskPydanticOutput) with a real model.
Cost/latency implications: Each completed task triggers an extra LLM evaluation call. For N tasks, that's up to N-1 evaluation calls plus up to max_replans replanning calls. Consider whether documentation or a warning is needed.
End-to-end test: Create a small crew with planning=True, replan_on_failure=True, mock one task to return a clearly deviating result, and confirm the remaining task descriptions get updated with revised plans during a full kickoff().

Notes

_original_task_descriptions is only populated inside _handle_crew_planning(). If replan_on_failure=True but planning=False, the feature is inert (no plan text → early return). This is intentional but undocumented.
The CrewPlanner instance in _maybe_replan is constructed with self.tasks (all tasks), but _handle_crew_replanning only uses its explicit arguments—self.tasks on the planner object is unused during replanning.

Link to Devin session: https://app.devin.ai/sessions/19e57a876625492f86b149047b0103fb

Changed files

lib/crewai/src/crewai/crew.py (modified, +114/-0)
lib/crewai/src/crewai/utilities/planning_handler.py (modified, +112/-0)
lib/crewai/src/crewai/utilities/replanning_evaluator.py (added, +162/-0)
lib/crewai/tests/utilities/test_replanning.py (added, +572/-0)

Code Example

crew = Crew(
    agents=[...],
    tasks=[...],
    planning=True,           # existing flag, required
    replan_on_failure=True,  # new flag
    max_replans=2,           # new flag, prevents infinite loops
)

RAW_BUFFERClick to expand / collapse

Feature Area

Core functionality

Is your feature request related to a an existing bug? Please link it here.

Problem

When planning=True, AgentPlanner generates a step-by-step plan before the crew starts executing — which is great. However, the plan is static: once generated, it is never updated regardless of what actually happens during execution.

This becomes a problem when a task returns results that contradict the plan's assumptions. For example:

A research task finds no data where the plan assumed data would be available
An API call returns an unexpected format or error
An early task reveals that a planned approach is infeasible

In all these cases, the remaining agents continue following the original (now incorrect) plan, leading to compounding errors and poor final outputs.

Describe the solution you'd like

Proposed Solution

Add an optional replan_on_failure flag to the Crew class that enables adaptive re-planning during execution.

New API (fully backwards compatible — defaults to False):

crew = Crew(
    agents=[...],
    tasks=[...],
    planning=True,           # existing flag, required
    replan_on_failure=True,  # new flag
    max_replans=2,           # new flag, prevents infinite loops
)

How it would work:

After each task completes, a lightweight ReplanningEvaluator makes a structured LLM call asking: "Does this result deviate significantly from what the plan assumed?"
If yes (returns ReplanDecision(should_replan=True, reason=..., affected_steps=[...])), AgentPlanner.replan() is called with: original goal + completed results so far + the deviation reason
A revised plan is generated for remaining tasks only and injected into their descriptions
Execution continues with the updated plan
A replan_count counter prevents runaway loops (capped at max_replans)

Files I'd change:

src/crewai/utilities/replanning_evaluator.py — new file
src/crewai/utilities/planning_handler.py — add replan() method
src/crewai/crew.py — add fields + hook into _execute_tasks()
tests/utilities/test_replanning_evaluator.py — new tests
docs/core-concepts/Planning.mdx — new section

Why Backwards Compatible

replan_on_failure defaults to False. All existing crews with planning=True are completely unaffected unless they explicitly opt in.

Questions for Maintainers

Does this direction align with your vision for the planning feature?
Any preference on where the evaluator logic lives — separate class vs. inside AgentPlanner?
Should the ReplanningEvaluator be pluggable (i.e. users can pass a custom evaluator)? I'd lean yes for flexibility.
Any preference on the parameter names (replan_on_failure, max_replans)?

Happy to open a draft PR once direction is confirmed. Let me know if you'd like me to adjust the approach.

Describe alternatives you've considered

No response

Additional context

No response

Willingness to Contribute

Yes, I'd be happy to submit a pull request

extent analysis

Fix Plan

To implement the adaptive re-planning feature, follow these steps:

Create a new file replanning_evaluator.py with a ReplanningEvaluator class:

class ReplanningEvaluator:
    def evaluate(self, task_result, plan_assumptions):
        # Make a structured LLM call to determine if the result deviates from the plan
        # Return ReplanDecision(should_replan=True, reason=..., affected_steps=[...]) if deviation is significant
        pass

Add a replan method to the PlanningHandler class in planning_handler.py:

class PlanningHandler:
    def replan(self, original_goal, completed_results, deviation_reason):
        # Generate a revised plan for remaining tasks only
        # Inject the revised plan into their descriptions
        pass

Modify the Crew class in crew.py to include the replan_on_failure flag and max_replans counter:

class Crew:
    def __init__(self, agents, tasks, planning, replan_on_failure=False, max_replans=2):
        self.replan_on_failure = replan_on_failure
        self.max_replans = max_replans
        self.replan_count = 0

    def _execute_tasks(self):
        for task in self.tasks:
            # Execute the task
            task_result = task.execute()
            if self.replan_on_failure:
                evaluator = ReplanningEvaluator()
                decision = evaluator.evaluate(task_result, self.plan_assumptions)
                if decision.should_replan:
                    self.replan_count += 1
                    if self.replan_count <= self.max_replans:
                        self.planning_handler.replan(self.original_goal, self.completed_results, decision.reason)
                    else:
                        # Handle max replans exceeded
                        pass

Create new tests in test_replanning_evaluator.py to cover the ReplanningEvaluator class.

Verification

To verify the fix, test the Crew class with the replan_on_failure flag enabled and simulate task failures that trigger re-planning. Check that the revised plan is generated and executed correctly.

Extra Tips

Consider making the ReplanningEvaluator pluggable to allow users to pass a custom evaluator.
Use logging to track re-planning events and decisions.
Review the max_replans counter to ensure it prevents infinite loops in case of repeated task failures.

Vote matrix · Quick signals

Works

Did the solution work? Tap to confirm.

Easy Fix

Was it a quick fix?

Time Saver

Did it save you time?

Blocking

Was it severely blocking?

Common Issue

Are others likely hitting this too?

Flaky / Intermittent

Is it intermittent?

Verified / Reproducible

Can you reproduce it reliably?

#api #ssr #installation #tensor shape #autograd error #dependency error #configuration error #environment variable #network issue

Still need to ship something?

×6

Another batch ranked right after the header list — different links, same matching logic.