fix: prevent stale duplicate planning tasks from blocking phase completion
Three fixes for phases getting stuck when a detail task crashes and is retried: 1. detailPhase mutation (architect.ts): clean up orphaned pending/in_progress detail tasks before creating new ones, preventing duplicates at the source 2. orchestrator recovery: detect and complete stale duplicate planning tasks (same category+phase, one completed, one pending) 3. ensureBranch: catch "already exists" TOCTOU race instead of blocking phase
This commit is contained in:
@@ -149,8 +149,11 @@ When an agent crashes (`agent:crashed` event), the orchestrator automatically re
|
||||
On server restart, `recoverDispatchQueues()` also recovers:
|
||||
- Stuck `in_progress` tasks whose agents are dead (status is not `running` or `waiting_for_input`) — reset to `pending` and re-queued
|
||||
- Erroneously `blocked` tasks whose agents completed successfully (status is `idle` or `stopped`) — marked `completed` so the phase can progress. This handles the legacy case where conflict resolution incorrectly blocked already-completed tasks.
|
||||
- Stale duplicate planning tasks — if a phase has both a completed and a pending task of the same planning category (e.g. two `detail` tasks from a crash-and-retry), the pending one is marked `completed` with summary "Superseded by retry"
|
||||
- Fully-completed `in_progress` phases — after task recovery, if all tasks in an `in_progress` phase are completed, triggers `handlePhaseAllTasksDone` to complete/review the phase
|
||||
|
||||
The `detailPhase` mutation in `architect.ts` also cleans up orphaned pending/in_progress detail tasks before creating new ones, preventing duplicates at the source.
|
||||
|
||||
Manual retry via `retryBlockedTask()` resets `retryCount` to 0, giving the task a fresh set of automatic retries.
|
||||
|
||||
### Coalesced Scheduling
|
||||
|
||||
Reference in New Issue
Block a user