fix: Prevent lost task completions after server restart

Three bugs causing empty phase diffs when server restarts during agent
execution:

1. Startup ordering race: reconcileAfterRestart() emitted agent:stopped
   before orchestrator registered listeners — events lost. Moved
   reconciliation to after orchestrator.start().

2. Stuck in_progress tasks: recoverDispatchQueues() only re-queued
   pending tasks. Added recovery for in_progress tasks whose agents
   are dead (not running/waiting_for_input).

3. Branch force-reset destroys work: git branch -f wiped commits when
   a second agent was dispatched for the same task. Now checks if the
   branch has commits beyond baseBranch before resetting.

Also adds:
- agent:crashed handler with auto-retry (MAX_TASK_RETRIES=3)
- retryCount column on tasks table + migration
- retryCount reset on manual retryBlockedTask()
This commit is contained in:
Lukas May
2026-03-06 12:19:59 +01:00
parent a69527b7d6
commit eac03862e3
9 changed files with 94 additions and 13 deletions

View File

@@ -0,0 +1 @@
ALTER TABLE tasks ADD COLUMN retry_count integer NOT NULL DEFAULT 0;

View File

@@ -239,6 +239,13 @@
"when": 1772409600000,
"tag": "0033_drop_approval_columns",
"breakpoints": true
},
{
"idx": 34,
"version": "6",
"when": 1772496000000,
"tag": "0034_add_task_retry_count",
"breakpoints": true
}
]
}