Commit Graph

6 Commits

Author SHA1 Message Date
Lukas May
fab7706f5c feat: Phase schema refactor, agent lifecycle module, and log chunks
Phase model changes:
- Drop `number` column (ordering now by createdAt + dependency DAG)
- Replace `description` (plain text) with `content` (Tiptap JSON)
- Add `approved` status as dispatch gate
- Add phase dependency management (list, remove, dependents)
- Approval gate in PhaseDispatchManager.queuePhase()

Agent log chunks:
- New `agent_log_chunks` table for DB-first output persistence
- LogChunkRepository port + DrizzleLogChunkRepository adapter
- FileTailer onRawContent callback streams chunks to DB
- getAgentOutput reads from DB first, falls back to file

Agent lifecycle module (src/agent/lifecycle/):
- SignalManager: atomic signal.json read/write/wait operations
- RetryPolicy: exponential backoff with error-specific strategies
- ErrorAnalyzer: pattern-based error classification
- CleanupStrategy: debug archival vs production cleanup
- AgentLifecycleController: orchestrates retry/recovery flow
- Missing signal recovery with instruction injection

Completion detection fixes:
- Read signal.json file instead of parsing stdout as JSON
- Cancellable pollForCompletion with { cancel } handle
- Centralized state cleanup via cleanupAgentState()
- Credential handler consolidation (prepareProcessEnv)

Prompts refactor:
- Split monolithic prompts.ts into per-mode modules
- Add workspace layout section to agent prompts
- Fix markdown-to-tiptap double-serialization

Server/tRPC:
- Subscription heartbeat (30s) and bounded queue (1000 max)
- Phase CRUD: approvePhase, deletePhase, dependency queries
- Page: findByIds, getPageUpdatedAtMap
- Wire new repositories through container and context
2026-02-09 22:33:28 +01:00
Lukas May
43e2c8b0ba fix(agent): Eliminate race condition in completion handling
PROBLEM:
- Agents completing with questions were incorrectly marked as "crashed"
- Race condition: polling handler AND crash handler both called handleCompletion()
- Caused database corruption and lost pending questions

SOLUTION:
- Add completion mutex in OutputHandler to prevent concurrent processing
- Remove duplicate completion call from crash handler
- Only one handler executes completion logic per agent

TESTING:
- Added mutex-completion.test.ts with 4 test cases
- Verified mutex prevents concurrent access
- Verified lock cleanup on exceptions
- Verified different agents can process concurrently

FIXES: residential-cuckoo and 12+ other agents stuck in crashed state
2026-02-08 15:51:32 +01:00
Lukas May
3b24cf2c9d feat(01.1-03): add event emission to ProcessManager
- ProcessManager accepts optional eventBus parameter
- Emit ProcessSpawned event after successful spawn
- Emit ProcessStopped event on normal exit (code 0)
- Emit ProcessCrashed event on non-zero exit with signal
- Add 4 tests verifying event emission behavior
- Backwards compatible: events only emitted if eventBus provided
2026-01-30 14:03:45 +01:00
Lukas May
b556c10a69 test(01.1-03): add unit tests for ProcessRegistry and ProcessManager
- ProcessRegistry: 15 tests covering register, get, getAll, updateStatus, unregister, getByPid, clear
- ProcessManager: 16 tests covering spawn, stop, stopAll, restart, isRunning
- Mock execa module to avoid spawning real processes
- Test exit handler behavior for both normal exit and crash scenarios
2026-01-30 14:02:19 +01:00
Lukas May
2f3df1d529 feat(01-03): create process manager with spawn/stop
- ProcessManager class with execa for child process spawning
- spawn() starts detached background processes
- stop() graceful shutdown with SIGTERM then SIGKILL after 5s timeout
- stopAll() terminates all managed processes
- restart() stops and respawns with same config
- isRunning() probes actual process state
- Proper promise handling for killed processes
2026-01-30 13:15:31 +01:00
Lukas May
40a66175a2 feat(01-03): create process types and registry
- ProcessInfo interface for tracking process metadata
- SpawnOptions interface for spawn configuration
- ProcessRegistry class with Map-based storage
- CRUD operations: register, unregister, get, getAll, getByPid, clear
- Additional helpers: updateStatus, size getter
2026-01-30 13:13:06 +01:00