From c823a6b44b4a89663643fa54ef18bedef89a2aa7 Mon Sep 17 00:00:00 2001
From: Lukas May <lukas.may@carealytix.com>
Date: Sat, 31 Jan 2026 09:06:44 +0100
Subject: [PATCH] docs(08): create phase plan

Phase 08: E2E Scenario Tests
- 2 plans in 1 wave (parallel)
- 08-01: Happy path tests (dispatch, dependencies, merge)
- 08-02: Edge case tests (crash, waiting, conflicts)
- Ready for execution
---
 .planning/ROADMAP.md                          |   7 +-
 .../08-e2e-scenario-tests/08-01-PLAN.md       | 153 ++++++++++++++
 .../08-e2e-scenario-tests/08-02-PLAN.md       | 192 ++++++++++++++++++
 3 files changed, 349 insertions(+), 3 deletions(-)
 create mode 100644 .planning/phases/08-e2e-scenario-tests/08-01-PLAN.md
 create mode 100644 .planning/phases/08-e2e-scenario-tests/08-02-PLAN.md
diff --git a/.planning/ROADMAP.md b/.planning/ROADMAP.md
index 2c5d60c..f132098 100644
--- a/.planning/ROADMAP.md
+++ b/.planning/ROADMAP.md
@@ -135,10 +135,11 @@ Plans:
 **Goal**: Happy path tests (basic flow, dependencies, merging) + edge case tests (conflicts, interrupts, token limits)
 **Depends on**: Phase 7
 **Research**: Unlikely (testing existing functionality)
-**Plans**: TBD
+**Plans**: 2 plans
 
 Plans:
-- [ ] 08-01: TBD (run /gsd:plan-phase 8 to break down)
+- [ ] 08-01: Happy Path E2E Tests
+- [ ] 08-02: Edge Case E2E Tests
 
 #### Phase 9: Extended Scenarios & CI
 
@@ -165,7 +166,7 @@ Phases execute in numeric order: 1 → 1.1 → 2 → 3 → 4 → 5 → 6 → 7 
 | 5. Task Dispatch | v1.0 | 5/5 | Complete | 2026-01-30 |
 | 6. Coordination | v1.0 | 3/3 | Complete | 2026-01-30 |
 | 7. Mock Agent & Test Harness | v1.1 | 2/2 | Complete | 2026-01-31 |
-| 8. E2E Scenario Tests | v1.1 | 0/? | Not started | - |
+| 8. E2E Scenario Tests | v1.1 | 0/2 | Not started | - |
 | 9. Extended Scenarios & CI | v1.1 | 0/? | Not started | - |
 
 ---
diff --git a/.planning/phases/08-e2e-scenario-tests/08-01-PLAN.md b/.planning/phases/08-e2e-scenario-tests/08-01-PLAN.md
new file mode 100644
index 0000000..72f00f2
--- /dev/null
+++ b/.planning/phases/08-e2e-scenario-tests/08-01-PLAN.md
@@ -0,0 +1,153 @@
+---
+phase: 08-e2e-scenario-tests
+plan: 01
+type: execute
+wave: 1
+depends_on: []
+files_modified: [src/test/e2e/happy-path.test.ts]
+autonomous: true
+---
+
+<objective>
+E2E tests proving happy path scenarios work: basic dispatch, dependency ordering, parallel execution, and merge flow.
+
+Purpose: Validate the core dispatch/coordination flow works end-to-end with mocked agents and worktrees.
+Output: Comprehensive test file covering all happy path scenarios.
+</objective>
+
+<execution_context>
+@~/.claude/get-shit-done/workflows/execute-plan.md
+@~/.claude/get-shit-done/templates/summary.md
+</execution_context>
+
+<context>
+@.planning/PROJECT.md
+@.planning/ROADMAP.md
+@.planning/STATE.md
+@.planning/phases/07-mock-agent-test-harness/07-01-SUMMARY.md
+@.planning/phases/07-mock-agent-test-harness/07-02-SUMMARY.md
+
+# Test infrastructure from Phase 7:
+@src/test/harness.ts
+@src/test/fixtures.ts
+@src/test/harness.test.ts
+
+# Types for understanding dispatch/coordination:
+@src/dispatch/types.ts
+@src/coordination/types.ts
+</context>
+
+<tasks>
+
+<task type="auto">
+  <name>Task 1: Create E2E happy path test file</name>
+  <files>src/test/e2e/happy-path.test.ts</files>
+  <action>
+Create comprehensive E2E tests for happy path scenarios using the test harness from Phase 7.
+
+Test scenarios to implement:
+
+1. **Single task flow** (using SIMPLE_FIXTURE Task A - no dependencies):
+   - Seed fixture, pre-seed idle agent
+   - Queue task, dispatch, wait for completion
+   - Verify: task:queued, task:dispatched, agent:spawned, agent:stopped events
+   - Verify: task status becomes 'completed' in database
+
+2. **Sequential dependencies** (using SIMPLE_FIXTURE):
+   - Task A has no deps, Task B and C depend on A
+   - Queue all three tasks
+   - First dispatchNext: only Task A dispatchable
+   - Complete Task A
+   - Next dispatchNext: Task B or C now dispatchable
+   - Verify dependency ordering enforced
+
+3. **Parallel dispatch** (using PARALLEL_FIXTURE):
+   - 4 independent tasks across 2 plans
+   - Pre-seed 2 idle agents
+   - Queue all tasks
+   - Two dispatchNext calls: both should succeed (parallel)
+   - Verify both agents assigned different tasks
+
+4. **Full merge flow**:
+   - Dispatch task, wait for completion
+   - Call coordinationManager.queueMerge(taskId)
+   - Call coordinationManager.processMerges('main')
+   - Verify merge:queued, merge:completed events
+   - Verify worktreeManager.merge was called
+
+Pattern from harness.test.ts:
+```typescript
+vi.useFakeTimers();
+const seeded = await harness.seedFixture(FIXTURE);
+// Pre-seed idle agent (required by DispatchManager)
+await harness.agentManager.spawn({ name: 'pool-agent', taskId: 'placeholder', prompt: 'placeholder' });
+await vi.runAllTimersAsync();
+harness.clearEvents();
+// Queue and dispatch...
+```
+
+Create `src/test/e2e/` directory if it doesn't exist. File should:
+- Import from '../index.js' (TestHarness, fixtures)
+- Use vi.useFakeTimers() for agent completion control
+- Clean up with harness.cleanup() in afterEach
+  </action>
+  <verify>npm run test -- src/test/e2e/happy-path.test.ts passes all tests</verify>
+  <done>All 4 happy path scenario tests pass, events verified, database state correct</done>
+</task>
+
+<task type="auto">
+  <name>Task 2: Add complex dependency flow test</name>
+  <files>src/test/e2e/happy-path.test.ts</files>
+  <action>
+Add test for complex dependency graph using COMPLEX_FIXTURE:
+
+**COMPLEX_FIXTURE structure:**
+- Phase 1: Plan 1 (Task 1A, 1B), Plan 2 (Task 2A depends on 1A)
+- Phase 2: Plan 3 (Task 3A depends on 1B), Plan 4 (Task 4A depends on 2A and 3A)
+
+**Test: Complex dependency ordering**
+- Seed COMPLEX_FIXTURE
+- Queue all 5 tasks
+- Verify dispatch order respects dependencies:
+  1. First dispatch: Task 1A or Task 1B (both have no deps)
+  2. After 1A completes: Task 2A becomes dispatchable
+  3. After 1B completes: Task 3A becomes dispatchable
+  4. After both 2A and 3A complete: Task 4A becomes dispatchable
+  5. Task 4A cannot dispatch until BOTH 2A and 3A complete
+
+Use getNextDispatchable() to check which tasks are ready without actually dispatching.
+
+Verify:
+- Correct event sequence
+- No task dispatched before its dependencies complete
+- Final task (4A) only dispatches after all predecessors
+  </action>
+  <verify>npm run test -- src/test/e2e/happy-path.test.ts passes (including new test)</verify>
+  <done>Complex dependency test passes, proves multi-dependency ordering works</done>
+</task>
+
+</tasks>
+
+<verification>
+Before declaring plan complete:
+- [ ] `npm run test -- src/test/e2e/happy-path.test.ts` passes all tests
+- [ ] Single task flow test exists and passes
+- [ ] Sequential dependencies test exists and passes
+- [ ] Parallel dispatch test exists and passes
+- [ ] Full merge flow test exists and passes
+- [ ] Complex dependency test exists and passes
+- [ ] No flaky tests (run twice to confirm)
+</verification>
+
+<success_criteria>
+
+- All happy path E2E tests pass
+- Tests use TestHarness from Phase 7
+- Event verification confirms correct flow
+- Database state verification confirms persistence
+- Complex dependency ordering proven correct
+</success_criteria>
+
+<output>
+After completion, create `.planning/phases/08-e2e-scenario-tests/08-01-SUMMARY.md`
+</output>
diff --git a/.planning/phases/08-e2e-scenario-tests/08-02-PLAN.md b/.planning/phases/08-e2e-scenario-tests/08-02-PLAN.md
new file mode 100644
index 0000000..0aaadc2
--- /dev/null
+++ b/.planning/phases/08-e2e-scenario-tests/08-02-PLAN.md
@@ -0,0 +1,192 @@
+---
+phase: 08-e2e-scenario-tests
+plan: 02
+type: execute
+wave: 1
+depends_on: []
+files_modified: [src/test/e2e/edge-cases.test.ts]
+autonomous: true
+---
+
+<objective>
+E2E tests proving edge case scenarios work: agent crashes, waiting for input, merge conflicts, and task blocking.
+
+Purpose: Validate error handling and edge cases in dispatch/coordination flow work correctly with proper event emission and state management.
+Output: Comprehensive test file covering all edge case scenarios.
+</objective>
+
+<execution_context>
+@~/.claude/get-shit-done/workflows/execute-plan.md
+@~/.claude/get-shit-done/templates/summary.md
+</execution_context>
+
+<context>
+@.planning/PROJECT.md
+@.planning/ROADMAP.md
+@.planning/STATE.md
+@.planning/phases/07-mock-agent-test-harness/07-01-SUMMARY.md
+@.planning/phases/07-mock-agent-test-harness/07-02-SUMMARY.md
+
+# Test infrastructure from Phase 7:
+@src/test/harness.ts
+@src/test/fixtures.ts
+@src/agent/mock-manager.ts
+
+# Types for understanding dispatch/coordination:
+@src/dispatch/types.ts
+@src/coordination/types.ts
+</context>
+
+<tasks>
+
+<task type="auto">
+  <name>Task 1: Create E2E edge case test file with crash and waiting scenarios</name>
+  <files>src/test/e2e/edge-cases.test.ts</files>
+  <action>
+Create E2E tests for edge case scenarios using test harness.
+
+Test scenarios to implement:
+
+1. **Agent crash during task**:
+   - Seed fixture, queue task
+   - Set crash scenario for agent: `harness.setAgentScenario(agentName, { outcome: 'crash', message: 'Token limit exceeded' })`
+   - Dispatch task, wait for completion
+   - Verify: agent:spawned then agent:crashed events
+   - Verify: task status should NOT be 'completed' (still in_progress or blocked)
+   - Verify: error message captured
+
+2. **Agent waiting for input and resume**:
+   - Seed fixture, queue task
+   - Set waiting scenario: `harness.setAgentScenario(agentName, { outcome: 'waiting_for_input', question: 'Which database?' })`
+   - Dispatch task
+   - Verify: agent:waiting event with question
+   - Resume agent: `harness.agentManager.resume(agentId, 'PostgreSQL')`
+   - Wait for completion
+   - Verify: agent:resumed then agent:stopped events
+   - Verify: task can now be completed
+
+3. **Task blocking**:
+   - Seed fixture, queue task
+   - Call dispatchManager.blockTask(taskId, 'Waiting for user decision')
+   - Verify: task appears in blocked list from getQueueState()
+   - Verify: getNextDispatchable() does not return blocked task
+
+Agent name pattern from dispatch: `agent-${taskId.slice(0, 6)}`
+Use setAgentScenario before dispatch to configure behavior.
+
+Pattern:
+```typescript
+vi.useFakeTimers();
+const seeded = await harness.seedFixture(SIMPLE_FIXTURE);
+const taskAId = seeded.tasks.get('Task A')!;
+
+// Pre-seed required idle agent
+await harness.agentManager.spawn({ name: 'pool-agent', taskId: 'placeholder', prompt: 'placeholder' });
+await vi.runAllTimersAsync();
+
+// Set scenario BEFORE dispatch
+harness.setAgentScenario(`agent-${taskAId.slice(0, 6)}`, { outcome: 'crash' });
+
+await harness.dispatchManager.queue(taskAId);
+harness.clearEvents();
+await harness.dispatchManager.dispatchNext();
+await vi.runAllTimersAsync();
+
+// Verify crash events
+```
+  </action>
+  <verify>npm run test -- src/test/e2e/edge-cases.test.ts passes all tests</verify>
+  <done>Crash, waiting-for-input, and blocking tests pass with proper event verification</done>
+</task>
+
+<task type="auto">
+  <name>Task 2: Add merge conflict scenario test</name>
+  <files>src/test/e2e/edge-cases.test.ts</files>
+  <action>
+Add test for merge conflict handling:
+
+**Test: Merge conflict triggers handleConflict**
+- Seed fixture, dispatch task, complete task
+- Set up worktree for task
+- Set conflict merge result: `harness.worktreeManager.setMergeResult(worktreeId, { success: false, conflicts: ['src/shared.ts', 'src/types.ts'], message: 'Merge conflict' })`
+- Queue merge and process
+- Verify: merge conflict detected
+- Call handleConflict
+- Verify: merge:conflict event emitted
+- Verify: conflict appears in queue state
+
+**Test: Successful merge after conflict resolution**
+- After conflict, clear the merge result override
+- Re-process merges
+- Verify: merge succeeds
+
+MockWorktreeManager methods:
+- `setMergeResult(worktreeId, result)` - configure specific merge behavior
+- Default behavior returns success
+
+CoordinationManager methods:
+- `queueMerge(taskId)` - queue completed task for merge
+- `processMerges(targetBranch)` - process all ready merges
+- `handleConflict(taskId, conflicts)` - handle merge conflict
+- `getQueueState()` - check conflicted tasks
+
+Note: Need to track worktreeId from agent spawn. MockAgentManager creates worktreeId during spawn - access via `agentInfo.worktreeId`. The CoordinationManager uses this when queueing for merge.
+  </action>
+  <verify>npm run test -- src/test/e2e/edge-cases.test.ts passes (including conflict tests)</verify>
+  <done>Merge conflict and resolution tests pass, proves conflict handling works</done>
+</task>
+
+<task type="auto">
+  <name>Task 3: Add test module index export</name>
+  <files>src/test/e2e/index.ts</files>
+  <action>
+Create index file for E2E test module that exports test utilities if any were created.
+
+If no shared utilities were created (tests are self-contained), create minimal index:
+
+```typescript
+/**
+ * E2E Tests for Dispatch/Coordination Flows
+ *
+ * Test files:
+ * - happy-path.test.ts: Normal operation scenarios
+ * - edge-cases.test.ts: Error handling and edge cases
+ *
+ * Uses TestHarness from src/test/ for system wiring.
+ */
+
+// No exports needed - tests are self-contained
+export {};
+```
+
+This documents the test module structure for future reference.
+  </action>
+  <verify>File exists and TypeScript compiles</verify>
+  <done>E2E test module properly organized with index</done>
+</task>
+
+</tasks>
+
+<verification>
+Before declaring plan complete:
+- [ ] `npm run test -- src/test/e2e/` passes all tests
+- [ ] Agent crash test exists and passes
+- [ ] Agent waiting for input test exists and passes
+- [ ] Task blocking test exists and passes
+- [ ] Merge conflict test exists and passes
+- [ ] Conflict resolution test exists and passes
+- [ ] No flaky tests (run twice to confirm)
+</verification>
+
+<success_criteria>
+
+- All edge case E2E tests pass
+- Tests use TestHarness and MockAgentManager scenarios
+- Event verification confirms correct error handling
+- Conflict handling proven working
+- Agent recovery (waiting → resume) proven working
+</success_criteria>
+
+<output>
+After completion, create `.planning/phases/08-e2e-scenario-tests/08-02-SUMMARY.md`
+</output>