From c823a6b44b4a89663643fa54ef18bedef89a2aa7 Mon Sep 17 00:00:00 2001 From: Lukas May Date: Sat, 31 Jan 2026 09:06:44 +0100 Subject: [PATCH] docs(08): create phase plan Phase 08: E2E Scenario Tests - 2 plans in 1 wave (parallel) - 08-01: Happy path tests (dispatch, dependencies, merge) - 08-02: Edge case tests (crash, waiting, conflicts) - Ready for execution --- .planning/ROADMAP.md | 7 +- .../08-e2e-scenario-tests/08-01-PLAN.md | 153 ++++++++++++++ .../08-e2e-scenario-tests/08-02-PLAN.md | 192 ++++++++++++++++++ 3 files changed, 349 insertions(+), 3 deletions(-) create mode 100644 .planning/phases/08-e2e-scenario-tests/08-01-PLAN.md create mode 100644 .planning/phases/08-e2e-scenario-tests/08-02-PLAN.md diff --git a/.planning/ROADMAP.md b/.planning/ROADMAP.md index 2c5d60c..f132098 100644 --- a/.planning/ROADMAP.md +++ b/.planning/ROADMAP.md @@ -135,10 +135,11 @@ Plans: **Goal**: Happy path tests (basic flow, dependencies, merging) + edge case tests (conflicts, interrupts, token limits) **Depends on**: Phase 7 **Research**: Unlikely (testing existing functionality) -**Plans**: TBD +**Plans**: 2 plans Plans: -- [ ] 08-01: TBD (run /gsd:plan-phase 8 to break down) +- [ ] 08-01: Happy Path E2E Tests +- [ ] 08-02: Edge Case E2E Tests #### Phase 9: Extended Scenarios & CI @@ -165,7 +166,7 @@ Phases execute in numeric order: 1 → 1.1 → 2 → 3 → 4 → 5 → 6 → 7 | 5. Task Dispatch | v1.0 | 5/5 | Complete | 2026-01-30 | | 6. Coordination | v1.0 | 3/3 | Complete | 2026-01-30 | | 7. Mock Agent & Test Harness | v1.1 | 2/2 | Complete | 2026-01-31 | -| 8. E2E Scenario Tests | v1.1 | 0/? | Not started | - | +| 8. E2E Scenario Tests | v1.1 | 0/2 | Not started | - | | 9. Extended Scenarios & CI | v1.1 | 0/? | Not started | - | --- diff --git a/.planning/phases/08-e2e-scenario-tests/08-01-PLAN.md b/.planning/phases/08-e2e-scenario-tests/08-01-PLAN.md new file mode 100644 index 0000000..72f00f2 --- /dev/null +++ b/.planning/phases/08-e2e-scenario-tests/08-01-PLAN.md @@ -0,0 +1,153 @@ +--- +phase: 08-e2e-scenario-tests +plan: 01 +type: execute +wave: 1 +depends_on: [] +files_modified: [src/test/e2e/happy-path.test.ts] +autonomous: true +--- + + +E2E tests proving happy path scenarios work: basic dispatch, dependency ordering, parallel execution, and merge flow. + +Purpose: Validate the core dispatch/coordination flow works end-to-end with mocked agents and worktrees. +Output: Comprehensive test file covering all happy path scenarios. + + + +@~/.claude/get-shit-done/workflows/execute-plan.md +@~/.claude/get-shit-done/templates/summary.md + + + +@.planning/PROJECT.md +@.planning/ROADMAP.md +@.planning/STATE.md +@.planning/phases/07-mock-agent-test-harness/07-01-SUMMARY.md +@.planning/phases/07-mock-agent-test-harness/07-02-SUMMARY.md + +# Test infrastructure from Phase 7: +@src/test/harness.ts +@src/test/fixtures.ts +@src/test/harness.test.ts + +# Types for understanding dispatch/coordination: +@src/dispatch/types.ts +@src/coordination/types.ts + + + + + + Task 1: Create E2E happy path test file + src/test/e2e/happy-path.test.ts + +Create comprehensive E2E tests for happy path scenarios using the test harness from Phase 7. + +Test scenarios to implement: + +1. **Single task flow** (using SIMPLE_FIXTURE Task A - no dependencies): + - Seed fixture, pre-seed idle agent + - Queue task, dispatch, wait for completion + - Verify: task:queued, task:dispatched, agent:spawned, agent:stopped events + - Verify: task status becomes 'completed' in database + +2. **Sequential dependencies** (using SIMPLE_FIXTURE): + - Task A has no deps, Task B and C depend on A + - Queue all three tasks + - First dispatchNext: only Task A dispatchable + - Complete Task A + - Next dispatchNext: Task B or C now dispatchable + - Verify dependency ordering enforced + +3. **Parallel dispatch** (using PARALLEL_FIXTURE): + - 4 independent tasks across 2 plans + - Pre-seed 2 idle agents + - Queue all tasks + - Two dispatchNext calls: both should succeed (parallel) + - Verify both agents assigned different tasks + +4. **Full merge flow**: + - Dispatch task, wait for completion + - Call coordinationManager.queueMerge(taskId) + - Call coordinationManager.processMerges('main') + - Verify merge:queued, merge:completed events + - Verify worktreeManager.merge was called + +Pattern from harness.test.ts: +```typescript +vi.useFakeTimers(); +const seeded = await harness.seedFixture(FIXTURE); +// Pre-seed idle agent (required by DispatchManager) +await harness.agentManager.spawn({ name: 'pool-agent', taskId: 'placeholder', prompt: 'placeholder' }); +await vi.runAllTimersAsync(); +harness.clearEvents(); +// Queue and dispatch... +``` + +Create `src/test/e2e/` directory if it doesn't exist. File should: +- Import from '../index.js' (TestHarness, fixtures) +- Use vi.useFakeTimers() for agent completion control +- Clean up with harness.cleanup() in afterEach + + npm run test -- src/test/e2e/happy-path.test.ts passes all tests + All 4 happy path scenario tests pass, events verified, database state correct + + + + Task 2: Add complex dependency flow test + src/test/e2e/happy-path.test.ts + +Add test for complex dependency graph using COMPLEX_FIXTURE: + +**COMPLEX_FIXTURE structure:** +- Phase 1: Plan 1 (Task 1A, 1B), Plan 2 (Task 2A depends on 1A) +- Phase 2: Plan 3 (Task 3A depends on 1B), Plan 4 (Task 4A depends on 2A and 3A) + +**Test: Complex dependency ordering** +- Seed COMPLEX_FIXTURE +- Queue all 5 tasks +- Verify dispatch order respects dependencies: + 1. First dispatch: Task 1A or Task 1B (both have no deps) + 2. After 1A completes: Task 2A becomes dispatchable + 3. After 1B completes: Task 3A becomes dispatchable + 4. After both 2A and 3A complete: Task 4A becomes dispatchable + 5. Task 4A cannot dispatch until BOTH 2A and 3A complete + +Use getNextDispatchable() to check which tasks are ready without actually dispatching. + +Verify: +- Correct event sequence +- No task dispatched before its dependencies complete +- Final task (4A) only dispatches after all predecessors + + npm run test -- src/test/e2e/happy-path.test.ts passes (including new test) + Complex dependency test passes, proves multi-dependency ordering works + + + + + +Before declaring plan complete: +- [ ] `npm run test -- src/test/e2e/happy-path.test.ts` passes all tests +- [ ] Single task flow test exists and passes +- [ ] Sequential dependencies test exists and passes +- [ ] Parallel dispatch test exists and passes +- [ ] Full merge flow test exists and passes +- [ ] Complex dependency test exists and passes +- [ ] No flaky tests (run twice to confirm) + + + + +- All happy path E2E tests pass +- Tests use TestHarness from Phase 7 +- Event verification confirms correct flow +- Database state verification confirms persistence +- Complex dependency ordering proven correct + + + +After completion, create `.planning/phases/08-e2e-scenario-tests/08-01-SUMMARY.md` + diff --git a/.planning/phases/08-e2e-scenario-tests/08-02-PLAN.md b/.planning/phases/08-e2e-scenario-tests/08-02-PLAN.md new file mode 100644 index 0000000..0aaadc2 --- /dev/null +++ b/.planning/phases/08-e2e-scenario-tests/08-02-PLAN.md @@ -0,0 +1,192 @@ +--- +phase: 08-e2e-scenario-tests +plan: 02 +type: execute +wave: 1 +depends_on: [] +files_modified: [src/test/e2e/edge-cases.test.ts] +autonomous: true +--- + + +E2E tests proving edge case scenarios work: agent crashes, waiting for input, merge conflicts, and task blocking. + +Purpose: Validate error handling and edge cases in dispatch/coordination flow work correctly with proper event emission and state management. +Output: Comprehensive test file covering all edge case scenarios. + + + +@~/.claude/get-shit-done/workflows/execute-plan.md +@~/.claude/get-shit-done/templates/summary.md + + + +@.planning/PROJECT.md +@.planning/ROADMAP.md +@.planning/STATE.md +@.planning/phases/07-mock-agent-test-harness/07-01-SUMMARY.md +@.planning/phases/07-mock-agent-test-harness/07-02-SUMMARY.md + +# Test infrastructure from Phase 7: +@src/test/harness.ts +@src/test/fixtures.ts +@src/agent/mock-manager.ts + +# Types for understanding dispatch/coordination: +@src/dispatch/types.ts +@src/coordination/types.ts + + + + + + Task 1: Create E2E edge case test file with crash and waiting scenarios + src/test/e2e/edge-cases.test.ts + +Create E2E tests for edge case scenarios using test harness. + +Test scenarios to implement: + +1. **Agent crash during task**: + - Seed fixture, queue task + - Set crash scenario for agent: `harness.setAgentScenario(agentName, { outcome: 'crash', message: 'Token limit exceeded' })` + - Dispatch task, wait for completion + - Verify: agent:spawned then agent:crashed events + - Verify: task status should NOT be 'completed' (still in_progress or blocked) + - Verify: error message captured + +2. **Agent waiting for input and resume**: + - Seed fixture, queue task + - Set waiting scenario: `harness.setAgentScenario(agentName, { outcome: 'waiting_for_input', question: 'Which database?' })` + - Dispatch task + - Verify: agent:waiting event with question + - Resume agent: `harness.agentManager.resume(agentId, 'PostgreSQL')` + - Wait for completion + - Verify: agent:resumed then agent:stopped events + - Verify: task can now be completed + +3. **Task blocking**: + - Seed fixture, queue task + - Call dispatchManager.blockTask(taskId, 'Waiting for user decision') + - Verify: task appears in blocked list from getQueueState() + - Verify: getNextDispatchable() does not return blocked task + +Agent name pattern from dispatch: `agent-${taskId.slice(0, 6)}` +Use setAgentScenario before dispatch to configure behavior. + +Pattern: +```typescript +vi.useFakeTimers(); +const seeded = await harness.seedFixture(SIMPLE_FIXTURE); +const taskAId = seeded.tasks.get('Task A')!; + +// Pre-seed required idle agent +await harness.agentManager.spawn({ name: 'pool-agent', taskId: 'placeholder', prompt: 'placeholder' }); +await vi.runAllTimersAsync(); + +// Set scenario BEFORE dispatch +harness.setAgentScenario(`agent-${taskAId.slice(0, 6)}`, { outcome: 'crash' }); + +await harness.dispatchManager.queue(taskAId); +harness.clearEvents(); +await harness.dispatchManager.dispatchNext(); +await vi.runAllTimersAsync(); + +// Verify crash events +``` + + npm run test -- src/test/e2e/edge-cases.test.ts passes all tests + Crash, waiting-for-input, and blocking tests pass with proper event verification + + + + Task 2: Add merge conflict scenario test + src/test/e2e/edge-cases.test.ts + +Add test for merge conflict handling: + +**Test: Merge conflict triggers handleConflict** +- Seed fixture, dispatch task, complete task +- Set up worktree for task +- Set conflict merge result: `harness.worktreeManager.setMergeResult(worktreeId, { success: false, conflicts: ['src/shared.ts', 'src/types.ts'], message: 'Merge conflict' })` +- Queue merge and process +- Verify: merge conflict detected +- Call handleConflict +- Verify: merge:conflict event emitted +- Verify: conflict appears in queue state + +**Test: Successful merge after conflict resolution** +- After conflict, clear the merge result override +- Re-process merges +- Verify: merge succeeds + +MockWorktreeManager methods: +- `setMergeResult(worktreeId, result)` - configure specific merge behavior +- Default behavior returns success + +CoordinationManager methods: +- `queueMerge(taskId)` - queue completed task for merge +- `processMerges(targetBranch)` - process all ready merges +- `handleConflict(taskId, conflicts)` - handle merge conflict +- `getQueueState()` - check conflicted tasks + +Note: Need to track worktreeId from agent spawn. MockAgentManager creates worktreeId during spawn - access via `agentInfo.worktreeId`. The CoordinationManager uses this when queueing for merge. + + npm run test -- src/test/e2e/edge-cases.test.ts passes (including conflict tests) + Merge conflict and resolution tests pass, proves conflict handling works + + + + Task 3: Add test module index export + src/test/e2e/index.ts + +Create index file for E2E test module that exports test utilities if any were created. + +If no shared utilities were created (tests are self-contained), create minimal index: + +```typescript +/** + * E2E Tests for Dispatch/Coordination Flows + * + * Test files: + * - happy-path.test.ts: Normal operation scenarios + * - edge-cases.test.ts: Error handling and edge cases + * + * Uses TestHarness from src/test/ for system wiring. + */ + +// No exports needed - tests are self-contained +export {}; +``` + +This documents the test module structure for future reference. + + File exists and TypeScript compiles + E2E test module properly organized with index + + + + + +Before declaring plan complete: +- [ ] `npm run test -- src/test/e2e/` passes all tests +- [ ] Agent crash test exists and passes +- [ ] Agent waiting for input test exists and passes +- [ ] Task blocking test exists and passes +- [ ] Merge conflict test exists and passes +- [ ] Conflict resolution test exists and passes +- [ ] No flaky tests (run twice to confirm) + + + + +- All edge case E2E tests pass +- Tests use TestHarness and MockAgentManager scenarios +- Event verification confirms correct error handling +- Conflict handling proven working +- Agent recovery (waiting → resume) proven working + + + +After completion, create `.planning/phases/08-e2e-scenario-tests/08-02-SUMMARY.md` +