docs(09): create phase plan

Phase 09: Extended Scenarios
- 2 plans in 1 wave
- 2 parallel, 0 sequential
- Ready for execution
This commit is contained in:
Lukas May
2026-01-31 15:06:47 +01:00
parent 5152f0baa4
commit 05f6f64fbe
2 changed files with 292 additions and 0 deletions

View File

@@ -0,0 +1,145 @@
---
phase: 09-extended-scenarios
plan: 01
type: execute
wave: 1
depends_on: []
files_modified: [src/test/e2e/extended-scenarios.test.ts]
autonomous: true
---
<objective>
Create E2E tests proving conflict hand-back round-trip and multi-agent parallel completion work correctly.
Purpose: Validate the full conflict resolution cycle (conflict detected -> agent resolves -> merge succeeds) and multiple agents working/completing/merging in parallel.
Output: Extended E2E test file with conflict round-trip and parallel completion scenarios.
</objective>
<execution_context>
@~/.claude/get-shit-done/workflows/execute-plan.md
@~/.claude/get-shit-done/templates/summary.md
</execution_context>
<context>
@.planning/PROJECT.md
@.planning/ROADMAP.md
@.planning/STATE.md
@.planning/phases/09-extended-scenarios/09-CONTEXT.md
@.planning/phases/08-e2e-scenario-tests/08-01-SUMMARY.md
@.planning/phases/08-e2e-scenario-tests/08-02-SUMMARY.md
@src/test/harness.ts
@src/test/fixtures.ts
@src/test/e2e/edge-cases.test.ts
@src/test/e2e/happy-path.test.ts
</context>
<tasks>
<task type="auto">
<name>Task 1: Create conflict hand-back round-trip tests</name>
<files>src/test/e2e/extended-scenarios.test.ts</files>
<action>
Create new test file with describe block "Conflict hand-back round-trip". Test the full cycle:
1. **Test: conflict triggers resolution task, agent resolves, merge succeeds**
- Seed SIMPLE_FIXTURE, complete Task A
- Create agent in agentRepository with worktreeId
- Create worktree via MockWorktreeManager
- Set merge conflict result for first merge attempt
- Queue and process merge (should fail with conflict)
- Verify: merge:conflicted event, task marked blocked, resolution task created
- Clear the merge conflict (setMergeResult to success)
- Find the resolution task, dispatch it to an agent
- Complete the resolution task
- Queue original task for re-merge
- Process merge again (should succeed)
- Verify: merge:completed event for original task
2. **Test: conflict resolution preserves original task context**
- Similar setup with conflict
- Verify resolution task has correct parentTaskId linking back
- Verify resolution task prompt contains conflict file info
3. **Test: multiple sequential conflicts resolved in order**
- Set up 2 tasks (A and B) both with conflicts
- Process merges, both fail
- Resolve A's conflict, merge A succeeds
- Resolve B's conflict, merge B succeeds
- Verify merge order matches resolution order
Use same patterns as edge-cases.test.ts:
- vi.useFakeTimers() for async control
- Pre-seed idle agent before dispatch
- harness.setAgentScenario for agent behavior
- harness.worktreeManager.setMergeResult for conflict injection
- Manual agentRepository.create for coordination tests
</action>
<verify>npm test src/test/e2e/extended-scenarios.test.ts -- --run passes</verify>
<done>3+ conflict round-trip tests passing, proving full cycle works</done>
</task>
<task type="auto">
<name>Task 2: Create multi-agent parallel completion tests</name>
<files>src/test/e2e/extended-scenarios.test.ts</files>
<action>
Add describe block "Multi-agent parallel work" to the test file. Test scenarios:
1. **Test: multiple agents complete tasks in parallel**
- Seed PARALLEL_FIXTURE (4 independent tasks)
- Pre-seed 3 idle agents
- Queue all 4 tasks
- Dispatch 3 tasks in parallel (3 agents working)
- Use vi.runAllTimersAsync() to complete all 3 agents
- Verify: 3 agent:stopped events
- Complete all 3 tasks
- Dispatch remaining task
- Verify: all 4 tasks completed
2. **Test: parallel merges process in correct dependency order**
- Use COMPLEX_FIXTURE (has dependency structure)
- Complete Task 1A and Task 1B (no dependencies)
- Set up worktrees and agents for both
- Queue both for merge
- Process merges - both should succeed (no dependencies between them)
- Verify: merge:completed for both in same batch
- Complete Task 2A (depends on 1A) and Task 3A (depends on 1B)
- Queue and merge - should succeed
- Complete Task 4A (depends on 2A and 3A)
- Queue and merge - should succeed
- Verify: final merge order respects dependency graph
3. **Test: parallel dispatch with mixed outcomes**
- Pre-seed 2 agents
- Dispatch 2 tasks, one set to succeed, one set to crash
- Verify: one agent:stopped, one agent:crashed
- Verify: completed task can merge, crashed task stays in_progress
Use PARALLEL_FIXTURE for independent tasks, COMPLEX_FIXTURE for dependency scenarios.
</action>
<verify>npm test src/test/e2e/extended-scenarios.test.ts -- --run passes</verify>
<done>3+ parallel work tests passing, proving multi-agent scenarios work</done>
</task>
</tasks>
<verification>
Before declaring plan complete:
- [ ] `npm test src/test/e2e/extended-scenarios.test.ts -- --run` passes
- [ ] At least 6 new tests (3 conflict + 3 parallel)
- [ ] No flaky tests (run twice to verify)
- [ ] Test file follows existing patterns from edge-cases.test.ts
</verification>
<success_criteria>
- All tasks completed
- All verification checks pass
- Conflict round-trip fully tested (detect -> resolve -> re-merge)
- Multi-agent parallel scenarios validated
- No regressions in existing E2E tests
</success_criteria>
<output>
After completion, create `.planning/phases/09-extended-scenarios/09-01-SUMMARY.md`
</output>

View File

@@ -0,0 +1,147 @@
---
phase: 09-extended-scenarios
plan: 02
type: execute
wave: 1
depends_on: []
files_modified: [src/test/e2e/recovery-scenarios.test.ts]
autonomous: true
---
<objective>
Create E2E tests proving recovery/resume after interruption and extended agent Q&A scenarios work correctly.
Purpose: Validate system can recover state after interruption and handle complex agent question/answer flows.
Output: Recovery scenarios test file with state persistence and Q&A flow tests.
</objective>
<execution_context>
@~/.claude/get-shit-done/workflows/execute-plan.md
@~/.claude/get-shit-done/templates/summary.md
</execution_context>
<context>
@.planning/PROJECT.md
@.planning/ROADMAP.md
@.planning/STATE.md
@.planning/phases/09-extended-scenarios/09-CONTEXT.md
@.planning/phases/08-e2e-scenario-tests/08-02-SUMMARY.md
@src/test/harness.ts
@src/test/fixtures.ts
@src/test/e2e/edge-cases.test.ts
</context>
<tasks>
<task type="auto">
<name>Task 1: Create recovery/resume scenario tests</name>
<files>src/test/e2e/recovery-scenarios.test.ts</files>
<action>
Create new test file with describe block "Recovery after interruption". Test scenarios:
1. **Test: queue state survives harness recreation**
- Seed fixture, queue tasks
- Get queue state (tasks in queue)
- Create NEW harness pointing to SAME database
- Query queue state from new harness
- Verify: queue state matches (tasks still queued)
Implementation note: createTestHarness() creates fresh in-memory DB. For this test, need to:
- Extract DB from first harness
- Create second harness manually reusing same DB
- Or modify test to verify DB persistence directly
2. **Test: in-progress task recoverable after agent crash**
- Dispatch task, agent crashes mid-execution
- Verify task status is 'in_progress' (not completed, not lost)
- Queue same task again (should be dispatchable)
- Dispatch to new agent
- Agent completes successfully
- Verify: task completed, merge can proceed
3. **Test: blocked task state persists and can be unblocked**
- Queue task, block it with reason
- Verify task in blocked state in DB
- "Simulate restart" by recreating managers with same DB
- Query blocked tasks
- Unblock task
- Verify: task now dispatchable
4. **Test: merge queue state recoverable**
- Complete task, queue for merge
- Verify merge queue has pending item
- Query merge queue state
- Process merge
- Verify: merge completes correctly
Focus on proving that DATABASE STATE is the source of truth and managers can be recreated without losing work.
</action>
<verify>npm test src/test/e2e/recovery-scenarios.test.ts -- --run passes</verify>
<done>4 recovery tests passing, proving state persistence works</done>
</task>
<task type="auto">
<name>Task 2: Create extended agent Q&A scenario tests</name>
<files>src/test/e2e/recovery-scenarios.test.ts</files>
<action>
Add describe block "Agent Q&A extended scenarios" to the test file. Test scenarios:
1. **Test: multiple questions in sequence from same agent**
- Dispatch task with scenario: first asks question, then after resume asks another
- Handle first question (agent:waiting -> resume -> agent:resumed)
- Agent asks second question
- Handle second question
- Agent completes
- Verify: 2 agent:waiting events, 2 agent:resumed events, 1 agent:stopped
Implementation: MockAgentManager may need scenario that asks multiple questions. If not supported, test single question but verify the state machine works correctly.
2. **Test: question surfaces as message in message queue**
- Dispatch task with waiting_for_input scenario
- Verify: agent:waiting event includes question
- Check messageRepository for user-directed message
- Verify: message contains the question text
3. **Test: agent resumes with user's answer in context**
- Dispatch task, agent asks question
- Resume with specific answer "PostgreSQL"
- Verify: resume call includes the answer
- Agent completes
- Verify: agent result reflects successful completion
4. **Test: waiting agent blocks task completion**
- Dispatch task, agent enters waiting_for_input
- Attempt to complete task (should not be allowed while agent waiting)
- Resume agent, agent completes
- Now complete task
- Verify: proper state transitions
Use edge-cases.test.ts patterns for waiting/resume flow.
</action>
<verify>npm test src/test/e2e/recovery-scenarios.test.ts -- --run passes</verify>
<done>4 Q&A tests passing, proving extended question flows work</done>
</task>
</tasks>
<verification>
Before declaring plan complete:
- [ ] `npm test src/test/e2e/recovery-scenarios.test.ts -- --run` passes
- [ ] At least 8 new tests (4 recovery + 4 Q&A)
- [ ] No flaky tests (run twice to verify)
- [ ] Test patterns consistent with existing E2E tests
</verification>
<success_criteria>
- All tasks completed
- All verification checks pass
- Recovery scenarios prove database is source of truth
- Q&A flow handles multiple questions and state transitions
- No regressions in existing E2E tests
</success_criteria>
<output>
After completion, create `.planning/phases/09-extended-scenarios/09-02-SUMMARY.md`
</output>