docs(07): create phase plan

Phase 07: Mock Agent & Test Harness
- 2 plans in 2 waves
- 1 parallel (07-01), 1 sequential (07-02)
- Ready for execution
This commit is contained in:
Lukas May
2026-01-31 08:15:02 +01:00
parent 95ac68b86b
commit d0e9acf512
2 changed files with 333 additions and 0 deletions

View File

@@ -0,0 +1,136 @@
---
phase: 07-mock-agent-test-harness
plan: 01
type: execute
wave: 1
depends_on: []
files_modified: [src/agent/mock-manager.ts, src/agent/mock-manager.test.ts, src/agent/index.ts]
autonomous: true
---
<objective>
Implement MockAgentManager adapter for test scenarios.
Purpose: Enable E2E testing of dispatch/coordination flows without spawning real Claude agents. The mock adapter simulates configurable agent behaviors (success, crash, waiting_for_input) and emits proper lifecycle events.
Output: MockAgentManager class implementing AgentManager port with scenario configuration.
</objective>
<execution_context>
@~/.claude/get-shit-done/workflows/execute-plan.md
@~/.claude/get-shit-done/templates/summary.md
</execution_context>
<context>
@.planning/PROJECT.md
@.planning/ROADMAP.md
@.planning/STATE.md
@src/agent/types.ts
@src/agent/manager.ts
@src/events/types.ts
</context>
<tasks>
<task type="auto">
<name>Task 1: Implement MockAgentManager adapter</name>
<files>src/agent/mock-manager.ts</files>
<action>
Create MockAgentManager class implementing AgentManager interface with configurable scenarios.
**Scenario configuration interface:**
```typescript
interface MockAgentScenario {
/** How agent completes: 'success' | 'crash' | 'waiting_for_input' */
outcome: 'success' | 'crash' | 'waiting_for_input';
/** Delay before completion (ms). Default 0 for synchronous tests. */
delay?: number;
/** Result message for success/crash */
message?: string;
/** Files modified (for success) */
filesModified?: string[];
/** Question to surface (for waiting_for_input) */
question?: string;
}
```
**Constructor takes:**
- eventBus?: EventBus (optional, for event emission)
- defaultScenario?: MockAgentScenario (defaults to immediate success)
**Key behaviors:**
- spawn(): Create agent record in internal Map, schedule completion based on scenario
- Use per-agent scenario override via `setScenario(agentName: string, scenario: MockAgentScenario)`
- Emit all lifecycle events: agent:spawned, agent:stopped, agent:crashed, agent:waiting, agent:resumed
- store session IDs (use UUID) for resume capability testing
- stop(): Mark agent stopped, emit agent:stopped event
- resume(): Re-run scenario for resumed agent
- getResult(): Return stored result after completion
**DO NOT:**
- Use real execa/subprocess - this is all in-memory simulation
- Block on spawn() - completion happens async via setTimeout (even if delay=0)
</action>
<verify>npm run build succeeds, TypeScript compiles without errors</verify>
<done>MockAgentManager implements AgentManager interface with scenario configuration</done>
</task>
<task type="auto">
<name>Task 2: Write comprehensive tests for MockAgentManager</name>
<files>src/agent/mock-manager.test.ts, src/agent/index.ts</files>
<action>
Create test suite for MockAgentManager covering all scenarios.
**Test categories:**
1. spawn() with default scenario (immediate success)
2. spawn() with configured delay
3. spawn() with crash scenario - emits agent:crashed, result.success=false
4. spawn() with waiting_for_input - emits agent:waiting, status='waiting_for_input'
5. resume() after waiting_for_input - emits agent:resumed, continues with scenario
6. stop() kills scheduled completion, emits agent:stopped
7. list() returns all agents with correct status
8. get() and getByName() lookups work
9. setScenario() overrides for specific agent names
10. Event emission order verification (spawned before completion events)
**Pattern to follow:**
Use same createMockEventBus() pattern from dispatch/manager.test.ts
Use async/await with vitest's fake timers for delay testing:
```typescript
vi.useFakeTimers();
await manager.spawn({ name: 'test', taskId: 't1', prompt: 'do thing' });
await vi.advanceTimersByTimeAsync(100);
// verify completion happened
vi.useRealTimers();
```
**Export MockAgentManager from src/agent/index.ts**
</action>
<verify>npm test passes all MockAgentManager tests, at least 10 test cases</verify>
<done>MockAgentManager has comprehensive test coverage for all scenario types</done>
</task>
</tasks>
<verification>
Before declaring plan complete:
- [ ] `npm run build` succeeds without errors
- [ ] `npm test` passes all tests including new MockAgentManager tests
- [ ] MockAgentManager implements full AgentManager interface
- [ ] All three outcome scenarios work: success, crash, waiting_for_input
- [ ] Events emitted correctly for each scenario
- [ ] MockAgentManager exported from src/agent/index.ts
</verification>
<success_criteria>
- All tasks completed
- MockAgentManager can simulate any agent lifecycle scenario
- Test suite proves all scenarios work correctly
- No errors or warnings introduced
</success_criteria>
<output>
After completion, create `.planning/phases/07-mock-agent-test-harness/07-01-SUMMARY.md`
</output>

View File

@@ -0,0 +1,197 @@
---
phase: 07-mock-agent-test-harness
plan: 02
type: execute
wave: 2
depends_on: ["07-01"]
files_modified: [src/test/harness.ts, src/test/fixtures.ts, src/test/index.ts, src/test/harness.test.ts]
autonomous: true
---
<objective>
Create test harness with database fixtures and full system wiring.
Purpose: Provide reusable E2E test setup that wires MockAgentManager with real DispatchManager and CoordinationManager, plus helpers for seeding database hierarchies (initiative → phase → plan → task with dependencies).
Output: Test harness module (src/test/) with fixtures and system factory.
</objective>
<execution_context>
@~/.claude/get-shit-done/workflows/execute-plan.md
@~/.claude/get-shit-done/templates/summary.md
</execution_context>
<context>
@.planning/PROJECT.md
@.planning/ROADMAP.md
@.planning/STATE.md
@.planning/phases/07-mock-agent-test-harness/07-01-SUMMARY.md
@src/agent/types.ts
@src/dispatch/types.ts
@src/coordination/types.ts
@src/db/repositories/drizzle/test-helpers.ts
</context>
<tasks>
<task type="auto">
<name>Task 1: Create fixture helpers for database seeding</name>
<files>src/test/fixtures.ts</files>
<action>
Create fixture helpers that seed complete task hierarchies.
**Fixture interface:**
```typescript
interface TaskFixture {
id: string;
name: string;
priority?: 'low' | 'medium' | 'high';
dependsOn?: string[]; // names of other tasks in same fixture
}
interface PlanFixture {
name: string;
tasks: TaskFixture[];
}
interface PhaseFixture {
name: string;
plans: PlanFixture[];
}
interface InitiativeFixture {
name: string;
phases: PhaseFixture[];
}
```
**seedFixture(db: DrizzleDatabase, fixture: InitiativeFixture): Promise<SeededFixture>**
- Creates initiative, phases, plans, tasks in correct order
- Resolves task dependencies by name → ID mapping
- Returns SeededFixture with all created IDs:
```typescript
interface SeededFixture {
initiativeId: string;
phases: Map<string, string>; // name → id
plans: Map<string, string>; // name → id
tasks: Map<string, string>; // name → id
}
```
**Convenience fixtures:**
- `SIMPLE_FIXTURE`: 1 initiative → 1 phase → 1 plan → 3 tasks (A, B depends on A, C depends on A)
- `PARALLEL_FIXTURE`: 1 initiative → 1 phase → 2 plans (each with 2 independent tasks)
- `COMPLEX_FIXTURE`: 1 initiative → 2 phases → 4 plans with cross-plan dependencies
</action>
<verify>TypeScript compiles, fixtures are valid data structures</verify>
<done>Fixture helpers create complete task hierarchies with dependency resolution</done>
</task>
<task type="auto">
<name>Task 2: Create test harness with full system wiring</name>
<files>src/test/harness.ts, src/test/index.ts</files>
<action>
Create TestHarness class that wires up the full system for E2E testing.
**TestHarness interface:**
```typescript
interface TestHarness {
// Core components
db: DrizzleDatabase;
eventBus: EventBus & { emittedEvents: DomainEvent[] };
agentManager: MockAgentManager;
dispatchManager: DispatchManager;
coordinationManager: CoordinationManager;
// Repositories
taskRepository: TaskRepository;
messageRepository: MessageRepository;
agentRepository: AgentRepository;
// Helpers
seedFixture(fixture: InitiativeFixture): Promise<SeededFixture>;
setAgentScenario(agentName: string, scenario: MockAgentScenario): void;
getEventsByType(type: string): DomainEvent[];
clearEvents(): void;
}
```
**createTestHarness(): TestHarness**
- Creates in-memory SQLite database (createTestDatabase)
- Creates EventEmitterBus (real event bus for event verification)
- Creates MockAgentManager (with eventBus)
- Creates MockWorktreeManager (simple in-memory, creates fake worktrees)
- Creates real DefaultDispatchManager (with mock agent manager)
- Creates real DefaultCoordinationManager (with mock worktree manager)
- Wires all repositories
**MockWorktreeManager (inline in harness.ts):**
- Simple Map<string, WorktreeInfo> for worktree storage
- create(): returns fake worktree with random path
- get(): lookup by ID
- remove(): delete from map
- merge(): returns success (no actual git operations)
</action>
<verify>npm run build succeeds</verify>
<done>TestHarness wires full system with mocks for E2E scenarios</done>
</task>
<task type="auto">
<name>Task 3: Write tests proving harness works</name>
<files>src/test/harness.test.ts</files>
<action>
Write tests that prove the test harness enables E2E scenarios.
**Test cases:**
1. createTestHarness() returns all components
2. seedFixture() creates task hierarchy, returns correct IDs
3. Task dependencies resolved correctly (dependsOn contains actual task IDs)
4. dispatchManager.queue() + dispatchNext() uses MockAgentManager
5. Event capture works (getEventsByType returns filtered events)
6. Agent completion triggers expected events
7. Full dispatch → complete → merge flow works end-to-end
**Key verification:**
The test should prove this flow works:
```typescript
const harness = createTestHarness();
const fixture = await harness.seedFixture(SIMPLE_FIXTURE);
await harness.dispatchManager.queue(fixture.tasks.get('Task A')!);
const result = await harness.dispatchManager.dispatchNext();
// Agent completes (mock scenario)
await harness.dispatchManager.completeTask(fixture.tasks.get('Task A')!);
// Verify events
const events = harness.getEventsByType('task:completed');
expect(events.length).toBe(1);
```
</action>
<verify>npm test passes harness tests</verify>
<done>Test harness proven to work for E2E scenarios</done>
</task>
</tasks>
<verification>
Before declaring plan complete:
- [ ] `npm run build` succeeds without errors
- [ ] `npm test` passes all tests
- [ ] createTestHarness() returns fully wired system
- [ ] seedFixture() creates complete hierarchies
- [ ] Task dependencies resolved by name
- [ ] MockWorktreeManager integrated
- [ ] At least one E2E flow test passes
</verification>
<success_criteria>
- All tasks completed
- Test harness enables E2E testing without real Claude agents
- Fixtures seed complex task hierarchies
- Full dispatch → coordination flow works with mocks
</success_criteria>
<output>
After completion, create `.planning/phases/07-mock-agent-test-harness/07-02-SUMMARY.md`
</output>