Files

Lukas May a2ab4c4a84 docs: Add comprehensive test inventory with coverage gaps and redundancy map

Audited all 44 test files one by one. Documents what each test verifies,
identifies 12 redundant test pairs, 13 coverage gaps (prioritized), fragility
assessment, and mock style inconsistencies.

2026-03-02 12:23:39 +09:00

27 KiB

Raw Blame History

Test Inventory

Comprehensive catalog of every test in the suite, what it verifies, and where coverage is weak or redundant.

44 test files | 35 unit | 7 E2E | 4 integration (mocked) | 5 integration (real providers)

Last audited: 2026-03-02

Coverage Matrix

Subjective 0–10 scores. 0 = no coverage, 5 = happy paths only, 10 = comprehensive with edge cases.

Module	Happy Path	Error/Edge	Integration	Race/Concurrency	Notes
Agent Manager	8	6	6	9	Missing: `getResult` with actual result, multi-provider spawn
Agent OutputHandler	6	4	5	8	Missing: `processAgentOutput` with error status, no-result-line case
Agent FileIO	8	7	8	—	Missing: round-trip write→read, `writeInputFiles` content verification
Agent ProcessManager	6	4	3	0	Heavy mocking; missing: `killProcess`, `cleanupWorktree`, error paths
Agent Lifecycle	8	7	3	0	Missing: controller integration, `analyzeError→shouldRetry` chain
DB Repositories	9	7	9	—	task.test.ts thin: missing `findByParentTaskId`, approval, category queries
DB Cascades	10	—	10	—	Excellent: full graph verified
Dispatch Manager	7	5	8	0	"dependency scenario" test is misleading (deps not implemented)
Phase Dispatch	9	7	8	0	Best dependency tests (diamond pattern, blocking propagation)
Coordination/Merge	7	5	7	0	Overlap with ConflictResolutionService
Git Manager	9	7	9	0	Real git repos, no mocks. Excellent.
Git BranchManager	6	5	6	—	Only `remoteBranchExists` tested. Other methods untested.
Process Manager/Registry	8	6	—	0	Registry is pure and clean. Manager has mock fragility.
Logging	8	6	8	0	Real filesystem. Good.
Events Bus	8	3	—	0	Missing: handler-throws, handler ordering
Server/HTTP	8	6	9	0	Real HTTP + real PID files
Shutdown	5	2	—	0	Missing: signal→handler→shutdown chain, error resilience
tRPC Router	3	1	3	0	Only `health` and `status` tested. All domain procedures untested.
Preview	7	5	4	0	Missing: `discoverConfig`, `docker-client`, `health-checker` unit tests
E2E Workflows	8	6	8	0	Good lifecycle coverage. Output parsing never tested E2E.

Test Files — Detailed Inventory

Agent Module (`src/agent/`)

`manager.test.ts` — MultiProviderAgentManager

Test	Verifies
spawn / creates agent record with provided name	`repository.create` called with name
spawn / rejects duplicate agent names	Unique name constraint enforced
spawn / emits AgentSpawned event with name	Event has agent name in payload
spawn / writes diagnostic files for workdir verification	`spawn-diagnostic.json` written with metadata
spawn / uses custom cwd if provided	`child_process.spawn` gets custom cwd
stop / stops running agent and updates status	DB updated to `stopped`
stop / kills detached process if running	Status update after spawn+stop (does NOT verify kill)
stop / throws if agent not found	Error on missing ID
stop / emits AgentStopped event	Event has `user_requested` reason
list / returns all agents with names	`findAll` returns correct data
get / finds agent by id	`findById` returns agent
get / returns null if not found	`findById` returns null
getByName / finds agent by name	`findByName` returns agent
getByName / returns null if not found	`findByName` returns null
resume / resumes agent waiting for input	Spawn called with `--resume` + session args
resume / rejects if not waiting for input	Error on wrong status
resume / rejects if no session	Error on null sessionId
resume / emits AgentResumed event	Event has sessionId
getResult / returns null when no result	Agent result is null
delete / deletes agent and clears active state	`repository.delete` called
delete / emits agent:deleted event	Event has agent name
delete / throws if not found	Error on missing ID
delete / handles missing workdir gracefully	No error when workdir absent

Mocking: 5 vi.mock() calls (child_process, git/manager, node:fs, node:fs/promises, file-tailer). Brittle. Gaps: No test for getResult with actual result. No multi-provider tests. findByTaskId/findBySessionId mocked but never exercised.

`output-handler.test.ts` — OutputHandler

Test	Verifies
processAgentOutput / parse questions from Claude output	`waiting_for_input` status + questions stored
processAgentOutput / handle malformed questions	Falls to `crashed` status
processAgentOutput / handle "done" status	No `waiting_for_input` set
getPendingQuestions / retrieve stored questions	JSON round-trip from DB
getPendingQuestions / return null when none	Null for no questions
formatAnswersAsPrompt / format normal answers	String formatting correct
formatAnswersAsPrompt / instruction-enhanced answers	`__instruction__` prepended
formatAnswersAsPrompt / instruction with whitespace	Trimmed correctly
formatAnswersAsPrompt / only instruction, no answers	Instruction + header only
formatAnswersAsPrompt / empty answers	Header with no entries

Gaps: No test for processAgentOutput with "error" status. No test for output with no result line. Uses real provider registry (fragile).

`completion-detection.test.ts` — readSignalCompletion (private method)

Test	Verifies
detects "questions" status	signal.json parsed, returns content
detects "done" status	Same
detects "error" status	Same
returns null when no signal.json	Missing file returns null
returns null for "running" status	Non-terminal status ignored
handles malformed signal.json	Bad JSON returns null

Concern: All 6 tests access private method via (outputHandler as any).readSignalCompletion. Real filesystem (good). Partially redundant with completion-race-condition.test.ts tests 2-3 which test the same scenarios through the public handleCompletion method.

`completion-race-condition.test.ts` — handleCompletion mutex

Test	Verifies
prevent concurrent completion via mutex	Only 1 of 2 concurrent calls executes
handle completion with questions status	Final status `waiting_for_input`
handle completion when done	Final status `idle`
clean up lock even on failure	Mutex released after error

Concern: Module-level mock repo mutated per-test without beforeEach reset. Tests 2-3 overlap with completion-detection.test.ts but are strictly better (test public API).

`mutex-completion.test.ts` — handleCompletion mutex (focused)

Test	Verifies
prevent concurrent completion	Same agentId blocked
allow sequential completion	Same agentId allowed after first finishes
clean up on exception	Lock released on error
per-agent mutex isolation	Different agentIds run concurrently

Excellent focused design. 4 complementary scenarios. Uses real timers (50ms delays). Partial overlap with completion-race-condition.test.ts test 1 — both test the mutex concurrent blocking.

`file-io.test.ts` — File I/O utilities

Test	Verifies
generateId / non-empty string	Returns truthy
generateId / unique values	100 unique IDs
writeInputFiles / initiative.md	File created
writeInputFiles / phase.md	File created
writeInputFiles / task.md	File created
writeInputFiles / pages	Files in pages/ subdir
writeInputFiles / empty options	No crash
readSummary / with frontmatter	Parsed correctly
readSummary / missing file	Returns null
readSummary / no frontmatter	Body only
readSummary / empty files_modified	Empty array
readPhaseFiles / from phases/ dir	Parsed with frontmatter
readPhaseFiles / missing dir	Returns []
readPhaseFiles / no dependencies	Defaults to []
readTaskFiles / from tasks/ dir	Parsed with frontmatter
readTaskFiles / defaults category/type	`execute`/`auto` defaults
readTaskFiles / missing dir	Returns []
readDecisionFiles / from decisions/ dir	Parsed
readDecisionFiles / missing dir	Returns []
readPageFiles / from pages/ dir	Parsed with pageId
readPageFiles / missing dir	Returns []
readPageFiles / ignores non-.md	Filter works

Real filesystem, no mocks. Good. Gaps: writeInputFiles only checks existsSync, not file content. No write→read round-trip test.

`process-manager.test.ts` — ProcessManager

Test	Verifies
getAgentWorkdir	Path computation
createProjectWorktrees	Worktree creation for initiative projects
createProjectWorktrees / throws on failure	Error propagation
createProjectWorktrees / logs details	Completes without error (weak)
createStandaloneWorktree	Path returned
createStandaloneWorktree / throws on failure	Error propagation
spawnDetached / validates cwd exists	existsSync called
spawnDetached / throws if cwd missing	Error thrown
spawnDetached / passes correct cwd	spawn args verified
spawnDetached / logs spawn info	Completes (weak)
spawnDetached / writes prompt file	PROMPT.md written
buildSpawnCommand / native prompt mode	Command+args assembled
buildSpawnCommand / flag prompt mode	Command+args assembled
buildResumeCommand / flag style	Resume args assembled
buildResumeCommand / no resume support	Error thrown

6 vi.mock() calls. Most brittle test file. "Logs comprehensive" tests are empty assertions. Missing: killProcess, cleanupWorktree.

`mock-manager.test.ts` — MockAgentManager (testing the test double)

39 tests covering: default scenario, configured delay, error/questions scenarios, resume, stop, list, get/getByName, scenario overrides, event emission order, name uniqueness, constructor options, agent modes (execute/discuss/plan/detail), structured questions.

Thorough. One duplicate: "should emit stopped event with detail_complete reason" appears twice with "(second test)" suffix.

Agent Lifecycle (`src/agent/lifecycle/`)

error-analyzer.test.ts — 20 tests. Auth failure (4), usage limit (4), timeout (2), missing signal (3), process crash (4), unknown (2), context preservation (2). Pure behavioral classification tests.

instructions.test.ts — 6 tests. Constant content verification + addInstructionToPrompt pure function tests. Appropriate for 28-line module.

retry-policy.test.ts — 12 tests. Config (2), shouldRetry per error type (8), backoff delays (2). Clean decision-function tests. Missing: AgentExhaustedError/AgentFailureError class tests.

signal-manager.test.ts — 18 tests. clearSignal (2), checkSignalExists (2), readSignal (7), waitForSignal (3), validateSignalFile (4). Real filesystem, no mocks. waitForSignal has real-timer flakiness risk.

DB Repositories (`src/db/repositories/drizzle/`)

All use real in-memory SQLite via createTestDatabase(). No mocks. Integration tests against real SQL.

agent.test.ts — 20 tests. Full CRUD + findByStatus, findByTaskId, findBySessionId. Gap: never tests initiativeId field.

cascade.test.ts — 4 tests. High value. Validates entire FK cascade graph: initiative→phases→tasks→pages→junction rows. Verifies SET NULL on agents/conversations. Before/after assertions prove setup correctness.

initiative.test.ts — 12 tests. Standard CRUD + findByStatus. Simplest repo, clean.

message.test.ts — 22 tests. Most comprehensive repo test. Covers sender/recipient polymorphism, threading (parentMessageId), status lifecycle (pending→read→responded), filtering (findPendingForUser, findRequiringResponse, findReplies).

phase.test.ts — 21 tests. CRUD + 12 dependency tests (createDependency, getDependencies, getDependents, findDependenciesByInitiativeId). Direct-only (not transitive) verified. Cross-initiative scoping verified.

task.test.ts — 12 tests. Thinnest relative to feature surface. Missing: findByParentTaskId, findByInitiativeId, findPendingApproval, category filtering, requiresApproval behavior.

Shared pattern: 8 tests use setTimeout(resolve, 10) for timestamp differentiation (~80ms total delay).

Dispatch (`src/dispatch/`)

manager.test.ts — 13 tests. Queue, getNextDispatchable (priority, queuedAt ordering), completeTask, blockTask, dispatchNext (no tasks, no agents, success), getQueueState, dependency scenario. Real DB + mock event bus + mock agent manager. Gap: "dependency scenario" test admits dependencies aren't implemented — just retests priority ordering.

phase-manager.test.ts — 17 tests. queuePhase (status validation, deps), getNextDispatchablePhase (dependency blocking), dispatchNextPhase, completePhase, blockPhase, diamond dependency scenario. Best dependency tests: full diamond pattern (A→B, A→C, B→D, C→D) with step-by-step verification. Gap: mock DispatchManager never asserted (task-level dispatch not verified).

Coordination (`src/coordination/`)

manager.test.ts — 14 tests. queueMerge, getNextMergeable (priority ordering), processMerges (success path, conflict detection, resolution task creation, message creation), getQueueState, handleConflict, error handling (undefined repos).

conflict-resolution-service.test.ts — 10 tests. handleConflict (resolution task properties, original task blocking, agent message, event emission, optional deps, error cases, multiple conflicts, parentTaskId preservation).

Overlap: 5+ tests in manager.test.ts duplicate conflict-handling logic from conflict-resolution-service.test.ts.

Git (`src/git/`)

manager.test.ts — 22 tests. create, remove, list, get, diff (clean/added/modified/deleted), merge (success/conflict/abort/cleanup), edge cases. Real git repos on filesystem. No mocks. Excellent. Most confident test file in the suite.

simple-git-branch-manager.test.ts — 4 tests. remoteBranchExists only. Real git with bare repo + clones. Gap: only one method tested; other methods untested.

Process (`src/process/`)

manager.test.ts — 18 tests. spawn, stop, stopAll, restart, isRunning (live/dead/stopped), event emission. Uses mocked execa. Good separation of event bus concern.

registry.test.ts — 14 tests. Pure unit tests, no mocks. register, get, getAll, updateStatus, unregister, getByPid, clear, size. Exemplary.

Logging (`src/logging/`)

manager.test.ts — 13 tests. getBaseDir, ensureLogDir, ensureProcessDir, getProcessDir, getLogPath, listLogs (filtering), cleanOldLogs (retainDays). Real filesystem.

writer.test.ts — 17 tests. open, writeStdout/writeStderr (string/Buffer, timestamps, multi-line), close (flush, idempotent), write-after-close, append mode, event emission, backwards compatibility. Real filesystem + real EventBus.

Events (`src/events/`)

bus.test.ts — 10 tests. emit/on, event isolation, once, off, multiple handlers, typed events, timestamp preservation, factory. Pure behavioral tests. Gap: no handler-throws test, no handler ordering test.

Server (`src/server/`)

index.test.ts — 16 tests. Lifecycle (start/stop/double-start/PID conflict/idempotent stop/isRunning), HTTP endpoints (health/status/404/405), PID file management (create/remove/stale cleanup), event emission. Real HTTP server + real fetch. Gap: no port-already-in-use test.

shutdown.test.ts — 6 tests. shutdown() call order (server.stop before processManager.stopAll), signal handler installation (SIGTERM/SIGINT/SIGHUP). Gap: doesn't verify signal→handler→shutdown chain. Doesn't test error resilience (what if server.stop() throws).

tRPC Router (`src/trpc/`)

router.test.ts — 14 tests. health procedure (shape, Zod, uptime calc, null startedAt, processCount), status procedure (shape, Zod, startedAt, pid, uptime), Zod schema validation.

Major gap: Only health and status are tested. All domain procedures untested: initiatives, tasks, phases, agents, pages, proposals, accounts, conversations, previews.

Preview (`src/preview/`)

compose-generator.test.ts — 10 tests. Pure function tests. YAML generation, build config handling, Caddy deps, Caddyfile routing (single/multi/internal/specificity), labels. No mocks. Strong.

config-reader.test.ts — 6 tests. parseCwPreviewConfig only. Minimal/multi-service, validation errors, internal services, build normalization. Gap: discoverConfig (filesystem discovery) is untested — always mocked in manager.test.ts.

manager.test.ts — 20 tests. start (full lifecycle, phaseId, Docker unavailable, project not found, compose-up failure, health-check failure, no healthcheck), stop (cleanup, missing labels), list (all/filtered/skip non-preview/skip incomplete), getStatus (running/failed/stopped/building/null), stopAll (all/error resilience/empty). 6 vi.mock() calls.

port-allocator.test.ts — 4 tests. Base port, skip used ports, gap filling, real TCP port contention. One real TCP server test. Gap: no upper-bound/exhaustion test.

E2E Tests (`src/test/e2e/`)

All use createTestHarness() with in-memory DB, MockAgentManager, MockWorktreeManager, real dispatch/coordination managers.

happy-path.test.ts — 6 tests. Single task flow, sequential deps, parallel dispatch, full merge flow, complex dependency flow (diamond), fixture dependency DB verification. Gap: merge flow requires manual agent+worktree bridging (harness not fully integrated).

architect-workflow.test.ts — 8 tests. Discuss mode (complete, pause/resume), plan mode (complete, persist phases), plan conflict detection (reject dup, auto-dismiss stale, different initiatives), full discuss→plan→phases workflow. Gap: architect output is never parsed — mock discards structured data.

decompose-workflow.test.ts — 10 tests. Detail mode (complete, pause/resume, multiple questions), detail conflict detection (3 tests), task persistence (create, all types, dependencies), full workflow. Gap: filename says "decompose" but describe says "Detail". Task dependency rows never verified in DB.

phase-dispatch.test.ts — 7 tests. Cleanest E2E file. Independent parallel dispatch, sequential deps, diamond dependency (A→B,C→D), approval gate (pending/in_progress rejection), blocked phase (no dispatch, downstream blocking). No fake timers, no agent involvement.

recovery-scenarios.test.ts — 9 tests. Queue state in DB, in-progress task recovery after crash, blocked state persistence, merge queue recovery, agent Q&A (question/resume, structured questions, status transitions, multiple questions).

edge-cases.test.ts — 14 tests. Agent crash (events, task status, error message), agent waiting/resume (3 tests), task blocking (4 tests), merge conflict handling (4 tests).

extended-scenarios.test.ts — 6 tests. Conflict hand-back round-trip (full lifecycle, context preservation, sequential conflicts), multi-agent parallel work (parallel completion, merge ordering, mixed outcomes).

Integration Tests (`src/test/integration/`)

agent-workdir-verification.test.ts — 2 tests. Skipped by default (REAL_WORKDIR_TESTS). Spawns real Claude agent, verifies working directory path via diagnostic files and agent-created files.

crash-race-condition.test.ts — 4 tests. Always runs. Tests signal.json-based completion detection vs crash marking. Tests 2 and 4 overlap (same setup, test 4 has weaker assertions). Accesses private methods via as any.

real-claude.test.ts — 3 tests. Skipped by default. Direct Claude CLI contract tests: done/questions/error status outputs validated against Zod schema. ~$0.10.

real-e2e-crash.test.ts — 1 test. Dead test: expect(true).toBe(true). Hardcoded absolute paths. Diagnostic script, not a regression test.

Real Provider Tests (`src/test/integration/real-providers/`)

All skipped by default. ~$0.50 total per full run.

claude-manager.test.ts — 7 tests. Output parsing (text_delta, init/session, result), questions flow, session resume (2 tests), error handling. Resume test accepts crashed as valid (too lenient).

codex-manager.test.ts — 3 tests. Spawn, output parsing, provider config. All accept crashed as valid — pass vacuously when Codex isn't installed.

conversation.test.ts — 1 test. Most ambitious: two real Claude agents doing actual coding while communicating via cw ask/listen/answer. Verifies 2 conversations, file creation, content, and interleaved coding+conversation. ~$0.30, 5min timeout.

crash-recovery.test.ts — 4 tests. Server restart simulation (resume streaming, mark dead agent, process output for dead agent), event idempotency (no duplicate events on restart).

schema-retry.test.ts — 9 tests. Valid output (done, questions, multiple questions), retry logic (bad→good, code blocks, surrounding text), mode-specific schemas (discuss, plan, detail). Tests 5-6 have conditional assertions (silently pass on crash).

Redundancy Map

Tests covering the same scenario multiple times. Bold = the better version to keep.

Scenario	Tests
Signal.json completion detection (questions→waiting)	`completion-detection.test.ts` #1, `completion-race-condition.test.ts` #2
Signal.json completion detection (done→idle)	`completion-detection.test.ts` #2, `completion-race-condition.test.ts` #3
Mutex concurrent blocking	`completion-race-condition.test.ts` #1, `mutex-completion.test.ts` #1 (more focused)
Mutex cleanup on error	`completion-race-condition.test.ts` #4, `mutex-completion.test.ts` #3 (more focused)
Agent Q&A resume flow	`edge-cases.test.ts` #5, `recovery-scenarios.test.ts` #5 (has options)
Agent status transitions (running→waiting→idle)	`edge-cases.test.ts` #6, `recovery-scenarios.test.ts` #8 (more thorough)
Task blocking (DB status)	`edge-cases.test.ts` #10, `recovery-scenarios.test.ts` #3
Merge conflict → resolution task	`edge-cases.test.ts` #13, `extended-scenarios.test.ts` #1 (full round-trip)
Conflict resolution task creation	`coordination/manager.test.ts` (5 tests), `conflict-resolution-service.test.ts` (10 tests)
MockAgentManager detail_complete reason	`mock-manager.test.ts` has duplicate test "(second test)"
Crash marking with no signal	`crash-race-condition.test.ts` #2, `crash-race-condition.test.ts` #4 (weaker)
`real-e2e-crash.test.ts` #1	Dead: `expect(true).toBe(true)` — contributes nothing

Coverage Gaps (Priority Order)

Critical

tRPC domain procedures entirely untested — initiatives, tasks, phases, agents, pages, proposals, accounts, conversations, previews. Only health/status have tests. This is the single largest gap.
Agent output parsing never tested E2E — Architect helpers (setArchitectPlanComplete, setArchitectDetailComplete) discard structured output. The pipeline from agent output → DB persistence is exercised nowhere.
task.test.ts thin relative to feature surface — Missing: findByParentTaskId, findByInitiativeId, findPendingApproval, approveTask, category filtering, requiresApproval field behavior. These features exist in production but have zero direct unit test coverage.

Important

discoverConfig (preview config discovery) untested — The filesystem walk logic (.cw-preview.yml → docker-compose.yml → Dockerfile) is always mocked away.
docker-client.js and health-checker.js have no unit tests — Always mocked. No verification of the Docker shell-out layer.
Signal handler → shutdown chain untested — shutdown.test.ts verifies listeners are installed and shutdown() works when called directly, but never tests signal → handler → shutdown.
No concurrent dispatch/merge tests — For a multi-agent system, racing dispatchNext() or concurrent merges are never exercised.
processAgentOutput missing error status test — "done" and "questions" tested, "error" not tested.

Minor

EventBus handler-throws behavior untested — Does emit swallow or propagate handler errors?
Port allocator upper-bound/exhaustion untested — What happens when all ports 9100-9200 are taken?
error-analyzer → retry-policy integration untested — Each tested independently but the chain (analyzeError output fed to shouldRetry) is never verified end-to-end.
Git BranchManager incomplete — Only remoteBranchExists tested. Other methods (if any) untested.
ProcessManager.killProcess and cleanupWorktree untested — No unit test for these methods.

Fragility Assessment

High Fragility (refactoring will break tests)

agent/process-manager.test.ts — 6 vi.mock() calls. Any import restructuring breaks tests.
agent/manager.test.ts — 5 vi.mock() calls. Same concern.
preview/manager.test.ts — 6 vi.mock() calls. Same concern.
completion-detection.test.ts — Tests private method readSignalCompletion via as any.
completion-race-condition.test.ts — Monkey-patches private methods. Module-level mock mutated without reset.
crash-race-condition.test.ts — Accesses private handleCompletion via as any.

Low Fragility (behavioral tests)

DB repository tests — Real in-memory SQLite, test public API only.
git/manager.test.ts — Real git repos, no mocks.
file-io.test.ts — Real filesystem, pure function tests.
process/registry.test.ts — Pure unit tests, no mocks.
preview/compose-generator.test.ts — Pure function tests.
events/bus.test.ts — Pure behavioral tests.
E2E tests — Test through public harness API.

Mock Style Inconsistencies

The test suite uses 4 different mocking patterns with no shared convention:

Pattern	Files	Concern
`vi.mock()` module-level	manager.test.ts, process-manager.test.ts, preview/manager.test.ts	Import-structure coupled
Factory function (`createMockX()`)	output-handler.test.ts, coordination/.test.ts, server/.test.ts	Cleaner but copy-pasted
Inline const object	completion-race-condition.test.ts, mutex-completion.test.ts	Simple but mutated per-test
Hand-rolled class	harness.ts (MockWorktreeManager, CapturingEventBus)	Best for shared infrastructure

createMockEventBus() is independently defined in 5+ files. Should be extracted to shared test utility.

Dead / Worthless Tests

Test	File	Reason
`expect(true).toBe(true)`	`real-e2e-crash.test.ts`	Can never fail. Hardcoded paths.
"logs comprehensive spawn information"	`process-manager.test.ts`	Asserts nothing meaningful
"logs comprehensive worktree creation details"	`process-manager.test.ts`	Asserts nothing meaningful
"should demonstrate the race condition exists"	`crash-race-condition.test.ts` #4	Only asserts `updateCalls.length > 0`
"should emit stopped event (second test)"	`mock-manager.test.ts`	Exact duplicate of previous test

27 KiB Raw Blame History Unescape Escape

Test Inventory

Coverage Matrix

Test Files — Detailed Inventory

Agent Module (src/agent/)

manager.test.ts — MultiProviderAgentManager

output-handler.test.ts — OutputHandler

completion-detection.test.ts — readSignalCompletion (private method)

completion-race-condition.test.ts — handleCompletion mutex

mutex-completion.test.ts — handleCompletion mutex (focused)

file-io.test.ts — File I/O utilities

process-manager.test.ts — ProcessManager

mock-manager.test.ts — MockAgentManager (testing the test double)

Agent Lifecycle (src/agent/lifecycle/)

DB Repositories (src/db/repositories/drizzle/)

Dispatch (src/dispatch/)

Coordination (src/coordination/)

Git (src/git/)

Process (src/process/)

Logging (src/logging/)

Events (src/events/)

Server (src/server/)

tRPC Router (src/trpc/)

Preview (src/preview/)

E2E Tests (src/test/e2e/)

Integration Tests (src/test/integration/)

Real Provider Tests (src/test/integration/real-providers/)