diff --git a/docs/test-inventory.md b/docs/test-inventory.md index 32ba637..5c92a6c 100644 --- a/docs/test-inventory.md +++ b/docs/test-inventory.md @@ -2,7 +2,7 @@ Comprehensive catalog of every test in the suite, what it verifies, and where coverage is weak or redundant. -**44 test files** | **35 unit** | **7 E2E** | **4 integration (mocked)** | **5 integration (real providers)** +**41 test files** | **33 unit** | **7 E2E** | **3 integration (mocked)** | **5 integration (real providers)** Last audited: 2026-03-02 @@ -93,35 +93,7 @@ Subjective 0–10 scores. 0 = no coverage, 5 = happy paths only, 10 = comprehens --- -#### `completion-detection.test.ts` — readSignalCompletion (private method) - -| Test | Verifies | -|------|----------| -| detects "questions" status | signal.json parsed, returns content | -| detects "done" status | Same | -| detects "error" status | Same | -| returns null when no signal.json | Missing file returns null | -| returns null for "running" status | Non-terminal status ignored | -| handles malformed signal.json | Bad JSON returns null | - -**Concern**: All 6 tests access private method via `(outputHandler as any).readSignalCompletion`. Real filesystem (good). **Partially redundant with `completion-race-condition.test.ts` tests 2-3** which test the same scenarios through the public `handleCompletion` method. - ---- - -#### `completion-race-condition.test.ts` — handleCompletion mutex - -| Test | Verifies | -|------|----------| -| prevent concurrent completion via mutex | Only 1 of 2 concurrent calls executes | -| handle completion with questions status | Final status `waiting_for_input` | -| handle completion when done | Final status `idle` | -| clean up lock even on failure | Mutex released after error | - -**Concern**: Module-level mock repo mutated per-test without `beforeEach` reset. Tests 2-3 overlap with `completion-detection.test.ts` but are strictly better (test public API). - ---- - -#### `mutex-completion.test.ts` — handleCompletion mutex (focused) +#### `mutex-completion.test.ts` — handleCompletion mutex | Test | Verifies | |------|----------| @@ -130,7 +102,7 @@ Subjective 0–10 scores. 0 = no coverage, 5 = happy paths only, 10 = comprehens | clean up on exception | Lock released on error | | per-agent mutex isolation | Different agentIds run concurrently | -**Excellent focused design.** 4 complementary scenarios. Uses real timers (50ms delays). **Partial overlap with `completion-race-condition.test.ts` test 1** — both test the mutex concurrent blocking. +**Excellent focused design.** 4 complementary scenarios. Uses real timers (50ms delays). --- @@ -172,28 +144,26 @@ Subjective 0–10 scores. 0 = no coverage, 5 = happy paths only, 10 = comprehens | getAgentWorkdir | Path computation | | createProjectWorktrees | Worktree creation for initiative projects | | createProjectWorktrees / throws on failure | Error propagation | -| createProjectWorktrees / logs details | Completes without error (weak) | | createStandaloneWorktree | Path returned | | createStandaloneWorktree / throws on failure | Error propagation | | spawnDetached / validates cwd exists | existsSync called | | spawnDetached / throws if cwd missing | Error thrown | | spawnDetached / passes correct cwd | spawn args verified | -| spawnDetached / logs spawn info | Completes (weak) | | spawnDetached / writes prompt file | PROMPT.md written | | buildSpawnCommand / native prompt mode | Command+args assembled | | buildSpawnCommand / flag prompt mode | Command+args assembled | | buildResumeCommand / flag style | Resume args assembled | | buildResumeCommand / no resume support | Error thrown | -**6 `vi.mock()` calls.** Most brittle test file. "Logs comprehensive" tests are empty assertions. Missing: `killProcess`, `cleanupWorktree`. +**6 `vi.mock()` calls.** Most brittle test file. Missing: `killProcess`, `cleanupWorktree`. --- #### `mock-manager.test.ts` — MockAgentManager (testing the test double) -39 tests covering: default scenario, configured delay, error/questions scenarios, resume, stop, list, get/getByName, scenario overrides, event emission order, name uniqueness, constructor options, agent modes (execute/discuss/plan/detail), structured questions. +43 tests covering: default scenario, configured delay, error/questions scenarios, resume, stop, list, get/getByName, scenario overrides, event emission order, name uniqueness, constructor options, agent modes (execute/discuss/plan/detail), structured questions. -**Thorough.** One duplicate: "should emit stopped event with detail_complete reason" appears twice with "(second test)" suffix. +**Thorough.** --- @@ -319,7 +289,7 @@ All use `createTestHarness()` with in-memory DB, MockAgentManager, MockWorktreeM **`recovery-scenarios.test.ts`** — 9 tests. Queue state in DB, in-progress task recovery after crash, blocked state persistence, merge queue recovery, agent Q&A (question/resume, structured questions, status transitions, multiple questions). -**`edge-cases.test.ts`** — 14 tests. Agent crash (events, task status, error message), agent waiting/resume (3 tests), task blocking (4 tests), merge conflict handling (4 tests). +**`edge-cases.test.ts`** — 12 tests. Agent crash (events, task status, error message), agent waiting (1 test), task blocking (4 tests), merge conflict handling (4 tests). **`extended-scenarios.test.ts`** — 6 tests. Conflict hand-back round-trip (full lifecycle, context preservation, sequential conflicts), multi-agent parallel work (parallel completion, merge ordering, mixed outcomes). @@ -329,12 +299,10 @@ All use `createTestHarness()` with in-memory DB, MockAgentManager, MockWorktreeM **`agent-workdir-verification.test.ts`** — 2 tests. Skipped by default (`REAL_WORKDIR_TESTS`). Spawns real Claude agent, verifies working directory path via diagnostic files and agent-created files. -**`crash-race-condition.test.ts`** — 4 tests. Always runs. Tests signal.json-based completion detection vs crash marking. Tests 2 and 4 overlap (same setup, test 4 has weaker assertions). Accesses private methods via `as any`. +**`crash-race-condition.test.ts`** — 3 tests. Always runs. Tests signal.json-based completion detection vs crash marking. Covers questions, no-signal crash, and slim-wildebeest scenario. Accesses private methods via `as any`. **`real-claude.test.ts`** — 3 tests. Skipped by default. Direct Claude CLI contract tests: done/questions/error status outputs validated against Zod schema. ~$0.10. -**`real-e2e-crash.test.ts`** — 1 test. **Dead test**: `expect(true).toBe(true)`. Hardcoded absolute paths. Diagnostic script, not a regression test. - --- ### Real Provider Tests (`src/test/integration/real-providers/`) @@ -353,24 +321,15 @@ All skipped by default. ~$0.50 total per full run. --- -## Redundancy Map +## Remaining Redundancy -Tests covering the same scenario multiple times. **Bold** = the better version to keep. +Tests covering the same scenario multiple times. | Scenario | Tests | |----------|-------| -| Signal.json completion detection (questions→waiting) | `completion-detection.test.ts` #1, **`completion-race-condition.test.ts` #2** | -| Signal.json completion detection (done→idle) | `completion-detection.test.ts` #2, **`completion-race-condition.test.ts` #3** | -| Mutex concurrent blocking | `completion-race-condition.test.ts` #1, **`mutex-completion.test.ts` #1** (more focused) | -| Mutex cleanup on error | `completion-race-condition.test.ts` #4, **`mutex-completion.test.ts` #3** (more focused) | -| Agent Q&A resume flow | `edge-cases.test.ts` #5, **`recovery-scenarios.test.ts` #5** (has options) | -| Agent status transitions (running→waiting→idle) | `edge-cases.test.ts` #6, **`recovery-scenarios.test.ts` #8** (more thorough) | -| Task blocking (DB status) | `edge-cases.test.ts` #10, `recovery-scenarios.test.ts` #3 | -| Merge conflict → resolution task | `edge-cases.test.ts` #13, **`extended-scenarios.test.ts` #1** (full round-trip) | +| Task blocking (DB status) | `edge-cases.test.ts`, `recovery-scenarios.test.ts` | +| Merge conflict → resolution task | `edge-cases.test.ts`, `extended-scenarios.test.ts` (full round-trip) | | Conflict resolution task creation | `coordination/manager.test.ts` (5 tests), `conflict-resolution-service.test.ts` (10 tests) | -| MockAgentManager detail_complete reason | `mock-manager.test.ts` has duplicate test "(second test)" | -| Crash marking with no signal | `crash-race-condition.test.ts` #2, `crash-race-condition.test.ts` #4 (weaker) | -| `real-e2e-crash.test.ts` #1 | Dead: `expect(true).toBe(true)` — contributes nothing | --- @@ -417,8 +376,6 @@ Tests covering the same scenario multiple times. **Bold** = the better version t - **`agent/process-manager.test.ts`** — 6 `vi.mock()` calls. Any import restructuring breaks tests. - **`agent/manager.test.ts`** — 5 `vi.mock()` calls. Same concern. - **`preview/manager.test.ts`** — 6 `vi.mock()` calls. Same concern. -- **`completion-detection.test.ts`** — Tests private method `readSignalCompletion` via `as any`. -- **`completion-race-condition.test.ts`** — Monkey-patches private methods. Module-level mock mutated without reset. - **`crash-race-condition.test.ts`** — Accesses private `handleCompletion` via `as any`. ### Low Fragility (behavioral tests) @@ -441,19 +398,7 @@ The test suite uses 4 different mocking patterns with no shared convention: |---------|-------|---------| | `vi.mock()` module-level | manager.test.ts, process-manager.test.ts, preview/manager.test.ts | Import-structure coupled | | Factory function (`createMockX()`) | output-handler.test.ts, coordination/*.test.ts, server/*.test.ts | Cleaner but copy-pasted | -| Inline const object | completion-race-condition.test.ts, mutex-completion.test.ts | Simple but mutated per-test | +| Inline const object | mutex-completion.test.ts | Simple but mutated per-test | | Hand-rolled class | harness.ts (MockWorktreeManager, CapturingEventBus) | Best for shared infrastructure | `createMockEventBus()` is independently defined in 5+ files. Should be extracted to shared test utility. - ---- - -## Dead / Worthless Tests - -| Test | File | Reason | -|------|------|--------| -| `expect(true).toBe(true)` | `real-e2e-crash.test.ts` | Can never fail. Hardcoded paths. | -| "logs comprehensive spawn information" | `process-manager.test.ts` | Asserts nothing meaningful | -| "logs comprehensive worktree creation details" | `process-manager.test.ts` | Asserts nothing meaningful | -| "should demonstrate the race condition exists" | `crash-race-condition.test.ts` #4 | Only asserts `updateCalls.length > 0` | -| "should emit stopped event (second test)" | `mock-manager.test.ts` | Exact duplicate of previous test | diff --git a/src/agent/completion-detection.test.ts b/src/agent/completion-detection.test.ts deleted file mode 100644 index 15240ff..0000000 --- a/src/agent/completion-detection.test.ts +++ /dev/null @@ -1,126 +0,0 @@ -/** - * Test for completion detection via readSignalCompletion - */ - -import { describe, test, expect, beforeEach, afterEach, vi } from 'vitest'; -import { mkdtemp, writeFile, mkdir } from 'node:fs/promises'; -import { join } from 'node:path'; -import { tmpdir } from 'node:os'; -import { rmSync } from 'node:fs'; -import { OutputHandler } from './output-handler.js'; -import type { AgentRepository } from '../db/repositories/agent-repository.js'; - -describe('Completion Detection Fix', () => { - let tempDir: string; - let outputHandler: OutputHandler; - let mockAgentRepo: AgentRepository; - - beforeEach(async () => { - tempDir = await mkdtemp(join(tmpdir(), 'completion-test-')); - - // Mock repositories - mockAgentRepo = { - update: vi.fn(), - findById: vi.fn().mockResolvedValue({ id: 'test-agent', mode: 'refine' }), - } as any; - - outputHandler = new OutputHandler(mockAgentRepo); - }); - - afterEach(() => { - rmSync(tempDir, { recursive: true, force: true }); - }); - - test('detects completion from signal.json with "questions" status', async () => { - const agentWorkdir = join(tempDir, 'test-agent'); - const cwDir = join(agentWorkdir, '.cw/output'); - await mkdir(cwDir, { recursive: true }); - - const signalContent = JSON.stringify({ - status: 'questions', - questions: [{ id: 'q1', text: 'Do you want to proceed?' }] - }); - await writeFile(join(cwDir, 'signal.json'), signalContent); - - const readSignalCompletion = (outputHandler as any).readSignalCompletion.bind(outputHandler); - const result = await readSignalCompletion(agentWorkdir); - - expect(result).not.toBeNull(); - expect(JSON.parse(result).status).toBe('questions'); - }); - - test('detects completion from signal.json with "done" status', async () => { - const agentWorkdir = join(tempDir, 'test-agent'); - const cwDir = join(agentWorkdir, '.cw/output'); - await mkdir(cwDir, { recursive: true }); - - const signalContent = JSON.stringify({ - status: 'done', - result: 'Task completed successfully' - }); - await writeFile(join(cwDir, 'signal.json'), signalContent); - - const readSignalCompletion = (outputHandler as any).readSignalCompletion.bind(outputHandler); - const result = await readSignalCompletion(agentWorkdir); - - expect(result).not.toBeNull(); - expect(JSON.parse(result).status).toBe('done'); - }); - - test('detects completion from signal.json with "error" status', async () => { - const agentWorkdir = join(tempDir, 'test-agent'); - const cwDir = join(agentWorkdir, '.cw/output'); - await mkdir(cwDir, { recursive: true }); - - const signalContent = JSON.stringify({ - status: 'error', - error: 'Something went wrong' - }); - await writeFile(join(cwDir, 'signal.json'), signalContent); - - const readSignalCompletion = (outputHandler as any).readSignalCompletion.bind(outputHandler); - const result = await readSignalCompletion(agentWorkdir); - - expect(result).not.toBeNull(); - expect(JSON.parse(result).status).toBe('error'); - }); - - test('returns null when signal.json does not exist', async () => { - const agentWorkdir = join(tempDir, 'test-agent'); - - const readSignalCompletion = (outputHandler as any).readSignalCompletion.bind(outputHandler); - const result = await readSignalCompletion(agentWorkdir); - - expect(result).toBeNull(); - }); - - test('returns null for incomplete status', async () => { - const agentWorkdir = join(tempDir, 'test-agent'); - const cwDir = join(agentWorkdir, '.cw/output'); - await mkdir(cwDir, { recursive: true }); - - const signalContent = JSON.stringify({ - status: 'running', - progress: 'Still working...' - }); - await writeFile(join(cwDir, 'signal.json'), signalContent); - - const readSignalCompletion = (outputHandler as any).readSignalCompletion.bind(outputHandler); - const result = await readSignalCompletion(agentWorkdir); - - expect(result).toBeNull(); - }); - - test('handles malformed signal.json gracefully', async () => { - const agentWorkdir = join(tempDir, 'test-agent'); - const cwDir = join(agentWorkdir, '.cw/output'); - await mkdir(cwDir, { recursive: true }); - - await writeFile(join(cwDir, 'signal.json'), '{ invalid json }'); - - const readSignalCompletion = (outputHandler as any).readSignalCompletion.bind(outputHandler); - const result = await readSignalCompletion(agentWorkdir); - - expect(result).toBeNull(); - }); -}); diff --git a/src/agent/completion-race-condition.test.ts b/src/agent/completion-race-condition.test.ts deleted file mode 100644 index a1de29e..0000000 --- a/src/agent/completion-race-condition.test.ts +++ /dev/null @@ -1,233 +0,0 @@ -/** - * Test for completion handler race condition fix. - * Verifies that only one completion handler executes per agent. - */ - -import { describe, it, beforeEach, afterEach, expect } from 'vitest'; -import { join } from 'node:path'; -import { mkdtemp, writeFile, mkdir, rm } from 'node:fs/promises'; -import { tmpdir } from 'node:os'; -import { OutputHandler, type ActiveAgent } from './output-handler.js'; -import type { AgentRepository } from '../db/repositories/agent-repository.js'; -import type { EventBus } from '../events/index.js'; - -// Default agent record for mocks -const defaultAgent = { - id: 'test-agent', - name: 'test-agent', - taskId: null, - provider: 'claude', - mode: 'refine' as const, - status: 'running' as const, - worktreeId: 'test-worktree', - outputFilePath: '', - sessionId: null, - result: null, - pendingQuestions: null, - initiativeId: null, - accountId: null, - userDismissedAt: null, - pid: null, - exitCode: null, - createdAt: new Date(), - updatedAt: new Date(), -}; - -// Mock agent repository -const mockAgentRepository: AgentRepository = { - async findById() { - return { ...defaultAgent }; - }, - async update(_id: string, data: any) { - return { ...defaultAgent, ...data }; - }, - async create() { - throw new Error('Not implemented'); - }, - async findAll() { - throw new Error('Not implemented'); - }, - async findByStatus() { - throw new Error('Not implemented'); - }, - async findByTaskId() { - throw new Error('Not implemented'); - }, - async findByName() { - throw new Error('Not implemented'); - }, - async findBySessionId() { - throw new Error('Not implemented'); - }, - async delete() { - throw new Error('Not implemented'); - } -}; - -describe('OutputHandler completion race condition fix', () => { - let outputHandler: OutputHandler; - let testWorkdir: string; - - beforeEach(async () => { - outputHandler = new OutputHandler(mockAgentRepository); - testWorkdir = await mkdtemp(join(tmpdir(), 'cw-completion-test-')); - }); - - afterEach(async () => { - await rm(testWorkdir, { recursive: true, force: true }); - }); - - it('should prevent concurrent completion handling via mutex', async () => { - // Setup agent workdir with completion signal - const outputDir = join(testWorkdir, '.cw', 'output'); - await mkdir(outputDir, { recursive: true }); - await writeFile(join(outputDir, 'signal.json'), JSON.stringify({ - status: 'questions', - questions: [{ id: 'q1', question: 'Test question?' }] - })); - - const agentId = 'test-agent'; - const getAgentWorkdir = (_alias: string) => testWorkdir; - - // Provide a mock ActiveAgent with outputFilePath so the signal.json check path is reached - const mockActive = { - outputFilePath: join(testWorkdir, 'output.jsonl'), - streamSessionId: 'session-1' - }; - // Create the output file so readCompleteLines doesn't error - await writeFile(mockActive.outputFilePath, ''); - - let completionCallCount = 0; - let firstHandlerStarted = false; - let secondHandlerBlocked = false; - - // Track calls to the private processSignalAndFiles method - const originalProcessSignalAndFiles = (outputHandler as any).processSignalAndFiles; - (outputHandler as any).processSignalAndFiles = async (...args: any[]) => { - completionCallCount++; - - if (completionCallCount === 1) { - firstHandlerStarted = true; - // Simulate some processing time - await new Promise(resolve => setTimeout(resolve, 50)); - return originalProcessSignalAndFiles.apply(outputHandler, args); - } else { - // This should never be reached due to the mutex - secondHandlerBlocked = true; - return originalProcessSignalAndFiles.apply(outputHandler, args); - } - }; - - // Start two concurrent completion handlers - const completion1 = outputHandler.handleCompletion(agentId, mockActive as any, getAgentWorkdir); - const completion2 = outputHandler.handleCompletion(agentId, mockActive as any, getAgentWorkdir); - - await Promise.all([completion1, completion2]); - - // Verify mutex prevented duplicate processing - expect(firstHandlerStarted, 'First handler should have started').toBe(true); - expect(secondHandlerBlocked, 'Second handler should have been blocked by mutex').toBe(false); - expect(completionCallCount, 'Should only process completion once').toBe(1); - }); - - it('should handle completion when agent has questions status', async () => { - // Setup agent workdir with questions signal - const outputDir = join(testWorkdir, '.cw', 'output'); - await mkdir(outputDir, { recursive: true }); - await writeFile(join(outputDir, 'signal.json'), JSON.stringify({ - status: 'questions', - questions: [{ id: 'q1', question: 'What should I do next?' }] - })); - - const agentId = 'test-agent'; - const getAgentWorkdir = (_alias: string) => testWorkdir; - - // Provide a mock ActiveAgent with outputFilePath so the signal.json check path is reached - const mockActive = { - outputFilePath: join(testWorkdir, 'output.jsonl'), - streamSessionId: 'session-1' - }; - await writeFile(mockActive.outputFilePath, ''); - - // Mock the update call to track status changes - let finalStatus: string | undefined; - (mockAgentRepository as any).update = async (id: string, updates: any) => { - if (updates.status) { - finalStatus = updates.status; - } - }; - - await outputHandler.handleCompletion(agentId, mockActive as any, getAgentWorkdir); - - // Verify agent was marked as waiting for input, not crashed - expect(finalStatus, 'Agent should be waiting for input').toBe('waiting_for_input'); - }); - - it('should handle completion when agent is done', async () => { - // Setup agent workdir with done signal - const outputDir = join(testWorkdir, '.cw', 'output'); - await mkdir(outputDir, { recursive: true }); - await writeFile(join(outputDir, 'signal.json'), JSON.stringify({ - status: 'done' - })); - - const agentId = 'test-agent'; - const getAgentWorkdir = (_alias: string) => testWorkdir; - - // Provide a mock ActiveAgent with outputFilePath so the signal.json check path is reached - const mockActive = { - outputFilePath: join(testWorkdir, 'output.jsonl'), - streamSessionId: 'session-1' - }; - await writeFile(mockActive.outputFilePath, ''); - - // Mock the update call to track status changes - let finalStatus: string | undefined; - (mockAgentRepository as any).update = async (id: string, updates: any) => { - if (updates.status) { - finalStatus = updates.status; - } - }; - - await outputHandler.handleCompletion(agentId, mockActive as any, getAgentWorkdir); - - // Verify agent was marked as idle, not crashed - expect(finalStatus, 'Agent should be idle after completion').toBe('idle'); - }); - - it('should clean up completion lock even if processing fails', async () => { - const agentId = 'test-agent'; - const getAgentWorkdir = (_alias: string) => testWorkdir; - - // Force an error during processing - (mockAgentRepository as any).findById = async () => { - throw new Error('Database error'); - }; - - // handleCompletion propagates errors but the finally block still cleans up the lock - await expect(outputHandler.handleCompletion(agentId, undefined, getAgentWorkdir)) - .rejects.toThrow('Database error'); - - // Verify lock was cleaned up by ensuring a second call can proceed - let secondCallStarted = false; - (mockAgentRepository as any).findById = async (id: string) => { - secondCallStarted = true; - return { ...defaultAgent }; - }; - - // Create signal.json so the second call can complete successfully - const outputDir = join(testWorkdir, '.cw', 'output'); - await mkdir(outputDir, { recursive: true }); - await writeFile(join(outputDir, 'signal.json'), JSON.stringify({ - status: 'done' - })); - const mockActive = { - outputFilePath: join(testWorkdir, 'output.jsonl'), - streamSessionId: 'session-1' - }; - await writeFile(mockActive.outputFilePath, ''); - - await outputHandler.handleCompletion(agentId, mockActive as any, getAgentWorkdir); - expect(secondCallStarted, 'Second call should proceed after lock cleanup').toBe(true); - }); -}); \ No newline at end of file diff --git a/src/agent/mock-manager.test.ts b/src/agent/mock-manager.test.ts index 19c3eae..317236f 100644 --- a/src/agent/mock-manager.test.ts +++ b/src/agent/mock-manager.test.ts @@ -730,25 +730,6 @@ describe('MockAgentManager', () => { expect(agent?.status).toBe('waiting_for_input'); }); - it('should emit stopped event with detail_complete reason (second test)', async () => { - manager.setScenario('detail-done', { - status: 'done', - delay: 0, - result: 'Detail complete', - }); - - await manager.spawn({ - name: 'detail-done', - taskId: 'plan-1', - prompt: 'test', - mode: 'detail', - }); - await vi.runAllTimersAsync(); - - const stopped = eventBus.emittedEvents.find((e) => e.type === 'agent:stopped') as AgentStoppedEvent | undefined; - expect(stopped?.payload.reason).toBe('detail_complete'); - }); - it('should set result message for detail mode', async () => { manager.setScenario('detailer', { status: 'done', diff --git a/src/agent/process-manager.test.ts b/src/agent/process-manager.test.ts index 5a3c681..ef254bb 100644 --- a/src/agent/process-manager.test.ts +++ b/src/agent/process-manager.test.ts @@ -159,17 +159,7 @@ describe('ProcessManager', () => { .rejects.toThrow('Worktree creation failed:'); }); - it('logs comprehensive worktree creation details', async () => { - const alias = 'test-agent'; - const initiativeId = 'init-123'; - - await processManager.createProjectWorktrees(alias, initiativeId); - - // Verify logging (implementation would need to capture log calls) - // For now, just verify the method completes successfully - expect(mockProjectRepository.findProjectsByInitiativeId).toHaveBeenCalledWith('init-123'); - }); - }); +}); describe('createStandaloneWorktree', () => { beforeEach(() => { @@ -270,28 +260,6 @@ describe('ProcessManager', () => { }); }); - it('logs comprehensive spawn information', () => { - const agentId = 'agent-123'; - const agentName = 'test-agent'; - const command = 'claude'; - const args = ['--json-schema', 'schema.json']; - const cwd = '/test/workspace/agent-workdirs/test-agent'; - const env = { CLAUDE_CONFIG_DIR: '/config' }; - const providerName = 'claude'; - - const result = processManager.spawnDetached(agentId, agentName, command, args, cwd, env, providerName); - - expect(result).toHaveProperty('pid', 12345); - expect(result).toHaveProperty('outputFilePath'); - expect(result).toHaveProperty('tailer'); - - // Verify log directory creation uses agent name, not ID - expect(mockMkdirSync).toHaveBeenCalledWith( - '/test/workspace/.cw/agent-logs/test-agent', - { recursive: true } - ); - }); - it('writes prompt file when provided', () => { const agentId = 'agent-123'; const agentName = 'test-agent'; diff --git a/src/test/e2e/edge-cases.test.ts b/src/test/e2e/edge-cases.test.ts index e35ed85..2ddfcee 100644 --- a/src/test/e2e/edge-cases.test.ts +++ b/src/test/e2e/edge-cases.test.ts @@ -3,7 +3,7 @@ * * Tests edge case scenarios in dispatch/coordination flow: * - Agent crashes during task - * - Agent waiting for input and resume + * - Agent waiting for input * - Task blocking * - Merge conflicts * @@ -20,8 +20,6 @@ import type { AgentSpawnedEvent, AgentCrashedEvent, AgentWaitingEvent, - AgentResumedEvent, - AgentStoppedEvent, TaskBlockedEvent, MergeConflictedEvent, } from '../../events/types.js'; @@ -171,101 +169,6 @@ describe('E2E Edge Cases', () => { expect(waitingPayload.questions[0].question).toBe('Which database should I use?'); }); - it('resumes agent and completes after resume', async () => { - vi.useFakeTimers(); - const seeded = await harness.seedFixture(SIMPLE_FIXTURE); - const taskAId = seeded.tasks.get('Task A')!; - - // Pre-seed required idle agent - await harness.agentManager.spawn({ - name: 'pool-agent', - taskId: 'placeholder', - prompt: 'placeholder', - }); - await harness.advanceTimers(); - - // Set questions scenario - harness.setAgentScenario(`agent-${taskAId.slice(0, 6)}`, { - status: 'questions', - questions: [{ id: 'q1', question: 'Which database should I use?' }], - }); - - await harness.dispatchManager.queue(taskAId); - harness.clearEvents(); - - const dispatchResult = await harness.dispatchManager.dispatchNext(); - await harness.advanceTimers(); - - // Verify agent is in waiting_for_input status - const agent = await harness.agentManager.get(dispatchResult.agentId!); - expect(agent?.status).toBe('waiting_for_input'); - - // Clear events to check resume events - harness.clearEvents(); - - // Resume agent with answers map - await harness.agentManager.resume(dispatchResult.agentId!, { q1: 'PostgreSQL' }); - await harness.advanceTimers(); - - // Verify: agent:resumed event emitted - const resumedEvents = harness.getEventsByType('agent:resumed'); - expect(resumedEvents.length).toBe(1); - const resumedPayload = (resumedEvents[0] as AgentResumedEvent).payload; - expect(resumedPayload.taskId).toBe(taskAId); - - // Verify: agent:stopped event emitted (after resume completes) - const stoppedEvents = harness.getEventsByType('agent:stopped'); - expect(stoppedEvents.length).toBe(1); - const stoppedPayload = (stoppedEvents[0] as AgentStoppedEvent).payload; - expect(stoppedPayload.taskId).toBe(taskAId); - expect(stoppedPayload.reason).toBe('task_complete'); - }); - - it('agent status transitions correctly through waiting and resume', async () => { - vi.useFakeTimers(); - const seeded = await harness.seedFixture(SIMPLE_FIXTURE); - const taskAId = seeded.tasks.get('Task A')!; - - // Pre-seed required idle agent - await harness.agentManager.spawn({ - name: 'pool-agent', - taskId: 'placeholder', - prompt: 'placeholder', - }); - await harness.advanceTimers(); - - // Set questions scenario - harness.setAgentScenario(`agent-${taskAId.slice(0, 6)}`, { - status: 'questions', - questions: [{ id: 'q1', question: 'Which database should I use?' }], - }); - - await harness.dispatchManager.queue(taskAId); - const dispatchResult = await harness.dispatchManager.dispatchNext(); - - // Initially running - let agent = await harness.agentManager.get(dispatchResult.agentId!); - expect(agent?.status).toBe('running'); - - await harness.advanceTimers(); - - // After scenario completes: waiting_for_input - agent = await harness.agentManager.get(dispatchResult.agentId!); - expect(agent?.status).toBe('waiting_for_input'); - - // Resume with answers map - await harness.agentManager.resume(dispatchResult.agentId!, { q1: 'PostgreSQL' }); - - // After resume: running again - agent = await harness.agentManager.get(dispatchResult.agentId!); - expect(agent?.status).toBe('running'); - - await harness.advanceTimers(); - - // After completion: idle - agent = await harness.agentManager.get(dispatchResult.agentId!); - expect(agent?.status).toBe('idle'); - }); }); describe('Task blocking', () => { diff --git a/src/test/integration/crash-race-condition.test.ts b/src/test/integration/crash-race-condition.test.ts index 39e82d8..f7ce25f 100644 --- a/src/test/integration/crash-race-condition.test.ts +++ b/src/test/integration/crash-race-condition.test.ts @@ -229,34 +229,4 @@ Initiative page is essentially empty — lacks context, scope, goals, and techni expect(['idle', 'waiting_for_input', 'stopped']).toContain(finalAgentStatus); }); - it('should demonstrate the race condition exists in current code', async () => { - // This test demonstrates what happens without the fix - // We'll create the conditions that trigger the race condition - - // SETUP: No signal.json initially (simulates timing where signal not yet written) - const outputFilePath = join(testDir, 'output.jsonl'); - await writeFile(outputFilePath, ''); // Empty output file - - const mockActive = { - outputFilePath, - streamSessionId: 'session-1' - }; - - const getAgentWorkdir = (agentId: string) => testDir; - - // EXECUTE: This should hit the "no output received" path and mark as crashed - await (outputHandler as any).handleCompletion( - testAgent.id, - mockActive, - getAgentWorkdir - ); - - // DOCUMENT: Show what happens in the race condition scenario - console.log('Race condition test - Final status:', finalAgentStatus); - console.log('Race condition test - Update calls:', updateCalls); - - // This test shows the current behavior - it may pass or fail depending on fix - expect(updateCalls.length).toBeGreaterThan(0); - console.log('Current behavior when no signal exists:', finalAgentStatus); - }); }); \ No newline at end of file diff --git a/src/test/integration/real-e2e-crash.test.ts b/src/test/integration/real-e2e-crash.test.ts deleted file mode 100644 index e7159e4..0000000 --- a/src/test/integration/real-e2e-crash.test.ts +++ /dev/null @@ -1,136 +0,0 @@ -/** - * END-TO-END integration test with REAL Claude Code process. - * This will capture the actual crash-marking behavior with debug logs. - */ - -import { describe, it, expect } from 'vitest'; -import { execSync } from 'node:child_process'; -import { writeFile, mkdir } from 'node:fs/promises'; -import { join } from 'node:path'; -import { randomBytes } from 'node:crypto'; -import { tmpdir } from 'node:os'; - -describe('Real E2E Crash Investigation', () => { - it('should spawn a real agent and capture debug logs', async () => { - if (!process.env.REAL_CLAUDE_TESTS) { - console.log('Skipping - REAL_CLAUDE_TESTS not set'); - return; - } - - console.log('🚀 Starting REAL end-to-end test...'); - - // Create test workspace - const testWorkspace = join(tmpdir(), `e2e-test-${randomBytes(4).toString('hex')}`); - await mkdir(testWorkspace, { recursive: true }); - await mkdir(join(testWorkspace, '.cw/input'), { recursive: true }); - - // Create minimal input that will ask questions - await writeFile(join(testWorkspace, '.cw/input/manifest.json'), JSON.stringify({ - files: ['initiative.md'] - })); - - await writeFile(join(testWorkspace, '.cw/input/initiative.md'), `--- -id: test-e2e -name: E2E Test Initiative -status: active ----`); - - console.log('📁 Created test workspace:', testWorkspace); - - // Change to the workdir where server is running to spawn agent - const originalCwd = process.cwd(); - process.chdir('/Users/lukasmay/development/projects/codewalk-district/workdir'); - - console.log('🎯 Spawning real Claude agent...'); - console.log('📍 Working directory:', process.cwd()); - - const agentName = `e2e-test-${randomBytes(4).toString('hex')}`; - let spawnResult: any; - - try { - // Spawn agent with shorter timeout for testing - use first available initiative - const result = execSync(`cw agent spawn --name ${agentName} --initiative 2-x-DkbSjiiRzlSAsD4ib "Please analyze the current admin UI setup and identify what needs to be replaced with shadcn + tailwind"`, { - encoding: 'utf-8', - timeout: 30000 // 30 seconds max for testing - }); - - console.log('📄 CLI Output:', result); - spawnResult = { success: true, output: result }; - - } catch (error: any) { - spawnResult = { - success: false, - exitCode: error.status, - stdout: error.stdout, - stderr: error.stderr - }; - - console.log('⚠️ CLI execution result:', spawnResult); - - // Timeout is expected for agents that take time to complete - if (error.status === 124 || error.message?.includes('timeout')) { - console.log('⏰ Got timeout (expected) - agent was running'); - } - } - - // Wait a moment for completion detection to run - await new Promise(resolve => setTimeout(resolve, 2000)); - - // Now check the database for what happened - console.log('📊 Checking agent status in database...'); - - const dbPath = '.cw/cw.db'; // From workdir - - try { - const agentData = execSync( - `sqlite3 "${dbPath}" "SELECT json_object('id', id, 'name', name, 'status', status, 'exit_code', exit_code, 'worktree_id', worktree_id) FROM agents WHERE name = '${agentName}' ORDER BY created_at DESC LIMIT 1;"`, - { encoding: 'utf-8' } - ).trim(); - - if (agentData) { - const agent = JSON.parse(agentData); - console.log('🔍 Agent found:', agent); - - // Check for output files in agent workdir - const agentWorkdir = `agent-workdirs/${agentName}`; - const outputDir = join(agentWorkdir, '.cw/output'); - - console.log('📂 Checking output dir:', outputDir); - - try { - const { existsSync } = await import('node:fs'); - const hasSignal = existsSync(join(outputDir, 'signal.json')); - console.log('📄 Has signal.json:', hasSignal); - - if (hasSignal) { - const { readFile } = await import('node:fs/promises'); - const signalContent = await readFile(join(outputDir, 'signal.json'), 'utf-8'); - const signal = JSON.parse(signalContent); - console.log('📋 Signal status:', signal.status); - - if (agent.status === 'crashed' && ['questions', 'done', 'error'].includes(signal.status)) { - console.log('🚨 BUG REPRODUCED: Agent crashed despite valid completion!'); - console.log('🔍 This indicates our completion detection race condition bug is still active'); - } else if (agent.status === 'waiting_for_input' && signal.status === 'questions') { - console.log('✅ CORRECT: Agent properly detected completion and set waiting_for_input'); - } - } - } catch (err) { - console.log('❌ Could not check output files:', err); - } - } else { - console.log('❌ No agent found in database'); - } - - } catch (err) { - console.log('❌ Database query failed:', err); - } finally { - // Restore original working directory - process.chdir(originalCwd); - } - - // The test passes regardless - this is for investigation - expect(true).toBe(true); - - }, 180000); // 3 minutes timeout -}); \ No newline at end of file