test: Remove redundant and dead tests (-743 lines)

Delete 3 files:
- completion-detection.test.ts (private method tests, covered by crash-race-condition)
- completion-race-condition.test.ts (covered by mutex-completion + crash-race-condition)
- real-e2e-crash.test.ts (dead: expect(true).toBe(true), hardcoded paths)

Remove individual tests:
- crash-race-condition.test.ts #4 (weaker duplicate of #2)
- mock-manager.test.ts duplicate "(second test)" for detail_complete
- process-manager.test.ts 2 "logs comprehensive" tests with empty assertions
- edge-cases.test.ts 2 Q&A tests redundant with recovery-scenarios

Update test-inventory.md to reflect removals.
This commit is contained in:
Lukas May
2026-03-02 12:57:27 +09:00
parent a2ab4c4a84
commit 1540039c52
8 changed files with 15 additions and 743 deletions

View File

@@ -2,7 +2,7 @@
Comprehensive catalog of every test in the suite, what it verifies, and where coverage is weak or redundant.
**44 test files** | **35 unit** | **7 E2E** | **4 integration (mocked)** | **5 integration (real providers)**
**41 test files** | **33 unit** | **7 E2E** | **3 integration (mocked)** | **5 integration (real providers)**
Last audited: 2026-03-02
@@ -93,35 +93,7 @@ Subjective 010 scores. 0 = no coverage, 5 = happy paths only, 10 = comprehens
---
#### `completion-detection.test.ts` — readSignalCompletion (private method)
| Test | Verifies |
|------|----------|
| detects "questions" status | signal.json parsed, returns content |
| detects "done" status | Same |
| detects "error" status | Same |
| returns null when no signal.json | Missing file returns null |
| returns null for "running" status | Non-terminal status ignored |
| handles malformed signal.json | Bad JSON returns null |
**Concern**: All 6 tests access private method via `(outputHandler as any).readSignalCompletion`. Real filesystem (good). **Partially redundant with `completion-race-condition.test.ts` tests 2-3** which test the same scenarios through the public `handleCompletion` method.
---
#### `completion-race-condition.test.ts` — handleCompletion mutex
| Test | Verifies |
|------|----------|
| prevent concurrent completion via mutex | Only 1 of 2 concurrent calls executes |
| handle completion with questions status | Final status `waiting_for_input` |
| handle completion when done | Final status `idle` |
| clean up lock even on failure | Mutex released after error |
**Concern**: Module-level mock repo mutated per-test without `beforeEach` reset. Tests 2-3 overlap with `completion-detection.test.ts` but are strictly better (test public API).
---
#### `mutex-completion.test.ts` — handleCompletion mutex (focused)
#### `mutex-completion.test.ts` — handleCompletion mutex
| Test | Verifies |
|------|----------|
@@ -130,7 +102,7 @@ Subjective 010 scores. 0 = no coverage, 5 = happy paths only, 10 = comprehens
| clean up on exception | Lock released on error |
| per-agent mutex isolation | Different agentIds run concurrently |
**Excellent focused design.** 4 complementary scenarios. Uses real timers (50ms delays). **Partial overlap with `completion-race-condition.test.ts` test 1** — both test the mutex concurrent blocking.
**Excellent focused design.** 4 complementary scenarios. Uses real timers (50ms delays).
---
@@ -172,28 +144,26 @@ Subjective 010 scores. 0 = no coverage, 5 = happy paths only, 10 = comprehens
| getAgentWorkdir | Path computation |
| createProjectWorktrees | Worktree creation for initiative projects |
| createProjectWorktrees / throws on failure | Error propagation |
| createProjectWorktrees / logs details | Completes without error (weak) |
| createStandaloneWorktree | Path returned |
| createStandaloneWorktree / throws on failure | Error propagation |
| spawnDetached / validates cwd exists | existsSync called |
| spawnDetached / throws if cwd missing | Error thrown |
| spawnDetached / passes correct cwd | spawn args verified |
| spawnDetached / logs spawn info | Completes (weak) |
| spawnDetached / writes prompt file | PROMPT.md written |
| buildSpawnCommand / native prompt mode | Command+args assembled |
| buildSpawnCommand / flag prompt mode | Command+args assembled |
| buildResumeCommand / flag style | Resume args assembled |
| buildResumeCommand / no resume support | Error thrown |
**6 `vi.mock()` calls.** Most brittle test file. "Logs comprehensive" tests are empty assertions. Missing: `killProcess`, `cleanupWorktree`.
**6 `vi.mock()` calls.** Most brittle test file. Missing: `killProcess`, `cleanupWorktree`.
---
#### `mock-manager.test.ts` — MockAgentManager (testing the test double)
39 tests covering: default scenario, configured delay, error/questions scenarios, resume, stop, list, get/getByName, scenario overrides, event emission order, name uniqueness, constructor options, agent modes (execute/discuss/plan/detail), structured questions.
43 tests covering: default scenario, configured delay, error/questions scenarios, resume, stop, list, get/getByName, scenario overrides, event emission order, name uniqueness, constructor options, agent modes (execute/discuss/plan/detail), structured questions.
**Thorough.** One duplicate: "should emit stopped event with detail_complete reason" appears twice with "(second test)" suffix.
**Thorough.**
---
@@ -319,7 +289,7 @@ All use `createTestHarness()` with in-memory DB, MockAgentManager, MockWorktreeM
**`recovery-scenarios.test.ts`** — 9 tests. Queue state in DB, in-progress task recovery after crash, blocked state persistence, merge queue recovery, agent Q&A (question/resume, structured questions, status transitions, multiple questions).
**`edge-cases.test.ts`** — 14 tests. Agent crash (events, task status, error message), agent waiting/resume (3 tests), task blocking (4 tests), merge conflict handling (4 tests).
**`edge-cases.test.ts`** — 12 tests. Agent crash (events, task status, error message), agent waiting (1 test), task blocking (4 tests), merge conflict handling (4 tests).
**`extended-scenarios.test.ts`** — 6 tests. Conflict hand-back round-trip (full lifecycle, context preservation, sequential conflicts), multi-agent parallel work (parallel completion, merge ordering, mixed outcomes).
@@ -329,12 +299,10 @@ All use `createTestHarness()` with in-memory DB, MockAgentManager, MockWorktreeM
**`agent-workdir-verification.test.ts`** — 2 tests. Skipped by default (`REAL_WORKDIR_TESTS`). Spawns real Claude agent, verifies working directory path via diagnostic files and agent-created files.
**`crash-race-condition.test.ts`** — 4 tests. Always runs. Tests signal.json-based completion detection vs crash marking. Tests 2 and 4 overlap (same setup, test 4 has weaker assertions). Accesses private methods via `as any`.
**`crash-race-condition.test.ts`** — 3 tests. Always runs. Tests signal.json-based completion detection vs crash marking. Covers questions, no-signal crash, and slim-wildebeest scenario. Accesses private methods via `as any`.
**`real-claude.test.ts`** — 3 tests. Skipped by default. Direct Claude CLI contract tests: done/questions/error status outputs validated against Zod schema. ~$0.10.
**`real-e2e-crash.test.ts`** — 1 test. **Dead test**: `expect(true).toBe(true)`. Hardcoded absolute paths. Diagnostic script, not a regression test.
---
### Real Provider Tests (`src/test/integration/real-providers/`)
@@ -353,24 +321,15 @@ All skipped by default. ~$0.50 total per full run.
---
## Redundancy Map
## Remaining Redundancy
Tests covering the same scenario multiple times. **Bold** = the better version to keep.
Tests covering the same scenario multiple times.
| Scenario | Tests |
|----------|-------|
| Signal.json completion detection (questions→waiting) | `completion-detection.test.ts` #1, **`completion-race-condition.test.ts` #2** |
| Signal.json completion detection (done→idle) | `completion-detection.test.ts` #2, **`completion-race-condition.test.ts` #3** |
| Mutex concurrent blocking | `completion-race-condition.test.ts` #1, **`mutex-completion.test.ts` #1** (more focused) |
| Mutex cleanup on error | `completion-race-condition.test.ts` #4, **`mutex-completion.test.ts` #3** (more focused) |
| Agent Q&A resume flow | `edge-cases.test.ts` #5, **`recovery-scenarios.test.ts` #5** (has options) |
| Agent status transitions (running→waiting→idle) | `edge-cases.test.ts` #6, **`recovery-scenarios.test.ts` #8** (more thorough) |
| Task blocking (DB status) | `edge-cases.test.ts` #10, `recovery-scenarios.test.ts` #3 |
| Merge conflict → resolution task | `edge-cases.test.ts` #13, **`extended-scenarios.test.ts` #1** (full round-trip) |
| Task blocking (DB status) | `edge-cases.test.ts`, `recovery-scenarios.test.ts` |
| Merge conflict → resolution task | `edge-cases.test.ts`, `extended-scenarios.test.ts` (full round-trip) |
| Conflict resolution task creation | `coordination/manager.test.ts` (5 tests), `conflict-resolution-service.test.ts` (10 tests) |
| MockAgentManager detail_complete reason | `mock-manager.test.ts` has duplicate test "(second test)" |
| Crash marking with no signal | `crash-race-condition.test.ts` #2, `crash-race-condition.test.ts` #4 (weaker) |
| `real-e2e-crash.test.ts` #1 | Dead: `expect(true).toBe(true)` — contributes nothing |
---
@@ -417,8 +376,6 @@ Tests covering the same scenario multiple times. **Bold** = the better version t
- **`agent/process-manager.test.ts`** — 6 `vi.mock()` calls. Any import restructuring breaks tests.
- **`agent/manager.test.ts`** — 5 `vi.mock()` calls. Same concern.
- **`preview/manager.test.ts`** — 6 `vi.mock()` calls. Same concern.
- **`completion-detection.test.ts`** — Tests private method `readSignalCompletion` via `as any`.
- **`completion-race-condition.test.ts`** — Monkey-patches private methods. Module-level mock mutated without reset.
- **`crash-race-condition.test.ts`** — Accesses private `handleCompletion` via `as any`.
### Low Fragility (behavioral tests)
@@ -441,19 +398,7 @@ The test suite uses 4 different mocking patterns with no shared convention:
|---------|-------|---------|
| `vi.mock()` module-level | manager.test.ts, process-manager.test.ts, preview/manager.test.ts | Import-structure coupled |
| Factory function (`createMockX()`) | output-handler.test.ts, coordination/*.test.ts, server/*.test.ts | Cleaner but copy-pasted |
| Inline const object | completion-race-condition.test.ts, mutex-completion.test.ts | Simple but mutated per-test |
| Inline const object | mutex-completion.test.ts | Simple but mutated per-test |
| Hand-rolled class | harness.ts (MockWorktreeManager, CapturingEventBus) | Best for shared infrastructure |
`createMockEventBus()` is independently defined in 5+ files. Should be extracted to shared test utility.
---
## Dead / Worthless Tests
| Test | File | Reason |
|------|------|--------|
| `expect(true).toBe(true)` | `real-e2e-crash.test.ts` | Can never fail. Hardcoded paths. |
| "logs comprehensive spawn information" | `process-manager.test.ts` | Asserts nothing meaningful |
| "logs comprehensive worktree creation details" | `process-manager.test.ts` | Asserts nothing meaningful |
| "should demonstrate the race condition exists" | `crash-race-condition.test.ts` #4 | Only asserts `updateCalls.length > 0` |
| "should emit stopped event (second test)" | `mock-manager.test.ts` | Exact duplicate of previous test |

View File

@@ -1,126 +0,0 @@
/**
* Test for completion detection via readSignalCompletion
*/
import { describe, test, expect, beforeEach, afterEach, vi } from 'vitest';
import { mkdtemp, writeFile, mkdir } from 'node:fs/promises';
import { join } from 'node:path';
import { tmpdir } from 'node:os';
import { rmSync } from 'node:fs';
import { OutputHandler } from './output-handler.js';
import type { AgentRepository } from '../db/repositories/agent-repository.js';
describe('Completion Detection Fix', () => {
let tempDir: string;
let outputHandler: OutputHandler;
let mockAgentRepo: AgentRepository;
beforeEach(async () => {
tempDir = await mkdtemp(join(tmpdir(), 'completion-test-'));
// Mock repositories
mockAgentRepo = {
update: vi.fn(),
findById: vi.fn().mockResolvedValue({ id: 'test-agent', mode: 'refine' }),
} as any;
outputHandler = new OutputHandler(mockAgentRepo);
});
afterEach(() => {
rmSync(tempDir, { recursive: true, force: true });
});
test('detects completion from signal.json with "questions" status', async () => {
const agentWorkdir = join(tempDir, 'test-agent');
const cwDir = join(agentWorkdir, '.cw/output');
await mkdir(cwDir, { recursive: true });
const signalContent = JSON.stringify({
status: 'questions',
questions: [{ id: 'q1', text: 'Do you want to proceed?' }]
});
await writeFile(join(cwDir, 'signal.json'), signalContent);
const readSignalCompletion = (outputHandler as any).readSignalCompletion.bind(outputHandler);
const result = await readSignalCompletion(agentWorkdir);
expect(result).not.toBeNull();
expect(JSON.parse(result).status).toBe('questions');
});
test('detects completion from signal.json with "done" status', async () => {
const agentWorkdir = join(tempDir, 'test-agent');
const cwDir = join(agentWorkdir, '.cw/output');
await mkdir(cwDir, { recursive: true });
const signalContent = JSON.stringify({
status: 'done',
result: 'Task completed successfully'
});
await writeFile(join(cwDir, 'signal.json'), signalContent);
const readSignalCompletion = (outputHandler as any).readSignalCompletion.bind(outputHandler);
const result = await readSignalCompletion(agentWorkdir);
expect(result).not.toBeNull();
expect(JSON.parse(result).status).toBe('done');
});
test('detects completion from signal.json with "error" status', async () => {
const agentWorkdir = join(tempDir, 'test-agent');
const cwDir = join(agentWorkdir, '.cw/output');
await mkdir(cwDir, { recursive: true });
const signalContent = JSON.stringify({
status: 'error',
error: 'Something went wrong'
});
await writeFile(join(cwDir, 'signal.json'), signalContent);
const readSignalCompletion = (outputHandler as any).readSignalCompletion.bind(outputHandler);
const result = await readSignalCompletion(agentWorkdir);
expect(result).not.toBeNull();
expect(JSON.parse(result).status).toBe('error');
});
test('returns null when signal.json does not exist', async () => {
const agentWorkdir = join(tempDir, 'test-agent');
const readSignalCompletion = (outputHandler as any).readSignalCompletion.bind(outputHandler);
const result = await readSignalCompletion(agentWorkdir);
expect(result).toBeNull();
});
test('returns null for incomplete status', async () => {
const agentWorkdir = join(tempDir, 'test-agent');
const cwDir = join(agentWorkdir, '.cw/output');
await mkdir(cwDir, { recursive: true });
const signalContent = JSON.stringify({
status: 'running',
progress: 'Still working...'
});
await writeFile(join(cwDir, 'signal.json'), signalContent);
const readSignalCompletion = (outputHandler as any).readSignalCompletion.bind(outputHandler);
const result = await readSignalCompletion(agentWorkdir);
expect(result).toBeNull();
});
test('handles malformed signal.json gracefully', async () => {
const agentWorkdir = join(tempDir, 'test-agent');
const cwDir = join(agentWorkdir, '.cw/output');
await mkdir(cwDir, { recursive: true });
await writeFile(join(cwDir, 'signal.json'), '{ invalid json }');
const readSignalCompletion = (outputHandler as any).readSignalCompletion.bind(outputHandler);
const result = await readSignalCompletion(agentWorkdir);
expect(result).toBeNull();
});
});

View File

@@ -1,233 +0,0 @@
/**
* Test for completion handler race condition fix.
* Verifies that only one completion handler executes per agent.
*/
import { describe, it, beforeEach, afterEach, expect } from 'vitest';
import { join } from 'node:path';
import { mkdtemp, writeFile, mkdir, rm } from 'node:fs/promises';
import { tmpdir } from 'node:os';
import { OutputHandler, type ActiveAgent } from './output-handler.js';
import type { AgentRepository } from '../db/repositories/agent-repository.js';
import type { EventBus } from '../events/index.js';
// Default agent record for mocks
const defaultAgent = {
id: 'test-agent',
name: 'test-agent',
taskId: null,
provider: 'claude',
mode: 'refine' as const,
status: 'running' as const,
worktreeId: 'test-worktree',
outputFilePath: '',
sessionId: null,
result: null,
pendingQuestions: null,
initiativeId: null,
accountId: null,
userDismissedAt: null,
pid: null,
exitCode: null,
createdAt: new Date(),
updatedAt: new Date(),
};
// Mock agent repository
const mockAgentRepository: AgentRepository = {
async findById() {
return { ...defaultAgent };
},
async update(_id: string, data: any) {
return { ...defaultAgent, ...data };
},
async create() {
throw new Error('Not implemented');
},
async findAll() {
throw new Error('Not implemented');
},
async findByStatus() {
throw new Error('Not implemented');
},
async findByTaskId() {
throw new Error('Not implemented');
},
async findByName() {
throw new Error('Not implemented');
},
async findBySessionId() {
throw new Error('Not implemented');
},
async delete() {
throw new Error('Not implemented');
}
};
describe('OutputHandler completion race condition fix', () => {
let outputHandler: OutputHandler;
let testWorkdir: string;
beforeEach(async () => {
outputHandler = new OutputHandler(mockAgentRepository);
testWorkdir = await mkdtemp(join(tmpdir(), 'cw-completion-test-'));
});
afterEach(async () => {
await rm(testWorkdir, { recursive: true, force: true });
});
it('should prevent concurrent completion handling via mutex', async () => {
// Setup agent workdir with completion signal
const outputDir = join(testWorkdir, '.cw', 'output');
await mkdir(outputDir, { recursive: true });
await writeFile(join(outputDir, 'signal.json'), JSON.stringify({
status: 'questions',
questions: [{ id: 'q1', question: 'Test question?' }]
}));
const agentId = 'test-agent';
const getAgentWorkdir = (_alias: string) => testWorkdir;
// Provide a mock ActiveAgent with outputFilePath so the signal.json check path is reached
const mockActive = {
outputFilePath: join(testWorkdir, 'output.jsonl'),
streamSessionId: 'session-1'
};
// Create the output file so readCompleteLines doesn't error
await writeFile(mockActive.outputFilePath, '');
let completionCallCount = 0;
let firstHandlerStarted = false;
let secondHandlerBlocked = false;
// Track calls to the private processSignalAndFiles method
const originalProcessSignalAndFiles = (outputHandler as any).processSignalAndFiles;
(outputHandler as any).processSignalAndFiles = async (...args: any[]) => {
completionCallCount++;
if (completionCallCount === 1) {
firstHandlerStarted = true;
// Simulate some processing time
await new Promise(resolve => setTimeout(resolve, 50));
return originalProcessSignalAndFiles.apply(outputHandler, args);
} else {
// This should never be reached due to the mutex
secondHandlerBlocked = true;
return originalProcessSignalAndFiles.apply(outputHandler, args);
}
};
// Start two concurrent completion handlers
const completion1 = outputHandler.handleCompletion(agentId, mockActive as any, getAgentWorkdir);
const completion2 = outputHandler.handleCompletion(agentId, mockActive as any, getAgentWorkdir);
await Promise.all([completion1, completion2]);
// Verify mutex prevented duplicate processing
expect(firstHandlerStarted, 'First handler should have started').toBe(true);
expect(secondHandlerBlocked, 'Second handler should have been blocked by mutex').toBe(false);
expect(completionCallCount, 'Should only process completion once').toBe(1);
});
it('should handle completion when agent has questions status', async () => {
// Setup agent workdir with questions signal
const outputDir = join(testWorkdir, '.cw', 'output');
await mkdir(outputDir, { recursive: true });
await writeFile(join(outputDir, 'signal.json'), JSON.stringify({
status: 'questions',
questions: [{ id: 'q1', question: 'What should I do next?' }]
}));
const agentId = 'test-agent';
const getAgentWorkdir = (_alias: string) => testWorkdir;
// Provide a mock ActiveAgent with outputFilePath so the signal.json check path is reached
const mockActive = {
outputFilePath: join(testWorkdir, 'output.jsonl'),
streamSessionId: 'session-1'
};
await writeFile(mockActive.outputFilePath, '');
// Mock the update call to track status changes
let finalStatus: string | undefined;
(mockAgentRepository as any).update = async (id: string, updates: any) => {
if (updates.status) {
finalStatus = updates.status;
}
};
await outputHandler.handleCompletion(agentId, mockActive as any, getAgentWorkdir);
// Verify agent was marked as waiting for input, not crashed
expect(finalStatus, 'Agent should be waiting for input').toBe('waiting_for_input');
});
it('should handle completion when agent is done', async () => {
// Setup agent workdir with done signal
const outputDir = join(testWorkdir, '.cw', 'output');
await mkdir(outputDir, { recursive: true });
await writeFile(join(outputDir, 'signal.json'), JSON.stringify({
status: 'done'
}));
const agentId = 'test-agent';
const getAgentWorkdir = (_alias: string) => testWorkdir;
// Provide a mock ActiveAgent with outputFilePath so the signal.json check path is reached
const mockActive = {
outputFilePath: join(testWorkdir, 'output.jsonl'),
streamSessionId: 'session-1'
};
await writeFile(mockActive.outputFilePath, '');
// Mock the update call to track status changes
let finalStatus: string | undefined;
(mockAgentRepository as any).update = async (id: string, updates: any) => {
if (updates.status) {
finalStatus = updates.status;
}
};
await outputHandler.handleCompletion(agentId, mockActive as any, getAgentWorkdir);
// Verify agent was marked as idle, not crashed
expect(finalStatus, 'Agent should be idle after completion').toBe('idle');
});
it('should clean up completion lock even if processing fails', async () => {
const agentId = 'test-agent';
const getAgentWorkdir = (_alias: string) => testWorkdir;
// Force an error during processing
(mockAgentRepository as any).findById = async () => {
throw new Error('Database error');
};
// handleCompletion propagates errors but the finally block still cleans up the lock
await expect(outputHandler.handleCompletion(agentId, undefined, getAgentWorkdir))
.rejects.toThrow('Database error');
// Verify lock was cleaned up by ensuring a second call can proceed
let secondCallStarted = false;
(mockAgentRepository as any).findById = async (id: string) => {
secondCallStarted = true;
return { ...defaultAgent };
};
// Create signal.json so the second call can complete successfully
const outputDir = join(testWorkdir, '.cw', 'output');
await mkdir(outputDir, { recursive: true });
await writeFile(join(outputDir, 'signal.json'), JSON.stringify({
status: 'done'
}));
const mockActive = {
outputFilePath: join(testWorkdir, 'output.jsonl'),
streamSessionId: 'session-1'
};
await writeFile(mockActive.outputFilePath, '');
await outputHandler.handleCompletion(agentId, mockActive as any, getAgentWorkdir);
expect(secondCallStarted, 'Second call should proceed after lock cleanup').toBe(true);
});
});

View File

@@ -730,25 +730,6 @@ describe('MockAgentManager', () => {
expect(agent?.status).toBe('waiting_for_input');
});
it('should emit stopped event with detail_complete reason (second test)', async () => {
manager.setScenario('detail-done', {
status: 'done',
delay: 0,
result: 'Detail complete',
});
await manager.spawn({
name: 'detail-done',
taskId: 'plan-1',
prompt: 'test',
mode: 'detail',
});
await vi.runAllTimersAsync();
const stopped = eventBus.emittedEvents.find((e) => e.type === 'agent:stopped') as AgentStoppedEvent | undefined;
expect(stopped?.payload.reason).toBe('detail_complete');
});
it('should set result message for detail mode', async () => {
manager.setScenario('detailer', {
status: 'done',

View File

@@ -159,17 +159,7 @@ describe('ProcessManager', () => {
.rejects.toThrow('Worktree creation failed:');
});
it('logs comprehensive worktree creation details', async () => {
const alias = 'test-agent';
const initiativeId = 'init-123';
await processManager.createProjectWorktrees(alias, initiativeId);
// Verify logging (implementation would need to capture log calls)
// For now, just verify the method completes successfully
expect(mockProjectRepository.findProjectsByInitiativeId).toHaveBeenCalledWith('init-123');
});
});
});
describe('createStandaloneWorktree', () => {
beforeEach(() => {
@@ -270,28 +260,6 @@ describe('ProcessManager', () => {
});
});
it('logs comprehensive spawn information', () => {
const agentId = 'agent-123';
const agentName = 'test-agent';
const command = 'claude';
const args = ['--json-schema', 'schema.json'];
const cwd = '/test/workspace/agent-workdirs/test-agent';
const env = { CLAUDE_CONFIG_DIR: '/config' };
const providerName = 'claude';
const result = processManager.spawnDetached(agentId, agentName, command, args, cwd, env, providerName);
expect(result).toHaveProperty('pid', 12345);
expect(result).toHaveProperty('outputFilePath');
expect(result).toHaveProperty('tailer');
// Verify log directory creation uses agent name, not ID
expect(mockMkdirSync).toHaveBeenCalledWith(
'/test/workspace/.cw/agent-logs/test-agent',
{ recursive: true }
);
});
it('writes prompt file when provided', () => {
const agentId = 'agent-123';
const agentName = 'test-agent';

View File

@@ -3,7 +3,7 @@
*
* Tests edge case scenarios in dispatch/coordination flow:
* - Agent crashes during task
* - Agent waiting for input and resume
* - Agent waiting for input
* - Task blocking
* - Merge conflicts
*
@@ -20,8 +20,6 @@ import type {
AgentSpawnedEvent,
AgentCrashedEvent,
AgentWaitingEvent,
AgentResumedEvent,
AgentStoppedEvent,
TaskBlockedEvent,
MergeConflictedEvent,
} from '../../events/types.js';
@@ -171,101 +169,6 @@ describe('E2E Edge Cases', () => {
expect(waitingPayload.questions[0].question).toBe('Which database should I use?');
});
it('resumes agent and completes after resume', async () => {
vi.useFakeTimers();
const seeded = await harness.seedFixture(SIMPLE_FIXTURE);
const taskAId = seeded.tasks.get('Task A')!;
// Pre-seed required idle agent
await harness.agentManager.spawn({
name: 'pool-agent',
taskId: 'placeholder',
prompt: 'placeholder',
});
await harness.advanceTimers();
// Set questions scenario
harness.setAgentScenario(`agent-${taskAId.slice(0, 6)}`, {
status: 'questions',
questions: [{ id: 'q1', question: 'Which database should I use?' }],
});
await harness.dispatchManager.queue(taskAId);
harness.clearEvents();
const dispatchResult = await harness.dispatchManager.dispatchNext();
await harness.advanceTimers();
// Verify agent is in waiting_for_input status
const agent = await harness.agentManager.get(dispatchResult.agentId!);
expect(agent?.status).toBe('waiting_for_input');
// Clear events to check resume events
harness.clearEvents();
// Resume agent with answers map
await harness.agentManager.resume(dispatchResult.agentId!, { q1: 'PostgreSQL' });
await harness.advanceTimers();
// Verify: agent:resumed event emitted
const resumedEvents = harness.getEventsByType('agent:resumed');
expect(resumedEvents.length).toBe(1);
const resumedPayload = (resumedEvents[0] as AgentResumedEvent).payload;
expect(resumedPayload.taskId).toBe(taskAId);
// Verify: agent:stopped event emitted (after resume completes)
const stoppedEvents = harness.getEventsByType('agent:stopped');
expect(stoppedEvents.length).toBe(1);
const stoppedPayload = (stoppedEvents[0] as AgentStoppedEvent).payload;
expect(stoppedPayload.taskId).toBe(taskAId);
expect(stoppedPayload.reason).toBe('task_complete');
});
it('agent status transitions correctly through waiting and resume', async () => {
vi.useFakeTimers();
const seeded = await harness.seedFixture(SIMPLE_FIXTURE);
const taskAId = seeded.tasks.get('Task A')!;
// Pre-seed required idle agent
await harness.agentManager.spawn({
name: 'pool-agent',
taskId: 'placeholder',
prompt: 'placeholder',
});
await harness.advanceTimers();
// Set questions scenario
harness.setAgentScenario(`agent-${taskAId.slice(0, 6)}`, {
status: 'questions',
questions: [{ id: 'q1', question: 'Which database should I use?' }],
});
await harness.dispatchManager.queue(taskAId);
const dispatchResult = await harness.dispatchManager.dispatchNext();
// Initially running
let agent = await harness.agentManager.get(dispatchResult.agentId!);
expect(agent?.status).toBe('running');
await harness.advanceTimers();
// After scenario completes: waiting_for_input
agent = await harness.agentManager.get(dispatchResult.agentId!);
expect(agent?.status).toBe('waiting_for_input');
// Resume with answers map
await harness.agentManager.resume(dispatchResult.agentId!, { q1: 'PostgreSQL' });
// After resume: running again
agent = await harness.agentManager.get(dispatchResult.agentId!);
expect(agent?.status).toBe('running');
await harness.advanceTimers();
// After completion: idle
agent = await harness.agentManager.get(dispatchResult.agentId!);
expect(agent?.status).toBe('idle');
});
});
describe('Task blocking', () => {

View File

@@ -229,34 +229,4 @@ Initiative page is essentially empty — lacks context, scope, goals, and techni
expect(['idle', 'waiting_for_input', 'stopped']).toContain(finalAgentStatus);
});
it('should demonstrate the race condition exists in current code', async () => {
// This test demonstrates what happens without the fix
// We'll create the conditions that trigger the race condition
// SETUP: No signal.json initially (simulates timing where signal not yet written)
const outputFilePath = join(testDir, 'output.jsonl');
await writeFile(outputFilePath, ''); // Empty output file
const mockActive = {
outputFilePath,
streamSessionId: 'session-1'
};
const getAgentWorkdir = (agentId: string) => testDir;
// EXECUTE: This should hit the "no output received" path and mark as crashed
await (outputHandler as any).handleCompletion(
testAgent.id,
mockActive,
getAgentWorkdir
);
// DOCUMENT: Show what happens in the race condition scenario
console.log('Race condition test - Final status:', finalAgentStatus);
console.log('Race condition test - Update calls:', updateCalls);
// This test shows the current behavior - it may pass or fail depending on fix
expect(updateCalls.length).toBeGreaterThan(0);
console.log('Current behavior when no signal exists:', finalAgentStatus);
});
});

View File

@@ -1,136 +0,0 @@
/**
* END-TO-END integration test with REAL Claude Code process.
* This will capture the actual crash-marking behavior with debug logs.
*/
import { describe, it, expect } from 'vitest';
import { execSync } from 'node:child_process';
import { writeFile, mkdir } from 'node:fs/promises';
import { join } from 'node:path';
import { randomBytes } from 'node:crypto';
import { tmpdir } from 'node:os';
describe('Real E2E Crash Investigation', () => {
it('should spawn a real agent and capture debug logs', async () => {
if (!process.env.REAL_CLAUDE_TESTS) {
console.log('Skipping - REAL_CLAUDE_TESTS not set');
return;
}
console.log('🚀 Starting REAL end-to-end test...');
// Create test workspace
const testWorkspace = join(tmpdir(), `e2e-test-${randomBytes(4).toString('hex')}`);
await mkdir(testWorkspace, { recursive: true });
await mkdir(join(testWorkspace, '.cw/input'), { recursive: true });
// Create minimal input that will ask questions
await writeFile(join(testWorkspace, '.cw/input/manifest.json'), JSON.stringify({
files: ['initiative.md']
}));
await writeFile(join(testWorkspace, '.cw/input/initiative.md'), `---
id: test-e2e
name: E2E Test Initiative
status: active
---`);
console.log('📁 Created test workspace:', testWorkspace);
// Change to the workdir where server is running to spawn agent
const originalCwd = process.cwd();
process.chdir('/Users/lukasmay/development/projects/codewalk-district/workdir');
console.log('🎯 Spawning real Claude agent...');
console.log('📍 Working directory:', process.cwd());
const agentName = `e2e-test-${randomBytes(4).toString('hex')}`;
let spawnResult: any;
try {
// Spawn agent with shorter timeout for testing - use first available initiative
const result = execSync(`cw agent spawn --name ${agentName} --initiative 2-x-DkbSjiiRzlSAsD4ib "Please analyze the current admin UI setup and identify what needs to be replaced with shadcn + tailwind"`, {
encoding: 'utf-8',
timeout: 30000 // 30 seconds max for testing
});
console.log('📄 CLI Output:', result);
spawnResult = { success: true, output: result };
} catch (error: any) {
spawnResult = {
success: false,
exitCode: error.status,
stdout: error.stdout,
stderr: error.stderr
};
console.log('⚠️ CLI execution result:', spawnResult);
// Timeout is expected for agents that take time to complete
if (error.status === 124 || error.message?.includes('timeout')) {
console.log('⏰ Got timeout (expected) - agent was running');
}
}
// Wait a moment for completion detection to run
await new Promise(resolve => setTimeout(resolve, 2000));
// Now check the database for what happened
console.log('📊 Checking agent status in database...');
const dbPath = '.cw/cw.db'; // From workdir
try {
const agentData = execSync(
`sqlite3 "${dbPath}" "SELECT json_object('id', id, 'name', name, 'status', status, 'exit_code', exit_code, 'worktree_id', worktree_id) FROM agents WHERE name = '${agentName}' ORDER BY created_at DESC LIMIT 1;"`,
{ encoding: 'utf-8' }
).trim();
if (agentData) {
const agent = JSON.parse(agentData);
console.log('🔍 Agent found:', agent);
// Check for output files in agent workdir
const agentWorkdir = `agent-workdirs/${agentName}`;
const outputDir = join(agentWorkdir, '.cw/output');
console.log('📂 Checking output dir:', outputDir);
try {
const { existsSync } = await import('node:fs');
const hasSignal = existsSync(join(outputDir, 'signal.json'));
console.log('📄 Has signal.json:', hasSignal);
if (hasSignal) {
const { readFile } = await import('node:fs/promises');
const signalContent = await readFile(join(outputDir, 'signal.json'), 'utf-8');
const signal = JSON.parse(signalContent);
console.log('📋 Signal status:', signal.status);
if (agent.status === 'crashed' && ['questions', 'done', 'error'].includes(signal.status)) {
console.log('🚨 BUG REPRODUCED: Agent crashed despite valid completion!');
console.log('🔍 This indicates our completion detection race condition bug is still active');
} else if (agent.status === 'waiting_for_input' && signal.status === 'questions') {
console.log('✅ CORRECT: Agent properly detected completion and set waiting_for_input');
}
}
} catch (err) {
console.log('❌ Could not check output files:', err);
}
} else {
console.log('❌ No agent found in database');
}
} catch (err) {
console.log('❌ Database query failed:', err);
} finally {
// Restore original working directory
process.chdir(originalCwd);
}
// The test passes regardless - this is for investigation
expect(true).toBe(true);
}, 180000); // 3 minutes timeout
});