Files
Codewalkers/docs/agent.md
Lukas May 44d2a3ff08 docs: Update agent.md to reflect prompt overhaul
Remove CODEBASE_VERIFICATION references, document new shared constants
(TEST_INTEGRITY, SESSION_STARTUP, PROGRESS_TRACKING), update mode prompt
descriptions with TDD protocol, Definition of Done checklists, and
mandatory test specifications.
2026-02-18 17:21:57 +09:00

13 KiB

Agent Module

src/agent/ — Agent lifecycle management, output parsing, multi-provider support, and account failover.

File Inventory

File Purpose
types.ts Core types: AgentInfo, AgentManager interface, SpawnOptions, StreamEvent
manager.ts MultiProviderAgentManager — main orchestrator class
process-manager.ts AgentProcessManager — worktree creation, command building, detached spawn
output-handler.ts OutputHandler — JSONL stream parsing, completion detection, proposal creation, task dedup
file-tailer.ts FileTailer — watches output files, fires parser + raw content callbacks
file-io.ts Input/output file I/O: frontmatter writing, signal.json reading, tiptap conversion
markdown-to-tiptap.ts Markdown to Tiptap JSON conversion using MarkdownManager
index.ts Public exports, ClaudeAgentManager deprecated alias

Sub-modules

Directory Purpose
providers/ Provider registry, presets (7 providers), config types
providers/parsers/ Provider-specific output parsers (Claude JSONL, generic line)
accounts/ Account discovery, config dir setup, credential management, usage API
credentials/ AccountCredentialManager — credential injection per account
lifecycle/ LifecycleController — retry policy, signal recovery, missing signal instructions
prompts/ Mode-specific prompt builders (execute, discuss, plan, detail, refine) + shared blocks (test integrity, deviation rules, git workflow, session startup, progress tracking) + inter-agent communication instructions

Key Flows

Spawning an Agent

  1. tRPC procedure calls agentManager.spawn(options)
  2. Manager generates alias (adjective-animal), creates DB record
  3. AgentProcessManager.createWorktree() — creates git worktree at .cw-worktrees/agent/<alias>/
  4. file-io.writeInputFiles() — writes .cw/input/ with assignment files (initiative, pages, phase, task) and read-only context dirs (context/phases/, context/tasks/)
  5. Provider config builds spawn command via buildSpawnCommand()
  6. spawnDetached() — launches detached child process with file output redirection
  7. FileTailer watches output file, fires onEvent (parsed stream events) and onRawContent (raw JSONL chunks) callbacks
  8. onRawContent → DB insert via createLogChunkCallback()agent:output event emitted (single emission point)
  9. OutputHandler.handleStreamEvent() processes parsed events (session tracking, result capture — no event emission)
  10. DB record updated with PID, output file path, session ID
  11. agent:spawned event emitted

Completion Detection

  1. Polling detects process exit, FileTailer.stop() flushes remaining output
  2. OutputHandler.handleCompletion() triggered
  3. Path resolution: Uses ActiveAgent.agentCwd (recorded at spawn) to locate signal.json. Standalone agents run in a workspace/ subdirectory under agent-workdirs/<alias>/, so the base getAgentWorkdir() path won't contain .cw/output/signal.json. Reconciliation and crash detection paths also probe for the workspace/ subdirectory when .cw/output is missing at the base level.
  4. Primary path: Reads .cw/output/signal.json from agent worktree
  5. Signal contains { status: "done"|"questions"|"error", result?, questions?, error? }
  6. Agent DB status updated accordingly (idle, waiting_for_input, crashed)
  7. For done: proposals created from structured output; agent:stopped emitted
  8. For questions: parsed and stored as pendingQuestions; agent:waiting emitted
  9. Fallback: If signal.json missing, lifecycle controller retries with instruction injection

Account Failover

  1. On usage-limit error, markAccountExhausted(id, until) called
  2. findNextAvailable(provider) returns least-recently-used non-exhausted account
  3. Agent re-spawned with new account's credentials
  4. agent:account_switched event emitted

Resume Flow

  1. tRPC resumeAgent called with answers: Record<string, string>
  2. Manager looks up agent's session ID and provider config
  3. buildResumeCommand() creates resume command with session flag
  4. formatAnswersAsPrompt(answers) converts answers to prompt text
  5. New detached process spawned, same worktree, incremented session number

Provider Configuration

Providers defined in providers/presets.ts:

Provider Command Resume Prompt Mode
claude claude --resume <id> native (-p)
claude-code claude --resume <id> native
codex codex none flag (--prompt)
aider aider none flag (--message)
cline cline none flag
continue continue none flag
cursor-agent cursor none flag

Each provider config specifies: command, args, resumeStyle, promptMode, structuredOutput, sessionId extraction, nonInteractive options.

Output Parsing

The OutputHandler processes JSONL streams from Claude CLI:

  • init event → session ID extracted and persisted
  • text_delta events → no-op in handler (output streaming handled by DB log chunks)
  • result event → final result with structured data captured on ActiveAgent
  • Signal file (signal.json) → authoritative completion status

Output event flow: FileTailer.onRawContent() → DB insertChunk()EventBus.emit('agent:output'). This is the single emission point — no events from handleStreamEvent() or processLine().

For providers without structured output, the generic line parser accumulates raw text.

Credential Management

AccountCredentialManager in credentials/ handles OAuth token lifecycle:

  • read() — extracts claudeAiOauth from .credentials.json. Only accessToken is required; refreshToken and expiresAt may be null (setup tokens).
  • isExpired() — returns false when expiresAt is null (setup tokens never "expire" from our perspective).
  • ensureValid() — if expired and refreshToken exists, refreshes. If expired with no refreshToken, returns invalid with error.

Setup Tokens

Setup tokens (from claude setup-token) are long-lived OAuth access tokens with no refresh token or expiry. Register via:

cw account add --token <token> --email user@example.com

Stored as credentials: {"claudeAiOauth":{"accessToken":"<token>"}} and configJson: {"hasCompletedOnboarding":true}.

Auto-Cleanup & Commit Retries

After an agent completes (status → idle), tryAutoCleanup checks if its project worktrees have uncommitted changes:

  1. CleanupManager.getDirtyWorktreePaths() runs git status --porcelain in each project subdirectory (not the parent agent-workdirs/<alias>/ dir)
  2. If all clean → worktrees and logs removed immediately
  3. If dirty → resumeForCommit() resumes the agent's session with a prompt listing the specific dirty subdirectories (e.g. - \my-project/``)
  4. The agent cds into each listed directory and commits
  5. On next completion, cleanup runs again. MAX_COMMIT_RETRIES (1) limits retries — after that the workdir is left in place with a warning

The retry counter is cleaned up on: successful removal, max retries exceeded, or unexpected error. It is not cleaned up when a commit retry is successfully launched (so the counter persists across the retry cycle).

Log Chunks

Agent output is persisted to agent_log_chunks table and drives all live streaming:

  • onRawContent callback fires for every raw JSONL chunk from FileTailer
  • DB insert → agent:output event emission (single source of truth for UI)
  • No FK to agents — survives agent deletion
  • Session tracking: spawn=1, resume=previousMax+1
  • Read path (getAgentOutput tRPC): concatenates all DB chunks (no file fallback)
  • Live path (onAgentOutput subscription): listens for agent:output events
  • Frontend: initial query loads from DB, subscription accumulates raw JSONL, both parsed via parseAgentOutput()

Inter-Agent Communication

Agents can communicate with each other via the conversations table, coordinated through CLI commands.

Prompt Integration

buildInterAgentCommunication(agentId) function in prompts/shared.ts generates per-agent communication instructions. Called in manager.ts after agent record creation — the actual agent ID is injected directly into the prompt (no manifest.json indirection). Appended to the prompt regardless of mode. Instructions include:

  1. Set up a background listener via temp-file redirect: cw listen > $CW_LISTEN_FILE &
  2. Periodically check the temp file for incoming questions between work steps
  3. Answer via cw answer, clear the file, restart the listener
  4. Ask questions to peers via cw ask --from <agentId> --agent-id|--task-id|--phase-id
  5. Kill the listener and clean up the temp file before writing signal.json

Agent Identity

manifest.json includes agentId and agentName fields. The manager passes these from the DB record after agent creation. The agent ID is also injected directly into the prompt's communication instructions.

CLI Commands

cw listen --agent-id <id>

  • Subscribes to onPendingConversation SSE subscription, prints first pending as JSON, exits with code 0
  • First yields any existing pending conversations from DB, then listens for conversation:created events
  • Output: { conversationId, fromAgentId, question, phaseId?, taskId? }

cw ask <question> --from <agentId> --agent-id|--task-id|--phase-id <target>

  • Creates conversation, subscribes to onConversationAnswer SSE, prints answer text to stdout when answered
  • Target resolution: --agent-id (direct), --task-id (find agent running task), --phase-id (find agent in phase)

cw answer <answer> --conversation-id <id>

  • Calls answerConversation, prints { conversationId, status: "answered" }

Prompt Architecture

Mode-specific prompts in prompts/ are composed from shared blocks and mode-specific sections.

Shared Blocks (prompts/shared.ts)

Constant Content
SIGNAL_FORMAT Signal output format (done/questions/error via .cw/output/signal.json)
INPUT_FILES Input file structure (manifest, assignment files, context files)
ID_GENERATION cw id usage for generating entity IDs
TEST_INTEGRITY Non-negotiable test rules — no self-validating tests, no assertion mutation, no skipping, independent tests, full suite runs
SESSION_STARTUP Environment verification sequence — confirm working directory, check git state, establish green test baseline, read assignment
PROGRESS_TRACKING Maintain .cw/output/progress.md after each commit — survives context compaction
DEVIATION_RULES Decision tree for handling unexpected situations (typo→fix, bug→fix if small, missing dep→coordinate, architectural mismatch→STOP)
GIT_WORKFLOW Worktree-aware git guidance — specific file staging (no git add .), no force-push, check status first
CONTEXT_MANAGEMENT Parallel file reads, cross-reference to progress tracking
buildInterAgentCommunication() Per-agent CLI instructions for cw listen, cw ask, cw answer (compact format with usage pattern summary)

Mode Prompts

Mode File Key Sections
execute execute.ts Session startup (baseline verification), execution protocol (RED-GREEN-REFACTOR: write failing tests→implement→verify→commit→iterate), test integrity rules, anti-patterns (self-validating tests, test mutation), scope rules (7+ files = overscoping), deviation rules, git workflow, progress tracking, Definition of Done checklist
plan plan.ts Testing strategy (tests per phase, not trailing phase), dependency graph with wave analysis, file ownership for parallelism, specificity test, Definition of Done checklist
detail detail.ts Mandatory test specifications (file path, scenarios, run command) for execute tasks, specificity test with good/bad examples, file ownership constraints, task sizing by lines changed, checkpoint guidance, Definition of Done checklist
discuss discuss.ts Goal-backward analysis (outcome→artifacts→wiring→failure points), question quality examples, decision quality with verification criteria, testability & verification question category, Definition of Done checklist
refine refine.ts Improvement hierarchy (ambiguity > missing details > contradictions > unverifiable requirements with testable acceptance criteria > missing edge cases as testable scenarios), Definition of Done checklist

Execute Prompt Dispatch

buildExecutePrompt(taskDescription?) accepts an optional task description that's inlined into the prompt. The dispatch manager (src/dispatch/manager.ts) wraps task.description || task.name in buildExecutePrompt() so execute agents receive full system context (execution protocol, scope rules, anti-patterns) alongside their task description. The workspace layout and inter-agent communication blocks are appended by the agent manager at spawn time.