Architect agents (discuss, plan, detail, refine) were producing generic analysis disconnected from the actual codebase. They had full tool access in their worktrees but were never instructed to explore the code. - Add CODEBASE_EXPLORATION shared constant: read project docs, explore structure, check existing patterns, use subagents for parallel exploration - Inject into all 4 architect prompts after INPUT_FILES - Strengthen discuss prompt: analysis method references codebase, examples cite specific paths, definition_of_done requires codebase references - Fix spawnArchitectDiscuss to pass full context (pages/phases/tasks) via gatherInitiativeContext() — was only passing bare initiative metadata - Update docs/agent.md with new tag ordering and shared block table
223 lines
14 KiB
Markdown
223 lines
14 KiB
Markdown
# Agent Module
|
|
|
|
`apps/server/agent/` — Agent lifecycle management, output parsing, multi-provider support, and account failover.
|
|
|
|
## File Inventory
|
|
|
|
| File | Purpose |
|
|
|------|---------|
|
|
| `types.ts` | Core types: `AgentInfo`, `AgentManager` interface, `SpawnOptions`, `StreamEvent` |
|
|
| `manager.ts` | `MultiProviderAgentManager` — main orchestrator class |
|
|
| `process-manager.ts` | `AgentProcessManager` — worktree creation, command building, detached spawn |
|
|
| `output-handler.ts` | `OutputHandler` — JSONL stream parsing, completion detection, proposal creation, task dedup |
|
|
| `file-tailer.ts` | `FileTailer` — watches output files, fires parser + raw content callbacks |
|
|
| `file-io.ts` | Input/output file I/O: frontmatter writing, signal.json reading, tiptap conversion |
|
|
| `markdown-to-tiptap.ts` | Markdown to Tiptap JSON conversion using MarkdownManager |
|
|
| `index.ts` | Public exports, `ClaudeAgentManager` deprecated alias |
|
|
|
|
### Sub-modules
|
|
|
|
| Directory | Purpose |
|
|
|-----------|---------|
|
|
| `providers/` | Provider registry, presets (7 providers), config types |
|
|
| `providers/parsers/` | Provider-specific output parsers (Claude JSONL, generic line) |
|
|
| `accounts/` | Account discovery, config dir setup, credential management, usage API |
|
|
| `credentials/` | `AccountCredentialManager` — credential injection per account |
|
|
| `lifecycle/` | `LifecycleController` — retry policy, signal recovery, missing signal instructions |
|
|
| `prompts/` | Mode-specific prompt builders (execute, discuss, plan, detail, refine) + shared blocks (test integrity, deviation rules, git workflow, session startup, progress tracking) + inter-agent communication instructions |
|
|
|
|
## Key Flows
|
|
|
|
### Spawning an Agent
|
|
|
|
1. **tRPC procedure** calls `agentManager.spawn(options)`
|
|
2. Manager generates alias (adjective-animal), creates DB record
|
|
3. `AgentProcessManager.createWorktree()` — creates git worktree at `.cw-worktrees/agent/<alias>/`
|
|
4. `file-io.writeInputFiles()` — writes `.cw/input/` with assignment files (initiative, pages, phase, task) and read-only context dirs (`context/phases/`, `context/tasks/`)
|
|
5. Provider config builds spawn command via `buildSpawnCommand()`
|
|
6. `spawnDetached()` — launches detached child process with file output redirection
|
|
7. `FileTailer` watches output file, fires `onEvent` (parsed stream events) and `onRawContent` (raw JSONL chunks) callbacks
|
|
8. `onRawContent` → DB insert via `createLogChunkCallback()` → `agent:output` event emitted (single emission point)
|
|
9. `OutputHandler.handleStreamEvent()` processes parsed events (session tracking, result capture — no event emission)
|
|
10. DB record updated with PID, output file path, session ID
|
|
11. `agent:spawned` event emitted
|
|
|
|
### Completion Detection
|
|
|
|
1. Polling detects process exit, `FileTailer.stop()` flushes remaining output
|
|
2. `OutputHandler.handleCompletion()` triggered
|
|
3. **Path resolution**: Uses `ActiveAgent.agentCwd` (recorded at spawn) to locate signal.json. Standalone agents run in a `workspace/` subdirectory under `agent-workdirs/<alias>/`, so the base `getAgentWorkdir()` path won't contain `.cw/output/signal.json`. Reconciliation and crash detection paths also probe for the `workspace/` subdirectory when `.cw/output` is missing at the base level.
|
|
4. **Primary path**: Reads `.cw/output/signal.json` from agent worktree
|
|
5. Signal contains `{ status: "done"|"questions"|"error", result?, questions?, error? }`
|
|
6. Agent DB status updated accordingly (idle, waiting_for_input, crashed)
|
|
7. For `done`: proposals created from structured output; `agent:stopped` emitted
|
|
8. For `questions`: parsed and stored as `pendingQuestions`; `agent:waiting` emitted
|
|
9. **Fallback**: If signal.json missing, lifecycle controller retries with instruction injection
|
|
|
|
### Account Failover
|
|
|
|
1. On usage-limit error, `markAccountExhausted(id, until)` called
|
|
2. `findNextAvailable(provider)` returns least-recently-used non-exhausted account
|
|
3. Agent re-spawned with new account's credentials
|
|
4. `agent:account_switched` event emitted
|
|
|
|
### Resume Flow
|
|
|
|
1. tRPC `resumeAgent` called with `answers: Record<string, string>`
|
|
2. Manager looks up agent's session ID and provider config
|
|
3. `buildResumeCommand()` creates resume command with session flag
|
|
4. `formatAnswersAsPrompt(answers)` converts answers to prompt text
|
|
5. New detached process spawned, same worktree, incremented session number
|
|
|
|
## Provider Configuration
|
|
|
|
Providers defined in `providers/presets.ts`:
|
|
|
|
| Provider | Command | Resume | Prompt Mode |
|
|
|----------|---------|--------|-------------|
|
|
| claude | `claude` | `--resume <id>` | native (`-p`) |
|
|
| claude-code | `claude` | `--resume <id>` | native |
|
|
| codex | `codex` | none | flag (`--prompt`) |
|
|
| aider | `aider` | none | flag (`--message`) |
|
|
| cline | `cline` | none | flag |
|
|
| continue | `continue` | none | flag |
|
|
| cursor-agent | `cursor` | none | flag |
|
|
|
|
Each provider config specifies: `command`, `args`, `resumeStyle`, `promptMode`, `structuredOutput`, `sessionId` extraction, `nonInteractive` options.
|
|
|
|
## Output Parsing
|
|
|
|
The `OutputHandler` processes JSONL streams from Claude CLI:
|
|
|
|
- `init` event → session ID extracted and persisted
|
|
- `text_delta` events → no-op in handler (output streaming handled by DB log chunks)
|
|
- `result` event → final result with structured data captured on `ActiveAgent`
|
|
- Signal file (`signal.json`) → authoritative completion status
|
|
|
|
**Output event flow**: `FileTailer.onRawContent()` → DB `insertChunk()` → `EventBus.emit('agent:output')`. This is the single emission point — no events from `handleStreamEvent()` or `processLine()`.
|
|
|
|
For providers without structured output, the generic line parser accumulates raw text.
|
|
|
|
## Credential Management
|
|
|
|
`AccountCredentialManager` in `credentials/` handles OAuth token lifecycle:
|
|
- `read()` — extracts `claudeAiOauth` from `.credentials.json`. Only `accessToken` is required; `refreshToken` and `expiresAt` may be null (setup tokens).
|
|
- `isExpired()` — returns false when `expiresAt` is null (setup tokens never "expire" from our perspective).
|
|
- `ensureValid()` — if expired and `refreshToken` exists, refreshes. If expired with no `refreshToken`, returns invalid with error.
|
|
|
|
### Setup Tokens
|
|
|
|
Setup tokens (from `claude setup-token`) are long-lived OAuth access tokens with no refresh token or expiry. Register via:
|
|
|
|
```sh
|
|
cw account add --token <token> --email user@example.com
|
|
```
|
|
|
|
Stored as `credentials: {"claudeAiOauth":{"accessToken":"<token>"}}` and `configJson: {"hasCompletedOnboarding":true}`.
|
|
|
|
## Auto-Cleanup & Commit Retries
|
|
|
|
After an agent completes (status → `idle`), `tryAutoCleanup` checks if its project worktrees have uncommitted changes:
|
|
|
|
1. `CleanupManager.getDirtyWorktreePaths()` runs `git status --porcelain` in each project subdirectory (not the parent `agent-workdirs/<alias>/` dir)
|
|
2. If all clean → worktrees and logs removed immediately
|
|
3. If dirty → `resumeForCommit()` resumes the agent's session with a prompt listing the specific dirty subdirectories (e.g. `- \`my-project/\``)
|
|
4. The agent `cd`s into each listed directory and commits
|
|
5. On next completion, cleanup runs again. `MAX_COMMIT_RETRIES` (1) limits retries — after that the workdir is left in place with a warning
|
|
|
|
The retry counter is cleaned up on: successful removal, max retries exceeded, or unexpected error. It is **not** cleaned up when a commit retry is successfully launched (so the counter persists across the retry cycle).
|
|
|
|
## Log Chunks
|
|
|
|
Agent output is persisted to `agent_log_chunks` table and drives all live streaming:
|
|
- `onRawContent` callback fires for every raw JSONL chunk from `FileTailer`
|
|
- DB insert → `agent:output` event emission (single source of truth for UI)
|
|
- No FK to agents — survives agent deletion
|
|
- Session tracking: spawn=1, resume=previousMax+1
|
|
- Read path (`getAgentOutput` tRPC): concatenates all DB chunks (no file fallback)
|
|
- Live path (`onAgentOutput` subscription): listens for `agent:output` events
|
|
- Frontend: initial query loads from DB, subscription accumulates raw JSONL, both parsed via `parseAgentOutput()`
|
|
|
|
## Inter-Agent Communication
|
|
|
|
Agents can communicate with each other via the `conversations` table, coordinated through CLI commands.
|
|
|
|
### Prompt Integration
|
|
`buildInterAgentCommunication(agentId)` function in `prompts/shared.ts` generates per-agent communication instructions. Called in `manager.ts` after agent record creation — the actual agent ID is injected directly into the prompt (no manifest.json indirection). Appended to the prompt regardless of mode. Instructions include:
|
|
1. Set up a background listener via temp-file redirect: `cw listen > $CW_LISTEN_FILE &`
|
|
2. Periodically check the temp file for incoming questions between work steps
|
|
3. Answer via `cw answer`, clear the file, restart the listener
|
|
4. Ask questions to peers via `cw ask --from <agentId> --agent-id|--task-id|--phase-id`
|
|
5. Kill the listener and clean up the temp file before writing `signal.json`
|
|
|
|
### Agent Identity
|
|
`manifest.json` includes `agentId` and `agentName` fields. The manager passes these from the DB record after agent creation. The agent ID is also injected directly into the prompt's communication instructions.
|
|
|
|
### CLI Commands
|
|
|
|
**`cw listen --agent-id <id>`**
|
|
- Subscribes to `onPendingConversation` SSE subscription, prints first pending as JSON, exits with code 0
|
|
- First yields any existing pending conversations from DB, then listens for `conversation:created` events
|
|
- Output: `{ conversationId, fromAgentId, question, phaseId?, taskId? }`
|
|
|
|
**`cw ask <question> --from <agentId> --agent-id|--task-id|--phase-id <target>`**
|
|
- Creates conversation, subscribes to `onConversationAnswer` SSE, prints answer text to stdout when answered
|
|
- Target resolution: `--agent-id` (direct), `--task-id` (find agent running task), `--phase-id` (find agent in phase)
|
|
|
|
**`cw answer <answer> --conversation-id <id>`**
|
|
- Calls `answerConversation`, prints `{ conversationId, status: "answered" }`
|
|
|
|
## Prompt Architecture
|
|
|
|
Mode-specific prompts in `prompts/` use XML tags as top-level structural delimiters, with markdown formatting inside tags. This separates first-order instructions from second-order content (task descriptions, examples, templates) per Anthropic best practices. The old `apps/server/agent/prompts.ts` (flat markdown) has been deleted.
|
|
|
|
### XML Tag Structure
|
|
|
|
All prompts follow a consistent tag ordering:
|
|
1. `<role>` — agent identity and mode
|
|
2. `<task>` — dynamic task content (execute mode only)
|
|
3. `<input_files>` — file format documentation
|
|
4. `<codebase_exploration>` — codebase grounding instructions (architect modes only)
|
|
5. `<output_format>` — what to produce, file paths, frontmatter
|
|
6. `<id_generation>` — ID creation via `cw id`
|
|
7. `<signal_format>` — completion signaling
|
|
8. `<session_startup>` — startup verification steps
|
|
9. Mode-specific tags (see below)
|
|
10. Rules/constraints tags
|
|
11. `<progress_tracking>` / `<context_management>`
|
|
12. `<definition_of_done>` — completion checklist
|
|
13. `<workspace>` — workspace layout (appended by manager)
|
|
14. `<inter_agent_communication>` — per-agent CLI instructions (appended by manager)
|
|
|
|
### Shared Blocks (`prompts/shared.ts`)
|
|
|
|
| Constant / Function | XML Tag | Content |
|
|
|---------------------|---------|---------|
|
|
| `SIGNAL_FORMAT` | `<signal_format>` | Done/questions/error via `.cw/output/signal.json` |
|
|
| `INPUT_FILES` | `<input_files>` | Manifest, assignment files, context files |
|
|
| `ID_GENERATION` | `<id_generation>` | `cw id` usage for generating entity IDs |
|
|
| `TEST_INTEGRITY` | `<test_integrity>` | No self-validating tests, no assertion mutation, no skipping, independent tests, full suite runs |
|
|
| `SESSION_STARTUP` | `<session_startup>` | Confirm working directory, check git state, establish green test baseline, read assignment |
|
|
| `PROGRESS_TRACKING` | `<progress_tracking>` | Maintain `.cw/output/progress.md` after each commit — survives context compaction |
|
|
| `DEVIATION_RULES` | `<deviation_rules>` | Typo→fix, bug→fix if small, missing dep→coordinate, architectural mismatch→STOP |
|
|
| `GIT_WORKFLOW` | `<git_workflow>` | Specific file staging (no `git add .`), no force-push, check status first |
|
|
| `CODEBASE_EXPLORATION` | `<codebase_exploration>` | Architect-mode codebase grounding: read project docs, explore structure, check existing patterns, use subagents for parallel exploration |
|
|
| `CONTEXT_MANAGEMENT` | `<context_management>` | Parallel file reads, cross-reference to progress tracking |
|
|
| `buildInterAgentCommunication()` | `<inter_agent_communication>` | Per-agent CLI instructions for `cw listen`, `cw ask`, `cw answer` |
|
|
|
|
### Mode-Specific Tags
|
|
|
|
| Mode | File | Mode-Specific Tags |
|
|
|------|------|--------------------|
|
|
| **execute** | `execute.ts` | `<task>`, `<execution_protocol>`, `<anti_patterns>`, `<scope_rules>` |
|
|
| **plan** | `plan.ts` | `<phase_design>`, `<dependencies>`, `<file_ownership>`, `<specificity>`, `<existing_context>` |
|
|
| **detail** | `detail.ts` | `<task_body_requirements>`, `<file_ownership>`, `<task_sizing>`, `<checkpoint_tasks>`, `<existing_context>` |
|
|
| **discuss** | `discuss.ts` | `<analysis_method>`, `<question_quality>`, `<decision_quality>`, `<question_categories>`, `<rules>` |
|
|
| **refine** | `refine.ts` | `<improvement_priorities>`, `<rules>` |
|
|
|
|
Examples within mode-specific tags use `<examples>` > `<example label="good">` / `<example label="bad">` nesting.
|
|
|
|
### Execute Prompt Dispatch
|
|
|
|
`buildExecutePrompt(taskDescription?)` accepts an optional task description wrapped in a `<task>` tag. The dispatch manager (`apps/server/dispatch/manager.ts`) wraps `task.description || task.name` in `buildExecutePrompt()` so execute agents receive full system context alongside their task. The `<workspace>` and `<inter_agent_communication>` blocks are appended by the agent manager at spawn time.
|