refactor: Add context management to plan/detail prompts, update docs

Add CONTEXT_MANAGEMENT shared block to plan and detail mode prompts so architect agents also benefit from compaction awareness and parallel execution hints. Update index.ts re-exports and agent docs.
2026-02-18 16:43:19 +09:00
parent 4ef9db1501
commit 7354582d69
4 changed files with 130 additions and 6 deletions
--- a/src/agent/prompts/detail.ts
+++ b/src/agent/prompts/detail.ts
@@ -2,7 +2,7 @@
 * Detail mode prompt — break a phase into executable tasks.
 */

-import { ID_GENERATION, INPUT_FILES, SIGNAL_FORMAT } from './shared.js';
+import { CONTEXT_MANAGEMENT, ID_GENERATION, INPUT_FILES, SIGNAL_FORMAT } from './shared.js';

 export function buildDetailPrompt(): string {
  return `You are an Architect agent in the Codewalk multi-agent system operating in DETAIL mode.
@@ -24,6 +24,61 @@ Write one file per task to \`.cw/output/tasks/{id}.md\`:

 ${ID_GENERATION}

+## Specificity Test
+
+Before finalizing each task, ask: **"Could a worker agent execute this without clarifying questions?"**
+
+Every task body MUST include:
+1. **What to create or modify** — if possible, specific file paths (e.g., \`src/db/schema.ts\`, \`src/api/routes/users.ts\`)
+2. **Expected behavior** — what the code should do, with concrete examples or edge cases
+3. **How to verify** — specific test to run, endpoint to hit, or behavior to check
+
+**Bad task:**
+\`\`\`
+Title: Add user validation
+Body: Add validation to the user model. Make sure all fields are validated properly.
+\`\`\`
+
+**Good task:**
+\`\`\`
+Title: Add Zod validation schema for user creation
+Body: Create src/api/validators/user.ts with a Zod schema for CreateUserInput:
+- email: valid email format, lowercase, max 255 chars
+- name: string, 1-100 chars, trimmed
+- password: min 8 chars, must contain uppercase + number
+Export the schema and inferred type. Add unit tests in src/api/validators/user.test.ts
+covering: valid input, missing fields, invalid email, short password.
+Verify: npm test -- src/api/validators/user.test.ts
+\`\`\`
+
+## File Ownership Constraints
+
+Tasks that can run in parallel MUST NOT modify the same files. Include a file list in each task body:
+
+\`\`\`
+Files modified:
+- src/db/schema/users.ts (create)
+- src/db/migrations/001_users.sql (create)
+\`\`\`
+
+If two tasks need to modify the same file or need the functionality another task created or modified, make one depend on the other.
+
+## Task Sizing
+
+- **1-5 files**: Good task size
+- **7+ files**: Too big — split into smaller tasks
+- **1 sentence description**: Too small — merge with related work or add more detail
+- **500+ words**: Probably overspecified — simplify or split
+
+## Checkpoint Tasks
+
+Use checkpoint types for work that requires human judgment:
+- \`checkpoint:human-verify\`: Visual changes, migration results, API contract changes
+- \`checkpoint:decision\`: Architecture choices that affect multiple phases
+- \`checkpoint:human-action\`: External setup (DNS, credentials, third-party config)
+
+~90% of tasks should be \`auto\`. Don't over-checkpoint.
+
 ## Task Design Rules
 - Each task: specific, actionable, completable by one agent
 - Ideally tasks shall be executable in parallel - if they depend on each other, use dependencies to indicate order
@@ -36,5 +91,5 @@ ${ID_GENERATION}
 - If a task in context/tasks/ already covers the same work (even under a different name), do NOT create a duplicate
 - Pages contain requirements — use them to create detailed task descriptions
 - DO NOT create tasks that overlap with existing tasks in other phases
-`;
+${CONTEXT_MANAGEMENT}`;
 }
--- a/src/agent/prompts/index.ts
+++ b/src/agent/prompts/index.ts
@@ -5,7 +5,7 @@
 * input files, ID generation) are in shared.ts.
 */

-export { SIGNAL_FORMAT, INPUT_FILES, ID_GENERATION, buildInterAgentCommunication } from './shared.js';
+export { SIGNAL_FORMAT, INPUT_FILES, ID_GENERATION, CODEBASE_VERIFICATION, CONTEXT_MANAGEMENT, DEVIATION_RULES, GIT_WORKFLOW, buildInterAgentCommunication } from './shared.js';
 export { buildExecutePrompt } from './execute.js';
 export { buildDiscussPrompt } from './discuss.js';
 export { buildPlanPrompt } from './plan.js';
--- a/src/agent/prompts/plan.ts
+++ b/src/agent/prompts/plan.ts
@@ -2,7 +2,7 @@
 * Plan mode prompt — plan initiative into phases.
 */

-import { ID_GENERATION, INPUT_FILES, SIGNAL_FORMAT } from './shared.js';
+import { CONTEXT_MANAGEMENT, ID_GENERATION, INPUT_FILES, SIGNAL_FORMAT } from './shared.js';

 export function buildPlanPrompt(): string {
  return `You are an Architect agent in the Codewalk multi-agent system operating in PLAN mode.
@@ -26,6 +26,43 @@ ${ID_GENERATION}
 - Size: 2-5 tasks each (not too big, not too small) - if the work is independent enough and the tasks are very similar you can also create more tasks for the phase
 - Clear, action-oriented names (describe what gets built, not how)

+## Dependency Graph
+
+Every plan MUST include an explicit dependency graph in the frontmatter in the output. For each phase, list:
+- What it depends on (by phase ID)
+
+Think in waves: Wave 1 = no dependencies (foundation). Wave 2 = depends only on Wave 1. And so on.
+
+**Good example:**
+\`\`\`
+Wave 1 (parallel): "Database schema", "API skeleton"
+Wave 2 (parallel): "User endpoints" (depends: API skeleton, DB schema), "Auth middleware" (depends: API skeleton)
+Wave 3: "Integration tests" (depends: User endpoints, Auth middleware)
+\`\`\`
+
+**Bad example:**
+\`\`\`
+Phase 1 → Phase 2 → Phase 3 → Phase 4 (fully serial, no parallelism)
+\`\`\`
+
+If your plan is fully serial, reconsider. Most real work has independent tracks.
+
+## File Ownership for Parallelism
+
+Phases that run in parallel MUST NOT modify the same files. If two phases need to change the same file, they must be sequential (one depends on the other).
+
+**Bad**: Phase A "Add user model" and Phase B "Add product model" both modify \`schema.ts\` and \`index.ts\`
+**Good**: Phase A "Add user model" creates \`user-schema.ts\`, Phase B "Add product model" creates \`product-schema.ts\`, Phase C "Wire models into index" depends on both
+
+## Specificity Test
+
+Before finalizing each phase description, ask: **"Could a detail agent break this into tasks without clarifying questions?"**
+
+**Bad**: "Set up the backend" — what backend? What framework? What endpoints?
+**Good**: "Create Express API server with health check endpoint at /api/health, CORS configured for localhost:3000, and error handling middleware that returns JSON error responses"
+
+Reference specific files and directories from the codebase when possible.
+
 ## Existing Context
 - Read context files to see what phases and tasks already exist
 - If phases/tasks already exist, account for them — don't plan work that's already covered
@@ -36,5 +73,6 @@ ${ID_GENERATION}
 - Start with foundation/infrastructure phases
 - Group related work together
 - Make dependencies explicit using phase IDs
- Each task should be completable in one session`;
+- Each task should be completable in one session
+${CONTEXT_MANAGEMENT}`;
 }