# Context Engineering Context engineering is a first-class concern in Codewalk. Agent output quality degrades predictably as context fills. This document defines the rules that all agents must follow. ## Quality Degradation Curve Claude's output quality follows a predictable curve based on context utilization: | Context Usage | Quality Level | Behavior | |---------------|---------------|----------| | 0-30% | **PEAK** | Thorough, comprehensive, considers edge cases | | 30-50% | **GOOD** | Confident, solid work, reliable output | | 50-70% | **DEGRADING** | Efficiency mode begins, shortcuts appear | | 70%+ | **POOR** | Rushed, minimal, misses requirements | **Rule: Stay UNDER 50% context for quality work.** --- ## Orchestrator Pattern Codewalk uses thin orchestration with heavy subagent work: ``` ┌─────────────────────────────────────────────────────────────┐ │ Orchestrator (30-40%) │ │ - Routes work to specialized agents │ │ - Collects results │ │ - Maintains state │ │ - Coordinates across phases │ └─────────────────────────────────────────────────────────────┘ │ ┌──────────────────┼──────────────────┐ ▼ ▼ ▼ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │ Worker │ │ Architect │ │ Verifier │ │ (200k ctx) │ │ (200k ctx) │ │ (200k ctx) │ │ Fresh per │ │ Fresh per │ │ Fresh per │ │ task │ │ initiative │ │ phase │ └─────────────┘ └─────────────┘ └─────────────┘ ``` **Key insight:** Each subagent gets a fresh 200k context window. Heavy work happens there, not in the orchestrator. --- ## Context Budgets by Role ### Orchestrator - **Target:** 30-40% max - **Strategy:** Route, don't process. Collect results, don't analyze. - **Reset trigger:** Context exceeds 50% ### Worker - **Target:** 50% per task - **Strategy:** Single task per context. Fresh context for each task. - **Reset trigger:** Task completion (always) ### Architect - **Target:** 60% per initiative analysis - **Strategy:** Initiative discussion + planning in single context - **Reset trigger:** Work plan generated or context exceeds 70% ### Verifier - **Target:** 40% per phase verification - **Strategy:** Goal-backward verification, gap identification - **Reset trigger:** Verification complete --- ## Task Sizing Rules Tasks are sized to fit context budgets: | Task Complexity | Context Estimate | Example | |-----------------|------------------|---------| | Simple | 10-20% | Add a field to an existing form | | Medium | 20-35% | Create new API endpoint with validation | | Complex | 35-50% | Implement auth flow with refresh tokens | | Too Large | >50% | **SPLIT INTO SUBTASKS** | **Planning rule:** No single task should require >50% context. If estimation suggests otherwise, decompose before execution. --- ## Plan Sizing Plans group 2-3 related tasks for sequential execution: | Plan Size | Target Context | Notes | |-----------|----------------|-------| | Minimal (1 task) | 20-30% | Simple independent work | | Standard (2-3 tasks) | 40-50% | Related work, shared context | | Maximum | 50% | Never exceed—quality degrades | **Why 2-3 tasks?** Shared context reduces overhead (file reads, understanding). More than 3 loses quality benefits. --- ## Wave-Based Parallelization Compute dependency graph and assign tasks to waves: ``` Wave 0: Tasks with no dependencies (run in parallel) ↓ Wave 1: Tasks depending only on Wave 0 (run in parallel) ↓ Wave 2: Tasks depending only on Wave 0-1 (run in parallel) ↓ ...continue until all tasks assigned ``` **Benefits:** - Maximum parallelization - Clear progress tracking - Natural checkpoints between waves ### Computation Algorithm ``` 1. Build dependency graph from task dependencies 2. Find all tasks with no unresolved dependencies → Wave 0 3. Mark Wave 0 as "resolved" 4. Find all tasks whose dependencies are all resolved → Wave 1 5. Repeat until all tasks assigned ``` --- ## Context Handoff When context fills, perform controlled handoff: ### STATE.md Update Before handoff, update session state: ```yaml position: phase: 2 plan: 3 task: "Implement refresh token rotation" wave: 1 decisions: - "Using jose library for JWT (not jsonwebtoken)" - "Refresh tokens stored in httpOnly cookie, not localStorage" - "15min access token, 7day refresh token" blockers: - "Waiting for user to configure OAuth credentials" next_action: "Continue with task after blocker resolved" ``` ### Handoff Content New session receives: - STATE.md (current position) - Relevant SUMMARY.md files (prior work in this phase) - Current PLAN.md (if executing) - Task context from initiative --- ## Anti-Patterns ### Context Stuffing **Wrong:** Loading entire codebase at session start **Right:** Load files on-demand as tasks require them ### Orchestrator Processing **Wrong:** Orchestrator reads all code and makes decisions **Right:** Orchestrator routes to specialized agents who do the work ### Plan Bloat **Wrong:** 10-task plans to "reduce coordination overhead" **Right:** 2-3 task plans that fit in 50% context ### No Handoff State **Wrong:** Agent restarts with no memory of prior work **Right:** STATE.md preserves position, decisions, blockers --- ## Monitoring Track context utilization across the system: | Metric | Threshold | Action | |--------|-----------|--------| | Orchestrator context | >50% | Trigger handoff | | Worker task context | >60% | Flag task as oversized | | Plan total estimate | >50% | Split plan before execution | | Average task context | >40% | Review decomposition strategy | --- ## Implementation Notes ### Context Estimation Estimate context usage before execution: - File reads: ~1-2% per file (varies by size) - Code changes: ~0.5% per change - Tool outputs: ~1% per tool call - Discussion: ~2-5% per exchange ### Fresh Context Triggers - Worker: Always fresh per task - Architect: Fresh per initiative - Verifier: Fresh per phase - Orchestrator: Handoff at 50% ### Subagent Spawning When spawning subagents: 1. Provide focused context (only what's needed) 2. Clear instructions (specific task, expected output) 3. Collect structured results 4. Update state with outcomes