Decomposed "Foundation Setup - Install Dependencies & Configure Tailwind" phase into 6 executable tasks: 1. Install Tailwind CSS, PostCSS & Autoprefixer 2. Map MUI theme to Tailwind design tokens 3. Setup CSS variables for dynamic theming 4. Install Radix UI primitives 5. Initialize shadcn/ui and setup component directory 6. Move MUI to devDependencies and verify setup Tasks follow logical dependency chain with final human verification checkpoint before proceeding with component migration. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
6.9 KiB
Context Engineering
Context engineering is a first-class concern in Codewalk. Agent output quality degrades predictably as context fills. This document defines the rules that all agents must follow.
Quality Degradation Curve
Claude's output quality follows a predictable curve based on context utilization:
| Context Usage | Quality Level | Behavior |
|---|---|---|
| 0-30% | PEAK | Thorough, comprehensive, considers edge cases |
| 30-50% | GOOD | Confident, solid work, reliable output |
| 50-70% | DEGRADING | Efficiency mode begins, shortcuts appear |
| 70%+ | POOR | Rushed, minimal, misses requirements |
Rule: Stay UNDER 50% context for quality work.
Orchestrator Pattern
Codewalk uses thin orchestration with heavy subagent work:
┌─────────────────────────────────────────────────────────────┐
│ Orchestrator (30-40%) │
│ - Routes work to specialized agents │
│ - Collects results │
│ - Maintains state │
│ - Coordinates across phases │
└─────────────────────────────────────────────────────────────┘
│
┌──────────────────┼──────────────────┐
▼ ▼ ▼
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Worker │ │ Architect │ │ Verifier │
│ (200k ctx) │ │ (200k ctx) │ │ (200k ctx) │
│ Fresh per │ │ Fresh per │ │ Fresh per │
│ task │ │ initiative │ │ phase │
└─────────────┘ └─────────────┘ └─────────────┘
Key insight: Each subagent gets a fresh 200k context window. Heavy work happens there, not in the orchestrator.
Context Budgets by Role
Orchestrator
- Target: 30-40% max
- Strategy: Route, don't process. Collect results, don't analyze.
- Reset trigger: Context exceeds 50%
Worker
- Target: 50% per task
- Strategy: Single task per context. Fresh context for each task.
- Reset trigger: Task completion (always)
Architect
- Target: 60% per initiative analysis
- Strategy: Initiative discussion + planning in single context
- Reset trigger: Work plan generated or context exceeds 70%
Verifier
- Target: 40% per phase verification
- Strategy: Goal-backward verification, gap identification
- Reset trigger: Verification complete
Task Sizing Rules
Tasks are sized to fit context budgets:
| Task Complexity | Context Estimate | Example |
|---|---|---|
| Simple | 10-20% | Add a field to an existing form |
| Medium | 20-35% | Create new API endpoint with validation |
| Complex | 35-50% | Implement auth flow with refresh tokens |
| Too Large | >50% | SPLIT INTO SUBTASKS |
Planning rule: No single task should require >50% context. If estimation suggests otherwise, decompose before execution.
Plan Sizing
Plans group 2-3 related tasks for sequential execution:
| Plan Size | Target Context | Notes |
|---|---|---|
| Minimal (1 task) | 20-30% | Simple independent work |
| Standard (2-3 tasks) | 40-50% | Related work, shared context |
| Maximum | 50% | Never exceed—quality degrades |
Why 2-3 tasks? Shared context reduces overhead (file reads, understanding). More than 3 loses quality benefits.
Wave-Based Parallelization
Compute dependency graph and assign tasks to waves:
Wave 0: Tasks with no dependencies (run in parallel)
↓
Wave 1: Tasks depending only on Wave 0 (run in parallel)
↓
Wave 2: Tasks depending only on Wave 0-1 (run in parallel)
↓
...continue until all tasks assigned
Benefits:
- Maximum parallelization
- Clear progress tracking
- Natural checkpoints between waves
Computation Algorithm
1. Build dependency graph from task dependencies
2. Find all tasks with no unresolved dependencies → Wave 0
3. Mark Wave 0 as "resolved"
4. Find all tasks whose dependencies are all resolved → Wave 1
5. Repeat until all tasks assigned
Context Handoff
When context fills, perform controlled handoff:
STATE.md Update
Before handoff, update session state:
position:
phase: 2
plan: 3
task: "Implement refresh token rotation"
wave: 1
decisions:
- "Using jose library for JWT (not jsonwebtoken)"
- "Refresh tokens stored in httpOnly cookie, not localStorage"
- "15min access token, 7day refresh token"
blockers:
- "Waiting for user to configure OAuth credentials"
next_action: "Continue with task after blocker resolved"
Handoff Content
New session receives:
- STATE.md (current position)
- Relevant SUMMARY.md files (prior work in this phase)
- Current PLAN.md (if executing)
- Task context from initiative
Anti-Patterns
Context Stuffing
Wrong: Loading entire codebase at session start Right: Load files on-demand as tasks require them
Orchestrator Processing
Wrong: Orchestrator reads all code and makes decisions Right: Orchestrator routes to specialized agents who do the work
Plan Bloat
Wrong: 10-task plans to "reduce coordination overhead" Right: 2-3 task plans that fit in 50% context
No Handoff State
Wrong: Agent restarts with no memory of prior work Right: STATE.md preserves position, decisions, blockers
Monitoring
Track context utilization across the system:
| Metric | Threshold | Action |
|---|---|---|
| Orchestrator context | >50% | Trigger handoff |
| Worker task context | >60% | Flag task as oversized |
| Plan total estimate | >50% | Split plan before execution |
| Average task context | >40% | Review decomposition strategy |
Implementation Notes
Context Estimation
Estimate context usage before execution:
- File reads: ~1-2% per file (varies by size)
- Code changes: ~0.5% per change
- Tool outputs: ~1% per tool call
- Discussion: ~2-5% per exchange
Fresh Context Triggers
- Worker: Always fresh per task
- Architect: Fresh per initiative
- Verifier: Fresh per phase
- Orchestrator: Handoff at 50%
Subagent Spawning
When spawning subagents:
- Provide focused context (only what's needed)
- Clear instructions (specific task, expected output)
- Collect structured results
- Update state with outcomes