Files
Codewalkers/docs/archive/context-engineering.md
Lukas May 342b490fe7 feat: Task decomposition for Tailwind/Radix/shadcn foundation setup
Decomposed "Foundation Setup - Install Dependencies & Configure Tailwind"
phase into 6 executable tasks:

1. Install Tailwind CSS, PostCSS & Autoprefixer
2. Map MUI theme to Tailwind design tokens
3. Setup CSS variables for dynamic theming
4. Install Radix UI primitives
5. Initialize shadcn/ui and setup component directory
6. Move MUI to devDependencies and verify setup

Tasks follow logical dependency chain with final human verification
checkpoint before proceeding with component migration.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-10 09:48:51 +01:00

219 lines
6.9 KiB
Markdown

# Context Engineering
Context engineering is a first-class concern in Codewalk. Agent output quality degrades predictably as context fills. This document defines the rules that all agents must follow.
## Quality Degradation Curve
Claude's output quality follows a predictable curve based on context utilization:
| Context Usage | Quality Level | Behavior |
|---------------|---------------|----------|
| 0-30% | **PEAK** | Thorough, comprehensive, considers edge cases |
| 30-50% | **GOOD** | Confident, solid work, reliable output |
| 50-70% | **DEGRADING** | Efficiency mode begins, shortcuts appear |
| 70%+ | **POOR** | Rushed, minimal, misses requirements |
**Rule: Stay UNDER 50% context for quality work.**
---
## Orchestrator Pattern
Codewalk uses thin orchestration with heavy subagent work:
```
┌─────────────────────────────────────────────────────────────┐
│ Orchestrator (30-40%) │
│ - Routes work to specialized agents │
│ - Collects results │
│ - Maintains state │
│ - Coordinates across phases │
└─────────────────────────────────────────────────────────────┘
┌──────────────────┼──────────────────┐
▼ ▼ ▼
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Worker │ │ Architect │ │ Verifier │
│ (200k ctx) │ │ (200k ctx) │ │ (200k ctx) │
│ Fresh per │ │ Fresh per │ │ Fresh per │
│ task │ │ initiative │ │ phase │
└─────────────┘ └─────────────┘ └─────────────┘
```
**Key insight:** Each subagent gets a fresh 200k context window. Heavy work happens there, not in the orchestrator.
---
## Context Budgets by Role
### Orchestrator
- **Target:** 30-40% max
- **Strategy:** Route, don't process. Collect results, don't analyze.
- **Reset trigger:** Context exceeds 50%
### Worker
- **Target:** 50% per task
- **Strategy:** Single task per context. Fresh context for each task.
- **Reset trigger:** Task completion (always)
### Architect
- **Target:** 60% per initiative analysis
- **Strategy:** Initiative discussion + planning in single context
- **Reset trigger:** Work plan generated or context exceeds 70%
### Verifier
- **Target:** 40% per phase verification
- **Strategy:** Goal-backward verification, gap identification
- **Reset trigger:** Verification complete
---
## Task Sizing Rules
Tasks are sized to fit context budgets:
| Task Complexity | Context Estimate | Example |
|-----------------|------------------|---------|
| Simple | 10-20% | Add a field to an existing form |
| Medium | 20-35% | Create new API endpoint with validation |
| Complex | 35-50% | Implement auth flow with refresh tokens |
| Too Large | >50% | **SPLIT INTO SUBTASKS** |
**Planning rule:** No single task should require >50% context. If estimation suggests otherwise, decompose before execution.
---
## Plan Sizing
Plans group 2-3 related tasks for sequential execution:
| Plan Size | Target Context | Notes |
|-----------|----------------|-------|
| Minimal (1 task) | 20-30% | Simple independent work |
| Standard (2-3 tasks) | 40-50% | Related work, shared context |
| Maximum | 50% | Never exceed—quality degrades |
**Why 2-3 tasks?** Shared context reduces overhead (file reads, understanding). More than 3 loses quality benefits.
---
## Wave-Based Parallelization
Compute dependency graph and assign tasks to waves:
```
Wave 0: Tasks with no dependencies (run in parallel)
Wave 1: Tasks depending only on Wave 0 (run in parallel)
Wave 2: Tasks depending only on Wave 0-1 (run in parallel)
...continue until all tasks assigned
```
**Benefits:**
- Maximum parallelization
- Clear progress tracking
- Natural checkpoints between waves
### Computation Algorithm
```
1. Build dependency graph from task dependencies
2. Find all tasks with no unresolved dependencies → Wave 0
3. Mark Wave 0 as "resolved"
4. Find all tasks whose dependencies are all resolved → Wave 1
5. Repeat until all tasks assigned
```
---
## Context Handoff
When context fills, perform controlled handoff:
### STATE.md Update
Before handoff, update session state:
```yaml
position:
phase: 2
plan: 3
task: "Implement refresh token rotation"
wave: 1
decisions:
- "Using jose library for JWT (not jsonwebtoken)"
- "Refresh tokens stored in httpOnly cookie, not localStorage"
- "15min access token, 7day refresh token"
blockers:
- "Waiting for user to configure OAuth credentials"
next_action: "Continue with task after blocker resolved"
```
### Handoff Content
New session receives:
- STATE.md (current position)
- Relevant SUMMARY.md files (prior work in this phase)
- Current PLAN.md (if executing)
- Task context from initiative
---
## Anti-Patterns
### Context Stuffing
**Wrong:** Loading entire codebase at session start
**Right:** Load files on-demand as tasks require them
### Orchestrator Processing
**Wrong:** Orchestrator reads all code and makes decisions
**Right:** Orchestrator routes to specialized agents who do the work
### Plan Bloat
**Wrong:** 10-task plans to "reduce coordination overhead"
**Right:** 2-3 task plans that fit in 50% context
### No Handoff State
**Wrong:** Agent restarts with no memory of prior work
**Right:** STATE.md preserves position, decisions, blockers
---
## Monitoring
Track context utilization across the system:
| Metric | Threshold | Action |
|--------|-----------|--------|
| Orchestrator context | >50% | Trigger handoff |
| Worker task context | >60% | Flag task as oversized |
| Plan total estimate | >50% | Split plan before execution |
| Average task context | >40% | Review decomposition strategy |
---
## Implementation Notes
### Context Estimation
Estimate context usage before execution:
- File reads: ~1-2% per file (varies by size)
- Code changes: ~0.5% per change
- Tool outputs: ~1% per tool call
- Discussion: ~2-5% per exchange
### Fresh Context Triggers
- Worker: Always fresh per task
- Architect: Fresh per initiative
- Verifier: Fresh per phase
- Orchestrator: Handoff at 50%
### Subagent Spawning
When spawning subagents:
1. Provide focused context (only what's needed)
2. Clear instructions (specific task, expected output)
3. Collect structured results
4. Update state with outcomes