Files
Codewalkers/docs/archive/context-engineering.md
Lukas May 342b490fe7 feat: Task decomposition for Tailwind/Radix/shadcn foundation setup
Decomposed "Foundation Setup - Install Dependencies & Configure Tailwind"
phase into 6 executable tasks:

1. Install Tailwind CSS, PostCSS & Autoprefixer
2. Map MUI theme to Tailwind design tokens
3. Setup CSS variables for dynamic theming
4. Install Radix UI primitives
5. Initialize shadcn/ui and setup component directory
6. Move MUI to devDependencies and verify setup

Tasks follow logical dependency chain with final human verification
checkpoint before proceeding with component migration.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-10 09:48:51 +01:00

6.9 KiB

Context Engineering

Context engineering is a first-class concern in Codewalk. Agent output quality degrades predictably as context fills. This document defines the rules that all agents must follow.

Quality Degradation Curve

Claude's output quality follows a predictable curve based on context utilization:

Context Usage Quality Level Behavior
0-30% PEAK Thorough, comprehensive, considers edge cases
30-50% GOOD Confident, solid work, reliable output
50-70% DEGRADING Efficiency mode begins, shortcuts appear
70%+ POOR Rushed, minimal, misses requirements

Rule: Stay UNDER 50% context for quality work.


Orchestrator Pattern

Codewalk uses thin orchestration with heavy subagent work:

┌─────────────────────────────────────────────────────────────┐
│                    Orchestrator (30-40%)                    │
│  - Routes work to specialized agents                        │
│  - Collects results                                         │
│  - Maintains state                                          │
│  - Coordinates across phases                                │
└─────────────────────────────────────────────────────────────┘
                              │
           ┌──────────────────┼──────────────────┐
           ▼                  ▼                  ▼
    ┌─────────────┐    ┌─────────────┐    ┌─────────────┐
    │   Worker    │    │  Architect  │    │  Verifier   │
    │  (200k ctx) │    │  (200k ctx) │    │  (200k ctx) │
    │  Fresh per  │    │  Fresh per  │    │  Fresh per  │
    │    task     │    │  initiative │    │    phase    │
    └─────────────┘    └─────────────┘    └─────────────┘

Key insight: Each subagent gets a fresh 200k context window. Heavy work happens there, not in the orchestrator.


Context Budgets by Role

Orchestrator

  • Target: 30-40% max
  • Strategy: Route, don't process. Collect results, don't analyze.
  • Reset trigger: Context exceeds 50%

Worker

  • Target: 50% per task
  • Strategy: Single task per context. Fresh context for each task.
  • Reset trigger: Task completion (always)

Architect

  • Target: 60% per initiative analysis
  • Strategy: Initiative discussion + planning in single context
  • Reset trigger: Work plan generated or context exceeds 70%

Verifier

  • Target: 40% per phase verification
  • Strategy: Goal-backward verification, gap identification
  • Reset trigger: Verification complete

Task Sizing Rules

Tasks are sized to fit context budgets:

Task Complexity Context Estimate Example
Simple 10-20% Add a field to an existing form
Medium 20-35% Create new API endpoint with validation
Complex 35-50% Implement auth flow with refresh tokens
Too Large >50% SPLIT INTO SUBTASKS

Planning rule: No single task should require >50% context. If estimation suggests otherwise, decompose before execution.


Plan Sizing

Plans group 2-3 related tasks for sequential execution:

Plan Size Target Context Notes
Minimal (1 task) 20-30% Simple independent work
Standard (2-3 tasks) 40-50% Related work, shared context
Maximum 50% Never exceed—quality degrades

Why 2-3 tasks? Shared context reduces overhead (file reads, understanding). More than 3 loses quality benefits.


Wave-Based Parallelization

Compute dependency graph and assign tasks to waves:

Wave 0: Tasks with no dependencies (run in parallel)
   ↓
Wave 1: Tasks depending only on Wave 0 (run in parallel)
   ↓
Wave 2: Tasks depending only on Wave 0-1 (run in parallel)
   ↓
...continue until all tasks assigned

Benefits:

  • Maximum parallelization
  • Clear progress tracking
  • Natural checkpoints between waves

Computation Algorithm

1. Build dependency graph from task dependencies
2. Find all tasks with no unresolved dependencies → Wave 0
3. Mark Wave 0 as "resolved"
4. Find all tasks whose dependencies are all resolved → Wave 1
5. Repeat until all tasks assigned

Context Handoff

When context fills, perform controlled handoff:

STATE.md Update

Before handoff, update session state:

position:
  phase: 2
  plan: 3
  task: "Implement refresh token rotation"
  wave: 1

decisions:
  - "Using jose library for JWT (not jsonwebtoken)"
  - "Refresh tokens stored in httpOnly cookie, not localStorage"
  - "15min access token, 7day refresh token"

blockers:
  - "Waiting for user to configure OAuth credentials"

next_action: "Continue with task after blocker resolved"

Handoff Content

New session receives:

  • STATE.md (current position)
  • Relevant SUMMARY.md files (prior work in this phase)
  • Current PLAN.md (if executing)
  • Task context from initiative

Anti-Patterns

Context Stuffing

Wrong: Loading entire codebase at session start Right: Load files on-demand as tasks require them

Orchestrator Processing

Wrong: Orchestrator reads all code and makes decisions Right: Orchestrator routes to specialized agents who do the work

Plan Bloat

Wrong: 10-task plans to "reduce coordination overhead" Right: 2-3 task plans that fit in 50% context

No Handoff State

Wrong: Agent restarts with no memory of prior work Right: STATE.md preserves position, decisions, blockers


Monitoring

Track context utilization across the system:

Metric Threshold Action
Orchestrator context >50% Trigger handoff
Worker task context >60% Flag task as oversized
Plan total estimate >50% Split plan before execution
Average task context >40% Review decomposition strategy

Implementation Notes

Context Estimation

Estimate context usage before execution:

  • File reads: ~1-2% per file (varies by size)
  • Code changes: ~0.5% per change
  • Tool outputs: ~1% per tool call
  • Discussion: ~2-5% per exchange

Fresh Context Triggers

  • Worker: Always fresh per task
  • Architect: Fresh per initiative
  • Verifier: Fresh per phase
  • Orchestrator: Handoff at 50%

Subagent Spawning

When spawning subagents:

  1. Provide focused context (only what's needed)
  2. Clear instructions (specific task, expected output)
  3. Collect structured results
  4. Update state with outcomes