Files

Lukas May 342b490fe7 feat: Task decomposition for Tailwind/Radix/shadcn foundation setup

Decomposed "Foundation Setup - Install Dependencies & Configure Tailwind"
phase into 6 executable tasks:

1. Install Tailwind CSS, PostCSS & Autoprefixer
2. Map MUI theme to Tailwind design tokens
3. Setup CSS variables for dynamic theming
4. Install Radix UI primitives
5. Initialize shadcn/ui and setup component directory
6. Move MUI to devDependencies and verify setup

Tasks follow logical dependency chain with final human verification
checkpoint before proceeding with component migration.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

2026-02-10 09:48:51 +01:00

6.9 KiB

Raw Permalink Blame History

Context Engineering

Context engineering is a first-class concern in Codewalk. Agent output quality degrades predictably as context fills. This document defines the rules that all agents must follow.

Quality Degradation Curve

Claude's output quality follows a predictable curve based on context utilization:

Context Usage	Quality Level	Behavior
0-30%	PEAK	Thorough, comprehensive, considers edge cases
30-50%	GOOD	Confident, solid work, reliable output
50-70%	DEGRADING	Efficiency mode begins, shortcuts appear
70%+	POOR	Rushed, minimal, misses requirements

Rule: Stay UNDER 50% context for quality work.

Orchestrator Pattern

Codewalk uses thin orchestration with heavy subagent work:

┌─────────────────────────────────────────────────────────────┐
│                    Orchestrator (30-40%)                    │
│  - Routes work to specialized agents                        │
│  - Collects results                                         │
│  - Maintains state                                          │
│  - Coordinates across phases                                │
└─────────────────────────────────────────────────────────────┘
                              │
           ┌──────────────────┼──────────────────┐
           ▼                  ▼                  ▼
    ┌─────────────┐    ┌─────────────┐    ┌─────────────┐
    │   Worker    │    │  Architect  │    │  Verifier   │
    │  (200k ctx) │    │  (200k ctx) │    │  (200k ctx) │
    │  Fresh per  │    │  Fresh per  │    │  Fresh per  │
    │    task     │    │  initiative │    │    phase    │
    └─────────────┘    └─────────────┘    └─────────────┘

Key insight: Each subagent gets a fresh 200k context window. Heavy work happens there, not in the orchestrator.

Context Budgets by Role

Orchestrator

Target: 30-40% max
Strategy: Route, don't process. Collect results, don't analyze.
Reset trigger: Context exceeds 50%

Worker

Target: 50% per task
Strategy: Single task per context. Fresh context for each task.
Reset trigger: Task completion (always)

Architect

Target: 60% per initiative analysis
Strategy: Initiative discussion + planning in single context
Reset trigger: Work plan generated or context exceeds 70%

Verifier

Target: 40% per phase verification
Strategy: Goal-backward verification, gap identification
Reset trigger: Verification complete

Task Sizing Rules

Tasks are sized to fit context budgets:

Task Complexity	Context Estimate	Example
Simple	10-20%	Add a field to an existing form
Medium	20-35%	Create new API endpoint with validation
Complex	35-50%	Implement auth flow with refresh tokens
Too Large	>50%	SPLIT INTO SUBTASKS

Planning rule: No single task should require >50% context. If estimation suggests otherwise, decompose before execution.

Plan Sizing

Plans group 2-3 related tasks for sequential execution:

Plan Size	Target Context	Notes
Minimal (1 task)	20-30%	Simple independent work
Standard (2-3 tasks)	40-50%	Related work, shared context
Maximum	50%	Never exceed—quality degrades

Why 2-3 tasks? Shared context reduces overhead (file reads, understanding). More than 3 loses quality benefits.

Wave-Based Parallelization

Compute dependency graph and assign tasks to waves:

Wave 0: Tasks with no dependencies (run in parallel)
   ↓
Wave 1: Tasks depending only on Wave 0 (run in parallel)
   ↓
Wave 2: Tasks depending only on Wave 0-1 (run in parallel)
   ↓
...continue until all tasks assigned

Benefits:

Maximum parallelization
Clear progress tracking
Natural checkpoints between waves

Computation Algorithm

1. Build dependency graph from task dependencies
2. Find all tasks with no unresolved dependencies → Wave 0
3. Mark Wave 0 as "resolved"
4. Find all tasks whose dependencies are all resolved → Wave 1
5. Repeat until all tasks assigned

Context Handoff

When context fills, perform controlled handoff:

STATE.md Update

Before handoff, update session state:

position:
  phase: 2
  plan: 3
  task: "Implement refresh token rotation"
  wave: 1

decisions:
  - "Using jose library for JWT (not jsonwebtoken)"
  - "Refresh tokens stored in httpOnly cookie, not localStorage"
  - "15min access token, 7day refresh token"

blockers:
  - "Waiting for user to configure OAuth credentials"

next_action: "Continue with task after blocker resolved"

Handoff Content

New session receives:

STATE.md (current position)
Relevant SUMMARY.md files (prior work in this phase)
Current PLAN.md (if executing)
Task context from initiative

Anti-Patterns

Context Stuffing

Wrong: Loading entire codebase at session start Right: Load files on-demand as tasks require them

Orchestrator Processing

Wrong: Orchestrator reads all code and makes decisions Right: Orchestrator routes to specialized agents who do the work

Plan Bloat

Wrong: 10-task plans to "reduce coordination overhead" Right: 2-3 task plans that fit in 50% context

No Handoff State

Wrong: Agent restarts with no memory of prior work Right: STATE.md preserves position, decisions, blockers

Monitoring

Track context utilization across the system:

Metric	Threshold	Action
Orchestrator context	>50%	Trigger handoff
Worker task context	>60%	Flag task as oversized
Plan total estimate	>50%	Split plan before execution
Average task context	>40%	Review decomposition strategy

Implementation Notes

Context Estimation

Estimate context usage before execution:

File reads: ~1-2% per file (varies by size)
Code changes: ~0.5% per change
Tool outputs: ~1% per tool call
Discussion: ~2-5% per exchange

Fresh Context Triggers

Worker: Always fresh per task
Architect: Fresh per initiative
Verifier: Fresh per phase
Orchestrator: Handoff at 50%

Subagent Spawning

When spawning subagents:

Provide focused context (only what's needed)
Clear instructions (specific task, expected output)
Collect structured results
Update state with outcomes

6.9 KiB Raw Permalink Blame History