Add userDismissedAt field to agents schema
This commit is contained in:
218
docs/context-engineering.md
Normal file
218
docs/context-engineering.md
Normal file
@@ -0,0 +1,218 @@
|
||||
# Context Engineering
|
||||
|
||||
Context engineering is a first-class concern in Codewalk. Agent output quality degrades predictably as context fills. This document defines the rules that all agents must follow.
|
||||
|
||||
## Quality Degradation Curve
|
||||
|
||||
Claude's output quality follows a predictable curve based on context utilization:
|
||||
|
||||
| Context Usage | Quality Level | Behavior |
|
||||
|---------------|---------------|----------|
|
||||
| 0-30% | **PEAK** | Thorough, comprehensive, considers edge cases |
|
||||
| 30-50% | **GOOD** | Confident, solid work, reliable output |
|
||||
| 50-70% | **DEGRADING** | Efficiency mode begins, shortcuts appear |
|
||||
| 70%+ | **POOR** | Rushed, minimal, misses requirements |
|
||||
|
||||
**Rule: Stay UNDER 50% context for quality work.**
|
||||
|
||||
---
|
||||
|
||||
## Orchestrator Pattern
|
||||
|
||||
Codewalk uses thin orchestration with heavy subagent work:
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────────┐
|
||||
│ Orchestrator (30-40%) │
|
||||
│ - Routes work to specialized agents │
|
||||
│ - Collects results │
|
||||
│ - Maintains state │
|
||||
│ - Coordinates across phases │
|
||||
└─────────────────────────────────────────────────────────────┘
|
||||
│
|
||||
┌──────────────────┼──────────────────┐
|
||||
▼ ▼ ▼
|
||||
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
|
||||
│ Worker │ │ Architect │ │ Verifier │
|
||||
│ (200k ctx) │ │ (200k ctx) │ │ (200k ctx) │
|
||||
│ Fresh per │ │ Fresh per │ │ Fresh per │
|
||||
│ task │ │ initiative │ │ phase │
|
||||
└─────────────┘ └─────────────┘ └─────────────┘
|
||||
```
|
||||
|
||||
**Key insight:** Each subagent gets a fresh 200k context window. Heavy work happens there, not in the orchestrator.
|
||||
|
||||
---
|
||||
|
||||
## Context Budgets by Role
|
||||
|
||||
### Orchestrator
|
||||
- **Target:** 30-40% max
|
||||
- **Strategy:** Route, don't process. Collect results, don't analyze.
|
||||
- **Reset trigger:** Context exceeds 50%
|
||||
|
||||
### Worker
|
||||
- **Target:** 50% per task
|
||||
- **Strategy:** Single task per context. Fresh context for each task.
|
||||
- **Reset trigger:** Task completion (always)
|
||||
|
||||
### Architect
|
||||
- **Target:** 60% per initiative analysis
|
||||
- **Strategy:** Initiative discussion + planning in single context
|
||||
- **Reset trigger:** Work plan generated or context exceeds 70%
|
||||
|
||||
### Verifier
|
||||
- **Target:** 40% per phase verification
|
||||
- **Strategy:** Goal-backward verification, gap identification
|
||||
- **Reset trigger:** Verification complete
|
||||
|
||||
---
|
||||
|
||||
## Task Sizing Rules
|
||||
|
||||
Tasks are sized to fit context budgets:
|
||||
|
||||
| Task Complexity | Context Estimate | Example |
|
||||
|-----------------|------------------|---------|
|
||||
| Simple | 10-20% | Add a field to an existing form |
|
||||
| Medium | 20-35% | Create new API endpoint with validation |
|
||||
| Complex | 35-50% | Implement auth flow with refresh tokens |
|
||||
| Too Large | >50% | **SPLIT INTO SUBTASKS** |
|
||||
|
||||
**Planning rule:** No single task should require >50% context. If estimation suggests otherwise, decompose before execution.
|
||||
|
||||
---
|
||||
|
||||
## Plan Sizing
|
||||
|
||||
Plans group 2-3 related tasks for sequential execution:
|
||||
|
||||
| Plan Size | Target Context | Notes |
|
||||
|-----------|----------------|-------|
|
||||
| Minimal (1 task) | 20-30% | Simple independent work |
|
||||
| Standard (2-3 tasks) | 40-50% | Related work, shared context |
|
||||
| Maximum | 50% | Never exceed—quality degrades |
|
||||
|
||||
**Why 2-3 tasks?** Shared context reduces overhead (file reads, understanding). More than 3 loses quality benefits.
|
||||
|
||||
---
|
||||
|
||||
## Wave-Based Parallelization
|
||||
|
||||
Compute dependency graph and assign tasks to waves:
|
||||
|
||||
```
|
||||
Wave 0: Tasks with no dependencies (run in parallel)
|
||||
↓
|
||||
Wave 1: Tasks depending only on Wave 0 (run in parallel)
|
||||
↓
|
||||
Wave 2: Tasks depending only on Wave 0-1 (run in parallel)
|
||||
↓
|
||||
...continue until all tasks assigned
|
||||
```
|
||||
|
||||
**Benefits:**
|
||||
- Maximum parallelization
|
||||
- Clear progress tracking
|
||||
- Natural checkpoints between waves
|
||||
|
||||
### Computation Algorithm
|
||||
|
||||
```
|
||||
1. Build dependency graph from task dependencies
|
||||
2. Find all tasks with no unresolved dependencies → Wave 0
|
||||
3. Mark Wave 0 as "resolved"
|
||||
4. Find all tasks whose dependencies are all resolved → Wave 1
|
||||
5. Repeat until all tasks assigned
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Context Handoff
|
||||
|
||||
When context fills, perform controlled handoff:
|
||||
|
||||
### STATE.md Update
|
||||
Before handoff, update session state:
|
||||
|
||||
```yaml
|
||||
position:
|
||||
phase: 2
|
||||
plan: 3
|
||||
task: "Implement refresh token rotation"
|
||||
wave: 1
|
||||
|
||||
decisions:
|
||||
- "Using jose library for JWT (not jsonwebtoken)"
|
||||
- "Refresh tokens stored in httpOnly cookie, not localStorage"
|
||||
- "15min access token, 7day refresh token"
|
||||
|
||||
blockers:
|
||||
- "Waiting for user to configure OAuth credentials"
|
||||
|
||||
next_action: "Continue with task after blocker resolved"
|
||||
```
|
||||
|
||||
### Handoff Content
|
||||
New session receives:
|
||||
- STATE.md (current position)
|
||||
- Relevant SUMMARY.md files (prior work in this phase)
|
||||
- Current PLAN.md (if executing)
|
||||
- Task context from initiative
|
||||
|
||||
---
|
||||
|
||||
## Anti-Patterns
|
||||
|
||||
### Context Stuffing
|
||||
**Wrong:** Loading entire codebase at session start
|
||||
**Right:** Load files on-demand as tasks require them
|
||||
|
||||
### Orchestrator Processing
|
||||
**Wrong:** Orchestrator reads all code and makes decisions
|
||||
**Right:** Orchestrator routes to specialized agents who do the work
|
||||
|
||||
### Plan Bloat
|
||||
**Wrong:** 10-task plans to "reduce coordination overhead"
|
||||
**Right:** 2-3 task plans that fit in 50% context
|
||||
|
||||
### No Handoff State
|
||||
**Wrong:** Agent restarts with no memory of prior work
|
||||
**Right:** STATE.md preserves position, decisions, blockers
|
||||
|
||||
---
|
||||
|
||||
## Monitoring
|
||||
|
||||
Track context utilization across the system:
|
||||
|
||||
| Metric | Threshold | Action |
|
||||
|--------|-----------|--------|
|
||||
| Orchestrator context | >50% | Trigger handoff |
|
||||
| Worker task context | >60% | Flag task as oversized |
|
||||
| Plan total estimate | >50% | Split plan before execution |
|
||||
| Average task context | >40% | Review decomposition strategy |
|
||||
|
||||
---
|
||||
|
||||
## Implementation Notes
|
||||
|
||||
### Context Estimation
|
||||
Estimate context usage before execution:
|
||||
- File reads: ~1-2% per file (varies by size)
|
||||
- Code changes: ~0.5% per change
|
||||
- Tool outputs: ~1% per tool call
|
||||
- Discussion: ~2-5% per exchange
|
||||
|
||||
### Fresh Context Triggers
|
||||
- Worker: Always fresh per task
|
||||
- Architect: Fresh per initiative
|
||||
- Verifier: Fresh per phase
|
||||
- Orchestrator: Handoff at 50%
|
||||
|
||||
### Subagent Spawning
|
||||
When spawning subagents:
|
||||
1. Provide focused context (only what's needed)
|
||||
2. Clear instructions (specific task, expected output)
|
||||
3. Collect structured results
|
||||
4. Update state with outcomes
|
||||
Reference in New Issue
Block a user