Add userDismissedAt field to agents schema

2026-02-07 00:33:12 +01:00
parent 111ed0962f
commit 2877484012
224 changed files with 30873 additions and 4672 deletions
--- a/docs/context-engineering.md
+++ b/docs/context-engineering.md
@@ -0,0 +1,218 @@
+# Context Engineering
+
+Context engineering is a first-class concern in Codewalk. Agent output quality degrades predictably as context fills. This document defines the rules that all agents must follow.
+
+## Quality Degradation Curve
+
+Claude's output quality follows a predictable curve based on context utilization:
+
+| Context Usage | Quality Level | Behavior |
+|---------------|---------------|----------|
+| 0-30% | **PEAK** | Thorough, comprehensive, considers edge cases |
+| 30-50% | **GOOD** | Confident, solid work, reliable output |
+| 50-70% | **DEGRADING** | Efficiency mode begins, shortcuts appear |
+| 70%+ | **POOR** | Rushed, minimal, misses requirements |
+
+**Rule: Stay UNDER 50% context for quality work.**
+
+---
+
+## Orchestrator Pattern
+
+Codewalk uses thin orchestration with heavy subagent work:
+
+```
+┌─────────────────────────────────────────────────────────────┐
+│                    Orchestrator (30-40%)                    │
+│  - Routes work to specialized agents                        │
+│  - Collects results                                         │
+│  - Maintains state                                          │
+│  - Coordinates across phases                                │
+└─────────────────────────────────────────────────────────────┘
+                              │
+           ┌──────────────────┼──────────────────┐
+           ▼                  ▼                  ▼
+    ┌─────────────┐    ┌─────────────┐    ┌─────────────┐
+    │   Worker    │    │  Architect  │    │  Verifier   │
+    │  (200k ctx) │    │  (200k ctx) │    │  (200k ctx) │
+    │  Fresh per  │    │  Fresh per  │    │  Fresh per  │
+    │    task     │    │  initiative │    │    phase    │
+    └─────────────┘    └─────────────┘    └─────────────┘
+```
+
+**Key insight:** Each subagent gets a fresh 200k context window. Heavy work happens there, not in the orchestrator.
+
+---
+
+## Context Budgets by Role
+
+### Orchestrator
+- **Target:** 30-40% max
+- **Strategy:** Route, don't process. Collect results, don't analyze.
+- **Reset trigger:** Context exceeds 50%
+
+### Worker
+- **Target:** 50% per task
+- **Strategy:** Single task per context. Fresh context for each task.
+- **Reset trigger:** Task completion (always)
+
+### Architect
+- **Target:** 60% per initiative analysis
+- **Strategy:** Initiative discussion + planning in single context
+- **Reset trigger:** Work plan generated or context exceeds 70%
+
+### Verifier
+- **Target:** 40% per phase verification
+- **Strategy:** Goal-backward verification, gap identification
+- **Reset trigger:** Verification complete
+
+---
+
+## Task Sizing Rules
+
+Tasks are sized to fit context budgets:
+
+| Task Complexity | Context Estimate | Example |
+|-----------------|------------------|---------|
+| Simple | 10-20% | Add a field to an existing form |
+| Medium | 20-35% | Create new API endpoint with validation |
+| Complex | 35-50% | Implement auth flow with refresh tokens |
+| Too Large | >50% | **SPLIT INTO SUBTASKS** |
+
+**Planning rule:** No single task should require >50% context. If estimation suggests otherwise, decompose before execution.
+
+---
+
+## Plan Sizing
+
+Plans group 2-3 related tasks for sequential execution:
+
+| Plan Size | Target Context | Notes |
+|-----------|----------------|-------|
+| Minimal (1 task) | 20-30% | Simple independent work |
+| Standard (2-3 tasks) | 40-50% | Related work, shared context |
+| Maximum | 50% | Never exceed—quality degrades |
+
+**Why 2-3 tasks?** Shared context reduces overhead (file reads, understanding). More than 3 loses quality benefits.
+
+---
+
+## Wave-Based Parallelization
+
+Compute dependency graph and assign tasks to waves:
+
+```
+Wave 0: Tasks with no dependencies (run in parallel)
+   ↓
+Wave 1: Tasks depending only on Wave 0 (run in parallel)
+   ↓
+Wave 2: Tasks depending only on Wave 0-1 (run in parallel)
+   ↓
+...continue until all tasks assigned
+```
+
+**Benefits:**
+- Maximum parallelization
+- Clear progress tracking
+- Natural checkpoints between waves
+
+### Computation Algorithm
+
+```
+1. Build dependency graph from task dependencies
+2. Find all tasks with no unresolved dependencies → Wave 0
+3. Mark Wave 0 as "resolved"
+4. Find all tasks whose dependencies are all resolved → Wave 1
+5. Repeat until all tasks assigned
+```
+
+---
+
+## Context Handoff
+
+When context fills, perform controlled handoff:
+
+### STATE.md Update
+Before handoff, update session state:
+
+```yaml
+position:
+  phase: 2
+  plan: 3
+  task: "Implement refresh token rotation"
+  wave: 1
+
+decisions:
+  - "Using jose library for JWT (not jsonwebtoken)"
+  - "Refresh tokens stored in httpOnly cookie, not localStorage"
+  - "15min access token, 7day refresh token"
+
+blockers:
+  - "Waiting for user to configure OAuth credentials"
+
+next_action: "Continue with task after blocker resolved"
+```
+
+### Handoff Content
+New session receives:
+- STATE.md (current position)
+- Relevant SUMMARY.md files (prior work in this phase)
+- Current PLAN.md (if executing)
+- Task context from initiative
+
+---
+
+## Anti-Patterns
+
+### Context Stuffing
+**Wrong:** Loading entire codebase at session start
+**Right:** Load files on-demand as tasks require them
+
+### Orchestrator Processing
+**Wrong:** Orchestrator reads all code and makes decisions
+**Right:** Orchestrator routes to specialized agents who do the work
+
+### Plan Bloat
+**Wrong:** 10-task plans to "reduce coordination overhead"
+**Right:** 2-3 task plans that fit in 50% context
+
+### No Handoff State
+**Wrong:** Agent restarts with no memory of prior work
+**Right:** STATE.md preserves position, decisions, blockers
+
+---
+
+## Monitoring
+
+Track context utilization across the system:
+
+| Metric | Threshold | Action |
+|--------|-----------|--------|
+| Orchestrator context | >50% | Trigger handoff |
+| Worker task context | >60% | Flag task as oversized |
+| Plan total estimate | >50% | Split plan before execution |
+| Average task context | >40% | Review decomposition strategy |
+
+---
+
+## Implementation Notes
+
+### Context Estimation
+Estimate context usage before execution:
+- File reads: ~1-2% per file (varies by size)
+- Code changes: ~0.5% per change
+- Tool outputs: ~1% per tool call
+- Discussion: ~2-5% per exchange
+
+### Fresh Context Triggers
+- Worker: Always fresh per task
+- Architect: Fresh per initiative
+- Verifier: Fresh per phase
+- Orchestrator: Handoff at 50%
+
+### Subagent Spawning
+When spawning subagents:
+1. Provide focused context (only what's needed)
+2. Clear instructions (specific task, expected output)
+3. Collect structured results
+4. Update state with outcomes