Add userDismissedAt field to agents schema

2026-02-07 00:33:12 +01:00
parent 111ed0962f
commit 2877484012
224 changed files with 30873 additions and 4672 deletions
--- a/docs/model-profiles.md
+++ b/docs/model-profiles.md
@@ -0,0 +1,267 @@
+# Model Profiles
+
+Different agent roles have different needs. Model selection balances quality, cost, and latency.
+
+## Profile Definitions
+
+| Profile | Use Case | Cost | Quality |
+|---------|----------|------|---------|
+| **quality** | Critical decisions, architecture | Highest | Best |
+| **balanced** | Default for most work | Medium | Good |
+| **budget** | High-volume, low-risk tasks | Lowest | Acceptable |
+
+---
+
+## Agent Model Assignments
+
+| Agent | Quality | Balanced (Default) | Budget |
+|-------|---------|-------------------|--------|
+| **Architect** | Opus | Opus | Sonnet |
+| **Worker** | Opus | Sonnet | Sonnet |
+| **Verifier** | Sonnet | Sonnet | Haiku |
+| **Orchestrator** | Sonnet | Sonnet | Haiku |
+| **Monitor** | Sonnet | Haiku | Haiku |
+| **Researcher** | Opus | Sonnet | Haiku |
+
+---
+
+## Rationale
+
+### Architect (Planning) - Opus/Opus/Sonnet
+Planning has the highest impact on outcomes. A bad plan wastes all downstream execution. Invest in quality here.
+
+**Quality profile:** Complex systems, novel domains, critical decisions
+**Balanced profile:** Standard feature work, established patterns
+**Budget profile:** Simple initiatives, well-documented domains
+
+### Worker (Execution) - Opus/Sonnet/Sonnet
+The plan already contains reasoning. Execution is implementation, not decision-making.
+
+**Quality profile:** Complex algorithms, security-critical code
+**Balanced profile:** Standard implementation work
+**Budget profile:** Simple tasks, boilerplate code
+
+### Verifier (Validation) - Sonnet/Sonnet/Haiku
+Verification is structured checking against defined criteria. Less reasoning needed than planning.
+
+**Quality profile:** Complex verification, subtle integration issues
+**Balanced profile:** Standard goal-backward verification
+**Budget profile:** Simple pass/fail checks
+
+### Orchestrator (Coordination) - Sonnet/Sonnet/Haiku
+Orchestrator routes work, doesn't do heavy lifting. Needs reliability, not creativity.
+
+**Quality profile:** Complex multi-agent coordination
+**Balanced profile:** Standard workflow management
+**Budget profile:** Simple task routing
+
+### Monitor (Observation) - Sonnet/Haiku/Haiku
+Monitoring is pattern matching and threshold checking. Minimal reasoning required.
+
+**Quality profile:** Complex health analysis
+**Balanced profile:** Standard monitoring
+**Budget profile:** Simple heartbeat checks
+
+### Researcher (Discovery) - Opus/Sonnet/Haiku
+Research is read-only exploration. High volume, low modification risk.
+
+**Quality profile:** Deep domain analysis
+**Balanced profile:** Standard codebase exploration
+**Budget profile:** Simple file lookups
+
+---
+
+## Profile Selection
+
+### Per-Initiative Override
+
+```yaml
+# In initiative config
+model_profile: quality  # Override default balanced
+```
+
+### Per-Agent Override
+
+```yaml
+# In task assignment
+assigned_to: worker-123
+model_override: opus  # This task needs Opus
+```
+
+### Automatic Escalation
+
+```yaml
+# When to auto-escalate
+escalation_triggers:
+  - condition: "task.retry_count > 2"
+    action: "escalate_model"
+  - condition: "task.complexity == 'high'"
+    action: "use_quality_profile"
+  - condition: "deviation.rule == 4"
+    action: "escalate_model"
+```
+
+---
+
+## Cost Management
+
+### Estimated Token Usage
+
+| Agent | Avg Tokens/Task | Profile Impact |
+|-------|-----------------|----------------|
+| Architect | 50k-100k | 3x between budget/quality |
+| Worker | 20k-50k | 2x between budget/quality |
+| Verifier | 10k-30k | 1.5x between budget/quality |
+| Orchestrator | 5k-15k | 1.5x between budget/quality |
+
+### Cost Optimization Strategies
+
+1. **Right-size tasks:** Smaller tasks = less token usage
+2. **Use budget for volume:** Monitoring, simple checks
+3. **Reserve quality for impact:** Architecture, security
+4. **Profile per initiative:** Simple features use budget, complex use quality
+
+---
+
+## Configuration
+
+### Default Profile
+
+```json
+// .planning/config.json
+{
+  "model_profile": "balanced",
+  "model_overrides": {
+    "architect": null,
+    "worker": null,
+    "verifier": null
+  }
+}
+```
+
+### Quality Profile
+
+```json
+{
+  "model_profile": "quality",
+  "model_overrides": {}
+}
+```
+
+### Budget Profile
+
+```json
+{
+  "model_profile": "budget",
+  "model_overrides": {
+    "architect": "sonnet"  // Keep architect at sonnet minimum
+  }
+}
+```
+
+### Mixed Profile
+
+```json
+{
+  "model_profile": "balanced",
+  "model_overrides": {
+    "architect": "opus",     // Invest in planning
+    "worker": "sonnet",      // Standard execution
+    "verifier": "haiku"      // Budget verification
+  }
+}
+```
+
+---
+
+## Model Capabilities Reference
+
+### Opus
+- **Strengths:** Complex reasoning, nuanced decisions, novel problems
+- **Best for:** Architecture, complex algorithms, security analysis
+- **Cost:** Highest
+
+### Sonnet
+- **Strengths:** Good balance of reasoning and speed, reliable
+- **Best for:** Standard development, code generation, debugging
+- **Cost:** Medium
+
+### Haiku
+- **Strengths:** Fast, cheap, good for structured tasks
+- **Best for:** Monitoring, simple checks, high-volume operations
+- **Cost:** Lowest
+
+---
+
+## Profile Switching
+
+### CLI Command
+
+```bash
+# Set profile for all future work
+cw config set model_profile quality
+
+# Set profile for specific initiative
+cw initiative config <id> --model-profile budget
+
+# Override for single task
+cw task update <id> --model-override opus
+```
+
+### API
+
+```typescript
+// Set initiative profile
+await initiative.setConfig(id, { modelProfile: 'quality' });
+
+// Override task model
+await task.update(id, { modelOverride: 'opus' });
+```
+
+---
+
+## Monitoring Model Usage
+
+Track model usage for cost analysis:
+
+```sql
+CREATE TABLE model_usage (
+  id INTEGER PRIMARY KEY AUTOINCREMENT,
+  agent_type TEXT NOT NULL,
+  model TEXT NOT NULL,
+  tokens_input INTEGER,
+  tokens_output INTEGER,
+  task_id TEXT,
+  initiative_id TEXT,
+  created_at INTEGER DEFAULT (unixepoch())
+);
+
+-- Usage by agent type
+SELECT agent_type, model, SUM(tokens_input + tokens_output) as total_tokens
+FROM model_usage
+GROUP BY agent_type, model;
+
+-- Cost by initiative
+SELECT initiative_id,
+       SUM(CASE WHEN model = 'opus' THEN tokens * 0.015
+                WHEN model = 'sonnet' THEN tokens * 0.003
+                WHEN model = 'haiku' THEN tokens * 0.0003 END) as estimated_cost
+FROM model_usage
+GROUP BY initiative_id;
+```
+
+---
+
+## Recommendations
+
+### Starting Out
+Use **balanced** profile. It provides good quality at reasonable cost.
+
+### High-Stakes Projects
+Use **quality** profile. The cost difference is negligible compared to getting it right.
+
+### High-Volume Work
+Use **budget** profile with architect override to sonnet. Don't skimp on planning.
+
+### Learning the System
+Use **quality** profile initially. See what good output looks like before optimizing for cost.