Add userDismissedAt field to agents schema

This commit is contained in:
Lukas May
2026-02-07 00:33:12 +01:00
parent 111ed0962f
commit 2877484012
224 changed files with 30873 additions and 4672 deletions

267
docs/model-profiles.md Normal file
View File

@@ -0,0 +1,267 @@
# Model Profiles
Different agent roles have different needs. Model selection balances quality, cost, and latency.
## Profile Definitions
| Profile | Use Case | Cost | Quality |
|---------|----------|------|---------|
| **quality** | Critical decisions, architecture | Highest | Best |
| **balanced** | Default for most work | Medium | Good |
| **budget** | High-volume, low-risk tasks | Lowest | Acceptable |
---
## Agent Model Assignments
| Agent | Quality | Balanced (Default) | Budget |
|-------|---------|-------------------|--------|
| **Architect** | Opus | Opus | Sonnet |
| **Worker** | Opus | Sonnet | Sonnet |
| **Verifier** | Sonnet | Sonnet | Haiku |
| **Orchestrator** | Sonnet | Sonnet | Haiku |
| **Monitor** | Sonnet | Haiku | Haiku |
| **Researcher** | Opus | Sonnet | Haiku |
---
## Rationale
### Architect (Planning) - Opus/Opus/Sonnet
Planning has the highest impact on outcomes. A bad plan wastes all downstream execution. Invest in quality here.
**Quality profile:** Complex systems, novel domains, critical decisions
**Balanced profile:** Standard feature work, established patterns
**Budget profile:** Simple initiatives, well-documented domains
### Worker (Execution) - Opus/Sonnet/Sonnet
The plan already contains reasoning. Execution is implementation, not decision-making.
**Quality profile:** Complex algorithms, security-critical code
**Balanced profile:** Standard implementation work
**Budget profile:** Simple tasks, boilerplate code
### Verifier (Validation) - Sonnet/Sonnet/Haiku
Verification is structured checking against defined criteria. Less reasoning needed than planning.
**Quality profile:** Complex verification, subtle integration issues
**Balanced profile:** Standard goal-backward verification
**Budget profile:** Simple pass/fail checks
### Orchestrator (Coordination) - Sonnet/Sonnet/Haiku
Orchestrator routes work, doesn't do heavy lifting. Needs reliability, not creativity.
**Quality profile:** Complex multi-agent coordination
**Balanced profile:** Standard workflow management
**Budget profile:** Simple task routing
### Monitor (Observation) - Sonnet/Haiku/Haiku
Monitoring is pattern matching and threshold checking. Minimal reasoning required.
**Quality profile:** Complex health analysis
**Balanced profile:** Standard monitoring
**Budget profile:** Simple heartbeat checks
### Researcher (Discovery) - Opus/Sonnet/Haiku
Research is read-only exploration. High volume, low modification risk.
**Quality profile:** Deep domain analysis
**Balanced profile:** Standard codebase exploration
**Budget profile:** Simple file lookups
---
## Profile Selection
### Per-Initiative Override
```yaml
# In initiative config
model_profile: quality # Override default balanced
```
### Per-Agent Override
```yaml
# In task assignment
assigned_to: worker-123
model_override: opus # This task needs Opus
```
### Automatic Escalation
```yaml
# When to auto-escalate
escalation_triggers:
- condition: "task.retry_count > 2"
action: "escalate_model"
- condition: "task.complexity == 'high'"
action: "use_quality_profile"
- condition: "deviation.rule == 4"
action: "escalate_model"
```
---
## Cost Management
### Estimated Token Usage
| Agent | Avg Tokens/Task | Profile Impact |
|-------|-----------------|----------------|
| Architect | 50k-100k | 3x between budget/quality |
| Worker | 20k-50k | 2x between budget/quality |
| Verifier | 10k-30k | 1.5x between budget/quality |
| Orchestrator | 5k-15k | 1.5x between budget/quality |
### Cost Optimization Strategies
1. **Right-size tasks:** Smaller tasks = less token usage
2. **Use budget for volume:** Monitoring, simple checks
3. **Reserve quality for impact:** Architecture, security
4. **Profile per initiative:** Simple features use budget, complex use quality
---
## Configuration
### Default Profile
```json
// .planning/config.json
{
"model_profile": "balanced",
"model_overrides": {
"architect": null,
"worker": null,
"verifier": null
}
}
```
### Quality Profile
```json
{
"model_profile": "quality",
"model_overrides": {}
}
```
### Budget Profile
```json
{
"model_profile": "budget",
"model_overrides": {
"architect": "sonnet" // Keep architect at sonnet minimum
}
}
```
### Mixed Profile
```json
{
"model_profile": "balanced",
"model_overrides": {
"architect": "opus", // Invest in planning
"worker": "sonnet", // Standard execution
"verifier": "haiku" // Budget verification
}
}
```
---
## Model Capabilities Reference
### Opus
- **Strengths:** Complex reasoning, nuanced decisions, novel problems
- **Best for:** Architecture, complex algorithms, security analysis
- **Cost:** Highest
### Sonnet
- **Strengths:** Good balance of reasoning and speed, reliable
- **Best for:** Standard development, code generation, debugging
- **Cost:** Medium
### Haiku
- **Strengths:** Fast, cheap, good for structured tasks
- **Best for:** Monitoring, simple checks, high-volume operations
- **Cost:** Lowest
---
## Profile Switching
### CLI Command
```bash
# Set profile for all future work
cw config set model_profile quality
# Set profile for specific initiative
cw initiative config <id> --model-profile budget
# Override for single task
cw task update <id> --model-override opus
```
### API
```typescript
// Set initiative profile
await initiative.setConfig(id, { modelProfile: 'quality' });
// Override task model
await task.update(id, { modelOverride: 'opus' });
```
---
## Monitoring Model Usage
Track model usage for cost analysis:
```sql
CREATE TABLE model_usage (
id INTEGER PRIMARY KEY AUTOINCREMENT,
agent_type TEXT NOT NULL,
model TEXT NOT NULL,
tokens_input INTEGER,
tokens_output INTEGER,
task_id TEXT,
initiative_id TEXT,
created_at INTEGER DEFAULT (unixepoch())
);
-- Usage by agent type
SELECT agent_type, model, SUM(tokens_input + tokens_output) as total_tokens
FROM model_usage
GROUP BY agent_type, model;
-- Cost by initiative
SELECT initiative_id,
SUM(CASE WHEN model = 'opus' THEN tokens * 0.015
WHEN model = 'sonnet' THEN tokens * 0.003
WHEN model = 'haiku' THEN tokens * 0.0003 END) as estimated_cost
FROM model_usage
GROUP BY initiative_id;
```
---
## Recommendations
### Starting Out
Use **balanced** profile. It provides good quality at reasonable cost.
### High-Stakes Projects
Use **quality** profile. The cost difference is negligible compared to getting it right.
### High-Volume Work
Use **budget** profile with architect override to sonnet. Don't skimp on planning.
### Learning the System
Use **quality** profile initially. See what good output looks like before optimizing for cost.