268 lines
6.6 KiB
Markdown
268 lines
6.6 KiB
Markdown
# Model Profiles
|
|
|
|
Different agent roles have different needs. Model selection balances quality, cost, and latency.
|
|
|
|
## Profile Definitions
|
|
|
|
| Profile | Use Case | Cost | Quality |
|
|
|---------|----------|------|---------|
|
|
| **quality** | Critical decisions, architecture | Highest | Best |
|
|
| **balanced** | Default for most work | Medium | Good |
|
|
| **budget** | High-volume, low-risk tasks | Lowest | Acceptable |
|
|
|
|
---
|
|
|
|
## Agent Model Assignments
|
|
|
|
| Agent | Quality | Balanced (Default) | Budget |
|
|
|-------|---------|-------------------|--------|
|
|
| **Architect** | Opus | Opus | Sonnet |
|
|
| **Worker** | Opus | Sonnet | Sonnet |
|
|
| **Verifier** | Sonnet | Sonnet | Haiku |
|
|
| **Orchestrator** | Sonnet | Sonnet | Haiku |
|
|
| **Monitor** | Sonnet | Haiku | Haiku |
|
|
| **Researcher** | Opus | Sonnet | Haiku |
|
|
|
|
---
|
|
|
|
## Rationale
|
|
|
|
### Architect (Planning) - Opus/Opus/Sonnet
|
|
Planning has the highest impact on outcomes. A bad plan wastes all downstream execution. Invest in quality here.
|
|
|
|
**Quality profile:** Complex systems, novel domains, critical decisions
|
|
**Balanced profile:** Standard feature work, established patterns
|
|
**Budget profile:** Simple initiatives, well-documented domains
|
|
|
|
### Worker (Execution) - Opus/Sonnet/Sonnet
|
|
The plan already contains reasoning. Execution is implementation, not decision-making.
|
|
|
|
**Quality profile:** Complex algorithms, security-critical code
|
|
**Balanced profile:** Standard implementation work
|
|
**Budget profile:** Simple tasks, boilerplate code
|
|
|
|
### Verifier (Validation) - Sonnet/Sonnet/Haiku
|
|
Verification is structured checking against defined criteria. Less reasoning needed than planning.
|
|
|
|
**Quality profile:** Complex verification, subtle integration issues
|
|
**Balanced profile:** Standard goal-backward verification
|
|
**Budget profile:** Simple pass/fail checks
|
|
|
|
### Orchestrator (Coordination) - Sonnet/Sonnet/Haiku
|
|
Orchestrator routes work, doesn't do heavy lifting. Needs reliability, not creativity.
|
|
|
|
**Quality profile:** Complex multi-agent coordination
|
|
**Balanced profile:** Standard workflow management
|
|
**Budget profile:** Simple task routing
|
|
|
|
### Monitor (Observation) - Sonnet/Haiku/Haiku
|
|
Monitoring is pattern matching and threshold checking. Minimal reasoning required.
|
|
|
|
**Quality profile:** Complex health analysis
|
|
**Balanced profile:** Standard monitoring
|
|
**Budget profile:** Simple heartbeat checks
|
|
|
|
### Researcher (Discovery) - Opus/Sonnet/Haiku
|
|
Research is read-only exploration. High volume, low modification risk.
|
|
|
|
**Quality profile:** Deep domain analysis
|
|
**Balanced profile:** Standard codebase exploration
|
|
**Budget profile:** Simple file lookups
|
|
|
|
---
|
|
|
|
## Profile Selection
|
|
|
|
### Per-Initiative Override
|
|
|
|
```yaml
|
|
# In initiative config
|
|
model_profile: quality # Override default balanced
|
|
```
|
|
|
|
### Per-Agent Override
|
|
|
|
```yaml
|
|
# In task assignment
|
|
assigned_to: worker-123
|
|
model_override: opus # This task needs Opus
|
|
```
|
|
|
|
### Automatic Escalation
|
|
|
|
```yaml
|
|
# When to auto-escalate
|
|
escalation_triggers:
|
|
- condition: "task.retry_count > 2"
|
|
action: "escalate_model"
|
|
- condition: "task.complexity == 'high'"
|
|
action: "use_quality_profile"
|
|
- condition: "deviation.rule == 4"
|
|
action: "escalate_model"
|
|
```
|
|
|
|
---
|
|
|
|
## Cost Management
|
|
|
|
### Estimated Token Usage
|
|
|
|
| Agent | Avg Tokens/Task | Profile Impact |
|
|
|-------|-----------------|----------------|
|
|
| Architect | 50k-100k | 3x between budget/quality |
|
|
| Worker | 20k-50k | 2x between budget/quality |
|
|
| Verifier | 10k-30k | 1.5x between budget/quality |
|
|
| Orchestrator | 5k-15k | 1.5x between budget/quality |
|
|
|
|
### Cost Optimization Strategies
|
|
|
|
1. **Right-size tasks:** Smaller tasks = less token usage
|
|
2. **Use budget for volume:** Monitoring, simple checks
|
|
3. **Reserve quality for impact:** Architecture, security
|
|
4. **Profile per initiative:** Simple features use budget, complex use quality
|
|
|
|
---
|
|
|
|
## Configuration
|
|
|
|
### Default Profile
|
|
|
|
```json
|
|
// .planning/config.json
|
|
{
|
|
"model_profile": "balanced",
|
|
"model_overrides": {
|
|
"architect": null,
|
|
"worker": null,
|
|
"verifier": null
|
|
}
|
|
}
|
|
```
|
|
|
|
### Quality Profile
|
|
|
|
```json
|
|
{
|
|
"model_profile": "quality",
|
|
"model_overrides": {}
|
|
}
|
|
```
|
|
|
|
### Budget Profile
|
|
|
|
```json
|
|
{
|
|
"model_profile": "budget",
|
|
"model_overrides": {
|
|
"architect": "sonnet" // Keep architect at sonnet minimum
|
|
}
|
|
}
|
|
```
|
|
|
|
### Mixed Profile
|
|
|
|
```json
|
|
{
|
|
"model_profile": "balanced",
|
|
"model_overrides": {
|
|
"architect": "opus", // Invest in planning
|
|
"worker": "sonnet", // Standard execution
|
|
"verifier": "haiku" // Budget verification
|
|
}
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## Model Capabilities Reference
|
|
|
|
### Opus
|
|
- **Strengths:** Complex reasoning, nuanced decisions, novel problems
|
|
- **Best for:** Architecture, complex algorithms, security analysis
|
|
- **Cost:** Highest
|
|
|
|
### Sonnet
|
|
- **Strengths:** Good balance of reasoning and speed, reliable
|
|
- **Best for:** Standard development, code generation, debugging
|
|
- **Cost:** Medium
|
|
|
|
### Haiku
|
|
- **Strengths:** Fast, cheap, good for structured tasks
|
|
- **Best for:** Monitoring, simple checks, high-volume operations
|
|
- **Cost:** Lowest
|
|
|
|
---
|
|
|
|
## Profile Switching
|
|
|
|
### CLI Command
|
|
|
|
```bash
|
|
# Set profile for all future work
|
|
cw config set model_profile quality
|
|
|
|
# Set profile for specific initiative
|
|
cw initiative config <id> --model-profile budget
|
|
|
|
# Override for single task
|
|
cw task update <id> --model-override opus
|
|
```
|
|
|
|
### API
|
|
|
|
```typescript
|
|
// Set initiative profile
|
|
await initiative.setConfig(id, { modelProfile: 'quality' });
|
|
|
|
// Override task model
|
|
await task.update(id, { modelOverride: 'opus' });
|
|
```
|
|
|
|
---
|
|
|
|
## Monitoring Model Usage
|
|
|
|
Track model usage for cost analysis:
|
|
|
|
```sql
|
|
CREATE TABLE model_usage (
|
|
id INTEGER PRIMARY KEY AUTOINCREMENT,
|
|
agent_type TEXT NOT NULL,
|
|
model TEXT NOT NULL,
|
|
tokens_input INTEGER,
|
|
tokens_output INTEGER,
|
|
task_id TEXT,
|
|
initiative_id TEXT,
|
|
created_at INTEGER DEFAULT (unixepoch())
|
|
);
|
|
|
|
-- Usage by agent type
|
|
SELECT agent_type, model, SUM(tokens_input + tokens_output) as total_tokens
|
|
FROM model_usage
|
|
GROUP BY agent_type, model;
|
|
|
|
-- Cost by initiative
|
|
SELECT initiative_id,
|
|
SUM(CASE WHEN model = 'opus' THEN tokens * 0.015
|
|
WHEN model = 'sonnet' THEN tokens * 0.003
|
|
WHEN model = 'haiku' THEN tokens * 0.0003 END) as estimated_cost
|
|
FROM model_usage
|
|
GROUP BY initiative_id;
|
|
```
|
|
|
|
---
|
|
|
|
## Recommendations
|
|
|
|
### Starting Out
|
|
Use **balanced** profile. It provides good quality at reasonable cost.
|
|
|
|
### High-Stakes Projects
|
|
Use **quality** profile. The cost difference is negligible compared to getting it right.
|
|
|
|
### High-Volume Work
|
|
Use **budget** profile with architect override to sonnet. Don't skimp on planning.
|
|
|
|
### Learning the System
|
|
Use **quality** profile initially. See what good output looks like before optimizing for cost.
|