Codewalkers/docs/model-profiles.md

# Model Profiles

Different agent roles have different needs. Model selection balances quality, cost, and latency.

## Profile Definitions

| Profile | Use Case | Cost | Quality |
|---------|----------|------|---------|
| **quality** | Critical decisions, architecture | Highest | Best |
| **balanced** | Default for most work | Medium | Good |
| **budget** | High-volume, low-risk tasks | Lowest | Acceptable |

---

## Agent Model Assignments

| Agent | Quality | Balanced (Default) | Budget |
|-------|---------|-------------------|--------|
| **Architect** | Opus | Opus | Sonnet |
| **Worker** | Opus | Sonnet | Sonnet |
| **Verifier** | Sonnet | Sonnet | Haiku |
| **Orchestrator** | Sonnet | Sonnet | Haiku |
| **Monitor** | Sonnet | Haiku | Haiku |
| **Researcher** | Opus | Sonnet | Haiku |

---

## Rationale

### Architect (Planning) - Opus/Opus/Sonnet
Planning has the highest impact on outcomes. A bad plan wastes all downstream execution. Invest in quality here.

**Quality profile:** Complex systems, novel domains, critical decisions
**Balanced profile:** Standard feature work, established patterns
**Budget profile:** Simple initiatives, well-documented domains

### Worker (Execution) - Opus/Sonnet/Sonnet
The plan already contains reasoning. Execution is implementation, not decision-making.

**Quality profile:** Complex algorithms, security-critical code
**Balanced profile:** Standard implementation work
**Budget profile:** Simple tasks, boilerplate code

### Verifier (Validation) - Sonnet/Sonnet/Haiku
Verification is structured checking against defined criteria. Less reasoning needed than planning.

**Quality profile:** Complex verification, subtle integration issues
**Balanced profile:** Standard goal-backward verification
**Budget profile:** Simple pass/fail checks

### Orchestrator (Coordination) - Sonnet/Sonnet/Haiku
Orchestrator routes work, doesn't do heavy lifting. Needs reliability, not creativity.

**Quality profile:** Complex multi-agent coordination
**Balanced profile:** Standard workflow management
**Budget profile:** Simple task routing

### Monitor (Observation) - Sonnet/Haiku/Haiku
Monitoring is pattern matching and threshold checking. Minimal reasoning required.

**Quality profile:** Complex health analysis
**Balanced profile:** Standard monitoring
**Budget profile:** Simple heartbeat checks

### Researcher (Discovery) - Opus/Sonnet/Haiku
Research is read-only exploration. High volume, low modification risk.

**Quality profile:** Deep domain analysis
**Balanced profile:** Standard codebase exploration
**Budget profile:** Simple file lookups

---

## Profile Selection

### Per-Initiative Override

```yaml
# In initiative config
model_profile: quality  # Override default balanced
```

### Per-Agent Override

```yaml
# In task assignment
assigned_to: worker-123
model_override: opus  # This task needs Opus
```

### Automatic Escalation

```yaml
# When to auto-escalate
escalation_triggers:
  - condition: "task.retry_count > 2"
    action: "escalate_model"
  - condition: "task.complexity == 'high'"
    action: "use_quality_profile"
  - condition: "deviation.rule == 4"
    action: "escalate_model"
```

---

## Cost Management

### Estimated Token Usage

| Agent | Avg Tokens/Task | Profile Impact |
|-------|-----------------|----------------|
| Architect | 50k-100k | 3x between budget/quality |
| Worker | 20k-50k | 2x between budget/quality |
| Verifier | 10k-30k | 1.5x between budget/quality |
| Orchestrator | 5k-15k | 1.5x between budget/quality |

### Cost Optimization Strategies

1. **Right-size tasks:** Smaller tasks = less token usage
2. **Use budget for volume:** Monitoring, simple checks
3. **Reserve quality for impact:** Architecture, security
4. **Profile per initiative:** Simple features use budget, complex use quality

---

## Configuration

### Default Profile

```json
// .planning/config.json
{
  "model_profile": "balanced",
  "model_overrides": {
    "architect": null,
    "worker": null,
    "verifier": null
  }
}
```

### Quality Profile

```json
{
  "model_profile": "quality",
  "model_overrides": {}
}
```

### Budget Profile

```json
{
  "model_profile": "budget",
  "model_overrides": {
    "architect": "sonnet"  // Keep architect at sonnet minimum
  }
}
```

### Mixed Profile

```json
{
  "model_profile": "balanced",
  "model_overrides": {
    "architect": "opus",     // Invest in planning
    "worker": "sonnet",      // Standard execution
    "verifier": "haiku"      // Budget verification
  }
}
```

---

## Model Capabilities Reference

### Opus
- **Strengths:** Complex reasoning, nuanced decisions, novel problems
- **Best for:** Architecture, complex algorithms, security analysis
- **Cost:** Highest

### Sonnet
- **Strengths:** Good balance of reasoning and speed, reliable
- **Best for:** Standard development, code generation, debugging
- **Cost:** Medium

### Haiku
- **Strengths:** Fast, cheap, good for structured tasks
- **Best for:** Monitoring, simple checks, high-volume operations
- **Cost:** Lowest

---

## Profile Switching

### CLI Command

```bash
# Set profile for all future work
cw config set model_profile quality

# Set profile for specific initiative
cw initiative config <id> --model-profile budget

# Override for single task
cw task update <id> --model-override opus
```

### API

```typescript
// Set initiative profile
await initiative.setConfig(id, { modelProfile: 'quality' });

// Override task model
await task.update(id, { modelOverride: 'opus' });
```

---

## Monitoring Model Usage

Track model usage for cost analysis:

```sql
CREATE TABLE model_usage (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  agent_type TEXT NOT NULL,
  model TEXT NOT NULL,
  tokens_input INTEGER,
  tokens_output INTEGER,
  task_id TEXT,
  initiative_id TEXT,
  created_at INTEGER DEFAULT (unixepoch())
);

-- Usage by agent type
SELECT agent_type, model, SUM(tokens_input + tokens_output) as total_tokens
FROM model_usage
GROUP BY agent_type, model;

-- Cost by initiative
SELECT initiative_id,
       SUM(CASE WHEN model = 'opus' THEN tokens * 0.015
                WHEN model = 'sonnet' THEN tokens * 0.003
                WHEN model = 'haiku' THEN tokens * 0.0003 END) as estimated_cost
FROM model_usage
GROUP BY initiative_id;
```

---

## Recommendations

### Starting Out
Use **balanced** profile. It provides good quality at reasonable cost.

### High-Stakes Projects
Use **quality** profile. The cost difference is negligible compared to getting it right.

### High-Volume Work
Use **budget** profile with architect override to sonnet. Don't skimp on planning.

### Learning the System
Use **quality** profile initially. See what good output looks like before optimizing for cost.