Files
Codewalkers/docs/archive/model-profiles.md
Lukas May 342b490fe7 feat: Task decomposition for Tailwind/Radix/shadcn foundation setup
Decomposed "Foundation Setup - Install Dependencies & Configure Tailwind"
phase into 6 executable tasks:

1. Install Tailwind CSS, PostCSS & Autoprefixer
2. Map MUI theme to Tailwind design tokens
3. Setup CSS variables for dynamic theming
4. Install Radix UI primitives
5. Initialize shadcn/ui and setup component directory
6. Move MUI to devDependencies and verify setup

Tasks follow logical dependency chain with final human verification
checkpoint before proceeding with component migration.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-10 09:48:51 +01:00

6.6 KiB

Model Profiles

Different agent roles have different needs. Model selection balances quality, cost, and latency.

Profile Definitions

Profile Use Case Cost Quality
quality Critical decisions, architecture Highest Best
balanced Default for most work Medium Good
budget High-volume, low-risk tasks Lowest Acceptable

Agent Model Assignments

Agent Quality Balanced (Default) Budget
Architect Opus Opus Sonnet
Worker Opus Sonnet Sonnet
Verifier Sonnet Sonnet Haiku
Orchestrator Sonnet Sonnet Haiku
Monitor Sonnet Haiku Haiku
Researcher Opus Sonnet Haiku

Rationale

Architect (Planning) - Opus/Opus/Sonnet

Planning has the highest impact on outcomes. A bad plan wastes all downstream execution. Invest in quality here.

Quality profile: Complex systems, novel domains, critical decisions Balanced profile: Standard feature work, established patterns Budget profile: Simple initiatives, well-documented domains

Worker (Execution) - Opus/Sonnet/Sonnet

The plan already contains reasoning. Execution is implementation, not decision-making.

Quality profile: Complex algorithms, security-critical code Balanced profile: Standard implementation work Budget profile: Simple tasks, boilerplate code

Verifier (Validation) - Sonnet/Sonnet/Haiku

Verification is structured checking against defined criteria. Less reasoning needed than planning.

Quality profile: Complex verification, subtle integration issues Balanced profile: Standard goal-backward verification Budget profile: Simple pass/fail checks

Orchestrator (Coordination) - Sonnet/Sonnet/Haiku

Orchestrator routes work, doesn't do heavy lifting. Needs reliability, not creativity.

Quality profile: Complex multi-agent coordination Balanced profile: Standard workflow management Budget profile: Simple task routing

Monitor (Observation) - Sonnet/Haiku/Haiku

Monitoring is pattern matching and threshold checking. Minimal reasoning required.

Quality profile: Complex health analysis Balanced profile: Standard monitoring Budget profile: Simple heartbeat checks

Researcher (Discovery) - Opus/Sonnet/Haiku

Research is read-only exploration. High volume, low modification risk.

Quality profile: Deep domain analysis Balanced profile: Standard codebase exploration Budget profile: Simple file lookups


Profile Selection

Per-Initiative Override

# In initiative config
model_profile: quality  # Override default balanced

Per-Agent Override

# In task assignment
assigned_to: worker-123
model_override: opus  # This task needs Opus

Automatic Escalation

# When to auto-escalate
escalation_triggers:
  - condition: "task.retry_count > 2"
    action: "escalate_model"
  - condition: "task.complexity == 'high'"
    action: "use_quality_profile"
  - condition: "deviation.rule == 4"
    action: "escalate_model"

Cost Management

Estimated Token Usage

Agent Avg Tokens/Task Profile Impact
Architect 50k-100k 3x between budget/quality
Worker 20k-50k 2x between budget/quality
Verifier 10k-30k 1.5x between budget/quality
Orchestrator 5k-15k 1.5x between budget/quality

Cost Optimization Strategies

  1. Right-size tasks: Smaller tasks = less token usage
  2. Use budget for volume: Monitoring, simple checks
  3. Reserve quality for impact: Architecture, security
  4. Profile per initiative: Simple features use budget, complex use quality

Configuration

Default Profile

// .planning/config.json
{
  "model_profile": "balanced",
  "model_overrides": {
    "architect": null,
    "worker": null,
    "verifier": null
  }
}

Quality Profile

{
  "model_profile": "quality",
  "model_overrides": {}
}

Budget Profile

{
  "model_profile": "budget",
  "model_overrides": {
    "architect": "sonnet"  // Keep architect at sonnet minimum
  }
}

Mixed Profile

{
  "model_profile": "balanced",
  "model_overrides": {
    "architect": "opus",     // Invest in planning
    "worker": "sonnet",      // Standard execution
    "verifier": "haiku"      // Budget verification
  }
}

Model Capabilities Reference

Opus

  • Strengths: Complex reasoning, nuanced decisions, novel problems
  • Best for: Architecture, complex algorithms, security analysis
  • Cost: Highest

Sonnet

  • Strengths: Good balance of reasoning and speed, reliable
  • Best for: Standard development, code generation, debugging
  • Cost: Medium

Haiku

  • Strengths: Fast, cheap, good for structured tasks
  • Best for: Monitoring, simple checks, high-volume operations
  • Cost: Lowest

Profile Switching

CLI Command

# Set profile for all future work
cw config set model_profile quality

# Set profile for specific initiative
cw initiative config <id> --model-profile budget

# Override for single task
cw task update <id> --model-override opus

API

// Set initiative profile
await initiative.setConfig(id, { modelProfile: 'quality' });

// Override task model
await task.update(id, { modelOverride: 'opus' });

Monitoring Model Usage

Track model usage for cost analysis:

CREATE TABLE model_usage (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  agent_type TEXT NOT NULL,
  model TEXT NOT NULL,
  tokens_input INTEGER,
  tokens_output INTEGER,
  task_id TEXT,
  initiative_id TEXT,
  created_at INTEGER DEFAULT (unixepoch())
);

-- Usage by agent type
SELECT agent_type, model, SUM(tokens_input + tokens_output) as total_tokens
FROM model_usage
GROUP BY agent_type, model;

-- Cost by initiative
SELECT initiative_id,
       SUM(CASE WHEN model = 'opus' THEN tokens * 0.015
                WHEN model = 'sonnet' THEN tokens * 0.003
                WHEN model = 'haiku' THEN tokens * 0.0003 END) as estimated_cost
FROM model_usage
GROUP BY initiative_id;

Recommendations

Starting Out

Use balanced profile. It provides good quality at reasonable cost.

High-Stakes Projects

Use quality profile. The cost difference is negligible compared to getting it right.

High-Volume Work

Use budget profile with architect override to sonnet. Don't skimp on planning.

Learning the System

Use quality profile initially. See what good output looks like before optimizing for cost.