Files
Codewalkers/.planning/research/FEATURES.md
Lukas May 0ff65b0b02 feat: Rename application from "Codewalk District" to "Codewalkers"
Update all user-facing strings (HTML title, manifest, header logo,
browser title updater), code comments, and documentation references.
Folder name retained as-is.
2026-03-05 12:05:08 +01:00

13 KiB

Feature Research: Codewalkers

Metadata

  • Domain: Multi-agent orchestration / Developer tooling
  • Researched: 2026-01-30
  • Confidence: HIGH
  • Focus: Solo developer orchestrating multiple Claude Code agents

Executive Summary

The multi-agent AI tooling space is maturing rapidly. Tools like Claude Squad, Claude Flow, and par have established baseline expectations. Your differentiator isn't the ability to run multiple agents—that's table stakes now. The differentiator is in coordination quality: preventing conflicts before they happen, making review manageable, and keeping the developer in control without drowning in context switches.


Table Stakes (Must Have)

Feature Why Expected Complexity Notes
Git worktree isolation Prevents agents from stepping on each other. Every serious parallel workflow tool uses this. Low Claude Squad, par, incident.io all use this pattern
tmux session management Persistent, detachable sessions. Users expect to close terminal and come back. Low Standard pattern; unreliable without delays for send-keys
Background execution "Yolo mode" / auto-accept. Users want agents working while they're away. Low Claude Squad pioneered this; now expected
Task status visibility Users need to see what each agent is doing at a glance Medium X of Y pattern or spinner per agent
Agent lifecycle management Start, stop, pause, resume agents Low Basic CRUD for agent sessions
Branch isolation Each agent works on its own branch Low Natural consequence of worktree approach
Session persistence Resume work after disconnect/reboot Medium tmux provides this; need to restore state
Basic logging Capture stdout/stderr per agent Low Essential for debugging and review
CLI interface Terminal-native; developers live here Low No web UI needed for v1
Git diff preview See changes before merging Low Standard git tooling

Differentiators (Competitive Advantage)

Feature Value Proposition Complexity Notes
Intelligent task breakdown Turn "implement auth" into parallelizable subtasks. Most tools leave this to the user. High CrewAI/LangGraph do this for agents; nobody does it well for Claude Code orchestration
Conflict prevention via file-level coordination Prevent merge hell by knowing which files each agent is touching High Agent-MCP has "file-level locking"; huge pain point
Review-first workflow Bottleneck is review, not generation. Tools that acknowledge this win. Medium Most tools assume shipping is the goal; quality-first is rare
Dependency-aware task ordering Don't start task B until task A creates the interface it depends on High Graph-based execution like LangGraph
Context summarization between agents Agent B can learn from Agent A's work without reading everything High Reduces token waste; improves coherence
Token/cost tracking per agent Know what each agent costs; budget accordingly Medium PM2 has resource monitoring; AI tools rarely have cost visibility
Merge conflict prediction Warn before conflicts happen, not after High Harmony AI does post-hoc resolution; prediction is better
Interactive task tree Visual hierarchy of tasks, dependencies, status Medium ClickUp-style breakdown but terminal-native
Automatic PR generation Merge agent's work, create PR with description Medium Tedious step that can be automated
Cross-agent communication Agent A can ask Agent B a question mid-task Very High CrewAI has this; complex to implement well

Anti-Features (Avoid Building)

Feature Why Requested Why Problematic Alternative
Web dashboard "Modern" / visual appeal Adds complexity; developers prefer terminal; maintenance burden Rich terminal UI (blessed, ink, etc.)
Agent-to-agent direct chat "Collaboration" Unpredictable behavior; context explosion; debugging nightmare Structured handoffs via task completion
Automatic merge without review "Faster" Quality disaster; merge conflicts; broken code Preview + one-command merge
Unlimited parallel agents "Maximum throughput" Review bottleneck; resource exhaustion; diminishing returns Soft cap with warning (3-5 agents)
Plugin system (v1) "Extensibility" Premature; need to learn what users actually want first Hardcode common patterns; plugins in v2
Cloud sync "Work anywhere" Privacy concerns; complexity; not the core problem Local-first; git is your sync
AI task decomposition (v1) "Smarter" Unpredictable; users need to understand their tasks first Manual breakdown with templates
Real-time collaboration "Team feature" Solo developer focus; complexity explosion git push/pull is collaboration

Feature Dependencies

[Git worktree isolation]
         |
         v
[tmux session management] -----> [Session persistence]
         |
         v
[Agent lifecycle management]
         |
    +----+----+
    |         |
    v         v
[Task status   [Background
 visibility]    execution]
    |
    v
[Basic logging]
    |
    v
[Git diff preview]
    |
    v
[Interactive task tree] -----> [Dependency-aware ordering]
    |
    v
[Conflict prevention] -----> [Merge conflict prediction]
    |
    v
[Automatic PR generation]

MVP Definition

v1.0 - "It Works"

Core value: Run multiple Claude Code agents without losing track.

  • Git worktree creation/cleanup per agent
  • tmux session per agent
  • Start/stop/list agents CLI
  • Basic status display (which agents running, which branch)
  • Background execution mode
  • Session resume after terminal close
  • Simple logging (stdout capture)
  • cs new <task> / cs list / cs stop <id> / cs status

Success metric: Can run 3 agents in parallel on different features without conflicts.

v1.x - "It's Pleasant"

Core value: Workflow feels natural, not janky.

  • Rich terminal UI (status dashboard)
  • Task description per agent
  • Git diff preview per agent
  • One-command merge to main
  • Token/cost tracking per agent
  • PR generation from merged work
  • Task templates for common patterns
  • Configurable agent limits with warnings

Success metric: User can manage 5 parallel agents without context-switch fatigue.

v2.0 - "It's Smart"

Core value: The tool helps you work better, not just faster.

  • Intelligent task breakdown suggestions
  • File-level coordination (who's touching what)
  • Dependency-aware task ordering
  • Conflict prediction before merge
  • Cross-agent context summarization
  • Review workflow integration
  • Historical task/cost analytics

Success metric: 50% fewer merge conflicts; review time per agent drops.


Feature Prioritization Matrix

Feature User Impact Implementation Effort Priority
Git worktree isolation Critical Low P0
tmux session management Critical Low P0
Agent lifecycle (start/stop/list) Critical Low P0
Basic status display High Low P0
Background execution High Low P0
Session persistence High Medium P1
Basic logging High Low P1
Git diff preview Medium Low P1
Rich terminal UI Medium Medium P2
Token tracking Medium Medium P2
PR generation Medium Medium P2
File-level coordination High High P2
Conflict prediction High High P3
Task dependency ordering Medium High P3
AI task breakdown Medium Very High P4

Competitor Feature Analysis

Claude Squad

Strengths:

  • First-mover; established pattern
  • Clean tmux + worktree architecture
  • Background daemon for auto-accept
  • Supports multiple agent types (Claude, Aider, Codex)

Weaknesses:

  • No task breakdown/planning
  • No conflict prevention
  • No cost tracking
  • Basic status visibility

Opportunity: They solved "run multiple agents." You solve "coordinate multiple agents."

par (Parallel Worktree & Session Manager)

Strengths:

  • Global CLI tool
  • Simple mental model
  • Designed for AI assistants

Weaknesses:

  • Minimal coordination features
  • No agent-specific intelligence
  • Basic session management

Opportunity: par is infrastructure; you're workflow.

Claude Flow

Strengths:

  • Enterprise positioning
  • MCP integration
  • "Swarm intelligence" marketing

Weaknesses:

  • Over-engineered for solo developers
  • Complex configuration
  • Unclear value prop for simple use cases

Opportunity: Simplicity wins for solo developers.

Native Claude Code Subagents

Strengths:

  • Built-in; no extra tools
  • Resumable sessions
  • Tool access control

Weaknesses:

  • Single main context (subagents are delegated)
  • No true parallelism
  • No isolation between subagents

Opportunity: True parallel execution that native subagents can't provide.


Key Insights

  1. Review is the bottleneck: Tools optimize for generation speed. Smart tools optimize for review speed. Two well-reviewed PRs beat five half-reviewed ones.

  2. Conflict prevention > conflict resolution: Everyone's building AI merge conflict resolvers. Nobody's preventing conflicts in the first place. File-level coordination is the gap.

  3. Token costs are invisible: Developers have no idea what parallel agents cost. First tool to make costs visible wins trust.

  4. 3-5 agents is the sweet spot: Research consistently shows diminishing returns beyond 5 parallel agents. The bottleneck is human review capacity.

  5. Graph-based coordination is coming: LangGraph's architecture (nodes + edges + state) is the right mental model for multi-agent coordination. CrewAI's role-based model is simpler but less flexible.

  6. Terminal-native wins: Developers live in the terminal. Web dashboards are a distraction for this use case.

  7. Local-first, always: Privacy matters. No cloud sync needed when git is your collaboration layer.


Sources

Multi-Agent Orchestration

Claude Code Multi-Agent Tools

Git Worktree Workflows

AI Coding Assistants

Process Management

CLI UX & Monitoring

Conflict Resolution