# Feature Research: Codewalkers

## Metadata
- **Domain**: Multi-agent orchestration / Developer tooling
- **Researched**: 2026-01-30
- **Confidence**: HIGH
- **Focus**: Solo developer orchestrating multiple Claude Code agents

---

## Executive Summary

The multi-agent AI tooling space is maturing rapidly. Tools like Claude Squad, Claude Flow, and par have established baseline expectations. Your differentiator isn't the *ability* to run multiple agents—that's table stakes now. The differentiator is in *coordination quality*: preventing conflicts before they happen, making review manageable, and keeping the developer in control without drowning in context switches.

---

## Table Stakes (Must Have)

| Feature | Why Expected | Complexity | Notes |
|---------|--------------|------------|-------|
| **Git worktree isolation** | Prevents agents from stepping on each other. Every serious parallel workflow tool uses this. | Low | Claude Squad, par, incident.io all use this pattern |
| **tmux session management** | Persistent, detachable sessions. Users expect to close terminal and come back. | Low | Standard pattern; unreliable without delays for send-keys |
| **Background execution** | "Yolo mode" / auto-accept. Users want agents working while they're away. | Low | Claude Squad pioneered this; now expected |
| **Task status visibility** | Users need to see what each agent is doing at a glance | Medium | X of Y pattern or spinner per agent |
| **Agent lifecycle management** | Start, stop, pause, resume agents | Low | Basic CRUD for agent sessions |
| **Branch isolation** | Each agent works on its own branch | Low | Natural consequence of worktree approach |
| **Session persistence** | Resume work after disconnect/reboot | Medium | tmux provides this; need to restore state |
| **Basic logging** | Capture stdout/stderr per agent | Low | Essential for debugging and review |
| **CLI interface** | Terminal-native; developers live here | Low | No web UI needed for v1 |
| **Git diff preview** | See changes before merging | Low | Standard git tooling |

---

## Differentiators (Competitive Advantage)

| Feature | Value Proposition | Complexity | Notes |
|---------|-------------------|------------|-------|
| **Intelligent task breakdown** | Turn "implement auth" into parallelizable subtasks. Most tools leave this to the user. | High | CrewAI/LangGraph do this for agents; nobody does it well for Claude Code orchestration |
| **Conflict prevention via file-level coordination** | Prevent merge hell by knowing which files each agent is touching | High | Agent-MCP has "file-level locking"; huge pain point |
| **Review-first workflow** | Bottleneck is review, not generation. Tools that acknowledge this win. | Medium | Most tools assume shipping is the goal; quality-first is rare |
| **Dependency-aware task ordering** | Don't start task B until task A creates the interface it depends on | High | Graph-based execution like LangGraph |
| **Context summarization between agents** | Agent B can learn from Agent A's work without reading everything | High | Reduces token waste; improves coherence |
| **Token/cost tracking per agent** | Know what each agent costs; budget accordingly | Medium | PM2 has resource monitoring; AI tools rarely have cost visibility |
| **Merge conflict prediction** | Warn before conflicts happen, not after | High | Harmony AI does post-hoc resolution; prediction is better |
| **Interactive task tree** | Visual hierarchy of tasks, dependencies, status | Medium | ClickUp-style breakdown but terminal-native |
| **Automatic PR generation** | Merge agent's work, create PR with description | Medium | Tedious step that can be automated |
| **Cross-agent communication** | Agent A can ask Agent B a question mid-task | Very High | CrewAI has this; complex to implement well |

---

## Anti-Features (Avoid Building)

| Feature | Why Requested | Why Problematic | Alternative |
|---------|---------------|-----------------|-------------|
| **Web dashboard** | "Modern" / visual appeal | Adds complexity; developers prefer terminal; maintenance burden | Rich terminal UI (blessed, ink, etc.) |
| **Agent-to-agent direct chat** | "Collaboration" | Unpredictable behavior; context explosion; debugging nightmare | Structured handoffs via task completion |
| **Automatic merge without review** | "Faster" | Quality disaster; merge conflicts; broken code | Preview + one-command merge |
| **Unlimited parallel agents** | "Maximum throughput" | Review bottleneck; resource exhaustion; diminishing returns | Soft cap with warning (3-5 agents) |
| **Plugin system (v1)** | "Extensibility" | Premature; need to learn what users actually want first | Hardcode common patterns; plugins in v2 |
| **Cloud sync** | "Work anywhere" | Privacy concerns; complexity; not the core problem | Local-first; git is your sync |
| **AI task decomposition (v1)** | "Smarter" | Unpredictable; users need to understand their tasks first | Manual breakdown with templates |
| **Real-time collaboration** | "Team feature" | Solo developer focus; complexity explosion | git push/pull is collaboration |

---

## Feature Dependencies

```
[Git worktree isolation]
         |
         v
[tmux session management] -----> [Session persistence]
         |
         v
[Agent lifecycle management]
         |
    +----+----+
    |         |
    v         v
[Task status   [Background
 visibility]    execution]
    |
    v
[Basic logging]
    |
    v
[Git diff preview]
    |
    v
[Interactive task tree] -----> [Dependency-aware ordering]
    |
    v
[Conflict prevention] -----> [Merge conflict prediction]
    |
    v
[Automatic PR generation]
```

---

## MVP Definition

### v1.0 - "It Works"
Core value: Run multiple Claude Code agents without losing track.

- Git worktree creation/cleanup per agent
- tmux session per agent
- Start/stop/list agents CLI
- Basic status display (which agents running, which branch)
- Background execution mode
- Session resume after terminal close
- Simple logging (stdout capture)
- `cs new <task>` / `cs list` / `cs stop <id>` / `cs status`

**Success metric**: Can run 3 agents in parallel on different features without conflicts.

### v1.x - "It's Pleasant"
Core value: Workflow feels natural, not janky.

- Rich terminal UI (status dashboard)
- Task description per agent
- Git diff preview per agent
- One-command merge to main
- Token/cost tracking per agent
- PR generation from merged work
- Task templates for common patterns
- Configurable agent limits with warnings

**Success metric**: User can manage 5 parallel agents without context-switch fatigue.

### v2.0 - "It's Smart"
Core value: The tool helps you work better, not just faster.

- Intelligent task breakdown suggestions
- File-level coordination (who's touching what)
- Dependency-aware task ordering
- Conflict prediction before merge
- Cross-agent context summarization
- Review workflow integration
- Historical task/cost analytics

**Success metric**: 50% fewer merge conflicts; review time per agent drops.

---

## Feature Prioritization Matrix

| Feature | User Impact | Implementation Effort | Priority |
|---------|-------------|----------------------|----------|
| Git worktree isolation | Critical | Low | **P0** |
| tmux session management | Critical | Low | **P0** |
| Agent lifecycle (start/stop/list) | Critical | Low | **P0** |
| Basic status display | High | Low | **P0** |
| Background execution | High | Low | **P0** |
| Session persistence | High | Medium | **P1** |
| Basic logging | High | Low | **P1** |
| Git diff preview | Medium | Low | **P1** |
| Rich terminal UI | Medium | Medium | **P2** |
| Token tracking | Medium | Medium | **P2** |
| PR generation | Medium | Medium | **P2** |
| File-level coordination | High | High | **P2** |
| Conflict prediction | High | High | **P3** |
| Task dependency ordering | Medium | High | **P3** |
| AI task breakdown | Medium | Very High | **P4** |

---

## Competitor Feature Analysis

### Claude Squad
**Strengths**:
- First-mover; established pattern
- Clean tmux + worktree architecture
- Background daemon for auto-accept
- Supports multiple agent types (Claude, Aider, Codex)

**Weaknesses**:
- No task breakdown/planning
- No conflict prevention
- No cost tracking
- Basic status visibility

**Opportunity**: They solved "run multiple agents." You solve "coordinate multiple agents."

### par (Parallel Worktree & Session Manager)
**Strengths**:
- Global CLI tool
- Simple mental model
- Designed for AI assistants

**Weaknesses**:
- Minimal coordination features
- No agent-specific intelligence
- Basic session management

**Opportunity**: par is infrastructure; you're workflow.

### Claude Flow
**Strengths**:
- Enterprise positioning
- MCP integration
- "Swarm intelligence" marketing

**Weaknesses**:
- Over-engineered for solo developers
- Complex configuration
- Unclear value prop for simple use cases

**Opportunity**: Simplicity wins for solo developers.

### Native Claude Code Subagents
**Strengths**:
- Built-in; no extra tools
- Resumable sessions
- Tool access control

**Weaknesses**:
- Single main context (subagents are delegated)
- No true parallelism
- No isolation between subagents

**Opportunity**: True parallel execution that native subagents can't provide.

---

## Key Insights

1. **Review is the bottleneck**: Tools optimize for generation speed. Smart tools optimize for review speed. Two well-reviewed PRs beat five half-reviewed ones.

2. **Conflict prevention > conflict resolution**: Everyone's building AI merge conflict resolvers. Nobody's preventing conflicts in the first place. File-level coordination is the gap.

3. **Token costs are invisible**: Developers have no idea what parallel agents cost. First tool to make costs visible wins trust.

4. **3-5 agents is the sweet spot**: Research consistently shows diminishing returns beyond 5 parallel agents. The bottleneck is human review capacity.

5. **Graph-based coordination is coming**: LangGraph's architecture (nodes + edges + state) is the right mental model for multi-agent coordination. CrewAI's role-based model is simpler but less flexible.

6. **Terminal-native wins**: Developers live in the terminal. Web dashboards are a distraction for this use case.

7. **Local-first, always**: Privacy matters. No cloud sync needed when git is your collaboration layer.

---

## Sources

### Multi-Agent Orchestration
- [Top AI Agent Orchestration Frameworks for Developers 2025](https://www.kubiya.ai/blog/ai-agent-orchestration-frameworks)
- [AI Agent Orchestration Frameworks - n8n Blog](https://blog.n8n.io/ai-agent-orchestration-frameworks/)
- [CrewAI - The Leading Multi-Agent Platform](https://www.crewai.com/)
- [LangGraph](https://www.langchain.com/langgraph)
- [LangGraph vs CrewAI - ZenML Blog](https://www.zenml.io/blog/langgraph-vs-crewai)
- [CrewAI vs LangGraph vs AutoGen - DataCamp](https://www.datacamp.com/tutorial/crewai-vs-langgraph-vs-autogen)

### Claude Code Multi-Agent Tools
- [Claude Code Subagents Documentation](https://code.claude.com/docs/en/sub-agents)
- [Claude Squad - GitHub](https://github.com/smtg-ai/claude-squad)
- [Claude Flow - GitHub](https://github.com/ruvnet/claude-flow)
- [par - Parallel Worktree & Session Manager](https://github.com/coplane/par)
- [Advanced Claude Code Techniques - Medium](https://medium.com/@salwan.mohamed/advanced-claude-code-techniques-multi-agent-workflows-and-parallel-development-for-devops-89377460252c)

### Git Worktree Workflows
- [How Git Worktrees Changed My AI Agent Workflow - Nx Blog](https://nx.dev/blog/git-worktrees-ai-agents)
- [Parallel Workflows: Git Worktrees and Multiple AI Agents - Medium](https://medium.com/@dennis.somerville/parallel-workflows-git-worktrees-and-the-art-of-managing-multiple-ai-agents-6fa3dc5eec1d)
- [How we're shipping faster with Claude Code and Git Worktrees - incident.io](https://incident.io/blog/shipping-faster-with-claude-code-and-git-worktrees)
- [gwq - Git Worktree Manager](https://github.com/d-kuro/gwq)

### AI Coding Assistants
- [Aider vs Cursor - Sider](https://sider.ai/blog/ai-tools/ai-aider-vs-cursor-which-ai-coding-assistant-wins-for-2025)
- [Best AI Coding Assistants 2026 - Shakudo](https://www.shakudo.io/blog/best-ai-coding-assistants)
- [Top 9 Cursor Alternatives - Cline](https://cline.bot/blog/top-9-cursor-alternatives-in-2025-best-open-source-ai-dev-tools-for-developers)

### Process Management
- [PM2 vs Supervisord - StackShare](https://stackshare.io/stackups/pm2-vs-supervisord)
- [PM2 - Production Process Manager](https://pm2.keymetrics.io/)
- [Cronicle - Multi-Server Task Scheduler](https://github.com/jhuckaby/Cronicle)

### CLI UX & Monitoring
- [CLI UX Best Practices: Progress Displays - Evil Martians](https://evilmartians.com/chronicles/cli-ux-best-practices-3-patterns-for-improving-progress-displays)
- [WTF - Terminal Dashboard](https://wtfutil.com/)
- [Sampler - Console Dashboards](https://dzone.com/articles/build-beautiful-console-dashboards-with-sampler)

### Conflict Resolution
- [The Role of AI in Merge Conflict Resolution - Graphite](https://graphite.com/guides/ai-code-merge-conflict-resolution)
- [Agent-MCP - Multi-Agent Framework](https://github.com/rinadelph/Agent-MCP)
- [Multi-Agent Parallel Execution - Skywork](https://skywork.ai/blog/agent/multi-agent-parallel-execution-running-multiple-ai-agents-simultaneously/)