chore: complete v1.1 milestone
- Created MILESTONES.md with v1.0 and v1.1 entries - Evolved PROJECT.md with validated requirements and current state - Reorganized ROADMAP.md with collapsed v1.1 milestone - Created milestone archive: milestones/v1.1-ROADMAP.md - Updated STATE.md for next milestone planning
This commit is contained in:
57
.planning/MILESTONES.md
Normal file
57
.planning/MILESTONES.md
Normal file
@@ -0,0 +1,57 @@
|
|||||||
|
# Project Milestones: Codewalk District
|
||||||
|
|
||||||
|
## v1.1 Test Infrastructure (Shipped: 2026-01-31)
|
||||||
|
|
||||||
|
**Delivered:** Complete E2E test coverage with mocked agents proving dispatch and coordination work correctly.
|
||||||
|
|
||||||
|
**Phases completed:** 7-9 (8 plans total, including Phase 8.1 inserted)
|
||||||
|
|
||||||
|
**Key accomplishments:**
|
||||||
|
|
||||||
|
- MockAgentManager adapter with configurable scenarios (success, crash, waiting_for_input)
|
||||||
|
- TestHarness with full system wiring and database fixtures
|
||||||
|
- 34 E2E tests covering happy paths, edge cases, conflicts, recovery, and Q&A flows
|
||||||
|
- Structured agent output schema with Zod validation and --json-schema CLI integration
|
||||||
|
- Proof that database is source of truth for state recovery
|
||||||
|
|
||||||
|
**Stats:**
|
||||||
|
|
||||||
|
- 37 files created/modified
|
||||||
|
- 6,786 lines of TypeScript added
|
||||||
|
- 4 phases (including 1 inserted), 8 plans
|
||||||
|
- 1 day from start to ship
|
||||||
|
|
||||||
|
**Git range:** `feat(07-01)` → `docs(09-01)`
|
||||||
|
|
||||||
|
**What's next:** Production readiness, real agent integration testing
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## v1.0 Core System (Shipped: 2026-01-30)
|
||||||
|
|
||||||
|
**Delivered:** Full multi-agent orchestration system with CLI, database, git worktrees, agent lifecycle, task dispatch, and coordination.
|
||||||
|
|
||||||
|
**Phases completed:** 1-6 (27 plans total, including Phase 1.1 inserted)
|
||||||
|
|
||||||
|
**Key accomplishments:**
|
||||||
|
|
||||||
|
- CLI binary (`cw`) with server mode, process management, graceful shutdown
|
||||||
|
- Hexagonal architecture with event bus and tRPC
|
||||||
|
- SQLite database with Drizzle ORM, task hierarchy schema
|
||||||
|
- Git worktree management for agent isolation
|
||||||
|
- Agent lifecycle (spawn, stop, resume) with Claude Code CLI integration
|
||||||
|
- Task dispatch with dependency-ordered work queue
|
||||||
|
- Coordination manager for merge handling and conflict detection
|
||||||
|
|
||||||
|
**Stats:**
|
||||||
|
|
||||||
|
- 100+ files created/modified
|
||||||
|
- ~8,000 lines of TypeScript
|
||||||
|
- 7 phases (including 1 inserted), 27 plans
|
||||||
|
- 1 day from start to ship
|
||||||
|
|
||||||
|
**Git range:** `feat(01-01)` → `docs(06-03)`
|
||||||
|
|
||||||
|
**What's next:** v1.1 Test Infrastructure (completed)
|
||||||
|
|
||||||
|
---
|
||||||
@@ -14,15 +14,18 @@ If everything else fails, this must work: spawn agents, assign work, know what's
|
|||||||
|
|
||||||
### Validated
|
### Validated
|
||||||
|
|
||||||
(None yet — ship to validate)
|
- ✓ **CLI `cw`** — single binary, server mode via `--server`, commands for tasks/initiatives/agents — v1.0
|
||||||
|
- ✓ **Task breakdown system** — initiative → phases → plans → tasks with SQLite backing — v1.0
|
||||||
|
- ✓ **Orchestration layer** — spawn Claude Code agents, track running work, dispatch tasks from queue — v1.0
|
||||||
|
- ✓ **Worktree management** — isolated git worktrees per agent; automatic setup/teardown — v1.0
|
||||||
|
- ✓ **Coordination layer** — merge agent outputs in dependency order, detect conflicts, hand back for resolution — v1.0
|
||||||
|
- ✓ **E2E test coverage** — MockAgentManager, TestHarness, 34 E2E tests proving dispatch/coordination works — v1.1
|
||||||
|
|
||||||
### Active
|
### Active
|
||||||
|
|
||||||
- [ ] **Task breakdown system** — GSD-style initiative → phases → plans → tasks with SQLite backing
|
|
||||||
- [ ] **Orchestration layer** — spawn Claude Code agents, track running work, dispatch tasks from queue
|
|
||||||
- [ ] **File system UI (fsui)** — bidirectional sync between SQLite and filesystem; agent messages appear as files, user responds by editing files
|
- [ ] **File system UI (fsui)** — bidirectional sync between SQLite and filesystem; agent messages appear as files, user responds by editing files
|
||||||
- [ ] **Worktree management** — isolated git worktrees per agent; automatic setup/teardown; agents work in parallel without merge conflicts
|
- [ ] **Real agent integration tests** — tests with actual Claude Code CLI (not mocked)
|
||||||
- [ ] **CLI `cw`** — single binary, server mode via `--server`, commands for tasks/initiatives/agents
|
- [ ] **Production hardening** — error handling, logging improvements, graceful degradation
|
||||||
|
|
||||||
### Out of Scope
|
### Out of Scope
|
||||||
|
|
||||||
@@ -32,6 +35,16 @@ If everything else fails, this must work: spawn agents, assign work, know what's
|
|||||||
- Knowledge capture suggestions — future feature to auto-extend CLAUDE.md
|
- Knowledge capture suggestions — future feature to auto-extend CLAUDE.md
|
||||||
- Multi-user support — solo developer first, stub for future
|
- Multi-user support — solo developer first, stub for future
|
||||||
|
|
||||||
|
## Current State
|
||||||
|
|
||||||
|
**Shipped:** v1.1 Test Infrastructure (2026-01-31)
|
||||||
|
- Full orchestration system: CLI, database, git worktrees, agent lifecycle, dispatch, coordination
|
||||||
|
- 34 E2E tests with MockAgentManager proving all scenarios work
|
||||||
|
- Structured agent output schema with Zod validation
|
||||||
|
- ~15,000 LOC TypeScript across 130+ files
|
||||||
|
|
||||||
|
**Tech stack:** TypeScript, tRPC, SQLite/Drizzle, Vitest, Hexagonal architecture
|
||||||
|
|
||||||
## Context
|
## Context
|
||||||
|
|
||||||
**Pain point:** Running multiple Claude Code agents in separate terminals. Losing track of what each is doing. Hard to parallelize work. Agents collide on the same files. No central coordination.
|
**Pain point:** Running multiple Claude Code agents in separate terminals. Losing track of what each is doing. Hard to parallelize work. Agents collide on the same files. No central coordination.
|
||||||
@@ -68,4 +81,4 @@ If everything else fails, this must work: spawn agents, assign work, know what's
|
|||||||
| Terminal inbox via fsui, not TUI | Less code, leverage existing editor, bidirectional fs sync already planned | — Pending |
|
| Terminal inbox via fsui, not TUI | Less code, leverage existing editor, bidirectional fs sync already planned | — Pending |
|
||||||
|
|
||||||
---
|
---
|
||||||
*Last updated: 2026-01-30 after initialization*
|
*Last updated: 2026-01-31 after v1.1 milestone*
|
||||||
|
|||||||
@@ -115,54 +115,31 @@ Plans:
|
|||||||
|
|
||||||
</details>
|
</details>
|
||||||
|
|
||||||
### ✅ v1.1 Test Infrastructure (Shipped 2026-01-31)
|
<details>
|
||||||
|
<summary>✅ v1.1 Test Infrastructure (Phases 7-9) - SHIPPED 2026-01-31</summary>
|
||||||
|
|
||||||
**Milestone Goal:** E2E test coverage with mocked agents proving all dispatch/coordination scenarios work end-to-end
|
**Milestone Goal:** E2E test coverage with mocked agents proving all dispatch/coordination scenarios work end-to-end
|
||||||
|
|
||||||
#### Phase 7: Mock Agent & Test Harness
|
**Full details:** [milestones/v1.1-ROADMAP.md](milestones/v1.1-ROADMAP.md)
|
||||||
|
|
||||||
**Goal**: Mock agent adapter with configurable scenarios + test harness foundation with DB-seeded fixtures
|
### Phase 7: Mock Agent & Test Harness
|
||||||
**Depends on**: v1.0 complete
|
|
||||||
**Research**: Unlikely (internal test patterns, vitest already in codebase)
|
|
||||||
**Plans**: TBD
|
|
||||||
|
|
||||||
Plans:
|
|
||||||
- [x] 07-01: MockAgentManager Adapter
|
- [x] 07-01: MockAgentManager Adapter
|
||||||
- [x] 07-02: Test Harness with Database Fixtures
|
- [x] 07-02: Test Harness with Database Fixtures
|
||||||
|
|
||||||
#### Phase 8: E2E Scenario Tests
|
### Phase 8: E2E Scenario Tests
|
||||||
|
|
||||||
**Goal**: Happy path tests (basic flow, dependencies, merging) + edge case tests (conflicts, interrupts, token limits)
|
|
||||||
**Depends on**: Phase 7
|
|
||||||
**Research**: Unlikely (testing existing functionality)
|
|
||||||
**Plans**: 2 plans
|
|
||||||
|
|
||||||
Plans:
|
|
||||||
- [x] 08-01: Happy Path E2E Tests
|
- [x] 08-01: Happy Path E2E Tests
|
||||||
- [x] 08-02: Edge Case E2E Tests
|
- [x] 08-02: Edge Case E2E Tests
|
||||||
|
|
||||||
#### Phase 8.1: Agent Output Schema (INSERTED)
|
### Phase 8.1: Agent Output Schema (INSERTED)
|
||||||
|
|
||||||
**Goal**: Define structured agent output schema (done/question/error discriminated union) and update ClaudeAgentManager to use `--json-schema` flag for validated output parsing
|
|
||||||
**Depends on**: Phase 8
|
|
||||||
**Research**: Unlikely (Zod schemas, Claude CLI flags documented)
|
|
||||||
**Plans**: 2 plans
|
|
||||||
|
|
||||||
Plans:
|
|
||||||
- [x] 08.1-01: Agent Output Schema & ClaudeAgentManager
|
- [x] 08.1-01: Agent Output Schema & ClaudeAgentManager
|
||||||
- [x] 08.1-02: MockAgentManager Schema Alignment
|
- [x] 08.1-02: MockAgentManager Schema Alignment
|
||||||
|
|
||||||
#### Phase 9: Extended Scenarios
|
### Phase 9: Extended Scenarios
|
||||||
|
|
||||||
**Goal**: Extended E2E scenario coverage — conflict hand-back round-trip, multi-agent parallel work, recovery/resume flows
|
|
||||||
**Depends on**: Phase 8.1
|
|
||||||
**Research**: Unlikely (testing existing functionality)
|
|
||||||
**Plans**: 2 plans
|
|
||||||
|
|
||||||
Plans:
|
|
||||||
- [x] 09-01: Conflict & Parallel E2E Tests
|
- [x] 09-01: Conflict & Parallel E2E Tests
|
||||||
- [x] 09-02: Recovery & Resume E2E Tests
|
- [x] 09-02: Recovery & Resume E2E Tests
|
||||||
|
|
||||||
|
</details>
|
||||||
|
|
||||||
## Progress
|
## Progress
|
||||||
|
|
||||||
**Execution Order:**
|
**Execution Order:**
|
||||||
|
|||||||
@@ -2,19 +2,18 @@
|
|||||||
|
|
||||||
## Project Reference
|
## Project Reference
|
||||||
|
|
||||||
See: .planning/PROJECT.md (updated 2026-01-30)
|
See: .planning/PROJECT.md (updated 2026-01-31)
|
||||||
|
|
||||||
**Core value:** Coordinate multiple Claude Code agents without losing track or stepping on each other.
|
**Core value:** Coordinate multiple Claude Code agents without losing track or stepping on each other.
|
||||||
**Current focus:** v1.1 Test Infrastructure — E2E test coverage with mocked agents
|
**Current focus:** Planning next milestone
|
||||||
|
|
||||||
## Current Position
|
## Current Position
|
||||||
|
|
||||||
Phase: 9 of 9 (Extended Scenarios)
|
Milestone: v1.1 complete
|
||||||
Plan: 2 of 2 in current phase
|
Status: Ready to plan next milestone
|
||||||
Status: Phase complete - Milestone v1.1 complete
|
Last activity: 2026-01-31 — v1.1 Test Infrastructure shipped
|
||||||
Last activity: 2026-01-31 — Completed 09-02-PLAN.md
|
|
||||||
|
|
||||||
Progress: ██████████ 100%
|
Progress: ██████████ 100% (v1.0 + v1.1 complete)
|
||||||
|
|
||||||
## Performance Metrics
|
## Performance Metrics
|
||||||
|
|
||||||
|
|||||||
102
.planning/milestones/v1.1-ROADMAP.md
Normal file
102
.planning/milestones/v1.1-ROADMAP.md
Normal file
@@ -0,0 +1,102 @@
|
|||||||
|
# Milestone v1.1: Test Infrastructure
|
||||||
|
|
||||||
|
**Status:** SHIPPED 2026-01-31
|
||||||
|
**Phases:** 7-9 (including 8.1 inserted)
|
||||||
|
**Total Plans:** 8
|
||||||
|
|
||||||
|
## Overview
|
||||||
|
|
||||||
|
E2E test coverage with mocked agents proving all dispatch/coordination scenarios work end-to-end. MockAgentManager enables testing without real Claude CLI, TestHarness provides full system wiring with database fixtures.
|
||||||
|
|
||||||
|
## Phases
|
||||||
|
|
||||||
|
### Phase 7: Mock Agent & Test Harness
|
||||||
|
|
||||||
|
**Goal**: Mock agent adapter with configurable scenarios + test harness foundation with DB-seeded fixtures
|
||||||
|
**Depends on**: v1.0 complete
|
||||||
|
**Plans**: 2 plans
|
||||||
|
|
||||||
|
Plans:
|
||||||
|
- [x] 07-01: MockAgentManager Adapter
|
||||||
|
- [x] 07-02: Test Harness with Database Fixtures
|
||||||
|
|
||||||
|
**Key deliverables:**
|
||||||
|
- MockAgentManager implementing full AgentManager port
|
||||||
|
- MockAgentScenario for configurable outcomes (success, crash, waiting)
|
||||||
|
- TestHarness with full system wiring (Dispatch, Coordination, Mock agents)
|
||||||
|
- Fixture helpers (SIMPLE_FIXTURE, PARALLEL_FIXTURE, COMPLEX_FIXTURE)
|
||||||
|
- MockWorktreeManager with configurable merge results
|
||||||
|
|
||||||
|
### Phase 8: E2E Scenario Tests
|
||||||
|
|
||||||
|
**Goal**: Happy path tests (basic flow, dependencies, merging) + edge case tests (conflicts, interrupts, token limits)
|
||||||
|
**Depends on**: Phase 7
|
||||||
|
**Plans**: 2 plans
|
||||||
|
|
||||||
|
Plans:
|
||||||
|
- [x] 08-01: Happy Path E2E Tests
|
||||||
|
- [x] 08-02: Edge Case E2E Tests
|
||||||
|
|
||||||
|
**Key deliverables:**
|
||||||
|
- 6 happy path tests (single task, parallel dispatch, merge flow, complex dependencies)
|
||||||
|
- 14 edge case tests (agent crash, merge conflicts, blocked tasks, waiting agents)
|
||||||
|
- Test patterns for E2E scenarios with fake timers
|
||||||
|
|
||||||
|
### Phase 8.1: Agent Output Schema (INSERTED)
|
||||||
|
|
||||||
|
**Goal**: Define structured agent output schema (done/question/error discriminated union) and update ClaudeAgentManager to use `--json-schema` flag for validated output parsing
|
||||||
|
**Depends on**: Phase 8
|
||||||
|
**Plans**: 2 plans
|
||||||
|
|
||||||
|
Plans:
|
||||||
|
- [x] 08.1-01: Agent Output Schema & ClaudeAgentManager
|
||||||
|
- [x] 08.1-02: MockAgentManager Schema Alignment
|
||||||
|
|
||||||
|
**Key deliverables:**
|
||||||
|
- Zod schema with discriminated union (done/question/unrecoverable_error)
|
||||||
|
- JSON schema export for Claude CLI --json-schema flag
|
||||||
|
- ClaudeAgentManager parsing structured output
|
||||||
|
- MockAgentManager aligned with schema
|
||||||
|
- TestHarness convenience methods (setAgentDone, setAgentQuestion, setAgentError)
|
||||||
|
|
||||||
|
### Phase 9: Extended Scenarios
|
||||||
|
|
||||||
|
**Goal**: Extended E2E scenario coverage — conflict hand-back round-trip, multi-agent parallel work, recovery/resume flows
|
||||||
|
**Depends on**: Phase 8.1
|
||||||
|
**Plans**: 2 plans
|
||||||
|
|
||||||
|
Plans:
|
||||||
|
- [x] 09-01: Conflict & Parallel E2E Tests
|
||||||
|
- [x] 09-02: Recovery & Resume E2E Tests
|
||||||
|
|
||||||
|
**Key deliverables:**
|
||||||
|
- 6 conflict and parallel tests (conflict cycle, resolution, parallel merges)
|
||||||
|
- 8 recovery and Q&A tests (state persistence, crash recovery, multi-question flows)
|
||||||
|
- Proof that database is source of truth for all state
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Milestone Summary
|
||||||
|
|
||||||
|
**Decimal Phases:**
|
||||||
|
- Phase 8.1: Agent Output Schema (inserted after Phase 8 for structured agent output)
|
||||||
|
|
||||||
|
**Key Decisions:**
|
||||||
|
- MockAgentManager uses in-memory Map (no database needed)
|
||||||
|
- TestHarness pre-seeds idle agents before dispatch
|
||||||
|
- Discriminated union on status field for agent output
|
||||||
|
- JSON schema passed to Claude CLI via --json-schema flag
|
||||||
|
- Database is source of truth for recovery scenarios
|
||||||
|
|
||||||
|
**Issues Resolved:**
|
||||||
|
- Agent output parsing was hacky string matching — now structured schema
|
||||||
|
- No way to test dispatch/coordination without real Claude CLI — MockAgentManager solves this
|
||||||
|
|
||||||
|
**Issues Deferred:**
|
||||||
|
- None
|
||||||
|
|
||||||
|
**Technical Debt Incurred:**
|
||||||
|
- None
|
||||||
|
|
||||||
|
---
|
||||||
|
*For current project status, see .planning/ROADMAP.md*
|
||||||
Reference in New Issue
Block a user