From aacfe5349266a9ae15f4bae8623e3a6443379edd Mon Sep 17 00:00:00 2001 From: Lukas May Date: Sat, 31 Jan 2026 16:31:57 +0100 Subject: [PATCH] chore: complete v1.1 milestone - Created MILESTONES.md with v1.0 and v1.1 entries - Evolved PROJECT.md with validated requirements and current state - Reorganized ROADMAP.md with collapsed v1.1 milestone - Created milestone archive: milestones/v1.1-ROADMAP.md - Updated STATE.md for next milestone planning --- .planning/MILESTONES.md | 57 +++++++++++++++ .planning/PROJECT.md | 25 +++++-- .planning/ROADMAP.md | 41 +++-------- .planning/STATE.md | 13 ++-- .planning/milestones/v1.1-ROADMAP.md | 102 +++++++++++++++++++++++++++ 5 files changed, 193 insertions(+), 45 deletions(-) create mode 100644 .planning/MILESTONES.md create mode 100644 .planning/milestones/v1.1-ROADMAP.md diff --git a/.planning/MILESTONES.md b/.planning/MILESTONES.md new file mode 100644 index 0000000..5d213c1 --- /dev/null +++ b/.planning/MILESTONES.md @@ -0,0 +1,57 @@ +# Project Milestones: Codewalk District + +## v1.1 Test Infrastructure (Shipped: 2026-01-31) + +**Delivered:** Complete E2E test coverage with mocked agents proving dispatch and coordination work correctly. + +**Phases completed:** 7-9 (8 plans total, including Phase 8.1 inserted) + +**Key accomplishments:** + +- MockAgentManager adapter with configurable scenarios (success, crash, waiting_for_input) +- TestHarness with full system wiring and database fixtures +- 34 E2E tests covering happy paths, edge cases, conflicts, recovery, and Q&A flows +- Structured agent output schema with Zod validation and --json-schema CLI integration +- Proof that database is source of truth for state recovery + +**Stats:** + +- 37 files created/modified +- 6,786 lines of TypeScript added +- 4 phases (including 1 inserted), 8 plans +- 1 day from start to ship + +**Git range:** `feat(07-01)` → `docs(09-01)` + +**What's next:** Production readiness, real agent integration testing + +--- + +## v1.0 Core System (Shipped: 2026-01-30) + +**Delivered:** Full multi-agent orchestration system with CLI, database, git worktrees, agent lifecycle, task dispatch, and coordination. + +**Phases completed:** 1-6 (27 plans total, including Phase 1.1 inserted) + +**Key accomplishments:** + +- CLI binary (`cw`) with server mode, process management, graceful shutdown +- Hexagonal architecture with event bus and tRPC +- SQLite database with Drizzle ORM, task hierarchy schema +- Git worktree management for agent isolation +- Agent lifecycle (spawn, stop, resume) with Claude Code CLI integration +- Task dispatch with dependency-ordered work queue +- Coordination manager for merge handling and conflict detection + +**Stats:** + +- 100+ files created/modified +- ~8,000 lines of TypeScript +- 7 phases (including 1 inserted), 27 plans +- 1 day from start to ship + +**Git range:** `feat(01-01)` → `docs(06-03)` + +**What's next:** v1.1 Test Infrastructure (completed) + +--- diff --git a/.planning/PROJECT.md b/.planning/PROJECT.md index 6078e64..b693314 100644 --- a/.planning/PROJECT.md +++ b/.planning/PROJECT.md @@ -14,15 +14,18 @@ If everything else fails, this must work: spawn agents, assign work, know what's ### Validated -(None yet — ship to validate) +- ✓ **CLI `cw`** — single binary, server mode via `--server`, commands for tasks/initiatives/agents — v1.0 +- ✓ **Task breakdown system** — initiative → phases → plans → tasks with SQLite backing — v1.0 +- ✓ **Orchestration layer** — spawn Claude Code agents, track running work, dispatch tasks from queue — v1.0 +- ✓ **Worktree management** — isolated git worktrees per agent; automatic setup/teardown — v1.0 +- ✓ **Coordination layer** — merge agent outputs in dependency order, detect conflicts, hand back for resolution — v1.0 +- ✓ **E2E test coverage** — MockAgentManager, TestHarness, 34 E2E tests proving dispatch/coordination works — v1.1 ### Active -- [ ] **Task breakdown system** — GSD-style initiative → phases → plans → tasks with SQLite backing -- [ ] **Orchestration layer** — spawn Claude Code agents, track running work, dispatch tasks from queue - [ ] **File system UI (fsui)** — bidirectional sync between SQLite and filesystem; agent messages appear as files, user responds by editing files -- [ ] **Worktree management** — isolated git worktrees per agent; automatic setup/teardown; agents work in parallel without merge conflicts -- [ ] **CLI `cw`** — single binary, server mode via `--server`, commands for tasks/initiatives/agents +- [ ] **Real agent integration tests** — tests with actual Claude Code CLI (not mocked) +- [ ] **Production hardening** — error handling, logging improvements, graceful degradation ### Out of Scope @@ -32,6 +35,16 @@ If everything else fails, this must work: spawn agents, assign work, know what's - Knowledge capture suggestions — future feature to auto-extend CLAUDE.md - Multi-user support — solo developer first, stub for future +## Current State + +**Shipped:** v1.1 Test Infrastructure (2026-01-31) +- Full orchestration system: CLI, database, git worktrees, agent lifecycle, dispatch, coordination +- 34 E2E tests with MockAgentManager proving all scenarios work +- Structured agent output schema with Zod validation +- ~15,000 LOC TypeScript across 130+ files + +**Tech stack:** TypeScript, tRPC, SQLite/Drizzle, Vitest, Hexagonal architecture + ## Context **Pain point:** Running multiple Claude Code agents in separate terminals. Losing track of what each is doing. Hard to parallelize work. Agents collide on the same files. No central coordination. @@ -68,4 +81,4 @@ If everything else fails, this must work: spawn agents, assign work, know what's | Terminal inbox via fsui, not TUI | Less code, leverage existing editor, bidirectional fs sync already planned | — Pending | --- -*Last updated: 2026-01-30 after initialization* +*Last updated: 2026-01-31 after v1.1 milestone* diff --git a/.planning/ROADMAP.md b/.planning/ROADMAP.md index cf1e554..bab9e2b 100644 --- a/.planning/ROADMAP.md +++ b/.planning/ROADMAP.md @@ -115,54 +115,31 @@ Plans: -### ✅ v1.1 Test Infrastructure (Shipped 2026-01-31) +
+✅ v1.1 Test Infrastructure (Phases 7-9) - SHIPPED 2026-01-31 **Milestone Goal:** E2E test coverage with mocked agents proving all dispatch/coordination scenarios work end-to-end -#### Phase 7: Mock Agent & Test Harness +**Full details:** [milestones/v1.1-ROADMAP.md](milestones/v1.1-ROADMAP.md) -**Goal**: Mock agent adapter with configurable scenarios + test harness foundation with DB-seeded fixtures -**Depends on**: v1.0 complete -**Research**: Unlikely (internal test patterns, vitest already in codebase) -**Plans**: TBD - -Plans: +### Phase 7: Mock Agent & Test Harness - [x] 07-01: MockAgentManager Adapter - [x] 07-02: Test Harness with Database Fixtures -#### Phase 8: E2E Scenario Tests - -**Goal**: Happy path tests (basic flow, dependencies, merging) + edge case tests (conflicts, interrupts, token limits) -**Depends on**: Phase 7 -**Research**: Unlikely (testing existing functionality) -**Plans**: 2 plans - -Plans: +### Phase 8: E2E Scenario Tests - [x] 08-01: Happy Path E2E Tests - [x] 08-02: Edge Case E2E Tests -#### Phase 8.1: Agent Output Schema (INSERTED) - -**Goal**: Define structured agent output schema (done/question/error discriminated union) and update ClaudeAgentManager to use `--json-schema` flag for validated output parsing -**Depends on**: Phase 8 -**Research**: Unlikely (Zod schemas, Claude CLI flags documented) -**Plans**: 2 plans - -Plans: +### Phase 8.1: Agent Output Schema (INSERTED) - [x] 08.1-01: Agent Output Schema & ClaudeAgentManager - [x] 08.1-02: MockAgentManager Schema Alignment -#### Phase 9: Extended Scenarios - -**Goal**: Extended E2E scenario coverage — conflict hand-back round-trip, multi-agent parallel work, recovery/resume flows -**Depends on**: Phase 8.1 -**Research**: Unlikely (testing existing functionality) -**Plans**: 2 plans - -Plans: +### Phase 9: Extended Scenarios - [x] 09-01: Conflict & Parallel E2E Tests - [x] 09-02: Recovery & Resume E2E Tests +
+ ## Progress **Execution Order:** diff --git a/.planning/STATE.md b/.planning/STATE.md index e366a8b..df5cbfb 100644 --- a/.planning/STATE.md +++ b/.planning/STATE.md @@ -2,19 +2,18 @@ ## Project Reference -See: .planning/PROJECT.md (updated 2026-01-30) +See: .planning/PROJECT.md (updated 2026-01-31) **Core value:** Coordinate multiple Claude Code agents without losing track or stepping on each other. -**Current focus:** v1.1 Test Infrastructure — E2E test coverage with mocked agents +**Current focus:** Planning next milestone ## Current Position -Phase: 9 of 9 (Extended Scenarios) -Plan: 2 of 2 in current phase -Status: Phase complete - Milestone v1.1 complete -Last activity: 2026-01-31 — Completed 09-02-PLAN.md +Milestone: v1.1 complete +Status: Ready to plan next milestone +Last activity: 2026-01-31 — v1.1 Test Infrastructure shipped -Progress: ██████████ 100% +Progress: ██████████ 100% (v1.0 + v1.1 complete) ## Performance Metrics diff --git a/.planning/milestones/v1.1-ROADMAP.md b/.planning/milestones/v1.1-ROADMAP.md new file mode 100644 index 0000000..ee02329 --- /dev/null +++ b/.planning/milestones/v1.1-ROADMAP.md @@ -0,0 +1,102 @@ +# Milestone v1.1: Test Infrastructure + +**Status:** SHIPPED 2026-01-31 +**Phases:** 7-9 (including 8.1 inserted) +**Total Plans:** 8 + +## Overview + +E2E test coverage with mocked agents proving all dispatch/coordination scenarios work end-to-end. MockAgentManager enables testing without real Claude CLI, TestHarness provides full system wiring with database fixtures. + +## Phases + +### Phase 7: Mock Agent & Test Harness + +**Goal**: Mock agent adapter with configurable scenarios + test harness foundation with DB-seeded fixtures +**Depends on**: v1.0 complete +**Plans**: 2 plans + +Plans: +- [x] 07-01: MockAgentManager Adapter +- [x] 07-02: Test Harness with Database Fixtures + +**Key deliverables:** +- MockAgentManager implementing full AgentManager port +- MockAgentScenario for configurable outcomes (success, crash, waiting) +- TestHarness with full system wiring (Dispatch, Coordination, Mock agents) +- Fixture helpers (SIMPLE_FIXTURE, PARALLEL_FIXTURE, COMPLEX_FIXTURE) +- MockWorktreeManager with configurable merge results + +### Phase 8: E2E Scenario Tests + +**Goal**: Happy path tests (basic flow, dependencies, merging) + edge case tests (conflicts, interrupts, token limits) +**Depends on**: Phase 7 +**Plans**: 2 plans + +Plans: +- [x] 08-01: Happy Path E2E Tests +- [x] 08-02: Edge Case E2E Tests + +**Key deliverables:** +- 6 happy path tests (single task, parallel dispatch, merge flow, complex dependencies) +- 14 edge case tests (agent crash, merge conflicts, blocked tasks, waiting agents) +- Test patterns for E2E scenarios with fake timers + +### Phase 8.1: Agent Output Schema (INSERTED) + +**Goal**: Define structured agent output schema (done/question/error discriminated union) and update ClaudeAgentManager to use `--json-schema` flag for validated output parsing +**Depends on**: Phase 8 +**Plans**: 2 plans + +Plans: +- [x] 08.1-01: Agent Output Schema & ClaudeAgentManager +- [x] 08.1-02: MockAgentManager Schema Alignment + +**Key deliverables:** +- Zod schema with discriminated union (done/question/unrecoverable_error) +- JSON schema export for Claude CLI --json-schema flag +- ClaudeAgentManager parsing structured output +- MockAgentManager aligned with schema +- TestHarness convenience methods (setAgentDone, setAgentQuestion, setAgentError) + +### Phase 9: Extended Scenarios + +**Goal**: Extended E2E scenario coverage — conflict hand-back round-trip, multi-agent parallel work, recovery/resume flows +**Depends on**: Phase 8.1 +**Plans**: 2 plans + +Plans: +- [x] 09-01: Conflict & Parallel E2E Tests +- [x] 09-02: Recovery & Resume E2E Tests + +**Key deliverables:** +- 6 conflict and parallel tests (conflict cycle, resolution, parallel merges) +- 8 recovery and Q&A tests (state persistence, crash recovery, multi-question flows) +- Proof that database is source of truth for all state + +--- + +## Milestone Summary + +**Decimal Phases:** +- Phase 8.1: Agent Output Schema (inserted after Phase 8 for structured agent output) + +**Key Decisions:** +- MockAgentManager uses in-memory Map (no database needed) +- TestHarness pre-seeds idle agents before dispatch +- Discriminated union on status field for agent output +- JSON schema passed to Claude CLI via --json-schema flag +- Database is source of truth for recovery scenarios + +**Issues Resolved:** +- Agent output parsing was hacky string matching — now structured schema +- No way to test dispatch/coordination without real Claude CLI — MockAgentManager solves this + +**Issues Deferred:** +- None + +**Technical Debt Incurred:** +- None + +--- +*For current project status, see .planning/ROADMAP.md*