From 9f149f5f9bebe4b08cd406eab4300f905317dd44 Mon Sep 17 00:00:00 2001 From: Lukas May Date: Mon, 2 Feb 2026 10:46:41 +0100 Subject: [PATCH] chore: archive v1.2 milestone - Added v1.2 entry to MILESTONES.md - Created milestones/v1.2-ROADMAP.md archive - Updated ROADMAP.md with archive link - Evolved PROJECT.md with v1.2 validated requirements - Updated STATE.md for next milestone planning --- .planning/MILESTONES.md | 27 ++++ .planning/PROJECT.md | 16 ++- .planning/ROADMAP.md | 2 + .planning/STATE.md | 16 +-- .planning/milestones/v1.2-ROADMAP.md | 118 ++++++++++++++++ .../11-architect-agent/11-08-SUMMARY.md | 132 ++++++++++++++++++ 6 files changed, 297 insertions(+), 14 deletions(-) create mode 100644 .planning/milestones/v1.2-ROADMAP.md create mode 100644 .planning/phases/11-architect-agent/11-08-SUMMARY.md diff --git a/.planning/MILESTONES.md b/.planning/MILESTONES.md index 5d213c1..2509d2d 100644 --- a/.planning/MILESTONES.md +++ b/.planning/MILESTONES.md @@ -1,5 +1,32 @@ # Project Milestones: Codewalk District +## v1.2 Architect & Multi-Question (Shipped: 2026-02-02) + +**Delivered:** Structured planning workflow with Architect agent modes and efficient multi-question Q&A with batched answers. + +**Phases completed:** 10-13 (21 plans total) + +**Key accomplishments:** + +- Multi-question schema with batched answers for efficient agent Q&A +- Architect agent with discuss/breakdown/decompose modes for planning +- Phase-task decomposition workflow generating tasks from plans +- Real Claude CLI integration tests validating JSON schema handling +- Fixed structured_output parsing for Claude CLI --json-schema flag + +**Stats:** + +- ~40 files created/modified +- ~27,600 lines of TypeScript total +- 4 phases, 21 plans +- 2 days from start to ship + +**Git range:** `feat(10-01)` → `docs(13-01)` + +**What's next:** File system UI (fsui), production hardening + +--- + ## v1.1 Test Infrastructure (Shipped: 2026-01-31) **Delivered:** Complete E2E test coverage with mocked agents proving dispatch and coordination work correctly. diff --git a/.planning/PROJECT.md b/.planning/PROJECT.md index b693314..d2f272c 100644 --- a/.planning/PROJECT.md +++ b/.planning/PROJECT.md @@ -20,11 +20,13 @@ If everything else fails, this must work: spawn agents, assign work, know what's - ✓ **Worktree management** — isolated git worktrees per agent; automatic setup/teardown — v1.0 - ✓ **Coordination layer** — merge agent outputs in dependency order, detect conflicts, hand back for resolution — v1.0 - ✓ **E2E test coverage** — MockAgentManager, TestHarness, 34 E2E tests proving dispatch/coordination works — v1.1 +- ✓ **Multi-question Q&A** — batched questions with id-based answer correlation, efficient agent pauses — v1.2 +- ✓ **Architect agent modes** — discuss, breakdown, decompose for structured planning workflow — v1.2 +- ✓ **Real CLI validation** — integration tests confirming Claude CLI JSON schema handling — v1.2 ### Active - [ ] **File system UI (fsui)** — bidirectional sync between SQLite and filesystem; agent messages appear as files, user responds by editing files -- [ ] **Real agent integration tests** — tests with actual Claude Code CLI (not mocked) - [ ] **Production hardening** — error handling, logging improvements, graceful degradation ### Out of Scope @@ -37,11 +39,13 @@ If everything else fails, this must work: spawn agents, assign work, know what's ## Current State -**Shipped:** v1.1 Test Infrastructure (2026-01-31) +**Shipped:** v1.2 Architect & Multi-Question (2026-02-02) - Full orchestration system: CLI, database, git worktrees, agent lifecycle, dispatch, coordination -- 34 E2E tests with MockAgentManager proving all scenarios work -- Structured agent output schema with Zod validation -- ~15,000 LOC TypeScript across 130+ files +- 40+ E2E tests with MockAgentManager proving all scenarios work +- Architect agent with discuss/breakdown/decompose modes for planning +- Multi-question Q&A with batched answers +- Real Claude CLI integration tests validating schema handling +- ~27,600 LOC TypeScript across 150+ files **Tech stack:** TypeScript, tRPC, SQLite/Drizzle, Vitest, Hexagonal architecture @@ -81,4 +85,4 @@ If everything else fails, this must work: spawn agents, assign work, know what's | Terminal inbox via fsui, not TUI | Less code, leverage existing editor, bidirectional fs sync already planned | — Pending | --- -*Last updated: 2026-01-31 after v1.1 milestone* +*Last updated: 2026-02-02 after v1.2 milestone* diff --git a/.planning/ROADMAP.md b/.planning/ROADMAP.md index f11e1dd..31140aa 100644 --- a/.planning/ROADMAP.md +++ b/.planning/ROADMAP.md @@ -148,6 +148,8 @@ Plans: **Full details:** [milestones/v1.2-ROADMAP.md](milestones/v1.2-ROADMAP.md) +**Full details:** [milestones/v1.2-ROADMAP.md](milestones/v1.2-ROADMAP.md) + ### Phase 10: Multi-Question Schema **Goal**: Extend agent output schema to return multiple questions; resume agent with all answers batched **Depends on**: Phase 9 (v1.1 complete) diff --git a/.planning/STATE.md b/.planning/STATE.md index b568dae..5428080 100644 --- a/.planning/STATE.md +++ b/.planning/STATE.md @@ -2,19 +2,19 @@ ## Project Reference -See: .planning/PROJECT.md (updated 2026-01-31) +See: .planning/PROJECT.md (updated 2026-02-02) **Core value:** Coordinate multiple Claude Code agents without losing track or stepping on each other. -**Current focus:** v1.2 Architect & Multi-Question +**Current focus:** Planning next milestone ## Current Position -Phase: 13 of 13 (Real Claude E2E Tests) -Plan: 1 of 1 in current phase -Status: Milestone complete -Last activity: 2026-02-02 — Completed 13-01-PLAN.md +Phase: v1.2 complete +Plan: N/A +Status: Ready to plan next milestone +Last activity: 2026-02-02 — v1.2 milestone archived -Progress: ██████████ 100% +Progress: ██████████ 100% (v1.2) ## Performance Metrics @@ -181,5 +181,5 @@ None. ## Session Continuity Last session: 2026-02-02 -Stopped at: Completed 13-01-PLAN.md (Real Claude CLI Integration Tests) +Stopped at: Archived v1.2 milestone Resume file: None diff --git a/.planning/milestones/v1.2-ROADMAP.md b/.planning/milestones/v1.2-ROADMAP.md new file mode 100644 index 0000000..2f461f2 --- /dev/null +++ b/.planning/milestones/v1.2-ROADMAP.md @@ -0,0 +1,118 @@ +# Milestone v1.2: Architect & Multi-Question + +**Status:** ✅ SHIPPED 2026-02-02 +**Phases:** 10-13 +**Total Plans:** 21 + +## Overview + +Enable structured planning workflow with Architect agent and efficient multi-question Q&A. Agents can now ask multiple questions at once with batched answers, run in discuss/breakdown/decompose modes to generate phases and tasks, and real Claude CLI integration tests validate the schema handling. + +## Phases + +### Phase 10: Multi-Question Schema + +**Goal**: Extend agent output schema to return multiple questions; resume agent with all answers batched +**Depends on**: Phase 9 (v1.1 complete) +**Plans**: 4 plans + +Plans: +- [x] 10-01: Schema & Type Updates +- [x] 10-02: Manager Implementation +- [x] 10-03: TestHarness & Test Updates +- [x] 10-04: E2E Test Updates + +**Key deliverables:** +- Questions array schema with id field for answer correlation +- Batched answers via resume() with Record mapping +- AgentWaitingEvent with questions array payload +- Multi-question E2E test validating full flow + +### Phase 11: Architect Agent + +**Goal**: Agent modes for concept refinement (questioning) and phase breakdown (persisting to ROADMAP.md) +**Depends on**: Phase 10 +**Plans**: 8 plans + +Plans: +- [x] 11-01: Agent Mode Schema Extension +- [x] 11-02: Initiative & Phase Repositories +- [x] 11-03: ClaudeAgentManager Mode Support +- [x] 11-04: Initiative & Phase tRPC Procedures +- [x] 11-05: Architect Spawn Procedures +- [x] 11-06: CLI Commands +- [x] 11-07: Unit Tests +- [x] 11-08: E2E Tests + +**Key deliverables:** +- AgentMode type (execute, discuss, breakdown, decompose) +- Discuss mode outputs decisions array +- Breakdown mode outputs phases array with dependencies +- Initiative and Phase repositories with tRPC procedures +- Agent prompts module for mode-specific prompts +- Full workflow E2E test (discuss -> breakdown -> phases) + +### Phase 12: Phase-Task Decomposition + +**Goal**: Agents break phases into individual tasks with ability to ask questions during breakdown +**Depends on**: Phase 11 +**Plans**: 8 plans + +Plans: +- [x] 12-01: Decompose Mode Schema +- [x] 12-02: PlanRepository Extensions +- [x] 12-03: ClaudeAgentManager Decompose Support +- [x] 12-04: Plan & Task tRPC Procedures +- [x] 12-05: Decompose Prompts & Spawn Procedure +- [x] 12-06: CLI Commands +- [x] 12-07: Unit Tests +- [x] 12-08: E2E Tests + +**Key deliverables:** +- Decompose mode schema with TaskBreakdown array +- Task dependencies via integer references +- PlanRepository with getNextNumber for auto-numbering +- createTasksFromDecomposition tRPC procedure +- Full workflow E2E test (initiative -> phase -> plan -> decompose -> tasks) + +### Phase 13: Real Claude E2E Tests + +**Goal**: Verify multi-question and architect flows with actual Claude CLI; replace with mocks after verification +**Depends on**: Phase 12 +**Plans**: 1 plan + +Plans: +- [x] 13-01: Real Claude CLI Integration Tests + +**Key deliverables:** +- Integration tests for all agent modes (execute, discuss, breakdown, decompose) +- Fixed structured_output parsing in ClaudeAgentManager +- Documentation of Claude CLI response structure with --json-schema flag +- Validation that MockAgentManager accurately simulates real CLI behavior + +--- + +## Milestone Summary + +**Key Decisions:** +- Status 'questions' (plural) for array-based question payload +- Each question has id field for matching answers in batched resume +- AgentMode stored in database with 'execute' default for backwards compatibility +- Separate handler methods per mode (handleExecuteOutput, handleDiscussOutput, etc.) +- Use structured_output field (not result) when --json-schema is used +- Integration tests skipped by default (REAL_CLAUDE_TESTS=1 to enable) + +**Issues Resolved:** +- Single question per pause was inefficient — now batched questions +- No planning workflow — Architect agent with discuss/breakdown/decompose modes +- JSON schema validation untested with real CLI — integration tests confirm behavior +- structured_output parsing incorrect — fixed to read correct field + +**Issues Deferred:** +- None + +**Technical Debt Incurred:** +- None + +--- +*For current project status, see .planning/ROADMAP.md* diff --git a/.planning/phases/11-architect-agent/11-08-SUMMARY.md b/.planning/phases/11-architect-agent/11-08-SUMMARY.md new file mode 100644 index 0000000..a895ba5 --- /dev/null +++ b/.planning/phases/11-architect-agent/11-08-SUMMARY.md @@ -0,0 +1,132 @@ +--- +phase: 11-architect-agent +plan: 08 +subsystem: test +tags: [e2e-tests, architect, test-harness, discuss-mode, breakdown-mode] + +# Dependency graph +requires: + - phase: 11-05 + provides: spawnArchitectDiscuss, spawnArchitectBreakdown procedures + - phase: 11-06 + provides: Initiative and architect CLI commands + - phase: 11-07 + provides: Unit tests for modes and repositories +provides: + - TestHarness with tRPC caller and architect scenario helpers + - E2E tests for discuss mode completion and Q&A flow + - E2E tests for breakdown mode and phase persistence + - Full workflow test: discuss -> breakdown -> phases +affects: [testing-infrastructure, e2e-coverage] + +# Tech tracking +tech-stack: + added: [] + patterns: + - "TestHarness tRPC caller for direct procedure invocation" + - "Architect scenario helpers wrapping MockAgentScenario" + +key-files: + created: + - src/test/e2e/architect-workflow.test.ts + modified: + - src/test/harness.ts + - src/test/index.ts + - src/agent/mock-manager.test.ts + +key-decisions: + - "TestHarness wired with tRPC caller and initiative/phase repositories" + - "Architect scenario helpers via MockAgentManager (context_complete, breakdown_complete)" + - "E2E tests cover full discuss -> breakdown -> phase persistence workflow" + +patterns-established: + - "TestHarness as integration point for tRPC-based E2E testing" + - "Scenario helpers for mode-specific agent behaviors" + +# Metrics +duration: 4min +completed: 2026-01-31 +--- + +# Phase 11 Plan 08: TestHarness Helpers & Architect E2E Tests Summary + +**Added TestHarness architect mode support and comprehensive E2E tests for the complete architect workflow** + +## Performance + +- **Duration:** 4 min +- **Started:** 2026-01-31T19:25:00Z +- **Completed:** 2026-01-31T19:29:00Z +- **Tasks:** 3 +- **Files modified:** 4 + +## Accomplishments + +- Enhanced TestHarness with tRPC caller and initiative/phase repositories +- Added architect-specific scenario helpers (setArchitectDiscussComplete, setArchitectBreakdownComplete) +- Added convenience helpers (mockAgentManager alias, advanceTimers, getEmittedEvents) +- Created comprehensive E2E tests for discuss mode (completion, Q&A flow) +- Created E2E tests for breakdown mode and phase persistence +- Added full workflow test covering discuss -> breakdown -> phases + +## Task Commits + +Each task was committed atomically: + +1. **Task 1: Add TestHarness helpers for architect modes** - `021937c` (feat) +2. **Task 2: Add E2E test for discuss mode** - `ae130e9` (test) +3. **Task 3: Add E2E test for breakdown mode and phase persistence** - `47b4623` (test) + +## Files Created/Modified + +- `src/test/harness.ts` - Added tRPC caller, repositories, architect helpers +- `src/test/index.ts` - Export TRPCCaller type +- `src/test/e2e/architect-workflow.test.ts` - New E2E test file (5 tests) +- `src/agent/mock-manager.test.ts` - Fixed pre-existing test issues + +## Tests Added + +- **Discuss mode completion** - Spawn architect, complete with decisions +- **Discuss Q&A flow** - Pause on questions, resume with answers +- **Breakdown mode completion** - Spawn architect, complete with phases +- **Phase persistence** - Create and retrieve phases from breakdown +- **Full workflow** - Discuss -> Breakdown -> Phase persistence + +## Decisions Made + +1. **TestHarness tRPC caller** - Enables direct procedure invocation in tests +2. **Architect scenario helpers** - Convenience wrappers for context_complete, breakdown_complete +3. **Full workflow coverage** - Single test proving entire architect flow works + +## Deviations from Plan + +Minor fixes to pre-existing test issues (dependencies in PhaseBreakdown, type casting for AgentStoppedEvent). + +## Issues Encountered + +None + +## User Setup Required + +None - tests run automatically. + +## Phase 11 Completion + +This was the final plan in Phase 11 (Architect Agent). Phase 11 is now complete: +- 11-01: Agent mode and schema updates +- 11-02: Discuss and breakdown mode output schemas +- 11-03: Mode-aware agent manager implementation +- 11-04: Initiative and phase tRPC procedures +- 11-05: Agent prompts module and architect spawn procedures +- 11-06: Initiative and architect CLI commands +- 11-07: Unit tests for modes and repositories +- 11-08: TestHarness helpers and E2E tests (this plan) + +## Next Phase Readiness + +- Phase 11 complete - architect workflow fully tested +- Ready for Phase 12 (if exists) or milestone completion + +--- +*Phase: 11-architect-agent* +*Completed: 2026-01-31*