chore: archive v1.2 milestone
- Added v1.2 entry to MILESTONES.md - Created milestones/v1.2-ROADMAP.md archive - Updated ROADMAP.md with archive link - Evolved PROJECT.md with v1.2 validated requirements - Updated STATE.md for next milestone planning
This commit is contained in:
@@ -1,5 +1,32 @@
|
||||
# Project Milestones: Codewalk District
|
||||
|
||||
## v1.2 Architect & Multi-Question (Shipped: 2026-02-02)
|
||||
|
||||
**Delivered:** Structured planning workflow with Architect agent modes and efficient multi-question Q&A with batched answers.
|
||||
|
||||
**Phases completed:** 10-13 (21 plans total)
|
||||
|
||||
**Key accomplishments:**
|
||||
|
||||
- Multi-question schema with batched answers for efficient agent Q&A
|
||||
- Architect agent with discuss/breakdown/decompose modes for planning
|
||||
- Phase-task decomposition workflow generating tasks from plans
|
||||
- Real Claude CLI integration tests validating JSON schema handling
|
||||
- Fixed structured_output parsing for Claude CLI --json-schema flag
|
||||
|
||||
**Stats:**
|
||||
|
||||
- ~40 files created/modified
|
||||
- ~27,600 lines of TypeScript total
|
||||
- 4 phases, 21 plans
|
||||
- 2 days from start to ship
|
||||
|
||||
**Git range:** `feat(10-01)` → `docs(13-01)`
|
||||
|
||||
**What's next:** File system UI (fsui), production hardening
|
||||
|
||||
---
|
||||
|
||||
## v1.1 Test Infrastructure (Shipped: 2026-01-31)
|
||||
|
||||
**Delivered:** Complete E2E test coverage with mocked agents proving dispatch and coordination work correctly.
|
||||
|
||||
@@ -20,11 +20,13 @@ If everything else fails, this must work: spawn agents, assign work, know what's
|
||||
- ✓ **Worktree management** — isolated git worktrees per agent; automatic setup/teardown — v1.0
|
||||
- ✓ **Coordination layer** — merge agent outputs in dependency order, detect conflicts, hand back for resolution — v1.0
|
||||
- ✓ **E2E test coverage** — MockAgentManager, TestHarness, 34 E2E tests proving dispatch/coordination works — v1.1
|
||||
- ✓ **Multi-question Q&A** — batched questions with id-based answer correlation, efficient agent pauses — v1.2
|
||||
- ✓ **Architect agent modes** — discuss, breakdown, decompose for structured planning workflow — v1.2
|
||||
- ✓ **Real CLI validation** — integration tests confirming Claude CLI JSON schema handling — v1.2
|
||||
|
||||
### Active
|
||||
|
||||
- [ ] **File system UI (fsui)** — bidirectional sync between SQLite and filesystem; agent messages appear as files, user responds by editing files
|
||||
- [ ] **Real agent integration tests** — tests with actual Claude Code CLI (not mocked)
|
||||
- [ ] **Production hardening** — error handling, logging improvements, graceful degradation
|
||||
|
||||
### Out of Scope
|
||||
@@ -37,11 +39,13 @@ If everything else fails, this must work: spawn agents, assign work, know what's
|
||||
|
||||
## Current State
|
||||
|
||||
**Shipped:** v1.1 Test Infrastructure (2026-01-31)
|
||||
**Shipped:** v1.2 Architect & Multi-Question (2026-02-02)
|
||||
- Full orchestration system: CLI, database, git worktrees, agent lifecycle, dispatch, coordination
|
||||
- 34 E2E tests with MockAgentManager proving all scenarios work
|
||||
- Structured agent output schema with Zod validation
|
||||
- ~15,000 LOC TypeScript across 130+ files
|
||||
- 40+ E2E tests with MockAgentManager proving all scenarios work
|
||||
- Architect agent with discuss/breakdown/decompose modes for planning
|
||||
- Multi-question Q&A with batched answers
|
||||
- Real Claude CLI integration tests validating schema handling
|
||||
- ~27,600 LOC TypeScript across 150+ files
|
||||
|
||||
**Tech stack:** TypeScript, tRPC, SQLite/Drizzle, Vitest, Hexagonal architecture
|
||||
|
||||
@@ -81,4 +85,4 @@ If everything else fails, this must work: spawn agents, assign work, know what's
|
||||
| Terminal inbox via fsui, not TUI | Less code, leverage existing editor, bidirectional fs sync already planned | — Pending |
|
||||
|
||||
---
|
||||
*Last updated: 2026-01-31 after v1.1 milestone*
|
||||
*Last updated: 2026-02-02 after v1.2 milestone*
|
||||
|
||||
@@ -148,6 +148,8 @@ Plans:
|
||||
|
||||
**Full details:** [milestones/v1.2-ROADMAP.md](milestones/v1.2-ROADMAP.md)
|
||||
|
||||
**Full details:** [milestones/v1.2-ROADMAP.md](milestones/v1.2-ROADMAP.md)
|
||||
|
||||
### Phase 10: Multi-Question Schema
|
||||
**Goal**: Extend agent output schema to return multiple questions; resume agent with all answers batched
|
||||
**Depends on**: Phase 9 (v1.1 complete)
|
||||
|
||||
@@ -2,19 +2,19 @@
|
||||
|
||||
## Project Reference
|
||||
|
||||
See: .planning/PROJECT.md (updated 2026-01-31)
|
||||
See: .planning/PROJECT.md (updated 2026-02-02)
|
||||
|
||||
**Core value:** Coordinate multiple Claude Code agents without losing track or stepping on each other.
|
||||
**Current focus:** v1.2 Architect & Multi-Question
|
||||
**Current focus:** Planning next milestone
|
||||
|
||||
## Current Position
|
||||
|
||||
Phase: 13 of 13 (Real Claude E2E Tests)
|
||||
Plan: 1 of 1 in current phase
|
||||
Status: Milestone complete
|
||||
Last activity: 2026-02-02 — Completed 13-01-PLAN.md
|
||||
Phase: v1.2 complete
|
||||
Plan: N/A
|
||||
Status: Ready to plan next milestone
|
||||
Last activity: 2026-02-02 — v1.2 milestone archived
|
||||
|
||||
Progress: ██████████ 100%
|
||||
Progress: ██████████ 100% (v1.2)
|
||||
|
||||
## Performance Metrics
|
||||
|
||||
@@ -181,5 +181,5 @@ None.
|
||||
## Session Continuity
|
||||
|
||||
Last session: 2026-02-02
|
||||
Stopped at: Completed 13-01-PLAN.md (Real Claude CLI Integration Tests)
|
||||
Stopped at: Archived v1.2 milestone
|
||||
Resume file: None
|
||||
|
||||
118
.planning/milestones/v1.2-ROADMAP.md
Normal file
118
.planning/milestones/v1.2-ROADMAP.md
Normal file
@@ -0,0 +1,118 @@
|
||||
# Milestone v1.2: Architect & Multi-Question
|
||||
|
||||
**Status:** ✅ SHIPPED 2026-02-02
|
||||
**Phases:** 10-13
|
||||
**Total Plans:** 21
|
||||
|
||||
## Overview
|
||||
|
||||
Enable structured planning workflow with Architect agent and efficient multi-question Q&A. Agents can now ask multiple questions at once with batched answers, run in discuss/breakdown/decompose modes to generate phases and tasks, and real Claude CLI integration tests validate the schema handling.
|
||||
|
||||
## Phases
|
||||
|
||||
### Phase 10: Multi-Question Schema
|
||||
|
||||
**Goal**: Extend agent output schema to return multiple questions; resume agent with all answers batched
|
||||
**Depends on**: Phase 9 (v1.1 complete)
|
||||
**Plans**: 4 plans
|
||||
|
||||
Plans:
|
||||
- [x] 10-01: Schema & Type Updates
|
||||
- [x] 10-02: Manager Implementation
|
||||
- [x] 10-03: TestHarness & Test Updates
|
||||
- [x] 10-04: E2E Test Updates
|
||||
|
||||
**Key deliverables:**
|
||||
- Questions array schema with id field for answer correlation
|
||||
- Batched answers via resume() with Record<string, string> mapping
|
||||
- AgentWaitingEvent with questions array payload
|
||||
- Multi-question E2E test validating full flow
|
||||
|
||||
### Phase 11: Architect Agent
|
||||
|
||||
**Goal**: Agent modes for concept refinement (questioning) and phase breakdown (persisting to ROADMAP.md)
|
||||
**Depends on**: Phase 10
|
||||
**Plans**: 8 plans
|
||||
|
||||
Plans:
|
||||
- [x] 11-01: Agent Mode Schema Extension
|
||||
- [x] 11-02: Initiative & Phase Repositories
|
||||
- [x] 11-03: ClaudeAgentManager Mode Support
|
||||
- [x] 11-04: Initiative & Phase tRPC Procedures
|
||||
- [x] 11-05: Architect Spawn Procedures
|
||||
- [x] 11-06: CLI Commands
|
||||
- [x] 11-07: Unit Tests
|
||||
- [x] 11-08: E2E Tests
|
||||
|
||||
**Key deliverables:**
|
||||
- AgentMode type (execute, discuss, breakdown, decompose)
|
||||
- Discuss mode outputs decisions array
|
||||
- Breakdown mode outputs phases array with dependencies
|
||||
- Initiative and Phase repositories with tRPC procedures
|
||||
- Agent prompts module for mode-specific prompts
|
||||
- Full workflow E2E test (discuss -> breakdown -> phases)
|
||||
|
||||
### Phase 12: Phase-Task Decomposition
|
||||
|
||||
**Goal**: Agents break phases into individual tasks with ability to ask questions during breakdown
|
||||
**Depends on**: Phase 11
|
||||
**Plans**: 8 plans
|
||||
|
||||
Plans:
|
||||
- [x] 12-01: Decompose Mode Schema
|
||||
- [x] 12-02: PlanRepository Extensions
|
||||
- [x] 12-03: ClaudeAgentManager Decompose Support
|
||||
- [x] 12-04: Plan & Task tRPC Procedures
|
||||
- [x] 12-05: Decompose Prompts & Spawn Procedure
|
||||
- [x] 12-06: CLI Commands
|
||||
- [x] 12-07: Unit Tests
|
||||
- [x] 12-08: E2E Tests
|
||||
|
||||
**Key deliverables:**
|
||||
- Decompose mode schema with TaskBreakdown array
|
||||
- Task dependencies via integer references
|
||||
- PlanRepository with getNextNumber for auto-numbering
|
||||
- createTasksFromDecomposition tRPC procedure
|
||||
- Full workflow E2E test (initiative -> phase -> plan -> decompose -> tasks)
|
||||
|
||||
### Phase 13: Real Claude E2E Tests
|
||||
|
||||
**Goal**: Verify multi-question and architect flows with actual Claude CLI; replace with mocks after verification
|
||||
**Depends on**: Phase 12
|
||||
**Plans**: 1 plan
|
||||
|
||||
Plans:
|
||||
- [x] 13-01: Real Claude CLI Integration Tests
|
||||
|
||||
**Key deliverables:**
|
||||
- Integration tests for all agent modes (execute, discuss, breakdown, decompose)
|
||||
- Fixed structured_output parsing in ClaudeAgentManager
|
||||
- Documentation of Claude CLI response structure with --json-schema flag
|
||||
- Validation that MockAgentManager accurately simulates real CLI behavior
|
||||
|
||||
---
|
||||
|
||||
## Milestone Summary
|
||||
|
||||
**Key Decisions:**
|
||||
- Status 'questions' (plural) for array-based question payload
|
||||
- Each question has id field for matching answers in batched resume
|
||||
- AgentMode stored in database with 'execute' default for backwards compatibility
|
||||
- Separate handler methods per mode (handleExecuteOutput, handleDiscussOutput, etc.)
|
||||
- Use structured_output field (not result) when --json-schema is used
|
||||
- Integration tests skipped by default (REAL_CLAUDE_TESTS=1 to enable)
|
||||
|
||||
**Issues Resolved:**
|
||||
- Single question per pause was inefficient — now batched questions
|
||||
- No planning workflow — Architect agent with discuss/breakdown/decompose modes
|
||||
- JSON schema validation untested with real CLI — integration tests confirm behavior
|
||||
- structured_output parsing incorrect — fixed to read correct field
|
||||
|
||||
**Issues Deferred:**
|
||||
- None
|
||||
|
||||
**Technical Debt Incurred:**
|
||||
- None
|
||||
|
||||
---
|
||||
*For current project status, see .planning/ROADMAP.md*
|
||||
132
.planning/phases/11-architect-agent/11-08-SUMMARY.md
Normal file
132
.planning/phases/11-architect-agent/11-08-SUMMARY.md
Normal file
@@ -0,0 +1,132 @@
|
||||
---
|
||||
phase: 11-architect-agent
|
||||
plan: 08
|
||||
subsystem: test
|
||||
tags: [e2e-tests, architect, test-harness, discuss-mode, breakdown-mode]
|
||||
|
||||
# Dependency graph
|
||||
requires:
|
||||
- phase: 11-05
|
||||
provides: spawnArchitectDiscuss, spawnArchitectBreakdown procedures
|
||||
- phase: 11-06
|
||||
provides: Initiative and architect CLI commands
|
||||
- phase: 11-07
|
||||
provides: Unit tests for modes and repositories
|
||||
provides:
|
||||
- TestHarness with tRPC caller and architect scenario helpers
|
||||
- E2E tests for discuss mode completion and Q&A flow
|
||||
- E2E tests for breakdown mode and phase persistence
|
||||
- Full workflow test: discuss -> breakdown -> phases
|
||||
affects: [testing-infrastructure, e2e-coverage]
|
||||
|
||||
# Tech tracking
|
||||
tech-stack:
|
||||
added: []
|
||||
patterns:
|
||||
- "TestHarness tRPC caller for direct procedure invocation"
|
||||
- "Architect scenario helpers wrapping MockAgentScenario"
|
||||
|
||||
key-files:
|
||||
created:
|
||||
- src/test/e2e/architect-workflow.test.ts
|
||||
modified:
|
||||
- src/test/harness.ts
|
||||
- src/test/index.ts
|
||||
- src/agent/mock-manager.test.ts
|
||||
|
||||
key-decisions:
|
||||
- "TestHarness wired with tRPC caller and initiative/phase repositories"
|
||||
- "Architect scenario helpers via MockAgentManager (context_complete, breakdown_complete)"
|
||||
- "E2E tests cover full discuss -> breakdown -> phase persistence workflow"
|
||||
|
||||
patterns-established:
|
||||
- "TestHarness as integration point for tRPC-based E2E testing"
|
||||
- "Scenario helpers for mode-specific agent behaviors"
|
||||
|
||||
# Metrics
|
||||
duration: 4min
|
||||
completed: 2026-01-31
|
||||
---
|
||||
|
||||
# Phase 11 Plan 08: TestHarness Helpers & Architect E2E Tests Summary
|
||||
|
||||
**Added TestHarness architect mode support and comprehensive E2E tests for the complete architect workflow**
|
||||
|
||||
## Performance
|
||||
|
||||
- **Duration:** 4 min
|
||||
- **Started:** 2026-01-31T19:25:00Z
|
||||
- **Completed:** 2026-01-31T19:29:00Z
|
||||
- **Tasks:** 3
|
||||
- **Files modified:** 4
|
||||
|
||||
## Accomplishments
|
||||
|
||||
- Enhanced TestHarness with tRPC caller and initiative/phase repositories
|
||||
- Added architect-specific scenario helpers (setArchitectDiscussComplete, setArchitectBreakdownComplete)
|
||||
- Added convenience helpers (mockAgentManager alias, advanceTimers, getEmittedEvents)
|
||||
- Created comprehensive E2E tests for discuss mode (completion, Q&A flow)
|
||||
- Created E2E tests for breakdown mode and phase persistence
|
||||
- Added full workflow test covering discuss -> breakdown -> phases
|
||||
|
||||
## Task Commits
|
||||
|
||||
Each task was committed atomically:
|
||||
|
||||
1. **Task 1: Add TestHarness helpers for architect modes** - `021937c` (feat)
|
||||
2. **Task 2: Add E2E test for discuss mode** - `ae130e9` (test)
|
||||
3. **Task 3: Add E2E test for breakdown mode and phase persistence** - `47b4623` (test)
|
||||
|
||||
## Files Created/Modified
|
||||
|
||||
- `src/test/harness.ts` - Added tRPC caller, repositories, architect helpers
|
||||
- `src/test/index.ts` - Export TRPCCaller type
|
||||
- `src/test/e2e/architect-workflow.test.ts` - New E2E test file (5 tests)
|
||||
- `src/agent/mock-manager.test.ts` - Fixed pre-existing test issues
|
||||
|
||||
## Tests Added
|
||||
|
||||
- **Discuss mode completion** - Spawn architect, complete with decisions
|
||||
- **Discuss Q&A flow** - Pause on questions, resume with answers
|
||||
- **Breakdown mode completion** - Spawn architect, complete with phases
|
||||
- **Phase persistence** - Create and retrieve phases from breakdown
|
||||
- **Full workflow** - Discuss -> Breakdown -> Phase persistence
|
||||
|
||||
## Decisions Made
|
||||
|
||||
1. **TestHarness tRPC caller** - Enables direct procedure invocation in tests
|
||||
2. **Architect scenario helpers** - Convenience wrappers for context_complete, breakdown_complete
|
||||
3. **Full workflow coverage** - Single test proving entire architect flow works
|
||||
|
||||
## Deviations from Plan
|
||||
|
||||
Minor fixes to pre-existing test issues (dependencies in PhaseBreakdown, type casting for AgentStoppedEvent).
|
||||
|
||||
## Issues Encountered
|
||||
|
||||
None
|
||||
|
||||
## User Setup Required
|
||||
|
||||
None - tests run automatically.
|
||||
|
||||
## Phase 11 Completion
|
||||
|
||||
This was the final plan in Phase 11 (Architect Agent). Phase 11 is now complete:
|
||||
- 11-01: Agent mode and schema updates
|
||||
- 11-02: Discuss and breakdown mode output schemas
|
||||
- 11-03: Mode-aware agent manager implementation
|
||||
- 11-04: Initiative and phase tRPC procedures
|
||||
- 11-05: Agent prompts module and architect spawn procedures
|
||||
- 11-06: Initiative and architect CLI commands
|
||||
- 11-07: Unit tests for modes and repositories
|
||||
- 11-08: TestHarness helpers and E2E tests (this plan)
|
||||
|
||||
## Next Phase Readiness
|
||||
|
||||
- Phase 11 complete - architect workflow fully tested
|
||||
- Ready for Phase 12 (if exists) or milestone completion
|
||||
|
||||
---
|
||||
*Phase: 11-architect-agent*
|
||||
*Completed: 2026-01-31*
|
||||
Reference in New Issue
Block a user