docs(13-01): complete Real Claude CLI Integration Tests plan

Tasks completed: 3/3 - Create real Claude CLI integration test file - Fix ClaudeAgentManager to parse structured_output - Run real Claude tests and document findings SUMMARY: .planning/phases/13-real-claude-e2e-tests/13-01-SUMMARY.md Milestone v1.2 complete (21 plans, 4 phases)
2026-02-02 10:41:47 +01:00
parent accbaca49d
commit 2dc51c74d3
3 changed files with 143 additions and 16 deletions
--- a/.planning/phases/13-real-claude-e2e-tests/13-01-SUMMARY.md
+++ b/.planning/phases/13-real-claude-e2e-tests/13-01-SUMMARY.md
@@ -0,0 +1,119 @@
+---
+phase: 13-real-claude-e2e-tests
+plan: 01
+subsystem: testing
+tags: [claude-cli, json-schema, integration-tests, structured-output]
+
+# Dependency graph
+requires:
+  - phase: 11-architect-agent
+    provides: Agent mode schemas (execute, discuss, breakdown, decompose)
+  - phase: 12-phase-task-decomposition
+    provides: Decompose mode schema
+provides:
+  - Real Claude CLI integration tests for schema validation
+  - Fix for structured_output parsing in ClaudeAgentManager
+  - Documentation of actual Claude CLI response structure
+affects: [agent-manager, mock-agent-manager, future-cli-integration]
+
+# Tech tracking
+tech-stack:
+  added: []
+  patterns:
+    - Real CLI integration tests skipped by default (env var to enable)
+    - structured_output field for JSON schema responses
+
+key-files:
+  created:
+    - src/test/integration/real-claude.test.ts
+  modified:
+    - src/agent/manager.ts
+
+key-decisions:
+  - "Use structured_output field (not result) when --json-schema is used"
+  - "Integration tests skipped by default (REAL_CLAUDE_TESTS=1 to enable)"
+  - "Test timeout of 120s for real API calls"
+
+patterns-established:
+  - "Real CLI integration tests as validation tool, not CI suite"
+
+# Metrics
+duration: 4min
+completed: 2026-02-02
+---
+
+# Phase 13 Plan 01: Real Claude CLI Integration Tests Summary
+
+**Integration tests for validating JSON schemas with real Claude CLI, discovered result field is empty when using --json-schema (structured_output contains data)**
+
+## Performance
+
+- **Duration:** 4 min
+- **Started:** 2026-02-02T09:36:37Z
+- **Completed:** 2026-02-02T09:40:10Z
+- **Tasks:** 3
+- **Files modified:** 2
+
+## Accomplishments
+
+- Created integration test suite for all agent modes (execute, discuss, breakdown, decompose)
+- Fixed ClaudeAgentManager to correctly read from `structured_output` field
+- Documented actual Claude CLI response structure with `--json-schema` flag
+- Validated MockAgentManager accurately simulates real CLI behavior
+
+## Task Commits
+
+Each task was committed atomically:
+
+1. **Task 1: Create real Claude CLI integration test file** - `3c98dbe` (test)
+2. **Task 2: Fix ClaudeAgentManager to parse structured_output** - `5605547` (fix)
+3. **Task 3: Run real Claude tests and document findings** - `accbaca` (docs)
+
+## Files Created/Modified
+
+- `src/test/integration/real-claude.test.ts` - Integration tests for all agent mode schemas
+- `src/agent/manager.ts` - Added `structured_output` field to ClaudeCliResult, fixed parsing
+
+## Decisions Made
+
+1. **Use `structured_output` field for JSON schema responses** - When using `--json-schema` flag, Claude CLI returns structured data in `structured_output` field, not `result`. The `result` field is empty in this case.
+
+2. **Integration tests skipped by default** - Tests call real Claude API and incur costs (~$0.025 per call). Enable with `REAL_CLAUDE_TESTS=1` environment variable.
+
+## Deviations from Plan
+
+None - plan executed exactly as written.
+
+## Issues Encountered
+
+None.
+
+## Key Finding: Claude CLI Response Structure
+
+When using `--json-schema` flag:
+```json
+{
+  "type": "result",
+  "subtype": "success",
+  "result": "",                        // EMPTY
+  "structured_output": { ... },        // Actual validated JSON here
+  "session_id": "...",
+  "total_cost_usd": 0.025
+}
+```
+
+This is different from non-schema mode where `result` contains the text response.
+
+## User Setup Required
+
+None - no external service configuration required.
+
+## Next Phase Readiness
+
+- Integration tests in place for schema validation
+- ClaudeAgentManager correctly handles structured_output
+- Ready to use real CLI tests for future schema changes
+
+---
+*Phase: 13-real-claude-e2e-tests*
+*Completed: 2026-02-02*