diff --git a/.planning/phases/13-real-claude-e2e-tests/13-01-PLAN.md b/.planning/phases/13-real-claude-e2e-tests/13-01-PLAN.md
new file mode 100644
index 0000000..811597f
--- /dev/null
+++ b/.planning/phases/13-real-claude-e2e-tests/13-01-PLAN.md
@@ -0,0 +1,178 @@
+---
+phase: 13-real-claude-e2e-tests
+plan: 01
+type: execute
+wave: 1
+depends_on: []
+files_modified: [src/test/integration/real-claude.test.ts, src/agent/manager.ts]
+autonomous: true
+---
+
+
+Create integration tests that validate Claude CLI JSON schema behavior with real Claude calls.
+
+Purpose: Verify that the JSON schemas defined in src/agent/schema.ts work correctly with the actual Claude CLI, confirming MockAgentManager accurately simulates real behavior.
+Output: Integration test file with real Claude CLI tests (skipped by default due to cost/time), documented findings.
+
+
+
+@~/.claude/get-shit-done/workflows/execute-plan.md
+@~/.claude/get-shit-done/templates/summary.md
+
+
+
+@.planning/PROJECT.md
+@.planning/ROADMAP.md
+@.planning/STATE.md
+
+# Key source files
+@src/agent/manager.ts
+@src/agent/schema.ts
+@src/agent/prompts.ts
+@src/test/harness.ts
+
+
+
+
+
+ Task 1: Create real Claude CLI integration test file
+ src/test/integration/real-claude.test.ts
+
+Create integration test file for real Claude CLI validation. Structure:
+
+1. Create `src/test/integration/` directory if not exists
+2. Create test file with `describe.skip` wrapper (tests are expensive, run manually)
+3. Add helper function to call Claude CLI directly using execa:
+ - Takes prompt and JSON schema
+ - Returns parsed structured_output from CLI response
+ - Handles timeout (30s default)
+
+4. Add test cases for each agent mode:
+ - Execute mode: done status with result
+ - Execute mode: questions status with array
+ - Discuss mode: context_complete with decisions
+ - Breakdown mode: breakdown_complete with phases
+ - Decompose mode: decompose_complete with tasks
+
+5. Each test should:
+ - Use minimal prompt that triggers expected output
+ - Verify structured_output field is populated
+ - Verify output matches Zod schema validation
+ - Log cost for documentation
+
+Use `describe.skip` so tests don't run in CI. Add comment explaining how to run manually:
+`REAL_CLAUDE_TESTS=1 npm test -- --grep "Real Claude"`
+
+Key insight from validation: Claude CLI returns `structured_output` field (not `result`) when using --json-schema.
+
+ File exists at src/test/integration/real-claude.test.ts with skipped test suite
+ Integration test file created with all mode tests, skipped by default
+
+
+
+ Task 2: Fix ClaudeAgentManager to parse structured_output
+ src/agent/manager.ts
+
+Update handleAgentCompletion to read from `structured_output` field instead of parsing `result` as JSON.
+
+Current code (line ~190):
+```typescript
+const rawOutput = JSON.parse(cliResult.result);
+```
+
+The Claude CLI with --json-schema returns:
+```json
+{
+ "type": "result",
+ "result": "",
+ "structured_output": { "status": "done", "result": "..." }
+}
+```
+
+Update to:
+```typescript
+// When --json-schema is used, structured output is in structured_output field
+const rawOutput = cliResult.structured_output ?? JSON.parse(cliResult.result);
+```
+
+Also update ClaudeCliResult interface to include structured_output:
+```typescript
+interface ClaudeCliResult {
+ type: 'result';
+ subtype: 'success' | 'error';
+ is_error: boolean;
+ session_id: string;
+ result: string;
+ structured_output?: unknown; // Add this
+ total_cost_usd?: number;
+}
+```
+
+This is backwards compatible - if structured_output is missing, falls back to parsing result.
+
+ npm run build passes, existing tests still pass
+ ClaudeAgentManager correctly reads structured_output from Claude CLI response
+
+
+
+ Task 3: Run real Claude tests and document findings
+ src/test/integration/real-claude.test.ts
+
+Run the real Claude tests manually and document findings:
+
+1. Enable tests temporarily by removing .skip or setting env var
+2. Run: `npm test -- src/test/integration/real-claude.test.ts`
+3. Capture results:
+ - Which tests pass/fail
+ - Response times
+ - Costs per test
+ - Any unexpected behavior
+
+4. Add findings as comments in test file:
+ ```typescript
+ /**
+ * Real Claude CLI Integration Tests
+ *
+ * Findings from validation run (DATE):
+ * - Execute mode: Works, ~$X.XX, ~Xs
+ * - Multi-question: Works, array format validated
+ * - Discuss mode: Works, decisions array validated
+ * - Breakdown mode: Works, phases array validated
+ * - Decompose mode: Works, tasks array validated
+ *
+ * Total validation cost: $X.XX
+ *
+ * Conclusion: MockAgentManager accurately simulates real CLI behavior.
+ * JSON schemas work correctly with Claude CLI --json-schema flag.
+ */
+ ```
+
+5. Re-add .skip to prevent accidental runs in CI
+
+ Tests run successfully when enabled, findings documented in file
+ Real Claude CLI behavior validated, findings documented, tests skipped for CI
+
+
+
+
+
+Before declaring plan complete:
+- [ ] src/test/integration/real-claude.test.ts exists with all mode tests
+- [ ] ClaudeAgentManager reads structured_output field
+- [ ] npm run build passes
+- [ ] npm test passes (integration tests skipped)
+- [ ] Manual run of real tests documents findings
+
+
+
+
+- Integration test file created with real Claude CLI tests
+- Tests are skipped by default (cost/time)
+- ClaudeAgentManager correctly parses structured_output
+- At least one real test run validates expected behavior
+- Findings documented in test file comments
+
+
+