Codewalkers/docs/verification.md

# Goal-Backward Verification

Verification confirms that **goals are achieved**, not merely that **tasks were completed**. A completed task "create chat component" does not guarantee the goal "working chat interface" is met.

## Core Principle

**Task completion ≠ Goal achievement**

Tasks are implementation steps. Goals are user outcomes. Verification bridges the gap by checking observable outcomes, not just checklist items.

---

## Verification Levels

### Level 1: Existence Check
Does the artifact exist?

```
✓ File exists at expected path
✓ Component is exported
✓ Route is registered
```

### Level 2: Substance Check
Is the artifact substantive (not a stub)?

```
✓ Function has implementation (not just return null)
✓ Component renders content (not empty div)
✓ API returns meaningful response (not placeholder)
```

### Level 3: Wiring Check
Is the artifact connected to the system?

```
✓ Component is rendered somewhere
✓ API endpoint is called by client
✓ Event handler is attached
✓ Database query is executed
```

**All three levels must pass for verification success.**

---

## Must-Have Derivation

Before verification, derive what "done" means from the goal:

### 1. Observable Truths (3-7 user perspectives)
What can a user observe when the goal is achieved?

```yaml
observable_truths:
  - "User can click 'Send' and message appears in chat"
  - "Messages persist after page refresh"
  - "New messages appear without page reload"
  - "User sees typing indicator when other party types"
```

### 2. Required Artifacts
What files MUST exist?

```yaml
required_artifacts:
  - path: src/components/Chat.tsx
    check: "Exports Chat component"
  - path: src/api/messages.ts
    check: "Exports sendMessage, getMessages"
  - path: src/hooks/useChat.ts
    check: "Exports useChat hook"
```

### 3. Required Wiring
What connections MUST work?

```yaml
required_wiring:
  - from: Chat.tsx
    to: useChat.ts
    check: "Component calls hook"
  - from: useChat.ts
    to: messages.ts
    check: "Hook calls API"
  - from: messages.ts
    to: database
    check: "API persists to DB"
```

### 4. Key Links (Where Stubs Hide)
What integration points commonly fail?

```yaml
key_links:
  - "Form onSubmit → API call (not just console.log)"
  - "WebSocket connection → message handler"
  - "API response → state update → render"
```

---

## Verification Process

### Phase Verification

After all tasks in a phase complete:

```
1. Load must-haves (from phase goal or PLAN frontmatter)
2. For each observable truth:
   a. Level 1: Does the relevant code exist?
   b. Level 2: Is it substantive?
   c. Level 3: Is it wired?
3. For each required artifact:
   a. Verify file exists
   b. Verify not a stub
   c. Verify it's imported/used
4. For each key link:
   a. Trace the connection
   b. Verify data flows
5. Scan for anti-patterns (see below)
6. Structure gaps for re-planning
```

### Anti-Pattern Scanning

Check for common incomplete work:

| Pattern | Detection | Meaning |
|---------|-----------|---------|
| `// TODO` | Grep for TODO comments | Work deferred |
| `throw new Error('Not implemented')` | Grep for stub errors | Placeholder code |
| `return null` / `return {}` | AST analysis | Empty implementations |
| `console.log` in handlers | Grep for console.log | Debug code left behind |
| Empty catch blocks | AST analysis | Swallowed errors |
| Hardcoded values | Manual review | Missing configuration |

---

## Verification Output

### Pass Case

```yaml
# 2-VERIFICATION.md
phase: 2
status: PASS
verified_at: 2024-01-15T10:30:00Z

observable_truths:
  - truth: "User can send message"
    status: VERIFIED
    evidence: "Chat.tsx:45 calls sendMessage on submit"
  - truth: "Messages persist"
    status: VERIFIED
    evidence: "messages.ts:23 inserts to SQLite"

required_artifacts:
  - path: src/components/Chat.tsx
    status: EXISTS
    check: PASSED
  - path: src/api/messages.ts
    status: EXISTS
    check: PASSED

anti_patterns_found: []

human_verification_needed:
  - "Visual layout matches design"
  - "Real-time updates work under load"
```

### Fail Case (Gaps Found)

```yaml
# 2-VERIFICATION.md
phase: 2
status: GAPS_FOUND
verified_at: 2024-01-15T10:30:00Z

gaps:
  - type: STUB
    location: src/hooks/useChat.ts:34
    description: "sendMessage returns immediately without API call"
    severity: BLOCKING

  - type: MISSING_WIRING
    location: src/components/Chat.tsx
    description: "WebSocket not connected, no real-time updates"
    severity: BLOCKING

  - type: ANTI_PATTERN
    location: src/api/messages.ts:67
    description: "Empty catch block swallows errors"
    severity: HIGH

remediation_plan:
  - "Connect useChat to actual API endpoint"
  - "Initialize WebSocket in Chat component"
  - "Add error handling to API calls"
```

---

## User Acceptance Testing (UAT)

Verification confirms code correctness. UAT confirms user experience.

### UAT Process

1. Extract testable deliverables from phase goal
2. Walk user through each one:
   - "Can you log in with your email?"
   - "Does the dashboard show your projects?"
   - "Can you create a new project?"
3. Record result: PASS, FAIL, or describe issue
4. If issues found:
   - Diagnose root cause
   - Create targeted fix plan
5. If all pass: Phase complete

### UAT Output

```yaml
# 2-UAT.md
phase: 2
tested_by: user
tested_at: 2024-01-15T14:00:00Z

test_cases:
  - case: "Login with email"
    result: PASS

  - case: "Dashboard shows projects"
    result: FAIL
    issue: "Shows loading spinner forever"
    diagnosis: "API returns 500, missing auth header"

  - case: "Create new project"
    result: BLOCKED
    reason: "Cannot test, dashboard not loading"

fix_required: true
fix_plan:
  - task: "Add auth header to dashboard API call"
    files: [src/api/projects.ts]
    priority: P0
```

---

## Integration with Task Workflow

### Task Completion Hook
When task closes:
1. Worker marks task closed with reason
2. If all phase tasks closed, trigger phase verification
3. Verifier agent runs goal-backward check
4. If PASS: Phase marked complete
5. If GAPS: Create remediation tasks, phase stays in_progress

### Verification Task Type
Verification itself is a task:

```yaml
type: verification
phase_id: phase-2
status: open
assigned_to: verifier-agent
priority: P0  # Always high priority
```

---

## Checkpoint Types

During execution, agents may need human input. Use precise checkpoint types:

### checkpoint:human-verify (90% of checkpoints)
Agent completed work, user confirms it works.

```yaml
checkpoint: human-verify
prompt: "Can you log in with email and password?"
expected: "User confirms successful login"
```

### checkpoint:decision (9% of checkpoints)
User must make implementation choice.

```yaml
checkpoint: decision
prompt: "OAuth2 or SAML for SSO?"
options:
  - OAuth2: "Simpler, most common"
  - SAML: "Enterprise requirement"
```

### checkpoint:human-action (1% of checkpoints)
Truly unavoidable manual step.

```yaml
checkpoint: human-action
prompt: "Click the email verification link"
reason: "Cannot automate email client interaction"
```

---

## Human Verification Needs

Some verifications require human eyes:

| Category | Examples | Why Human |
|----------|----------|-----------|
| Visual | Layout, spacing, colors | Subjective/design judgment |
| Real-time | WebSocket, live updates | Requires interaction |
| External | OAuth flow, payment | Third-party systems |
| Accessibility | Screen reader, keyboard nav | Requires tooling/expertise |

**Mark these explicitly** in verification output. Don't claim PASS when human verification is pending.