7.5 KiB
Goal-Backward Verification
Verification confirms that goals are achieved, not merely that tasks were completed. A completed task "create chat component" does not guarantee the goal "working chat interface" is met.
Core Principle
Task completion ≠ Goal achievement
Tasks are implementation steps. Goals are user outcomes. Verification bridges the gap by checking observable outcomes, not just checklist items.
Verification Levels
Level 1: Existence Check
Does the artifact exist?
✓ File exists at expected path
✓ Component is exported
✓ Route is registered
Level 2: Substance Check
Is the artifact substantive (not a stub)?
✓ Function has implementation (not just return null)
✓ Component renders content (not empty div)
✓ API returns meaningful response (not placeholder)
Level 3: Wiring Check
Is the artifact connected to the system?
✓ Component is rendered somewhere
✓ API endpoint is called by client
✓ Event handler is attached
✓ Database query is executed
All three levels must pass for verification success.
Must-Have Derivation
Before verification, derive what "done" means from the goal:
1. Observable Truths (3-7 user perspectives)
What can a user observe when the goal is achieved?
observable_truths:
- "User can click 'Send' and message appears in chat"
- "Messages persist after page refresh"
- "New messages appear without page reload"
- "User sees typing indicator when other party types"
2. Required Artifacts
What files MUST exist?
required_artifacts:
- path: src/components/Chat.tsx
check: "Exports Chat component"
- path: src/api/messages.ts
check: "Exports sendMessage, getMessages"
- path: src/hooks/useChat.ts
check: "Exports useChat hook"
3. Required Wiring
What connections MUST work?
required_wiring:
- from: Chat.tsx
to: useChat.ts
check: "Component calls hook"
- from: useChat.ts
to: messages.ts
check: "Hook calls API"
- from: messages.ts
to: database
check: "API persists to DB"
4. Key Links (Where Stubs Hide)
What integration points commonly fail?
key_links:
- "Form onSubmit → API call (not just console.log)"
- "WebSocket connection → message handler"
- "API response → state update → render"
Verification Process
Phase Verification
After all tasks in a phase complete:
1. Load must-haves (from phase goal or PLAN frontmatter)
2. For each observable truth:
a. Level 1: Does the relevant code exist?
b. Level 2: Is it substantive?
c. Level 3: Is it wired?
3. For each required artifact:
a. Verify file exists
b. Verify not a stub
c. Verify it's imported/used
4. For each key link:
a. Trace the connection
b. Verify data flows
5. Scan for anti-patterns (see below)
6. Structure gaps for re-planning
Anti-Pattern Scanning
Check for common incomplete work:
| Pattern | Detection | Meaning |
|---|---|---|
// TODO |
Grep for TODO comments | Work deferred |
throw new Error('Not implemented') |
Grep for stub errors | Placeholder code |
return null / return {} |
AST analysis | Empty implementations |
console.log in handlers |
Grep for console.log | Debug code left behind |
| Empty catch blocks | AST analysis | Swallowed errors |
| Hardcoded values | Manual review | Missing configuration |
Verification Output
Pass Case
# 2-VERIFICATION.md
phase: 2
status: PASS
verified_at: 2024-01-15T10:30:00Z
observable_truths:
- truth: "User can send message"
status: VERIFIED
evidence: "Chat.tsx:45 calls sendMessage on submit"
- truth: "Messages persist"
status: VERIFIED
evidence: "messages.ts:23 inserts to SQLite"
required_artifacts:
- path: src/components/Chat.tsx
status: EXISTS
check: PASSED
- path: src/api/messages.ts
status: EXISTS
check: PASSED
anti_patterns_found: []
human_verification_needed:
- "Visual layout matches design"
- "Real-time updates work under load"
Fail Case (Gaps Found)
# 2-VERIFICATION.md
phase: 2
status: GAPS_FOUND
verified_at: 2024-01-15T10:30:00Z
gaps:
- type: STUB
location: src/hooks/useChat.ts:34
description: "sendMessage returns immediately without API call"
severity: BLOCKING
- type: MISSING_WIRING
location: src/components/Chat.tsx
description: "WebSocket not connected, no real-time updates"
severity: BLOCKING
- type: ANTI_PATTERN
location: src/api/messages.ts:67
description: "Empty catch block swallows errors"
severity: HIGH
remediation_plan:
- "Connect useChat to actual API endpoint"
- "Initialize WebSocket in Chat component"
- "Add error handling to API calls"
User Acceptance Testing (UAT)
Verification confirms code correctness. UAT confirms user experience.
UAT Process
- Extract testable deliverables from phase goal
- Walk user through each one:
- "Can you log in with your email?"
- "Does the dashboard show your projects?"
- "Can you create a new project?"
- Record result: PASS, FAIL, or describe issue
- If issues found:
- Diagnose root cause
- Create targeted fix plan
- If all pass: Phase complete
UAT Output
# 2-UAT.md
phase: 2
tested_by: user
tested_at: 2024-01-15T14:00:00Z
test_cases:
- case: "Login with email"
result: PASS
- case: "Dashboard shows projects"
result: FAIL
issue: "Shows loading spinner forever"
diagnosis: "API returns 500, missing auth header"
- case: "Create new project"
result: BLOCKED
reason: "Cannot test, dashboard not loading"
fix_required: true
fix_plan:
- task: "Add auth header to dashboard API call"
files: [src/api/projects.ts]
priority: P0
Integration with Task Workflow
Task Completion Hook
When task closes:
- Worker marks task closed with reason
- If all phase tasks closed, trigger phase verification
- Verifier agent runs goal-backward check
- If PASS: Phase marked complete
- If GAPS: Create remediation tasks, phase stays in_progress
Verification Task Type
Verification itself is a task:
type: verification
phase_id: phase-2
status: open
assigned_to: verifier-agent
priority: P0 # Always high priority
Checkpoint Types
During execution, agents may need human input. Use precise checkpoint types:
checkpoint:human-verify (90% of checkpoints)
Agent completed work, user confirms it works.
checkpoint: human-verify
prompt: "Can you log in with email and password?"
expected: "User confirms successful login"
checkpoint:decision (9% of checkpoints)
User must make implementation choice.
checkpoint: decision
prompt: "OAuth2 or SAML for SSO?"
options:
- OAuth2: "Simpler, most common"
- SAML: "Enterprise requirement"
checkpoint:human-action (1% of checkpoints)
Truly unavoidable manual step.
checkpoint: human-action
prompt: "Click the email verification link"
reason: "Cannot automate email client interaction"
Human Verification Needs
Some verifications require human eyes:
| Category | Examples | Why Human |
|---|---|---|
| Visual | Layout, spacing, colors | Subjective/design judgment |
| Real-time | WebSocket, live updates | Requires interaction |
| External | OAuth flow, payment | Third-party systems |
| Accessibility | Screen reader, keyboard nav | Requires tooling/expertise |
Mark these explicitly in verification output. Don't claim PASS when human verification is pending.