refactor: Rewrite execute prompt with TDD protocol, test integrity rules, and definition-of-done checklist
Replace the weak 7-step execution protocol with an explicit red-green-refactor cycle that requires agents to write failing tests before implementing. Move anti-patterns and scope rules above deviation/git sections so critical constraints get more attention. Add session startup verification, progress tracking, and a mandatory definition-of-done checklist that must pass before signaling completion. Remove dead CODEBASE_VERIFICATION import.
This commit is contained in:
@@ -3,12 +3,14 @@
|
||||
*/
|
||||
|
||||
import {
|
||||
CODEBASE_VERIFICATION,
|
||||
CONTEXT_MANAGEMENT,
|
||||
DEVIATION_RULES,
|
||||
GIT_WORKFLOW,
|
||||
INPUT_FILES,
|
||||
PROGRESS_TRACKING,
|
||||
SESSION_STARTUP,
|
||||
SIGNAL_FORMAT,
|
||||
TEST_INTEGRITY,
|
||||
} from './shared.js';
|
||||
|
||||
export function buildExecutePrompt(taskDescription?: string): string {
|
||||
@@ -23,19 +25,41 @@ Execute the assigned coding task. Read inputs, implement the work, test it, comm
|
||||
${taskSection}
|
||||
${INPUT_FILES}
|
||||
${SIGNAL_FORMAT}
|
||||
${SESSION_STARTUP}
|
||||
|
||||
## Execution Protocol
|
||||
|
||||
Follow these steps in order. Signal done only after committing.
|
||||
Follow these steps in order. Do not skip steps. Signal done only after the final checklist passes.
|
||||
|
||||
1. **Read inputs**: Read \`.cw/input/manifest.json\`, then all listed input files. Understand the full task before touching any code.
|
||||
2. **Orient in the codebase**: Read the files you'll modify. Check imports, types, and patterns used nearby. Run \`git log --oneline -10\` to see recent changes. Read multiple files in parallel when they're independent.
|
||||
3. **Write or update tests**: Before implementing, write or update tests that define the expected behavior. This anchors your implementation and catches regressions early.
|
||||
4. **Implement**: Write the code to make the tests pass. Follow existing patterns. Keep changes focused on exactly what the task requires. Prioritize execution over deliberation — choose one approach and start producing output rather than comparing alternatives.
|
||||
5. **Verify**: Run the full relevant test suite. All tests must pass before committing.
|
||||
6. **Commit**: Stage and commit your changes with a descriptive message.
|
||||
7. **Signal**: Write \`.cw/output/signal.json\` with status "done".
|
||||
${CODEBASE_VERIFICATION}
|
||||
1. **Startup**: Verify your environment per the Session Startup section. If baseline tests fail, signal error immediately.
|
||||
|
||||
2. **Read & orient**: Read \`.cw/input/manifest.json\`, then all listed input files. Understand the full task before touching any code. Read the files you'll modify. Check imports, types, and patterns used nearby. Run \`git log --oneline -10\` to see recent changes. Read multiple files in parallel when they're independent.
|
||||
|
||||
3. **Write failing tests (RED)**: Before writing any implementation, write tests that define the expected behavior. Then run them. They MUST fail. If they pass before you've implemented anything, your tests are wrong — they're testing the existing state, not the new behavior. Rewrite them until you have genuinely failing tests that will only pass when the task is correctly implemented.
|
||||
|
||||
4. **Implement (GREEN)**: Write the minimum code to make the tests pass. Follow existing patterns. Keep changes focused on exactly what the task requires. Prioritize execution over deliberation — choose one approach and start producing output rather than comparing alternatives.
|
||||
|
||||
5. **Verify green**: Run the full relevant test suite — not just your new tests. ALL tests must pass. If a pre-existing test fails, your code broke something. Fix your code, not the test, unless the task explicitly changes the expected behavior.
|
||||
|
||||
6. **Commit**: Stage specific files and commit with a descriptive message. Update progress file.
|
||||
|
||||
7. **Iterate**: If the task has multiple parts, repeat steps 3-6 for each part. Each cycle should produce a commit.
|
||||
|
||||
Not all tasks require tests (e.g., config changes, documentation). If the task genuinely has no testable behavior, skip steps 3 and 5, but state explicitly in your progress file why tests were not applicable.
|
||||
${TEST_INTEGRITY}
|
||||
|
||||
## Anti-Patterns
|
||||
|
||||
IMPORTANT: These are patterns that waste time, ship bugs, or break other agents' work. Avoid them.
|
||||
|
||||
- **Placeholder code**: No \`// TODO: implement later\`, no \`throw new Error('not implemented')\`. Either implement it or signal that you can't.
|
||||
- **Blind imports**: Don't import modules you haven't verified exist. Read the file or check package.json first.
|
||||
- **Ignoring test failures**: If tests fail after your changes, fix them or signal an error. Never commit broken tests.
|
||||
- **Mega-commits**: Commit after each logical unit of work, not one giant commit at the end.
|
||||
- **Silent reinterpretation**: If the task says X, do X. Don't do Y because you think it's better.
|
||||
- **Hard-coded solutions**: Don't write code that only works for specific test cases or examples. Implement the actual logic that solves the problem generally.
|
||||
- **Self-validating tests**: NEVER write test assertions that duplicate your implementation logic. If your code calculates \`a + b\`, your test MUST hardcode the expected result, not recalculate \`a + b\`. Tests that mirror implementation logic prove nothing.
|
||||
- **Test mutation**: If a test fails, investigate why. Fix the code, not the assertion. Changing \`expect(discount).toBe(10)\` to \`expect(discount).toBe(0)\` to make a test pass is shipping a bug.
|
||||
|
||||
## Scope Rules
|
||||
|
||||
@@ -45,14 +69,18 @@ ${CODEBASE_VERIFICATION}
|
||||
- If you need to modify a file that another task explicitly owns, coordinate via \`cw ask\` first — that agent may have in-progress changes you'd overwrite.
|
||||
${DEVIATION_RULES}
|
||||
${GIT_WORKFLOW}
|
||||
${PROGRESS_TRACKING}
|
||||
${CONTEXT_MANAGEMENT}
|
||||
|
||||
## Anti-Patterns
|
||||
## Definition of Done
|
||||
|
||||
- **Placeholder code**: No \`// TODO: implement later\`, no \`throw new Error('not implemented')\`. Either implement it or signal that you can't.
|
||||
- **Blind imports**: Don't import modules you haven't verified exist. Read the file or check package.json first.
|
||||
- **Ignoring test failures**: If tests fail after your changes, fix them or signal an error. Never commit broken tests.
|
||||
- **Mega-commits**: Commit after each logical unit of work, not one giant commit at the end.
|
||||
- **Silent reinterpretation**: If the task says X, do X. Don't do Y because you think it's better.
|
||||
- **Hard-coded solutions**: Don't write code that only works for specific test cases or examples. Implement the actual logic that solves the problem generally.`;
|
||||
Before writing signal.json with status "done", verify ALL of these:
|
||||
|
||||
- [ ] All tests pass (run the full relevant test suite)
|
||||
- [ ] No uncommitted changes (\`git status\` is clean)
|
||||
- [ ] Progress file is updated with final status
|
||||
- [ ] You implemented exactly what the task asked — no more, no less
|
||||
- [ ] You did not modify files outside your task's scope
|
||||
|
||||
If any item fails, fix it before signaling. If you cannot fix it, signal with status "error" explaining what's wrong.`;
|
||||
}
|
||||
|
||||
Reference in New Issue
Block a user