Commit Graph

416 Commits

Author SHA1 Message Date
Lukas May
6fa025251e feat: Wire up initiative deletion end-to-end
Add deleteInitiative tRPC procedure, wire Delete button in InitiativeCard
with confirm dialog (Shift+click bypass), remove unused onDelete prop chain.
Fix agents table FK constraints (initiative_id, account_id missing ON DELETE
SET NULL) via table recreation migration. Register conversations migration
in journal. Expand cascade delete tests to cover pages, projects, change
sets, agents (set null), and conversations (set null).
2026-02-18 17:54:53 +09:00
Lukas May
80aa3e42fb Fix StatusBadge crash when status is undefined 2026-02-18 17:44:38 +09:00
Lukas May
8bece70a61 fix: Wire archive button to updateInitiative mutation
The Archive menu item in InitiativeCard had no onClick handler.
Added mutation call with confirmation dialog (shift+click to skip).
2026-02-18 17:44:01 +09:00
Lukas May
e52b9d3332 Remove unused Edit and Duplicate menu items from initiative card 2026-02-18 17:43:21 +09:00
Lukas May
1331fb737d refactor: Wire buildExecutePrompt into dispatch manager
Dispatch manager now wraps task descriptions with buildExecutePrompt()
so agents receive the full execution protocol. Update test to match
prompt wrapping. Add worktree isolation note to workspace layout.
2026-02-18 17:40:03 +09:00
Lukas May
b63a8b605c refactor: Compress refine prompt for conciseness (439→243 words, -45%)
- Tighten items 1-3 arrow notation, compress item 4 to Better/Best
  progressive comparison, shorten item 5 scenario example
- Cut 3 redundant Rules bullets (already stated in Output Files and
  guard paragraphs)
- Collapse 5 DoD checks to 2 non-redundant verification items
- Compress behavioral guard paragraphs
2026-02-18 17:30:57 +09:00
Lukas May
a4d48262c1 refactor: Compress detail prompt for conciseness (775→473 words, -39%)
Drop redundant Specificity Test section (covered by examples and checklist),
remove Task Design Rules (implied by entire prompt), flatten frontmatter
docs, trim good example, tighten sizing/checkpoint/context sections.
2026-02-18 17:30:56 +09:00
Lukas May
c9769b09b7 refactor: Compress plan prompt for conciseness
Cut ~35% of words while preserving all high-value content:
- Merged Testing Strategy into Phase Design (rule + example)
- Eliminated Rules section (redundant with Phase Design, Dependencies)
- Compressed Dependency Graph intro (examples speak for themselves)
- Trimmed File Ownership and Specificity prose
- Reduced Existing Context from 4 to 2 bullets
- Tightened Definition of Done checklist
2026-02-18 17:30:09 +09:00
Lukas May
a4502ebf77 refactor: Compress discuss prompt for conciseness (~30% word reduction)
Cut redundant rules already demonstrated by good/bad examples,
removed default-Claude-behavior instructions, collapsed verbose
sections into single directives.
2026-02-18 17:30:07 +09:00
Lukas May
e73e99cb28 refactor: Compress shared agent prompts for conciseness (1060→699 words, -34%)
Apply aggressive compression: imperative style, remove anti-laziness
emphasis, cut rationale where obvious, eliminate redundant explanations.
All constant names and function signatures preserved.
2026-02-18 17:30:04 +09:00
Lukas May
67f98f4f35 refactor: Compress execute prompt for conciseness (~47% word reduction)
- Cut 5 anti-patterns: placeholder code, blind imports, ignoring test
  failures (all default Claude behavior), plus self-validating tests
  and test mutation (both already covered by TEST_INTEGRITY in shared.ts)
- Compressed execution protocol steps to imperative essentials
- Merged scope rules from 4 bullets to 3
- Trimmed definition of done checklist (removed redundant 5th item)
- Removed anti-laziness language (IMPORTANT, MUST, aggressive emphasis)
2026-02-18 17:30:00 +09:00
Lukas May
44d2a3ff08 docs: Update agent.md to reflect prompt overhaul
Remove CODEBASE_VERIFICATION references, document new shared constants
(TEST_INTEGRITY, SESSION_STARTUP, PROGRESS_TRACKING), update mode prompt
descriptions with TDD protocol, Definition of Done checklists, and
mandatory test specifications.
2026-02-18 17:21:57 +09:00
Lukas May
9ed7e9ad16 refactor: Rewrite execute prompt with TDD protocol, test integrity rules, and definition-of-done checklist
Replace the weak 7-step execution protocol with an explicit red-green-refactor
cycle that requires agents to write failing tests before implementing. Move
anti-patterns and scope rules above deviation/git sections so critical
constraints get more attention. Add session startup verification, progress
tracking, and a mandatory definition-of-done checklist that must pass before
signaling completion. Remove dead CODEBASE_VERIFICATION import.
2026-02-18 17:20:11 +09:00
Lukas May
b5509232f6 refactor: Add testability focus and definition-of-done checklists to discuss/refine prompts
Discuss prompt: add Testability & Verification question category, require
verification criteria for behavioral decisions, add definition-of-done checklist.

Refine prompt: strengthen unverifiable-requirements check to demand testable
acceptance criteria with inputs/outputs, extend missing-edge-cases to frame as
testable scenarios, add definition-of-done checklist.
2026-02-18 17:19:53 +09:00
Lukas May
09a388b490 refactor: Enforce mandatory test specs in detail prompt, add testing strategy to plan prompt
Detail: Replace vague "how to verify" requirement with mandatory test specification
(file path, scenarios, run command) for execute-category tasks. Update good-task
example to demonstrate the new format. Add Definition of Done checklist.

Plan: Add Testing Strategy section requiring tests within each implementation phase
instead of trailing test phases. Add Definition of Done checklist.
2026-02-18 17:19:48 +09:00
Lukas May
298c570bc4 refactor: Overhaul shared prompt constants — remove CODEBASE_VERIFICATION, trim GIT_WORKFLOW/CONTEXT_MANAGEMENT, add TEST_INTEGRITY/SESSION_STARTUP/PROGRESS_TRACKING 2026-02-18 17:18:53 +09:00
Lukas May
c04e6d7778 refactor: Replace file-count task sizing with lines-changed heuristic
Anchor on ~150 lines changed as the sweet spot based on SWE-bench Pro
data (107 lines / 4.1 files = 46% success for best agents). Old rules
used file count as the primary proxy which correlates poorly with task
difficulty compared to lines changed.
2026-02-18 16:54:10 +09:00
Lukas May
7354582d69 refactor: Add context management to plan/detail prompts, update docs
Add CONTEXT_MANAGEMENT shared block to plan and detail mode prompts so
architect agents also benefit from compaction awareness and parallel
execution hints. Update index.ts re-exports and agent docs.
2026-02-18 16:43:19 +09:00
Lukas May
4ef9db1501 refactor: Improve shared agent prompts — add context management, explain git rules, slim inter-agent comms
- Add CONTEXT_MANAGEMENT constant: tells agents to keep working through
  context compaction and parallelize reads
- Add "why" reasoning to each GIT_WORKFLOW rule so agents understand the
  purpose, not just the rule
- Slim buildInterAgentCommunication: replace verbose bash code blocks with
  a brief usage pattern paragraph, condense CLI docs to bullet list
2026-02-18 16:41:55 +09:00
Lukas May
459c09b687 refactor: Overhaul execute prompt with test-first protocol, context management, anti-hardcoding
- Add CONTEXT_MANAGEMENT import and inject into template
- Rewrite execution protocol: test-first (step 3), parallel file reads, execution-over-deliberation
- Add "why" rationale to scope rules (conflict prevention, overwrite risk)
- Add hard-coded solutions anti-pattern, soften imperative tone
- Rename section from "Anti-Patterns (never do these)" to "Anti-Patterns"
2026-02-18 16:41:53 +09:00
Lukas May
58514fef3f docs: Document standalone agent path resolution in completion detection 2026-02-10 16:01:25 +01:00
Lukas May
2aa807a394 fix: Resolve signal.json path mismatch for standalone agents
Standalone agents (no initiative or 0 linked projects) run in a
workspace/ subdirectory, but signal.json lookups used the parent
directory. This caused all standalone agents to be marked "crashed"
despite successful completion.

Track the actual agent cwd at spawn time via ActiveAgent.agentCwd
and probe for the workspace/ subdirectory during reconciliation and
crash detection paths.
2026-02-10 16:00:37 +01:00
Lukas May
62a542116d feat: Add task deletion with shift+click auto-confirm
- Add deleteTask tRPC mutation (repo already had delete method)
- Add X button to TaskRow, hidden until hover, with confirmation dialog
- Shift+click bypasses confirmation for fast bulk deletion
- Invalidates listInitiativeTasks on success
- Document shift+click pattern in CLAUDE.md as standard for destructive actions
2026-02-10 15:58:24 +01:00
Lukas May
bfefbc85af feat: Switch cw ask from polling to SSE via onConversationAnswer subscription
- New onConversationAnswer subscription: listens for conversation:answered
  events matching a specific conversation ID, yields the answer text
- cw ask now subscribes via SSE instead of polling getConversation
- Removed --poll-interval and --timeout flags from cw ask
- Updated prompt to reflect SSE-based cw ask (no polling options)
2026-02-10 15:56:54 +01:00
Lukas May
bfc1b422f9 feat: Inject agent ID into prompts, SSE-based cw listen, all flags documented
- INTER_AGENT_COMMUNICATION constant → buildInterAgentCommunication(agentId) function
- Manager injects actual agent ID into prompt after DB record creation
- Agent ID hardcoded in cw listen/ask commands — no manifest.json indirection
- cw listen now uses onPendingConversation SSE subscription instead of polling
- CLI trpc-client upgraded with splitLink for subscription support
- All CLI flags (--agent-id, --from, --timeout, --poll-interval) documented in prompt
- conversation:created/answered added to ALL_EVENT_TYPES
2026-02-10 15:53:01 +01:00
Lukas May
c2d665c24f feat: Make initiative branch and execution mode editable from header
- Execution mode badge toggles between YOLO/REVIEW on click
- Branch badge opens inline editor (input + save/cancel)
- Branch editing locked once any task has left pending status
- Server-side guard rejects branch changes after work has started
- getInitiative returns branchLocked flag
- updateInitiativeConfig now accepts optional branch field
2026-02-10 15:52:40 +01:00
Lukas May
3ff1f485f1 fix: Prevent agents page from scrolling — lock layout to viewport
Body: height 100vh + overflow hidden instead of min-height 100vh,
so the browser never shows a scrollbar on html/body.
AppLayout: h-screen flex column with shrink-0 header and flex-1
min-h-0 overflow-auto main. Pages like initiatives scroll within
main; agents page uses h-full with internal panel scrollers.
2026-02-10 15:47:55 +01:00
Lukas May
142f67c131 fix: Prevent agents page from scrolling — constrain scroll to panels
Left agent list gets min-h-0 for proper overflow containment in grid.
Right output panel gets overflow-hidden so AgentOutputViewer stays
within the available grid cell height.
2026-02-10 15:32:43 +01:00
Lukas May
9f5421f6bc test: Rewrite conversation integration test with mock server and real tasks
Replace full CoordinationServer with a lightweight mock that serves only
conversation tRPC procedures backed by an in-memory repository. Agents
now have real coding tasks (write spec, ask questions, create summary)
and the two-question flow proves the listen→answer→re-listen cycle works.
2026-02-10 15:27:26 +01:00
Lukas May
f8c5dce588 test: Add PreviewManager integration tests
21 tests covering the full preview lifecycle: start (happy path, phaseId,
Docker unavailable, project not found, compose failure, health check failure,
no healthchecks), stop, list (with filter, missing labels), getStatus
(running/failed/stopped/building/not found), and stopAll (including partial
failure resilience).
2026-02-10 14:02:43 +01:00
Lukas May
9902069d8d test: Add real Claude inter-agent conversation integration test
Two-session test: Agent A listens for questions and answers, Agent B
asks a question and captures the response. Also fixes missing
conversationRepository passthrough in tRPC adapter.
2026-02-10 13:49:04 +01:00
Lukas May
60f06671e4 fix: Include dirty worktree paths in commit prompt and fix retry counter
Three bugs fixed in auto-cleanup commit retry flow:

1. resumeForCommit now calls getDirtyWorktreePaths() to include specific
   project subdirectory names in the prompt, so the agent knows which
   dirs to cd into and commit (instead of running git from the non-repo
   parent dir).

2. Removed finally block in tryAutoCleanup that reset the retry counter
   after every call, making MAX_COMMIT_RETRIES ineffective. Counter is
   now only cleaned up on success, max retries, or error.

3. resumeForCommit returns false early if no worktrees are actually
   dirty, preventing unnecessary commit retries for clean agents.
2026-02-10 13:44:10 +01:00
Lukas May
a6371e156a feat: Add inter-agent conversation system (listen, ask, answer)
Enables parallel agents to communicate through a CLI-based conversation
mechanism coordinated via tRPC. Agents can ask questions to peers and
receive answers, with target resolution by agent ID, task ID, or phase ID.
2026-02-10 13:43:30 +01:00
Lukas May
270a5cb21d feat: Add Docker-based preview deployments for phase review
Preview deployments let reviewers spin up the app at a specific branch
in local Docker containers, accessible through a single Caddy reverse
proxy port. Docker is the source of truth — no database table needed.

New module: src/preview/ with config discovery (.cw-preview.yml →
compose → Dockerfile fallback), compose generation, Docker CLI wrapper,
health checking, and port allocation (9100-9200 range).
2026-02-10 13:24:56 +01:00
Lukas May
783a07bfb7 fix: Show actionable error details for account health check failures
Setup tokens from `claude setup-token` can't query the usage API,
resulting in a useless "Usage API request failed" message. Now shows
the actual HTTP status and guides users to complete OAuth setup.
Also distinguishes warning state (yellow) from error state (red)
in the AccountCard UI.
2026-02-10 13:16:03 +01:00
Lukas May
06f443ebc8 refactor: DB-driven agent output events with single emission point
DB log chunk insertion is now the sole trigger for agent:output events.
Eliminates triple emission (FileTailer, handleStreamEvent, output buffer)
in favor of: FileTailer.onRawContent → DB insert → EventBus emit.

- createLogChunkCallback emits agent:output after successful DB insert
- spawnInternal now wires onRawContent callback (fixes session 1 gap)
- Remove eventBus from FileTailer (no longer touches EventBus)
- Remove eventBus from ProcessManager constructor (dead parameter)
- Remove agent:output emission from handleStreamEvent text_delta
- Remove outputBuffers map and all buffer helpers from manager/handler
- Remove getOutputBuffer from AgentManager interface and implementations
- getAgentOutput tRPC: DB-only, no file fallback
- onAgentOutput subscription: no initial buffer yield, events only
- AgentOutputViewer: accumulates raw JSONL chunks, parses uniformly
2026-02-10 11:47:36 +01:00
Lukas May
771cd71c1e feat: Validate default branch exists in repo when setting project defaultBranch
registerProject and updateProject now verify via remoteBranchExists that the
specified branch actually exists in the cloned repository before saving.
2026-02-10 11:46:00 +01:00
Lukas May
a8d3f52d09 feat: Re-add initiative branch field and add projects settings page
Allow users to specify a custom branch when creating initiatives
(auto-generated if left blank). Add updateProject tRPC procedure
and /settings/projects page with inline-editable defaultBranch.
2026-02-10 11:19:48 +01:00
Lukas May
fc3039a147 fix(dispatch): Filter planning-category tasks from dispatch pipeline and agent context
Planning tasks (research, discuss, plan, detail, refine) have their own
architect flow and should never enter the dispatch pipeline or clutter
agent context. Three changes:

1. Phase auto-queue skips planning-category tasks
2. Safety net in getNextDispatchable() skips planning tasks
3. gatherInitiativeContext() filters to execution tasks only
2026-02-10 11:18:17 +01:00
Lukas May
d18c3c7e44 fix(web): Dismiss button on refine agent ChangeSetBanner now works
The dismiss mutation only invalidated `listAgents` but the hook reads
from `getActiveRefineAgent`, so the banner stayed visible after dismiss.
Added optimistic cache clearing and invalidation for `getActiveRefineAgent`.
2026-02-10 11:03:20 +01:00
Lukas May
57a5843324 fix(web): Update PipelineTab to use renamed PlanSection component 2026-02-10 10:55:59 +01:00
Lukas May
ca548c1eaa feat: Auto-branch initiative system with per-project default branches
Planning tasks (research, discuss, plan, detail, refine) now run on
the project's defaultBranch instead of hardcoded 'main'. Execution
tasks (execute, verify, merge, review) auto-generate an initiative
branch (cw/<slug>) on first dispatch. Branch configuration removed
from initiative creation — it's now fully automatic.

- Add PLANNING_CATEGORIES/EXECUTION_CATEGORIES to branch-naming
- Dispatch manager splits logic by task category
- ProcessManager uses per-project defaultBranch fallback
- Phase dispatch uses project.defaultBranch for ensureBranch base
- Remove mergeTarget from createInitiative input
- Rename updateInitiativeMergeConfig → updateInitiativeConfig
- Add defaultBranch field to registerProject + UI
- Rename mergeTarget → branch across all frontend components
2026-02-10 10:53:35 +01:00
Lukas May
0407f05332 refactor: Rename agent modes breakdown→plan, decompose→detail
Full rename across the codebase for clarity:
- breakdown (initiative→phases) is now "plan"
- decompose (phase→tasks) is now "detail"

Updates schema enums, TypeScript types, events, prompts, output handler,
tRPC procedures, CLI commands, frontend components, tests, and docs.
Also fixes 0022 migration multi-statement issue (adds statement-breakpoint markers).
2026-02-10 10:51:42 +01:00
Lukas May
f9f8b4c185 refactor(agent): Use agent name instead of ID for log directory paths
Aligns agent-logs directory naming with agent-workdirs so both use the
human-readable agent name, making filesystem correlation trivial.
2026-02-10 10:41:47 +01:00
Lukas May
bf898cb86e feat(agent): Enrich breakdown/decompose agent input with full initiative context
Breakdown and decompose agents now receive all existing phases, tasks,
and pages as read-only context so they can plan with awareness of what
already exists instead of operating in a vacuum.
2026-02-10 10:18:55 +01:00
Lukas May
118f6d0d51 fix(task): Filter out decompose container tasks from phase and initiative task lists
Decompose tasks are parent containers whose children inherit the same phaseId,
causing both the container and its children to appear in task listings.
2026-02-10 10:18:47 +01:00
Lukas May
a98c2d0f6b feat(web): Add stop button to agent detail view header
Allows stopping running agents directly from the output viewer
instead of requiring the card dropdown menu.
2026-02-10 10:13:45 +01:00
Lukas May
4d3bd9ca90 fix(agent): Add refresh token validation before token refresh
Check for refresh token availability before attempting credential refresh.
Setup tokens that expire without a refresh token now return a clear error
instead of attempting an invalid refresh operation.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-10 10:01:35 +01:00
Lukas May
7f8a936c02 chore: ensure working directory state is committed
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-10 10:00:41 +01:00
Lukas May
4ac03b74ca chore: sync working directory state
Ensure all changes are committed.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-10 09:59:25 +01:00