Commit Graph

153 Commits

Author SHA1 Message Date
Lukas May
e199188670 feat: cw task add CLI command + {AGENT_ID} prompt placeholder
- Add `createTaskForAgent` tRPC mutation: resolves agent → task → phase, creates sibling task
- Add `cw task add <name> --agent-id <id>` CLI command
- Replace `{AGENT_ID}` and `{AGENT_NAME}` placeholders in writeInputFiles() before flushing
- Update docs/agent.md and docs/cli-config.md
2026-03-06 22:22:49 +01:00
Lukas May
6482960c6f feat: errand review & request changes
Add errand.requestChanges procedure that re-spawns an agent in the
existing worktree with user feedback. Replace raw <pre> diff blocks
with syntax-highlighted ErrandDiffView using FileCard components.
Add Output/Changes tabs to the active errand view.
2026-03-06 22:09:01 +01:00
Lukas May
a3a9076411 Merge branch 'cw/radar-screen-performance' into cw-merge-1772829950184 2026-03-06 21:45:50 +01:00
Lukas May
346d62ef8d fix: prevent stale duplicate planning tasks from blocking phase completion
Three fixes for phases getting stuck when a detail task crashes and is retried:

1. detailPhase mutation (architect.ts): clean up orphaned pending/in_progress
   detail tasks before creating new ones, preventing duplicates at the source
2. orchestrator recovery: detect and complete stale duplicate planning tasks
   (same category+phase, one completed, one pending)
3. ensureBranch: catch "already exists" TOCTOU race instead of blocking phase
2026-03-06 21:44:26 +01:00
Lukas May
d97afa84d4 Merge branch 'cw/radar-screen-performance-phase-backfill-script-cw-backfill-metrics-cli-command-docs' into cw-merge-1772829393658 2026-03-06 21:36:33 +01:00
Lukas May
db2196f1d1 feat: add backfill-metrics script and cw backfill-metrics CLI command
Populates the agent_metrics table from existing agent_log_chunks data after
the schema migration. Reads chunks in batches of 500, accumulates per-agent
counts in memory, then upserts with additive ON CONFLICT DO UPDATE to match
the ongoing insertChunk write-path behavior.

- apps/server/scripts/backfill-metrics.ts: core backfillMetrics(db) + CLI wrapper backfillMetricsFromPath(dbPath)
- apps/server/scripts/backfill-metrics.test.ts: 8 tests covering all chunk types, malformed JSON, isolation, empty DB, and re-run double-count behavior
- apps/server/cli/index.ts: new top-level `cw backfill-metrics [--db <path>]` command
- docs/database-migrations.md: Post-migration backfill scripts section documenting when and how to run the script

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-06 21:36:08 +01:00
Lukas May
1fd3a1ae4a fix: make errand.list input optional so frontend query works without args 2026-03-06 21:35:30 +01:00
Lukas May
4a9f38c4e1 perf: replace O(N·chunks) listForRadar read path with O(N·agents) metrics lookup
listForRadar previously called findByAgentIds() and JSON-parsed every chunk to
compute questionsCount, subagentsCount, and compactionsCount. Switch to
findMetricsByAgentIds() which reads the pre-computed agent_metrics table,
eliminating the chunk scan and per-row JSON.parse entirely.

Add two new test cases: agent with no metrics row returns zero counts, and
listForRadar response rows never carry chunk content.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-06 21:35:29 +01:00
Lukas May
6eb1f8fc2a feat: add agent_metrics write+read path to LogChunkRepository
Wrap insertChunk in a synchronous better-sqlite3 transaction that upserts
agent_metrics counters atomically on every chunk insert. Malformed JSON
skips the upsert but always preserves the chunk row.

Add findMetricsByAgentIds to the interface and Drizzle adapter for
efficient bulk metric reads.

Add 8-test suite covering all write/read paths and edge cases.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-06 21:31:41 +01:00
Lukas May
0f53930610 feat: auto-create Integration phase for multi-leaf initiatives
When an initiative has multiple end phases (leaf nodes with no
dependents), queueAllPhases now auto-creates an Integration phase
that depends on all of them. This catches cross-phase incompatibilities
(type mismatches, conflicting exports, broken tests) before review.
2026-03-06 21:31:20 +01:00
Lukas May
276c342a50 feat: add agent_metrics table schema and Drizzle migration
Adds the agentMetrics table to SQLite schema for storing pre-computed
per-agent event counts (questions, subagents, compactions), enabling
listForRadar to fetch one row per agent instead of scanning log chunks.

Also fixes pre-existing Drizzle snapshot chain collision in meta/
(0035/0036 snapshots had wrong prevId due to parallel agent branches)
to unblock drizzle-kit generate.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-06 21:25:38 +01:00
Lukas May
094b7e6307 fix: wire errand repository through tRPC adapter
ErrandRepository was instantiated in the container but never passed
from TrpcAdapterOptions into createContext(), causing all errand
procedures to throw "Errand repository not available" at runtime.
2026-03-06 21:24:19 +01:00
Lukas May
56efc0bad6 fix: detect hung agent processes via defensive signal.json polling
Claude CLI occasionally hangs after writing signal.json but never exits.
Add an optional signal check to pollForCompletion: after a 60s grace
period, check signal.json every 30s. If a valid completion signal is
found while the process is still alive, SIGTERM it and proceed to
normal completion handling.
2026-03-06 21:23:19 +01:00
Lukas May
388befd7c3 fix: register errand router in appRouter and fix build errors
errandProcedures was defined but never imported/spread into the app
router, causing "No procedure found on path errand.create". Also fixed
nullable projectId TS errors in delete/abandon and added missing
sendUserMessage to test mocks.
2026-03-06 21:17:44 +01:00
Lukas May
5ede391311 Merge branch 'main' into cw/small-change-flow-conflict-1772826399181
# Conflicts:
#	README.md
#	apps/server/execution/orchestrator.ts
#	apps/server/test/unit/headquarters.test.ts
#	apps/server/trpc/router.ts
#	apps/server/trpc/routers/agent.ts
#	apps/server/trpc/routers/headquarters.ts
#	apps/web/src/components/hq/HQSections.test.tsx
#	apps/web/src/components/hq/types.ts
#	apps/web/src/layouts/AppLayout.tsx
#	apps/web/src/routes/hq.tsx
#	apps/web/tsconfig.app.tsbuildinfo
#	docs/dispatch-events.md
#	docs/server-api.md
#	vitest.config.ts
2026-03-06 21:01:36 +01:00
Lukas May
ba6ebe2594 Merge branch 'cw/review-tab-performance' into cw-merge-1772826318787 2026-03-06 20:45:19 +01:00
Lukas May
afdc1c7e00 fix: recover in_progress phases where all tasks are already completed on server restart 2026-03-06 20:41:26 +01:00
Lukas May
d4a28713f6 fix: conflict resolution tasks now get dispatched instead of permanently blocking initiative
- Remove original task blocking in handleConflict (task is already completed by handleAgentStopped)
- Return created conflict task from handleConflict so orchestrator can queue it for dispatch
- Add dedup check to prevent duplicate resolution tasks on crash retries
- Queue conflict resolution task via dispatchManager in mergeTaskIntoPhase
- Add recovery for erroneously blocked tasks in recoverDispatchQueues
- Update tests and docs
2026-03-06 20:37:29 +01:00
Lukas May
f1af9e5d7a chore: resolve merge conflicts for DiffCache test task
Resolves add/add conflict in diff-cache.ts (kept typed PhaseMetaResponse/
FileDiffResponse interfaces from HEAD over unknown-typed singletons from test
branch) and content conflict in phase.ts (kept both phaseMetaCache and
fileDiffCache imports; removed auto-merged duplicate firstClone/headHash/
cacheKey/cached declarations and unreachable empty-projects guard).

Also cleans auto-merged duplicate getHeadCommitHash in orchestrator.test.ts
and simple-git-branch-manager.ts.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-06 20:33:41 +01:00
Lukas May
f63b1c5eec Merge branch 'cw/radar' into cw-merge-1772825408137 2026-03-06 20:30:08 +01:00
Lukas May
a50ee01626 test: Add DiffCache unit tests and getPhaseReviewDiff cache integration tests
Creates diff-cache.ts module with generic DiffCache<T> class (TTL, prefix
invalidation, env-var configuration) and exports phaseMetaCache / fileDiffCache
singletons. Wires cache into getPhaseReviewDiff via getHeadCommitHash on
BranchManager. Adds 6 unit tests for DiffCache and 5 integration tests
verifying cache hit/miss behaviour, prefix invalidation, and NOT_FOUND guard.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-06 20:06:28 +01:00
Lukas May
0996073deb feat: add in-memory diff cache with TTL and commit-hash invalidation
Adds DiffCache<T> module, extends BranchManager with getHeadCommitHash,
and wires phase-level caching into getPhaseReviewDiff and getFileDiff.
Cache is invalidated in ExecutionOrchestrator after each task merges into
the phase branch, ensuring stale diffs are never served after new commits.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-06 19:51:04 +01:00
Lukas May
4890721a92 feat: split getPhaseReviewDiff into metadata + add getFileDiff procedure
Rewrites getPhaseReviewDiff to return file-level metadata (path, status,
additions, deletions) instead of a raw diff string, eliminating 10MB+
payloads for large repos. Adds getFileDiff for on-demand per-file hunk
content with binary detection via numstat. Multi-project initiatives
prefix file paths with the project name to avoid collisions.

Adds integration tests that use real local git repos + in-memory SQLite
to verify both procedures end-to-end (binary files, deleted files,
spaces in paths, error cases).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-06 19:45:57 +01:00
Lukas May
05eb160749 feat: add diffBranchesStat and diffFileSingle to BranchManager
Adds FileStatEntry type and two new primitives to the BranchManager
port and SimpleGitBranchManager adapter, enabling split diff
procedures in the tRPC layer without returning raw multi-megabyte diffs.

- FileStatEntry captures path, status, additions/deletions, oldPath
  (renames), and optional projectId for multi-project routing
- diffBranchesStat uses --name-status + --numstat, detects binary
  files (shown as - / - in numstat), handles spaces in filenames
- diffFileSingle returns raw unified diff for a single file path
2026-03-06 19:36:36 +01:00
Lukas May
c0096503b2 fix: Re-record cassettes and exclude workdir from test discovery
Re-recorded all 4 cassette files to reflect current prompt templates.
Added `workdir/**` to vitest exclude list to prevent test discovery
in agent worktree directories.
2026-03-06 17:00:46 +01:00
Lukas May
28521e1c20 chore: merge main into cw/small-change-flow
Integrates main branch changes (headquarters dashboard, task retry count,
agent prompt persistence, remote sync improvements) with the initiative's
errand agent feature. Both features coexist in the merged result.

Key resolutions:
- Schema: take main's errands table (nullable projectId, no conflictFiles,
  with errandsRelations); migrate to 0035_faulty_human_fly
- Router: keep both errandProcedures and headquartersProcedures
- Errand prompt: take main's simpler version (no question-asking flow)
- Manager: take main's status check (running|idle only, no waiting_for_input)
- Tests: update to match removed conflictFiles field and undefined vs null
2026-03-06 16:48:12 +01:00
Lukas May
5598e1c10f feat: implement Radar backend tRPC procedures with repository extensions
Add five new tRPC query procedures powering the Radar page's per-agent
behavioral metrics (questions asked, subagent spawns, compaction events,
inter-agent messages) plus the batch repository methods they require.

Repository changes:
- LogChunkRepository: add findByAgentIds() for batch fetching without N+1
- ConversationRepository: add countByFromAgentIds() and findByFromAgentId()
- Drizzle adapters: implement all three new methods using inArray()
- InMemoryConversationRepository (integration test): implement new methods

tRPC procedures added:
- agent.listForRadar: filtered agent list with per-agent metrics computed
  from log chunks (questionsCount, subagentsCount, compactionsCount) and
  conversation counts (messagesCount); supports timeRange/status/mode/initiative filters
- agent.getCompactionEvents: compact system init chunks for one agent (cap 200)
- agent.getSubagentSpawns: Agent tool_use entries with prompt preview (cap 200)
- agent.getQuestionsAsked: AskUserQuestion tool calls with questions array (cap 200)
- conversation.getByFromAgent: conversations by fromAgentId with toAgentName resolved

All 13 new unit tests pass; existing test suite unaffected.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-06 16:40:18 +01:00
Lukas May
1e16ad82e8 fix: Show conflict resolution agents in HQ dashboard
getHeadquartersDashboard had no section for active conflict agents,
so initiatives with a running conflict-* agent disappeared from all
HQ sections. Add resolvingConflicts array to surface them.
2026-03-06 16:39:48 +01:00
Lukas May
02ca1d568e fix: Disable EventEmitter maxListeners warning for SSE subscriptions
Each SSE client registers a listener per event type (30+ types), so a
few concurrent connections easily exceed the previous limit of 100.
Listeners are properly cleaned up on disconnect via eventBusIterable's
finally block, so this is not a real leak.
2026-03-06 16:39:36 +01:00
Lukas May
0ec1022aba Merge branch 'cw/small-change-flow-task-3yaqNLqG3kk5ku6rH0exF' into cw-merge-1772810964768 2026-03-06 16:29:25 +01:00
Lukas May
813979388b feat: wire conflictFiles through errand.get and add repository tests
- `errand.get` now returns `conflictFiles: string[]` (always an array,
  never null) with defensive JSON.parse error handling
- `errand.get` returns `projectPath: string | null` computed from
  workspaceRoot + getProjectCloneDir so cw errand resolve can locate
  the worktree without a second tRPC call
- Add `apps/server/db/repositories/drizzle/errand.test.ts` covering
  conflictFiles store/retrieve, null for non-conflict errands, and
  findAll including conflictFiles
- Update `errand.test.ts` mock to include getProjectCloneDir and fix
  conflictFiles expectation from null to []

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-06 16:28:56 +01:00
Lukas May
e86a743c0b feat: Add all 9 cw errand CLI subcommands with tests
Wires errand command group into CLI with start, list, chat, diff,
complete, merge, resolve, abandon, and delete subcommands. All
commands call tRPC procedures via createDefaultTrpcClient(). The
start command validates description length client-side (≤200 chars)
before making any network calls.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-06 16:26:15 +01:00
Lukas May
377e8de5e9 feat: Add errand tRPC router with all 9 procedures and comprehensive tests
Implements the errand workflow for small isolated changes that spawn a
dedicated agent in a git worktree:
- errand.create: branch + worktree + DB record + agent spawn
- errand.list / errand.get / errand.diff: read procedures
- errand.complete: transitions active→pending_review, stops agent
- errand.merge: merges branch, handles conflicts with conflictFiles
- errand.delete / errand.abandon: cleanup worktree, branch, agent
- errand.sendMessage: delivers user message directly to running agent

Supporting changes:
- Add 'errand' to AgentMode union and agents.mode enum
- Add sendUserMessage() to AgentManager interface and MockAgentManager
- MockAgentManager now accepts optional agentRepository to persist agents
  to the DB (required for FK constraint satisfaction in tests)
- Add ORDER BY createdAt DESC, id DESC to errand findAll
- Fix dispatch/manager.test.ts missing sendUserMessage mock

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-06 16:21:01 +01:00
Lukas May
3a328d2b1c feat: Add errands schema, repository, and wire into tRPC context/container
Creates the errands table (with conflictFiles column), errand-repository
port interface, DrizzleErrandRepository adapter, and wires the repository
into TRPCContext, the DI container, _helpers.ts requireErrandRepository guard,
and the test harness. Also fixes pre-existing TS error in controller.test.ts.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-06 15:49:26 +01:00
Lukas May
89c98d38cc feat: Add getHeadquartersDashboard tRPC procedure for HQ action items
Aggregates all user-blocking action items into a single composite query:
- waitingForInput: agents paused on questions (oldest first)
- pendingReviewInitiatives: initiatives awaiting content review
- pendingReviewPhases: phases awaiting diff review
- planningInitiatives: active initiatives with all phases pending and no running agents
- blockedPhases: phases in blocked state with optional last-message snippet

Wired into appRouter and covered by 10 unit tests using in-memory Drizzle DB
and an inline MockAgentManager (no real processes required).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-06 15:35:54 +01:00
Lukas May
67658fb717 fix: require signal.json for all errand agent exit scenarios
Option A ("ask inline, session stays open") described a path where the
errand agent could ask questions without writing signal.json, which broke
the server's completion detection (checkAgentCompletionResult polls for
done|questions|error status). Remove the Option A/B distinction and make
signal.json with questions status the single mechanism for all user-input
requests, consistent with how other agents handle blocking questions.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-06 15:35:11 +01:00
Lukas May
41c5d292bb fix: allow errand agent to end session with questions and resume
The errand agent can now write { "status": "questions", ... } to
signal.json to pause mid-task and ask the user for clarification.
The session ends cleanly; the user answers via UI or CLI; the system
resumes the agent with their answers via sendUserMessage.

Two changes:
- buildErrandPrompt: adds "Option B" explaining the questions signal
  format and the resume-on-answer lifecycle, alongside the existing
  inline-question approach.
- sendUserMessage: extends allowed statuses from running|idle to also
  include waiting_for_input, so agents paused on a questions signal
  can be resumed when the user replies.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-06 14:54:53 +01:00
Lukas May
2bef0fa682 Merge branch 'cw/project-management-operations-to-ui' into cw-merge-1772805244951 2026-03-06 14:54:05 +01:00
Lukas May
986cae880d chore: Remove unused realpathSync import 2026-03-06 14:52:42 +01:00
Lukas May
4a7105eb8f fix: Worktree get() matches wrong agent due to ambiguous endsWith lookup
The `get(id)` method on SimpleGitWorktreeManager used `path.endsWith(id)`
to find worktrees. Since all agents working on the same project create
worktrees with the same project name suffix (e.g., "codewalk-district"),
cleanup for one agent could match and delete another agent's worktree.

Fix: match on `basename(worktreesDir)/id` so each manager's lookups are
scoped to its own worktree base directory.
2026-03-06 14:52:28 +01:00
Lukas May
7a4d0d2582 feat: Add Sync All button to projects settings page with tests
- Adds syncAllMutation calling trpc.syncAllProjects once for all cards
- Shows Sync All button in header when ≥1 project exists (hidden otherwise)
- Propagates isPending to ProjectCard.syncAllPending to disable per-card sync
- On success: toasts all-success or failure count; invalidates listProjects + getProjectSyncStatus per project
- On network error: toasts Sync failed with err.message
- Adds unit tests for ProjectSyncManager.syncAllProjects: empty list, all succeed, partial failure, result shape, and failure counting logic

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-06 14:39:58 +01:00
Lukas May
8f4fa2a233 fix: disambiguate CONFLICT errors in registerProject by inspecting the SQLite UNIQUE constraint column
Previously a single "that name or URL" message was thrown regardless of which
column violated uniqueness. Now the catch block inspects the error string from
SQLite to emit a name-specific or url-specific message, with a generic fallback
when neither column can be identified.

Adds vitest tests covering all four scenarios: name conflict, url conflict,
unknown column conflict, and non-UNIQUE error passthrough.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-06 14:37:21 +01:00
Lukas May
e2c489dc48 feat: teach errand agent how to ask questions interactively
Add a dedicated "Asking questions" section to the errand prompt so the
agent knows it can pause, ask for clarification, and wait for the user
to reply via the UI chat input or `cw errand chat`. Previously the
prompt said "work interactively" with no guidance on the mechanism,
leaving the agent to guess.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-06 14:31:03 +01:00
Lukas May
23964374d1 Merge branch 'cw/agent-details' into cw-merge-1772802959182 2026-03-06 14:15:59 +01:00
Lukas May
6f38ae21ed Merge branch 'main' into cw/agent-details-conflict-1772802863659
# Conflicts:
#	docs/server-api.md
2026-03-06 14:15:30 +01:00
Lukas May
0f1c578269 fix: Fail fast when agent worktree creation or branch setup fails
Previously, branch computation errors and ensureBranch failures were
silently swallowed for all tasks, allowing execution agents to spawn
without proper git isolation. This caused alert-pony to commit directly
to main instead of its task branch.

- manager.ts: Verify each project worktree subdirectory exists after
  createProjectWorktrees; throw if any are missing. Convert passive
  cwdVerified log to a hard guard.
- dispatch/manager.ts: Make branch computation and ensureBranch errors
  fatal for execution tasks (execute, verify, merge, review) while
  keeping them non-fatal for planning tasks.
2026-03-06 14:08:59 +01:00
Lukas May
343c6a83a8 feat: Add agent spawn infrastructure for errand mode
Implements three primitives needed before errand tRPC procedures can be wired up:

- agentManager.sendUserMessage(agentId, message): resumes an errand agent with a
  raw user message, bypassing the conversations table and conversationResumeLocks.
  Throws on missing agent, invalid status, or absent sessionId.

- writeErrandManifest(options): writes .cw/input/errand.md (YAML frontmatter),
  .cw/input/manifest.json (errandId/agentId/agentName/mode, no files/contextFiles),
  and .cw/expected-pwd.txt to an agent workdir.

- buildErrandPrompt(description): minimal prompt for errand agents; exported from
  prompts/errand.ts and re-exported from prompts/index.ts.

Also fixes a pre-existing TypeScript error in lifecycle/controller.test.ts (missing
backoffMs property in RetryPolicy mock introduced by a concurrent agent commit).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-06 14:05:26 +01:00
Lukas May
b419981924 perf: Speed up conflict resolution agents by trimming prompt bloat
Replace SESSION_STARTUP (full test suite run) and CONTEXT_MANAGEMENT
(progress file refs) with a minimal startup block (pwd, git status,
CLAUDE.md). Add skipPromptExtras option to SpawnAgentOptions to skip
inter-agent communication and preview deployment instructions. Conflict
agents now go straight to the resolution protocol — one post-resolution
test run instead of two.
2026-03-06 14:05:23 +01:00
Lukas May
a0574a1ae9 feat: Add subagent usage guidance to refine and plan prompts
Instruct architect agents to leverage subagents for parallel work —
page analysis, codebase verification, dependency mapping, and pattern
discovery — instead of doing everything sequentially.
2026-03-06 13:39:19 +01:00
Lukas May
1629285778 Merge branch 'refs/heads/main' into cw/agent-details-conflict-1772799979862
# Conflicts:
#	apps/server/drizzle/meta/_journal.json
2026-03-06 13:34:28 +01:00