Merge two useEffects in AgentOutputViewer into one to fix race where
agentId reset clears messages after data effect sets them on remount.
Add "commit before signaling" instruction to errand prompts so
Changes tab shows diff after completion.
- Register errandProcedures in appRouter (was defined but never spread)
- Fix nullable projectId guard in errand delete/abandon procedures
- Add sendUserMessage stub to MockAgentManager in headquarters and
radar-procedures tests (AgentManager interface gained this method)
- Add missing qualityReview field to Initiative fixture in file-io test
(schema gained this column from the quality-review phase)
- Cast conflictFiles access in CLI errand resolve command
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds the fourth test case from the spec: when shouldRunQualityReview
throws, the orchestrator must not crash, must log a warning (verified
implicitly by the catch block), and must still call scheduleDispatch()
so dispatch continuity is maintained.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add errand.requestChanges procedure that re-spawns an agent in the
existing worktree with user feedback. Replace raw <pre> diff blocks
with syntax-highlighted ErrandDiffView using FileCard components.
Add Output/Changes tabs to the active errand view.
When an agent stops, check whether a quality review should run before
auto-completing the task. If shouldRunQualityReview returns run:true,
delegate to runQualityReview (which transitions task to quality_review
and spawns a review agent) instead of calling completeTask directly.
Falls back to completeTask when agentRepository or agentManager are
not injected, or when the task lacks phaseId/initiativeId context.
- Add agentManager optional param to ExecutionOrchestrator constructor
- Extract tryQualityReview() private method to compute branch names and
repo path before delegating to the quality-review service
- Pass agentManager to ExecutionOrchestrator in container.ts
- Add orchestrator integration tests for the agent:stopped quality hook
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
When an execute-mode agent stops with task_complete and the initiative has
qualityReview=true, the orchestrator now spawns a fresh execute-mode agent
to run /simplify on changed .ts/.tsx/.js files before marking the task
completed. The task transitions through quality_review status as a recursion
guard so the review agent's stop event is handled normally.
- Add apps/server/execution/quality-review.ts with three exported functions:
computeQualifyingFiles, shouldRunQualityReview, runQualityReview
- Add apps/server/execution/quality-review.test.ts (28 tests)
- Update ExecutionOrchestrator to accept agentManager, replace
handleAgentStopped with quality-review-aware logic, add getRepoPathForTask
- Update orchestrator.test.ts with 3 quality-review integration tests
- Update container.ts to pass agentManager to ExecutionOrchestrator
- Update docs/dispatch-events.md to reflect new agent:stopped behavior
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds apps/server/execution/quality-review.ts with three exported functions:
- computeQualifyingFiles: diffs task branch vs base, filters out *.gen.ts and dist/ paths
- shouldRunQualityReview: evaluates all six guard conditions (task_complete, execute mode,
in_progress status, initiative membership, qualityReview flag, non-empty changeset)
and returns { run, qualifyingFiles } to avoid recomputing the diff in the orchestrator
- runQualityReview: transitions task to quality_review, spawns execute-mode review agent
on the task branch, logs the review agent ID, and falls back to completed on spawn failure
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds qualityReview: z.boolean().optional() to the updateInitiativeConfig
input schema so the field passes through to the repository layer.
Includes integration tests verifying set-true, set-false, and
omit-preserves-existing round-trip behavior.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds two new fields to the database and propagates them through the
repository layer:
- Task status enum gains 'quality_review' (between in_progress and
completed), enabling a QA gate before tasks are marked complete.
- initiatives.quality_review (INTEGER DEFAULT 0) lets an initiative be
flagged for quality-review workflow without a data migration (existing
rows default to false).
Includes:
- Schema changes in schema.ts
- Migration 0037 (ALTER TABLE initiatives ADD quality_review)
- Snapshot chain repaired: deleted stale 0036 snapshot, fixed 0035
prevId to create a linear chain (0032 → 0034 → 0035), then generated
clean 0037 snapshot
- Repository adapter already uses SELECT * / spread-update pattern so
no adapter code changes were needed
- Initiative and task repository tests extended with qualityReview /
quality_review_status describe blocks (7 new tests)
- docs/database.md updated
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Three fixes for phases getting stuck when a detail task crashes and is retried:
1. detailPhase mutation (architect.ts): clean up orphaned pending/in_progress
detail tasks before creating new ones, preventing duplicates at the source
2. orchestrator recovery: detect and complete stale duplicate planning tasks
(same category+phase, one completed, one pending)
3. ensureBranch: catch "already exists" TOCTOU race instead of blocking phase
Populates the agent_metrics table from existing agent_log_chunks data after
the schema migration. Reads chunks in batches of 500, accumulates per-agent
counts in memory, then upserts with additive ON CONFLICT DO UPDATE to match
the ongoing insertChunk write-path behavior.
- apps/server/scripts/backfill-metrics.ts: core backfillMetrics(db) + CLI wrapper backfillMetricsFromPath(dbPath)
- apps/server/scripts/backfill-metrics.test.ts: 8 tests covering all chunk types, malformed JSON, isolation, empty DB, and re-run double-count behavior
- apps/server/cli/index.ts: new top-level `cw backfill-metrics [--db <path>]` command
- docs/database-migrations.md: Post-migration backfill scripts section documenting when and how to run the script
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
listForRadar previously called findByAgentIds() and JSON-parsed every chunk to
compute questionsCount, subagentsCount, and compactionsCount. Switch to
findMetricsByAgentIds() which reads the pre-computed agent_metrics table,
eliminating the chunk scan and per-row JSON.parse entirely.
Add two new test cases: agent with no metrics row returns zero counts, and
listForRadar response rows never carry chunk content.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Wrap insertChunk in a synchronous better-sqlite3 transaction that upserts
agent_metrics counters atomically on every chunk insert. Malformed JSON
skips the upsert but always preserves the chunk row.
Add findMetricsByAgentIds to the interface and Drizzle adapter for
efficient bulk metric reads.
Add 8-test suite covering all write/read paths and edge cases.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
When an initiative has multiple end phases (leaf nodes with no
dependents), queueAllPhases now auto-creates an Integration phase
that depends on all of them. This catches cross-phase incompatibilities
(type mismatches, conflicting exports, broken tests) before review.
Adds the agentMetrics table to SQLite schema for storing pre-computed
per-agent event counts (questions, subagents, compactions), enabling
listForRadar to fetch one row per agent instead of scanning log chunks.
Also fixes pre-existing Drizzle snapshot chain collision in meta/
(0035/0036 snapshots had wrong prevId due to parallel agent branches)
to unblock drizzle-kit generate.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
ErrandRepository was instantiated in the container but never passed
from TrpcAdapterOptions into createContext(), causing all errand
procedures to throw "Errand repository not available" at runtime.
Claude CLI occasionally hangs after writing signal.json but never exits.
Add an optional signal check to pollForCompletion: after a 60s grace
period, check signal.json every 30s. If a valid completion signal is
found while the process is still alive, SIGTERM it and proceed to
normal completion handling.
errandProcedures was defined but never imported/spread into the app
router, causing "No procedure found on path errand.create". Also fixed
nullable projectId TS errors in delete/abandon and added missing
sendUserMessage to test mocks.
- Remove original task blocking in handleConflict (task is already completed by handleAgentStopped)
- Return created conflict task from handleConflict so orchestrator can queue it for dispatch
- Add dedup check to prevent duplicate resolution tasks on crash retries
- Queue conflict resolution task via dispatchManager in mergeTaskIntoPhase
- Add recovery for erroneously blocked tasks in recoverDispatchQueues
- Update tests and docs
Resolves add/add conflict in diff-cache.ts (kept typed PhaseMetaResponse/
FileDiffResponse interfaces from HEAD over unknown-typed singletons from test
branch) and content conflict in phase.ts (kept both phaseMetaCache and
fileDiffCache imports; removed auto-merged duplicate firstClone/headHash/
cacheKey/cached declarations and unreachable empty-projects guard).
Also cleans auto-merged duplicate getHeadCommitHash in orchestrator.test.ts
and simple-git-branch-manager.ts.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Creates diff-cache.ts module with generic DiffCache<T> class (TTL, prefix
invalidation, env-var configuration) and exports phaseMetaCache / fileDiffCache
singletons. Wires cache into getPhaseReviewDiff via getHeadCommitHash on
BranchManager. Adds 6 unit tests for DiffCache and 5 integration tests
verifying cache hit/miss behaviour, prefix invalidation, and NOT_FOUND guard.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds DiffCache<T> module, extends BranchManager with getHeadCommitHash,
and wires phase-level caching into getPhaseReviewDiff and getFileDiff.
Cache is invalidated in ExecutionOrchestrator after each task merges into
the phase branch, ensuring stale diffs are never served after new commits.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Rewrites getPhaseReviewDiff to return file-level metadata (path, status,
additions, deletions) instead of a raw diff string, eliminating 10MB+
payloads for large repos. Adds getFileDiff for on-demand per-file hunk
content with binary detection via numstat. Multi-project initiatives
prefix file paths with the project name to avoid collisions.
Adds integration tests that use real local git repos + in-memory SQLite
to verify both procedures end-to-end (binary files, deleted files,
spaces in paths, error cases).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds FileStatEntry type and two new primitives to the BranchManager
port and SimpleGitBranchManager adapter, enabling split diff
procedures in the tRPC layer without returning raw multi-megabyte diffs.
- FileStatEntry captures path, status, additions/deletions, oldPath
(renames), and optional projectId for multi-project routing
- diffBranchesStat uses --name-status + --numstat, detects binary
files (shown as - / - in numstat), handles spaces in filenames
- diffFileSingle returns raw unified diff for a single file path
Re-recorded all 4 cassette files to reflect current prompt templates.
Added `workdir/**` to vitest exclude list to prevent test discovery
in agent worktree directories.
Integrates main branch changes (headquarters dashboard, task retry count,
agent prompt persistence, remote sync improvements) with the initiative's
errand agent feature. Both features coexist in the merged result.
Key resolutions:
- Schema: take main's errands table (nullable projectId, no conflictFiles,
with errandsRelations); migrate to 0035_faulty_human_fly
- Router: keep both errandProcedures and headquartersProcedures
- Errand prompt: take main's simpler version (no question-asking flow)
- Manager: take main's status check (running|idle only, no waiting_for_input)
- Tests: update to match removed conflictFiles field and undefined vs null
Add five new tRPC query procedures powering the Radar page's per-agent
behavioral metrics (questions asked, subagent spawns, compaction events,
inter-agent messages) plus the batch repository methods they require.
Repository changes:
- LogChunkRepository: add findByAgentIds() for batch fetching without N+1
- ConversationRepository: add countByFromAgentIds() and findByFromAgentId()
- Drizzle adapters: implement all three new methods using inArray()
- InMemoryConversationRepository (integration test): implement new methods
tRPC procedures added:
- agent.listForRadar: filtered agent list with per-agent metrics computed
from log chunks (questionsCount, subagentsCount, compactionsCount) and
conversation counts (messagesCount); supports timeRange/status/mode/initiative filters
- agent.getCompactionEvents: compact system init chunks for one agent (cap 200)
- agent.getSubagentSpawns: Agent tool_use entries with prompt preview (cap 200)
- agent.getQuestionsAsked: AskUserQuestion tool calls with questions array (cap 200)
- conversation.getByFromAgent: conversations by fromAgentId with toAgentName resolved
All 13 new unit tests pass; existing test suite unaffected.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
getHeadquartersDashboard had no section for active conflict agents,
so initiatives with a running conflict-* agent disappeared from all
HQ sections. Add resolvingConflicts array to surface them.
Each SSE client registers a listener per event type (30+ types), so a
few concurrent connections easily exceed the previous limit of 100.
Listeners are properly cleaned up on disconnect via eventBusIterable's
finally block, so this is not a real leak.
- `errand.get` now returns `conflictFiles: string[]` (always an array,
never null) with defensive JSON.parse error handling
- `errand.get` returns `projectPath: string | null` computed from
workspaceRoot + getProjectCloneDir so cw errand resolve can locate
the worktree without a second tRPC call
- Add `apps/server/db/repositories/drizzle/errand.test.ts` covering
conflictFiles store/retrieve, null for non-conflict errands, and
findAll including conflictFiles
- Update `errand.test.ts` mock to include getProjectCloneDir and fix
conflictFiles expectation from null to []
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Wires errand command group into CLI with start, list, chat, diff,
complete, merge, resolve, abandon, and delete subcommands. All
commands call tRPC procedures via createDefaultTrpcClient(). The
start command validates description length client-side (≤200 chars)
before making any network calls.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Implements the errand workflow for small isolated changes that spawn a
dedicated agent in a git worktree:
- errand.create: branch + worktree + DB record + agent spawn
- errand.list / errand.get / errand.diff: read procedures
- errand.complete: transitions active→pending_review, stops agent
- errand.merge: merges branch, handles conflicts with conflictFiles
- errand.delete / errand.abandon: cleanup worktree, branch, agent
- errand.sendMessage: delivers user message directly to running agent
Supporting changes:
- Add 'errand' to AgentMode union and agents.mode enum
- Add sendUserMessage() to AgentManager interface and MockAgentManager
- MockAgentManager now accepts optional agentRepository to persist agents
to the DB (required for FK constraint satisfaction in tests)
- Add ORDER BY createdAt DESC, id DESC to errand findAll
- Fix dispatch/manager.test.ts missing sendUserMessage mock
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Creates the errands table (with conflictFiles column), errand-repository
port interface, DrizzleErrandRepository adapter, and wires the repository
into TRPCContext, the DI container, _helpers.ts requireErrandRepository guard,
and the test harness. Also fixes pre-existing TS error in controller.test.ts.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Aggregates all user-blocking action items into a single composite query:
- waitingForInput: agents paused on questions (oldest first)
- pendingReviewInitiatives: initiatives awaiting content review
- pendingReviewPhases: phases awaiting diff review
- planningInitiatives: active initiatives with all phases pending and no running agents
- blockedPhases: phases in blocked state with optional last-message snippet
Wired into appRouter and covered by 10 unit tests using in-memory Drizzle DB
and an inline MockAgentManager (no real processes required).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>