feat: add backfill-metrics script and cw backfill-metrics CLI command

Populates the agent_metrics table from existing agent_log_chunks data after
the schema migration. Reads chunks in batches of 500, accumulates per-agent
counts in memory, then upserts with additive ON CONFLICT DO UPDATE to match
the ongoing insertChunk write-path behavior.

- apps/server/scripts/backfill-metrics.ts: core backfillMetrics(db) + CLI wrapper backfillMetricsFromPath(dbPath)
- apps/server/scripts/backfill-metrics.test.ts: 8 tests covering all chunk types, malformed JSON, isolation, empty DB, and re-run double-count behavior
- apps/server/cli/index.ts: new top-level `cw backfill-metrics [--db <path>]` command
- docs/database-migrations.md: Post-migration backfill scripts section documenting when and how to run the script

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
Lukas May
2026-03-06 21:36:08 +01:00
parent 6eb1f8fc2a
commit db2196f1d1
4 changed files with 301 additions and 0 deletions

View File

@@ -55,3 +55,27 @@ Migrations 00000007 were generated by `drizzle-kit generate`. Migrations 0008
- **Migration files are immutable.** Once committed, never edit them. Make a new migration instead.
- **Keep schema.ts in sync.** The schema file is the source of truth for TypeScript types; migrations are the source of truth for database DDL. Both must reflect the same structure.
- **Test with `npm test`** after generating migrations to verify they work with in-memory databases.
## Post-migration backfill scripts
Some schema additions require a one-time data backfill because SQLite migrations cannot execute Node.js logic (e.g., JSON parsing). In these cases, the migration creates the table structure, and a separate Node.js script populates it from existing data.
### agent_metrics backfill
**When to run:** After deploying the migration that creates the `agent_metrics` table (introduced in the Radar Screen Performance initiative). Run this once per production database after upgrading.
**Command:**
```sh
cw backfill-metrics
# Or with a custom DB path:
cw backfill-metrics --db /path/to/codewalkers.db
```
**What it does:**
- Reads all existing `agent_log_chunks` rows in batches of 500 (ordered by `createdAt ASC`)
- Parses each chunk's `content` JSON to count `AskUserQuestion` tool calls, `Agent` spawns, and compaction events
- Upserts the accumulated counts into `agent_metrics` using additive conflict resolution
**Idempotency:** The script uses `ON CONFLICT DO UPDATE` with additive increments, matching the ongoing write-path behavior. Running it against an empty `agent_metrics` table is fully safe. Running it a second time will double-count — only run it once per database, immediately after applying the migration.
**Batch size:** 500 rows per query, to avoid loading the full `agent_log_chunks` table into memory. Progress is logged every 1,000 chunks.