Files

Lukas May db2196f1d1 feat: add backfill-metrics script and cw backfill-metrics CLI command

Populates the agent_metrics table from existing agent_log_chunks data after
the schema migration. Reads chunks in batches of 500, accumulates per-agent
counts in memory, then upserts with additive ON CONFLICT DO UPDATE to match
the ongoing insertChunk write-path behavior.

- apps/server/scripts/backfill-metrics.ts: core backfillMetrics(db) + CLI wrapper backfillMetricsFromPath(dbPath)
- apps/server/scripts/backfill-metrics.test.ts: 8 tests covering all chunk types, malformed JSON, isolation, empty DB, and re-run double-count behavior
- apps/server/cli/index.ts: new top-level `cw backfill-metrics [--db <path>]` command
- docs/database-migrations.md: Post-migration backfill scripts section documenting when and how to run the script

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

2026-03-06 21:36:08 +01:00

4.4 KiB

Raw Blame History

Database Migrations

This project uses drizzle-kit for database schema management and migrations.

Overview

Schema definition: apps/server/db/schema.ts (drizzle-orm table definitions)
Migration output: apps/server/drizzle/ directory (SQL files + meta/_journal.json + meta/NNNN_snapshot.json)
Config: drizzle.config.ts
Runtime migrator: apps/server/db/ensure-schema.ts (calls drizzle-orm/better-sqlite3/migrator)

How It Works

On every server startup, ensureSchema(db) runs all pending migrations from the apps/server/drizzle/ folder. Drizzle tracks applied migrations in a __drizzle_migrations table so only new migrations are applied. This is safe to call repeatedly.

The migrator discovers migrations via apps/server/drizzle/meta/_journal.json — not by scanning the filesystem.

Workflow

Making schema changes

Edit apps/server/db/schema.ts with your table/column changes
Generate a migration:
```
npx drizzle-kit generate
```
Review the generated SQL in apps/server/drizzle/NNNN_*.sql
Verify multi-statement migrations use --> statement-breakpoint between statements (required by better-sqlite3 which only allows one statement per prepare() call)
Commit the migration file, snapshot, and journal update together

Important: statement breakpoints

better-sqlite3 rejects SQL with multiple statements in a single prepare() call. Drizzle-kit splits on --> statement-breakpoint. If you hand-write or edit a migration with multiple statements, append --> statement-breakpoint to the end of each statement line (before the next statement):

ALTER TABLE foo ADD COLUMN bar TEXT;--> statement-breakpoint
CREATE INDEX foo_bar_idx ON foo(bar);

Applying migrations

Migrations are applied automatically on server startup. No manual step needed.

For tests, the same ensureSchema() function is called on in-memory SQLite databases in apps/server/db/repositories/drizzle/test-helpers.ts.

History

Migrations 0000–0007 were generated by drizzle-kit generate. Migrations 0008–0032 were hand-written (the snapshots fell out of sync). A schema-derived snapshot was restored at 0032, so drizzle-kit generate works normally from that point forward.

Rules

Use drizzle-kit generate for new migrations. It reads schema.ts, diffs against the last snapshot, and generates both SQL + snapshot automatically.
Never use raw CREATE TABLE statements for schema initialization. The migration system handles this.
Always commit migration files. They are the source of truth for database evolution.
Migration files are immutable. Once committed, never edit them. Make a new migration instead.
Keep schema.ts in sync. The schema file is the source of truth for TypeScript types; migrations are the source of truth for database DDL. Both must reflect the same structure.
Test with npm test after generating migrations to verify they work with in-memory databases.

Post-migration backfill scripts

Some schema additions require a one-time data backfill because SQLite migrations cannot execute Node.js logic (e.g., JSON parsing). In these cases, the migration creates the table structure, and a separate Node.js script populates it from existing data.

agent_metrics backfill

When to run: After deploying the migration that creates the agent_metrics table (introduced in the Radar Screen Performance initiative). Run this once per production database after upgrading.

Command:

cw backfill-metrics
# Or with a custom DB path:
cw backfill-metrics --db /path/to/codewalkers.db

What it does:

Reads all existing agent_log_chunks rows in batches of 500 (ordered by createdAt ASC)
Parses each chunk's content JSON to count AskUserQuestion tool calls, Agent spawns, and compaction events
Upserts the accumulated counts into agent_metrics using additive conflict resolution

Idempotency: The script uses ON CONFLICT DO UPDATE with additive increments, matching the ongoing write-path behavior. Running it against an empty agent_metrics table is fully safe. Running it a second time will double-count — only run it once per database, immediately after applying the migration.

Batch size: 500 rows per query, to avoid loading the full agent_log_chunks table into memory. Progress is logged every 1,000 chunks.

4.4 KiB Raw Blame History Unescape Escape