Files
Codewalkers/docs/database-migrations.md
Lukas May db2196f1d1 feat: add backfill-metrics script and cw backfill-metrics CLI command
Populates the agent_metrics table from existing agent_log_chunks data after
the schema migration. Reads chunks in batches of 500, accumulates per-agent
counts in memory, then upserts with additive ON CONFLICT DO UPDATE to match
the ongoing insertChunk write-path behavior.

- apps/server/scripts/backfill-metrics.ts: core backfillMetrics(db) + CLI wrapper backfillMetricsFromPath(dbPath)
- apps/server/scripts/backfill-metrics.test.ts: 8 tests covering all chunk types, malformed JSON, isolation, empty DB, and re-run double-count behavior
- apps/server/cli/index.ts: new top-level `cw backfill-metrics [--db <path>]` command
- docs/database-migrations.md: Post-migration backfill scripts section documenting when and how to run the script

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-06 21:36:08 +01:00

82 lines
4.4 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Database Migrations
This project uses [drizzle-kit](https://orm.drizzle.team/kit-docs/overview) for database schema management and migrations.
## Overview
- **Schema definition:** `apps/server/db/schema.ts` (drizzle-orm table definitions)
- **Migration output:** `apps/server/drizzle/` directory (SQL files + `meta/_journal.json` + `meta/NNNN_snapshot.json`)
- **Config:** `drizzle.config.ts`
- **Runtime migrator:** `apps/server/db/ensure-schema.ts` (calls `drizzle-orm/better-sqlite3/migrator`)
## How It Works
On every server startup, `ensureSchema(db)` runs all pending migrations from the `apps/server/drizzle/` folder. Drizzle tracks applied migrations in a `__drizzle_migrations` table so only new migrations are applied. This is safe to call repeatedly.
The migrator discovers migrations via `apps/server/drizzle/meta/_journal.json`**not** by scanning the filesystem.
## Workflow
### Making schema changes
1. Edit `apps/server/db/schema.ts` with your table/column changes
2. Generate a migration:
```bash
npx drizzle-kit generate
```
3. Review the generated SQL in `apps/server/drizzle/NNNN_*.sql`
4. Verify multi-statement migrations use `--> statement-breakpoint` between statements (required by better-sqlite3 which only allows one statement per `prepare()` call)
5. Commit the migration file, snapshot, and journal update together
### Important: statement breakpoints
better-sqlite3 rejects SQL with multiple statements in a single `prepare()` call. Drizzle-kit splits on `--> statement-breakpoint`. If you hand-write or edit a migration with multiple statements, append `--> statement-breakpoint` to the end of each statement line (before the next statement):
```sql
ALTER TABLE foo ADD COLUMN bar TEXT;--> statement-breakpoint
CREATE INDEX foo_bar_idx ON foo(bar);
```
### Applying migrations
Migrations are applied automatically on server startup. No manual step needed.
For tests, the same `ensureSchema()` function is called on in-memory SQLite databases in `apps/server/db/repositories/drizzle/test-helpers.ts`.
## History
Migrations 00000007 were generated by `drizzle-kit generate`. Migrations 00080032 were hand-written (the snapshots fell out of sync). A schema-derived snapshot was restored at 0032, so `drizzle-kit generate` works normally from that point forward.
## Rules
- **Use `drizzle-kit generate`** for new migrations. It reads schema.ts, diffs against the last snapshot, and generates both SQL + snapshot automatically.
- **Never use raw CREATE TABLE statements** for schema initialization. The migration system handles this.
- **Always commit migration files.** They are the source of truth for database evolution.
- **Migration files are immutable.** Once committed, never edit them. Make a new migration instead.
- **Keep schema.ts in sync.** The schema file is the source of truth for TypeScript types; migrations are the source of truth for database DDL. Both must reflect the same structure.
- **Test with `npm test`** after generating migrations to verify they work with in-memory databases.
## Post-migration backfill scripts
Some schema additions require a one-time data backfill because SQLite migrations cannot execute Node.js logic (e.g., JSON parsing). In these cases, the migration creates the table structure, and a separate Node.js script populates it from existing data.
### agent_metrics backfill
**When to run:** After deploying the migration that creates the `agent_metrics` table (introduced in the Radar Screen Performance initiative). Run this once per production database after upgrading.
**Command:**
```sh
cw backfill-metrics
# Or with a custom DB path:
cw backfill-metrics --db /path/to/codewalkers.db
```
**What it does:**
- Reads all existing `agent_log_chunks` rows in batches of 500 (ordered by `createdAt ASC`)
- Parses each chunk's `content` JSON to count `AskUserQuestion` tool calls, `Agent` spawns, and compaction events
- Upserts the accumulated counts into `agent_metrics` using additive conflict resolution
**Idempotency:** The script uses `ON CONFLICT DO UPDATE` with additive increments, matching the ongoing write-path behavior. Running it against an empty `agent_metrics` table is fully safe. Running it a second time will double-count — only run it once per database, immediately after applying the migration.
**Batch size:** 500 rows per query, to avoid loading the full `agent_log_chunks` table into memory. Progress is logged every 1,000 chunks.