Populates the agent_metrics table from existing agent_log_chunks data after the schema migration. Reads chunks in batches of 500, accumulates per-agent counts in memory, then upserts with additive ON CONFLICT DO UPDATE to match the ongoing insertChunk write-path behavior. - apps/server/scripts/backfill-metrics.ts: core backfillMetrics(db) + CLI wrapper backfillMetricsFromPath(dbPath) - apps/server/scripts/backfill-metrics.test.ts: 8 tests covering all chunk types, malformed JSON, isolation, empty DB, and re-run double-count behavior - apps/server/cli/index.ts: new top-level `cw backfill-metrics [--db <path>]` command - docs/database-migrations.md: Post-migration backfill scripts section documenting when and how to run the script Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
82 lines
4.4 KiB
Markdown
82 lines
4.4 KiB
Markdown
# Database Migrations
|
||
|
||
This project uses [drizzle-kit](https://orm.drizzle.team/kit-docs/overview) for database schema management and migrations.
|
||
|
||
## Overview
|
||
|
||
- **Schema definition:** `apps/server/db/schema.ts` (drizzle-orm table definitions)
|
||
- **Migration output:** `apps/server/drizzle/` directory (SQL files + `meta/_journal.json` + `meta/NNNN_snapshot.json`)
|
||
- **Config:** `drizzle.config.ts`
|
||
- **Runtime migrator:** `apps/server/db/ensure-schema.ts` (calls `drizzle-orm/better-sqlite3/migrator`)
|
||
|
||
## How It Works
|
||
|
||
On every server startup, `ensureSchema(db)` runs all pending migrations from the `apps/server/drizzle/` folder. Drizzle tracks applied migrations in a `__drizzle_migrations` table so only new migrations are applied. This is safe to call repeatedly.
|
||
|
||
The migrator discovers migrations via `apps/server/drizzle/meta/_journal.json` — **not** by scanning the filesystem.
|
||
|
||
## Workflow
|
||
|
||
### Making schema changes
|
||
|
||
1. Edit `apps/server/db/schema.ts` with your table/column changes
|
||
2. Generate a migration:
|
||
```bash
|
||
npx drizzle-kit generate
|
||
```
|
||
3. Review the generated SQL in `apps/server/drizzle/NNNN_*.sql`
|
||
4. Verify multi-statement migrations use `--> statement-breakpoint` between statements (required by better-sqlite3 which only allows one statement per `prepare()` call)
|
||
5. Commit the migration file, snapshot, and journal update together
|
||
|
||
### Important: statement breakpoints
|
||
|
||
better-sqlite3 rejects SQL with multiple statements in a single `prepare()` call. Drizzle-kit splits on `--> statement-breakpoint`. If you hand-write or edit a migration with multiple statements, append `--> statement-breakpoint` to the end of each statement line (before the next statement):
|
||
|
||
```sql
|
||
ALTER TABLE foo ADD COLUMN bar TEXT;--> statement-breakpoint
|
||
CREATE INDEX foo_bar_idx ON foo(bar);
|
||
```
|
||
|
||
### Applying migrations
|
||
|
||
Migrations are applied automatically on server startup. No manual step needed.
|
||
|
||
For tests, the same `ensureSchema()` function is called on in-memory SQLite databases in `apps/server/db/repositories/drizzle/test-helpers.ts`.
|
||
|
||
## History
|
||
|
||
Migrations 0000–0007 were generated by `drizzle-kit generate`. Migrations 0008–0032 were hand-written (the snapshots fell out of sync). A schema-derived snapshot was restored at 0032, so `drizzle-kit generate` works normally from that point forward.
|
||
|
||
## Rules
|
||
|
||
- **Use `drizzle-kit generate`** for new migrations. It reads schema.ts, diffs against the last snapshot, and generates both SQL + snapshot automatically.
|
||
- **Never use raw CREATE TABLE statements** for schema initialization. The migration system handles this.
|
||
- **Always commit migration files.** They are the source of truth for database evolution.
|
||
- **Migration files are immutable.** Once committed, never edit them. Make a new migration instead.
|
||
- **Keep schema.ts in sync.** The schema file is the source of truth for TypeScript types; migrations are the source of truth for database DDL. Both must reflect the same structure.
|
||
- **Test with `npm test`** after generating migrations to verify they work with in-memory databases.
|
||
|
||
## Post-migration backfill scripts
|
||
|
||
Some schema additions require a one-time data backfill because SQLite migrations cannot execute Node.js logic (e.g., JSON parsing). In these cases, the migration creates the table structure, and a separate Node.js script populates it from existing data.
|
||
|
||
### agent_metrics backfill
|
||
|
||
**When to run:** After deploying the migration that creates the `agent_metrics` table (introduced in the Radar Screen Performance initiative). Run this once per production database after upgrading.
|
||
|
||
**Command:**
|
||
```sh
|
||
cw backfill-metrics
|
||
# Or with a custom DB path:
|
||
cw backfill-metrics --db /path/to/codewalkers.db
|
||
```
|
||
|
||
**What it does:**
|
||
- Reads all existing `agent_log_chunks` rows in batches of 500 (ordered by `createdAt ASC`)
|
||
- Parses each chunk's `content` JSON to count `AskUserQuestion` tool calls, `Agent` spawns, and compaction events
|
||
- Upserts the accumulated counts into `agent_metrics` using additive conflict resolution
|
||
|
||
**Idempotency:** The script uses `ON CONFLICT DO UPDATE` with additive increments, matching the ongoing write-path behavior. Running it against an empty `agent_metrics` table is fully safe. Running it a second time will double-count — only run it once per database, immediately after applying the migration.
|
||
|
||
**Batch size:** 500 rows per query, to avoid loading the full `agent_log_chunks` table into memory. Progress is logged every 1,000 chunks.
|