Codewalkers/docs/database-migrations.md

# Database Migrations

This project uses [drizzle-kit](https://orm.drizzle.team/kit-docs/overview) for database schema management and migrations.

## Overview

- **Schema definition:** `apps/server/db/schema.ts` (drizzle-orm table definitions)
- **Migration output:** `apps/server/drizzle/` directory (SQL files + `meta/_journal.json` + `meta/NNNN_snapshot.json`)
- **Config:** `drizzle.config.ts`
- **Runtime migrator:** `apps/server/db/ensure-schema.ts` (calls `drizzle-orm/better-sqlite3/migrator`)

## How It Works

On every server startup, `ensureSchema(db)` runs all pending migrations from the `apps/server/drizzle/` folder. Drizzle tracks applied migrations in a `__drizzle_migrations` table so only new migrations are applied. This is safe to call repeatedly.

The migrator discovers migrations via `apps/server/drizzle/meta/_journal.json` — **not** by scanning the filesystem.

## Workflow

### Making schema changes

1. Edit `apps/server/db/schema.ts` with your table/column changes
2. Generate a migration:
   ```bash
   npx drizzle-kit generate
   ```
3. Review the generated SQL in `apps/server/drizzle/NNNN_*.sql`
4. Verify multi-statement migrations use `--> statement-breakpoint` between statements (required by better-sqlite3 which only allows one statement per `prepare()` call)
5. Commit the migration file, snapshot, and journal update together

### Important: statement breakpoints

better-sqlite3 rejects SQL with multiple statements in a single `prepare()` call. Drizzle-kit splits on `--> statement-breakpoint`. If you hand-write or edit a migration with multiple statements, append `--> statement-breakpoint` to the end of each statement line (before the next statement):

```sql
ALTER TABLE foo ADD COLUMN bar TEXT;--> statement-breakpoint
CREATE INDEX foo_bar_idx ON foo(bar);
```

### Applying migrations

Migrations are applied automatically on server startup. No manual step needed.

For tests, the same `ensureSchema()` function is called on in-memory SQLite databases in `apps/server/db/repositories/drizzle/test-helpers.ts`.

## History

Migrations 0000–0007 were generated by `drizzle-kit generate`. Migrations 0008–0032 were hand-written (the snapshots fell out of sync). A schema-derived snapshot was restored at 0032, so `drizzle-kit generate` works normally from that point forward.

## Rules

- **Use `drizzle-kit generate`** for new migrations. It reads schema.ts, diffs against the last snapshot, and generates both SQL + snapshot automatically.
- **Never use raw CREATE TABLE statements** for schema initialization. The migration system handles this.
- **Always commit migration files.** They are the source of truth for database evolution.
- **Migration files are immutable.** Once committed, never edit them. Make a new migration instead.
- **Keep schema.ts in sync.** The schema file is the source of truth for TypeScript types; migrations are the source of truth for database DDL. Both must reflect the same structure.
- **Test with `npm test`** after generating migrations to verify they work with in-memory databases.

## Post-migration backfill scripts

Some schema additions require a one-time data backfill because SQLite migrations cannot execute Node.js logic (e.g., JSON parsing). In these cases, the migration creates the table structure, and a separate Node.js script populates it from existing data.

### agent_metrics backfill

**When to run:** After deploying the migration that creates the `agent_metrics` table (introduced in the Radar Screen Performance initiative). Run this once per production database after upgrading.

**Command:**
```sh
cw backfill-metrics
# Or with a custom DB path:
cw backfill-metrics --db /path/to/codewalkers.db
```

**What it does:**
- Reads all existing `agent_log_chunks` rows in batches of 500 (ordered by `createdAt ASC`)
- Parses each chunk's `content` JSON to count `AskUserQuestion` tool calls, `Agent` spawns, and compaction events
- Upserts the accumulated counts into `agent_metrics` using additive conflict resolution

**Idempotency:** The script uses `ON CONFLICT DO UPDATE` with additive increments, matching the ongoing write-path behavior. Running it against an empty `agent_metrics` table is fully safe. Running it a second time will double-count — only run it once per database, immediately after applying the migration.

**Batch size:** 500 rows per query, to avoid loading the full `agent_log_chunks` table into memory. Progress is logged every 1,000 chunks.