Files
SkyMoney/docs/production-db-recovery-runbook.md
Ricearoni1245 d9df9b0fe4
Some checks failed
Security Tests / security-non-db (push) Successful in 18s
Security Tests / security-db (push) Successful in 23s
Deploy / deploy (push) Has been cancelled
fix: adding db recovery practices (bye bye db)
2026-03-02 11:16:52 -06:00

5.4 KiB

Production DB Recovery and Safety Runbook

Last updated: March 2, 2026

Purpose

Use this runbook when production data appears lost, reset, or unexpectedly empty.

Safety Rules (Do Not Bypass)

  1. Never run destructive Prisma commands in production:
  • prisma migrate reset
  • prisma migrate dev
  • prisma db push --accept-data-loss
  1. Never run docker compose down -v (or docker-compose down -v) against production.
  2. Always restore into an isolated database first.
  3. Always take a fresh backup before migration/cutover actions.

1) Determine recoverability

1.1 Identify active Postgres container + volume

docker ps --format '{{.Names}}'
docker inspect <postgres-container> --format '{{json .Mounts}}'

Record:

  • container name
  • mounted volume name
  • data mount path

1.2 Enumerate candidate old volumes

docker volume ls | grep -E 'pgdata|skymoney|postgres'

1.3 Inspect candidate volumes for PostgreSQL cluster files

for v in $(docker volume ls --format '{{.Name}}' | grep -E 'pgdata|skymoney|postgres'); do
  echo "== $v =="
  docker run --rm -v "${v}:/var/lib/postgresql/data" alpine sh -lc \
    "ls -la /var/lib/postgresql/data | head -n 30"
done

Look for:

  • PG_VERSION
  • base/
  • global/
  • pg_wal/

1.4 Check filesystem backups and automation history

ls -la /opt/skymoney/backups
ls -la /var/backups
systemctl list-timers --all | grep -Ei 'backup|postgres|skymoney'
grep -R \"backup.sh\" /etc/cron* /opt/skymoney 2>/dev/null

Decision:

  • If valid dump/volume exists -> continue to Section 2.
  • If none exists -> mark irrecoverable and continue to Section 3.

2) Recover from artifact

2.1 Restore into isolated recovery DB

export RECOVERY_DB="skymoney_recovery_$(date +%Y%m%d%H%M)"
export BACKUP_FILE="/opt/skymoney/backups/<chosen-file>.dump"
export DATABASE_URL="postgres://<admin-user>:<admin-pass>@127.0.0.1:5432/skymoney"
export RESTORE_DATABASE_URL="postgres://<admin-user>:<admin-pass>@127.0.0.1:5432/${RECOVERY_DB}"
export RESTORE_DB="$RECOVERY_DB"

cd /opt/skymoney
./scripts/restore.sh

2.2 Validate restored data

psql "$RESTORE_DATABASE_URL" -c 'SELECT COUNT(*) FROM "User";'
psql "$RESTORE_DATABASE_URL" -c 'SELECT COUNT(*) FROM "Transaction";'
psql "$RESTORE_DATABASE_URL" -c 'SELECT COUNT(*) FROM "EmailToken";'

2.3 Cutover (if recovery DB is valid)

  1. Put app in brief maintenance mode (or stop writes).
  2. Take a fresh backup of current prod DB.
  3. Restore/copy recovered data into production DB.
  4. Run:
docker-compose exec -T api npx prisma migrate deploy
  1. Remove maintenance mode.

2.4 Post-recovery validation

  1. Login flow works.
  2. Dashboard and critical routes return expected results.
  3. Security events/logging continue to emit.

3) Re-create admin/operator access (if no recovery)

Note: current schema does not include a built-in isAdmin role field. This step restores operational access user credentials only.

3.1 Create verified operator user

cd /opt/skymoney/api
node -e '
const { PrismaClient } = require("@prisma/client");
const argon2 = require("argon2");
(async () => {
  const prisma = new PrismaClient();
  const email = process.env.BOOTSTRAP_EMAIL;
  const password = process.env.BOOTSTRAP_PASSWORD;
  if (!email || !password) throw new Error("BOOTSTRAP_EMAIL and BOOTSTRAP_PASSWORD are required");
  const passwordHash = await argon2.hash(password);
  await prisma.user.upsert({
    where: { email },
    update: { passwordHash, emailVerified: true },
    create: { email, passwordHash, emailVerified: true },
  });
  await prisma.$disconnect();
  console.log("Bootstrap user ready:", email);
})().catch((e) => { console.error(e); process.exit(1); });
'

3.2 Credential handling

  1. Generate random password in terminal.
  2. Store in password manager.
  3. Rotate immediately after first successful login.

4) Backup/restore validation against live VPS

4.1 Run real backup

cd /opt/skymoney
BACKUP_ENFORCE_TARGET_CHECK=1 \
EXPECTED_PROD_DB_HOST=postgres \
EXPECTED_PROD_DB_NAME=skymoney \
BACKUP_DIR=/opt/skymoney/backups \
./scripts/backup.sh

4.2 Verify latest checksum

LATEST_DUMP="$(ls -1t /opt/skymoney/backups/*.dump | head -n 1)"
sha256sum -c "${LATEST_DUMP}.sha256"

4.3 Restore drill into non-prod DB

RESTORE_DB="skymoney_restore_test_$(date +%Y%m%d%H%M)"
RESTORE_DATABASE_URL="postgres://<admin-user>:<admin-pass>@127.0.0.1:5432/${RESTORE_DB}" \
DATABASE_URL="postgres://<admin-user>:<admin-pass>@127.0.0.1:5432/skymoney" \
BACKUP_FILE="$LATEST_DUMP" \
RESTORE_DB="$RESTORE_DB" \
./scripts/restore.sh

4.4 Validate restore drill

psql "postgres://<admin-user>:<admin-pass>@127.0.0.1:5432/${RESTORE_DB}" -c 'SELECT COUNT(*) FROM "User";'

4.5 Cleanup drill DB

psql "postgres://<admin-user>:<admin-pass>@127.0.0.1:5432/skymoney" \
  -c "DROP DATABASE IF EXISTS \"${RESTORE_DB}\";"

5) Deploy hardening controls

  1. docker-compose.yml pins volume name: skymoney_pgdata.
  2. Deploy workflow sets COMPOSE_PROJECT_NAME=skymoney.
  3. Deploy workflow runs scripts/validate-prod-db-target.sh.
  4. Deploy workflow runs pre-migration scripts/backup.sh.
  5. Deploy workflow uses prisma migrate deploy only.

Quarterly drill requirement

Run a full backup + restore drill every quarter and record evidence in:

  • tests-results-for-OWASP/evidence-log-template.md