fix: adding db recovery practices (bye bye db)
Some checks failed
Security Tests / security-non-db (push) Successful in 18s
Security Tests / security-db (push) Successful in 23s
Deploy / deploy (push) Has been cancelled

This commit is contained in:
2026-03-02 11:16:52 -06:00
parent 301b3f8967
commit d9df9b0fe4
11 changed files with 409 additions and 15 deletions

8
.env
View File

@@ -26,13 +26,13 @@ SMTP_REQUIRE_TLS=true
SMTP_TLS_REJECT_UNAUTHORIZED=true
SMTP_USER=skymoney-smtp
SMTP_PASS=skymoneysmtp124521
EMAIL_FROM=SkyMoney Budget <no-reply@skymoneybudget.com>
EMAIL_FROM="SkyMoney Budget <no-reply@skymoneybudget.com>"
EMAIL_BOUNCE_FROM=bounces@skymoneybudget.com
EMAIL_REPLY_TO=support@skymoneybudget.com
UPDATE_NOTICE_VERSION=4
UPDATE_NOTICE_TITLE=SkyMoney Update
UPDATE_NOTICE_BODY=You can now set fixed expenses as Estimated Bills for variable amounts (like utilities), apply actual bill amounts each cycle for instant true-up, and auto-adjust surplus/shortfall against available budget.
UPDATE_NOTICE_TITLE="SkyMoney Update"
UPDATE_NOTICE_BODY="You can now set fixed expenses as Estimated Bills for variable amounts (like utilities), apply actual bill amounts each cycle for instant true-up, and auto-adjust surplus/shortfall against available budget."
ALLOW_INSECURE_AUTH_FOR_DEV=false
JWT_ISSUER=skymoney-api
JWT_AUDIENCE=skymoney-web
@@ -42,3 +42,5 @@ AUTH_LOCKOUT_WINDOW_MS=900000
PASSWORD_RESET_TTL_MINUTES=30
PASSWORD_RESET_RATE_LIMIT_PER_MINUTE=5
PASSWORD_RESET_CONFIRM_RATE_LIMIT_PER_MINUTE=10
EXPECTED_PROD_DB_HOST=postgres
EXPECTED_PROD_DB_NAME=skymoney

View File

@@ -19,6 +19,8 @@ DATABASE_URL=postgres://skymoney_app:change-me@postgres:5432/skymoney
BACKUP_DATABASE_URL=postgres://skymoney_app:change-me@127.0.0.1:5432/skymoney
RESTORE_DATABASE_URL=postgres://skymoney_app:change-me@127.0.0.1:5432/skymoney_restore_test
ADMIN_DATABASE_URL=postgres://postgres:change-me@127.0.0.1:5432/postgres
EXPECTED_PROD_DB_HOST=postgres
EXPECTED_PROD_DB_NAME=skymoney
# Auth secrets (min 32 chars)
JWT_SECRET=replace-with-32+-chars

View File

@@ -27,6 +27,8 @@ jobs:
- name: Deploy with Docker Compose
run: |
set -euo pipefail
# Deploy directory
APP_DIR=/opt/skymoney
mkdir -p $APP_DIR
@@ -47,14 +49,29 @@ jobs:
cd $APP_DIR
# Keep a stable compose project name to avoid accidental new resource names
export COMPOSE_PROJECT_NAME=skymoney
# Validate migration target before touching containers
export EXPECTED_PROD_DB_HOST="${EXPECTED_PROD_DB_HOST:-postgres}"
export EXPECTED_PROD_DB_NAME="${EXPECTED_PROD_DB_NAME:-skymoney}"
./scripts/validate-prod-db-target.sh
# Build and start all services
sudo docker-compose up -d --build
sudo -E docker-compose up -d --build
# Wait for database to be ready
sleep 10
# Mandatory pre-migration backup
BACKUP_ENFORCE_TARGET_CHECK=1 \
EXPECTED_PROD_DB_HOST="$EXPECTED_PROD_DB_HOST" \
EXPECTED_PROD_DB_NAME="$EXPECTED_PROD_DB_NAME" \
BACKUP_DIR=/opt/skymoney/backups \
./scripts/backup.sh
# Run Prisma migrations inside the API container
sudo docker-compose exec -T api npx prisma migrate deploy
sudo -E docker-compose exec -T api npx prisma migrate deploy
- name: Reload Nginx
run: sudo systemctl reload nginx

View File

@@ -1,5 +1,8 @@
{
"css.lint.unknownAtRules": "ignore",
"scss.lint.unknownAtRules": "ignore",
"less.lint.unknownAtRules": "ignore"
"less.lint.unknownAtRules": "ignore",
"cSpell.words": [
"skymoney"
]
}

View File

@@ -76,3 +76,4 @@ services:
volumes:
pgdata:
name: skymoney_pgdata

View File

@@ -0,0 +1,197 @@
# Production DB Recovery and Safety Runbook
Last updated: March 2, 2026
## Purpose
Use this runbook when production data appears lost, reset, or unexpectedly empty.
## Safety Rules (Do Not Bypass)
1. Never run destructive Prisma commands in production:
- `prisma migrate reset`
- `prisma migrate dev`
- `prisma db push --accept-data-loss`
2. Never run `docker compose down -v` (or `docker-compose down -v`) against production.
3. Always restore into an isolated database first.
4. Always take a fresh backup before migration/cutover actions.
## 1) Determine recoverability
### 1.1 Identify active Postgres container + volume
```bash
docker ps --format '{{.Names}}'
docker inspect <postgres-container> --format '{{json .Mounts}}'
```
Record:
- container name
- mounted volume name
- data mount path
### 1.2 Enumerate candidate old volumes
```bash
docker volume ls | grep -E 'pgdata|skymoney|postgres'
```
### 1.3 Inspect candidate volumes for PostgreSQL cluster files
```bash
for v in $(docker volume ls --format '{{.Name}}' | grep -E 'pgdata|skymoney|postgres'); do
echo "== $v =="
docker run --rm -v "${v}:/var/lib/postgresql/data" alpine sh -lc \
"ls -la /var/lib/postgresql/data | head -n 30"
done
```
Look for:
- `PG_VERSION`
- `base/`
- `global/`
- `pg_wal/`
### 1.4 Check filesystem backups and automation history
```bash
ls -la /opt/skymoney/backups
ls -la /var/backups
systemctl list-timers --all | grep -Ei 'backup|postgres|skymoney'
grep -R \"backup.sh\" /etc/cron* /opt/skymoney 2>/dev/null
```
Decision:
- If valid dump/volume exists -> continue to Section 2.
- If none exists -> mark irrecoverable and continue to Section 3.
## 2) Recover from artifact
### 2.1 Restore into isolated recovery DB
```bash
export RECOVERY_DB="skymoney_recovery_$(date +%Y%m%d%H%M)"
export BACKUP_FILE="/opt/skymoney/backups/<chosen-file>.dump"
export DATABASE_URL="postgres://<admin-user>:<admin-pass>@127.0.0.1:5432/skymoney"
export RESTORE_DATABASE_URL="postgres://<admin-user>:<admin-pass>@127.0.0.1:5432/${RECOVERY_DB}"
export RESTORE_DB="$RECOVERY_DB"
cd /opt/skymoney
./scripts/restore.sh
```
### 2.2 Validate restored data
```bash
psql "$RESTORE_DATABASE_URL" -c 'SELECT COUNT(*) FROM "User";'
psql "$RESTORE_DATABASE_URL" -c 'SELECT COUNT(*) FROM "Transaction";'
psql "$RESTORE_DATABASE_URL" -c 'SELECT COUNT(*) FROM "EmailToken";'
```
### 2.3 Cutover (if recovery DB is valid)
1. Put app in brief maintenance mode (or stop writes).
2. Take a fresh backup of current prod DB.
3. Restore/copy recovered data into production DB.
4. Run:
```bash
docker-compose exec -T api npx prisma migrate deploy
```
5. Remove maintenance mode.
### 2.4 Post-recovery validation
1. Login flow works.
2. Dashboard and critical routes return expected results.
3. Security events/logging continue to emit.
## 3) Re-create admin/operator access (if no recovery)
Note: current schema does not include a built-in `isAdmin` role field. This step restores operational access user credentials only.
### 3.1 Create verified operator user
```bash
cd /opt/skymoney/api
node -e '
const { PrismaClient } = require("@prisma/client");
const argon2 = require("argon2");
(async () => {
const prisma = new PrismaClient();
const email = process.env.BOOTSTRAP_EMAIL;
const password = process.env.BOOTSTRAP_PASSWORD;
if (!email || !password) throw new Error("BOOTSTRAP_EMAIL and BOOTSTRAP_PASSWORD are required");
const passwordHash = await argon2.hash(password);
await prisma.user.upsert({
where: { email },
update: { passwordHash, emailVerified: true },
create: { email, passwordHash, emailVerified: true },
});
await prisma.$disconnect();
console.log("Bootstrap user ready:", email);
})().catch((e) => { console.error(e); process.exit(1); });
'
```
### 3.2 Credential handling
1. Generate random password in terminal.
2. Store in password manager.
3. Rotate immediately after first successful login.
## 4) Backup/restore validation against live VPS
### 4.1 Run real backup
```bash
cd /opt/skymoney
BACKUP_ENFORCE_TARGET_CHECK=1 \
EXPECTED_PROD_DB_HOST=postgres \
EXPECTED_PROD_DB_NAME=skymoney \
BACKUP_DIR=/opt/skymoney/backups \
./scripts/backup.sh
```
### 4.2 Verify latest checksum
```bash
LATEST_DUMP="$(ls -1t /opt/skymoney/backups/*.dump | head -n 1)"
sha256sum -c "${LATEST_DUMP}.sha256"
```
### 4.3 Restore drill into non-prod DB
```bash
RESTORE_DB="skymoney_restore_test_$(date +%Y%m%d%H%M)"
RESTORE_DATABASE_URL="postgres://<admin-user>:<admin-pass>@127.0.0.1:5432/${RESTORE_DB}" \
DATABASE_URL="postgres://<admin-user>:<admin-pass>@127.0.0.1:5432/skymoney" \
BACKUP_FILE="$LATEST_DUMP" \
RESTORE_DB="$RESTORE_DB" \
./scripts/restore.sh
```
### 4.4 Validate restore drill
```bash
psql "postgres://<admin-user>:<admin-pass>@127.0.0.1:5432/${RESTORE_DB}" -c 'SELECT COUNT(*) FROM "User";'
```
### 4.5 Cleanup drill DB
```bash
psql "postgres://<admin-user>:<admin-pass>@127.0.0.1:5432/skymoney" \
-c "DROP DATABASE IF EXISTS \"${RESTORE_DB}\";"
```
## 5) Deploy hardening controls
1. `docker-compose.yml` pins volume name: `skymoney_pgdata`.
2. Deploy workflow sets `COMPOSE_PROJECT_NAME=skymoney`.
3. Deploy workflow runs `scripts/validate-prod-db-target.sh`.
4. Deploy workflow runs pre-migration `scripts/backup.sh`.
5. Deploy workflow uses `prisma migrate deploy` only.
## Quarterly drill requirement
Run a full backup + restore drill every quarter and record evidence in:
- `tests-results-for-OWASP/evidence-log-template.md`

View File

@@ -1,11 +1,6 @@
#!/usr/bin/env bash
set -euo pipefail
if [[ -z "${DATABASE_URL:-}" && -z "${BACKUP_DATABASE_URL:-}" ]]; then
echo "DATABASE_URL or BACKUP_DATABASE_URL is required."
exit 1
fi
ENV_FILE="${ENV_FILE:-./.env}"
if [[ -f "$ENV_FILE" ]]; then
set -a
@@ -14,6 +9,48 @@ if [[ -f "$ENV_FILE" ]]; then
set +a
fi
if [[ -z "${DATABASE_URL:-}" && -z "${BACKUP_DATABASE_URL:-}" ]]; then
echo "DATABASE_URL or BACKUP_DATABASE_URL is required."
exit 1
fi
BACKUP_URL="${BACKUP_DATABASE_URL:-$DATABASE_URL}"
extract_host() {
local url="$1"
sed -E 's#^[a-zA-Z][a-zA-Z0-9+.-]*://[^@/]+@([^/:?]+).*$#\1#' <<< "$url"
}
extract_db() {
local url="$1"
sed -E 's#^[a-zA-Z][a-zA-Z0-9+.-]*://[^/]+/([^?]+).*$#\1#' <<< "$url"
}
if [[ "${BACKUP_ENFORCE_TARGET_CHECK:-0}" == "1" ]]; then
if [[ -z "${EXPECTED_PROD_DB_HOST:-}" || -z "${EXPECTED_PROD_DB_NAME:-}" ]]; then
echo "BACKUP_ENFORCE_TARGET_CHECK=1 requires EXPECTED_PROD_DB_HOST and EXPECTED_PROD_DB_NAME."
exit 1
fi
ACTUAL_HOST="$(extract_host "$BACKUP_URL")"
ACTUAL_DB="$(extract_db "$BACKUP_URL")"
if [[ "$ACTUAL_HOST" == "$BACKUP_URL" || "$ACTUAL_DB" == "$BACKUP_URL" ]]; then
echo "Unable to parse backup database URL."
exit 1
fi
if [[ "$ACTUAL_HOST" != "$EXPECTED_PROD_DB_HOST" ]]; then
echo "Backup target host mismatch. expected=$EXPECTED_PROD_DB_HOST actual=$ACTUAL_HOST"
exit 1
fi
if [[ "$ACTUAL_DB" != "$EXPECTED_PROD_DB_NAME" ]]; then
echo "Backup target db mismatch. expected=$EXPECTED_PROD_DB_NAME actual=$ACTUAL_DB"
exit 1
fi
fi
OUT_DIR="${BACKUP_DIR:-./backups}"
mkdir -p "$OUT_DIR"
@@ -22,8 +59,12 @@ OUT_FILE="${OUT_DIR}/skymoney_${STAMP}.dump"
OUT_BASENAME="$(basename "$OUT_FILE")"
OUT_DIR_ABS="$(cd "$OUT_DIR" && pwd)"
pg_dump "${BACKUP_DATABASE_URL:-$DATABASE_URL}" -Fc -f "$OUT_FILE"
START_TS="$(date +%s)"
pg_dump "$BACKUP_URL" -Fc -f "$OUT_FILE"
(cd "$OUT_DIR_ABS" && sha256sum "$OUT_BASENAME" > "${OUT_BASENAME}.sha256")
END_TS="$(date +%s)"
RUNTIME_SEC="$((END_TS - START_TS))"
echo "Backup written to: $OUT_FILE"
echo "Checksum written to: ${OUT_FILE}.sha256"
echo "Backup runtime seconds: $RUNTIME_SEC"

View File

@@ -0,0 +1,50 @@
#!/usr/bin/env bash
set -euo pipefail
ENV_FILE="${ENV_FILE:-./.env}"
if [[ -f "$ENV_FILE" ]]; then
set -a
# shellcheck source=/dev/null
. "$ENV_FILE"
set +a
fi
if [[ -z "${DATABASE_URL:-}" ]]; then
echo "DATABASE_URL is required."
exit 1
fi
if [[ -z "${EXPECTED_PROD_DB_HOST:-}" || -z "${EXPECTED_PROD_DB_NAME:-}" ]]; then
echo "EXPECTED_PROD_DB_HOST and EXPECTED_PROD_DB_NAME are required."
exit 1
fi
extract_host() {
local url="$1"
sed -E 's#^[a-zA-Z][a-zA-Z0-9+.-]*://[^@/]+@([^/:?]+).*$#\1#' <<< "$url"
}
extract_db() {
local url="$1"
sed -E 's#^[a-zA-Z][a-zA-Z0-9+.-]*://[^/]+/([^?]+).*$#\1#' <<< "$url"
}
ACTUAL_HOST="$(extract_host "$DATABASE_URL")"
ACTUAL_DB="$(extract_db "$DATABASE_URL")"
if [[ "$ACTUAL_HOST" == "$DATABASE_URL" || "$ACTUAL_DB" == "$DATABASE_URL" ]]; then
echo "Unable to parse DATABASE_URL."
exit 1
fi
if [[ "$ACTUAL_HOST" != "$EXPECTED_PROD_DB_HOST" ]]; then
echo "DATABASE_URL host mismatch. expected=$EXPECTED_PROD_DB_HOST actual=$ACTUAL_HOST"
exit 1
fi
if [[ "$ACTUAL_DB" != "$EXPECTED_PROD_DB_NAME" ]]; then
echo "DATABASE_URL db mismatch. expected=$EXPECTED_PROD_DB_NAME actual=$ACTUAL_DB"
exit 1
fi
echo "DATABASE_URL target check passed (host=$ACTUAL_HOST db=$ACTUAL_DB)."

View File

@@ -25,6 +25,7 @@ This directory is the source of truth for SkyMoney OWASP validation work.
- `post-deployment-verification-checklist.md`: Production smoke checks after each deploy.
- `evidence-log-template.md`: Copy/paste template for recording each verification run.
- `residual-risk-backlog.md`: Open non-blocking hardening items tracked release-to-release.
- `../docs/production-db-recovery-runbook.md`: Incident response + recovery + admin bootstrap runbook.
## Current status

View File

@@ -6,12 +6,16 @@
- Environment: `local` | `staging` | `production`
- App/API version (git SHA):
- Operator:
- Incident/reference ticket (if recovery event):
## Environment flags
- `NODE_ENV`:
- `AUTH_DISABLED`:
- `ALLOW_INSECURE_AUTH_FOR_DEV`:
- `COMPOSE_PROJECT_NAME`:
- `EXPECTED_PROD_DB_HOST`:
- `EXPECTED_PROD_DB_NAME`:
## Commands executed
@@ -33,6 +37,30 @@ Output summary:
```
Output summary:
4.
```bash
# command
```
Output summary:
## Recoverability Evidence
- Current Postgres container:
- Mounted volume(s):
- Candidate old volume(s) inspected:
- Recoverable artifact found: `yes` | `no`
- Artifact location:
- Recovery decision:
## Backup/Restore Drill Evidence
- Latest backup file:
- Latest checksum file:
- Checksum verified: `yes` | `no`
- Restore test DB name:
- Restore succeeded: `yes` | `no`
- Row count checks performed:
## Results
- A01 protected route unauthenticated check: `pass` | `fail`
@@ -56,6 +84,8 @@ Output summary:
- New issues observed:
- Regressions observed:
- Follow-up tickets:
- Data recovery status:
- Admin user bootstrap status:
## Residual Risk Review

View File

@@ -18,6 +18,55 @@ echo "$TEST_DATABASE_URL"
Expected:
- single valid URL value
- host/port match the intended test database (for local runs usually `127.0.0.1:5432`)
5. Compose/DB safety preflight:
- `COMPOSE_PROJECT_NAME=skymoney` is set for deploy runtime.
- `docker-compose.yml` volume `pgdata` is pinned to `skymoney_pgdata`.
- `scripts/validate-prod-db-target.sh` passes for current `.env`.
- deploy runbook acknowledges forbidden destructive commands in prod:
- `prisma migrate reset`
- `prisma migrate dev`
- `prisma db push --accept-data-loss`
- `docker compose down -v` / `docker-compose down -v`
## Database recoverability and safety checks
### 0) Capture current container and volume bindings
```bash
docker ps --format '{{.Names}}'
docker inspect <postgres-container> --format '{{json .Mounts}}'
docker volume ls | grep -E 'pgdata|skymoney|postgres'
```
Expected:
- production Postgres uses `skymoney_pgdata`.
- no unexpected new empty volume silently substituted.
### 0.1) Validate latest backup artifact exists and verifies
```bash
ls -lt /opt/skymoney/backups | head
LATEST_DUMP="$(ls -1t /opt/skymoney/backups/*.dump | head -n 1)"
sha256sum -c "${LATEST_DUMP}.sha256"
```
Expected:
- latest dump and checksum exist.
- checksum verification returns `OK`.
### 0.2) Restore drill into isolated test DB (same VPS)
```bash
RESTORE_DB="skymoney_restore_test_$(date +%Y%m%d%H%M)" \
BACKUP_FILE="$LATEST_DUMP" \
RESTORE_DATABASE_URL="postgres://<user>:<pass>@127.0.0.1:5432/${RESTORE_DB}" \
DATABASE_URL="postgres://<admin-user>:<admin-pass>@127.0.0.1:5432/skymoney" \
./scripts/restore.sh
```
Expected:
- restore completes without manual edits.
- key tables readable in restored DB.
## A01 smoke checks
@@ -51,7 +100,7 @@ curl -i -X POST "${API_BASE}/admin/rollover" \
```
Expected:
- HTTP `403`
- HTTP `401` or `403` (must not be publicly callable)
## A09 smoke checks
@@ -98,10 +147,11 @@ Expected:
Note:
- A06/A07 runtime suites require PostgreSQL availability.
- `SECURITY_DB_TESTS=0` runs non-DB security controls only.
- `SECURITY_DB_TESTS=1` includes DB-backed A06/A07 suites.
- `SECURITY_DB_TESTS=1` includes DB-backed A06/A07/forgot-password suites.
## Sign-off
1. Record outputs in `evidence-log-template.md`.
2. Review open residual risks in `residual-risk-backlog.md`.
3. Mark release security check as pass/fail.
3. Record backup + restore drill evidence.
4. Mark release security check as pass/fail.