K3s Borg backup operational + update skill & infra docs

This commit is contained in:
Hoid 2026-02-19 09:14:02 +00:00
parent 83595c17fb
commit e2e9ae55f7
2 changed files with 32 additions and 13 deletions

View file

@ -151,21 +151,40 @@ kubectl exec -n postgres <primary-pod> -c postgres -- psql -U docfast -d <dbname
- Hetzner FW: `coolify-fw` (ID 10553199)
- Port 6443: 10.0.0.0/16 + 178.115.247.134/32 (CI runner)
## Backup (TO IMPLEMENT)
## Backup ✅ OPERATIONAL
**Current: ❌ NO BACKUPS**
**Borg → Hetzner Storage Box**
- Target: `u149513-sub10@u149513-sub10.your-backup.de:23`
- SSH key: `/root/.ssh/id_ed25519` (k3s-mgr-backup)
- Passphrase: `/root/.borg-passphrase` (on k3s-mgr)
- Key exports: `/root/.borg-key-cluster`, `/root/.borg-key-db`
- Script: `/root/k3s-backup.sh`
- Log: `/var/log/k3s-backup.log`
**Plan: Borg → Hetzner Storage Box**
- Target: `u149513-sub11@u149513-sub11.your-backup.de:23`
- SSH key already configured on k3s-mgr (`/root/.ssh/id_ed25519`, fingerprint `docfast-backup`)
- Per-machine subdir: `./docfast-1/` (existing), `./k3s-cluster/` and `./k3s-db/` (planned)
**Repos:**
| Repo | Contents | Size |
|------|----------|------|
| `./k3s-cluster` | K3s SQLite, manifests, token, all namespace YAML exports, CNPG specs | ~45 MB |
| `./k3s-db` | pg_dump of all databases (docfast, docfast_staging, snapapi, snapapi_staging) + globals | ~30 KB |
**What to back up:**
1. **Cluster state:** etcd snapshots + `/var/lib/rancher/k3s/server/manifests/` → daily
2. **Databases:** pg_dump all databases → every 6h
3. **K8s manifests:** export all resources as YAML → daily
**Schedule (cron on k3s-mgr):**
- `0 */6 * * *` — DB backup (pg_dump) every 6 hours
- `0 3 * * *` — Full backup (DB + cluster state + manifests) daily at 03:00 UTC
**Recovery:** Fresh nodes → K3s install → restore etcd or re-apply manifests → restore DB → update DNS → ~15-30 min
**Retention:** 7 daily, 4 weekly (auto-pruned)
**Recovery:**
1. Provision 3 fresh CAX11 nodes
2. Install K3s, restore SQLite DB from Borg (`/var/lib/rancher/k3s/server/db/`)
3. Or: fresh K3s + re-apply manifest YAMLs from Borg
4. Restore databases: `psql -U postgres -d <dbname> < dump.sql`
5. Update DNS to new LB IP
6. Estimated recovery time: ~15-30 minutes
**Verify backup:**
```bash
ssh k3s-mgr 'export BORG_RSH="ssh -p23"; export BORG_PASSPHRASE=$(cat /root/.borg-passphrase); borg list ssh://u149513-sub10@u149513-sub10.your-backup.de/./k3s-db'
```
## Common Operations