K3s Borg backup operational + update skill & infra docs
This commit is contained in:
parent
83595c17fb
commit
e2e9ae55f7
2 changed files with 32 additions and 13 deletions
|
|
@ -151,21 +151,40 @@ kubectl exec -n postgres <primary-pod> -c postgres -- psql -U docfast -d <dbname
|
|||
- Hetzner FW: `coolify-fw` (ID 10553199)
|
||||
- Port 6443: 10.0.0.0/16 + 178.115.247.134/32 (CI runner)
|
||||
|
||||
## Backup (TO IMPLEMENT)
|
||||
## Backup ✅ OPERATIONAL
|
||||
|
||||
**Current: ❌ NO BACKUPS**
|
||||
**Borg → Hetzner Storage Box**
|
||||
- Target: `u149513-sub10@u149513-sub10.your-backup.de:23`
|
||||
- SSH key: `/root/.ssh/id_ed25519` (k3s-mgr-backup)
|
||||
- Passphrase: `/root/.borg-passphrase` (on k3s-mgr)
|
||||
- Key exports: `/root/.borg-key-cluster`, `/root/.borg-key-db`
|
||||
- Script: `/root/k3s-backup.sh`
|
||||
- Log: `/var/log/k3s-backup.log`
|
||||
|
||||
**Plan: Borg → Hetzner Storage Box**
|
||||
- Target: `u149513-sub11@u149513-sub11.your-backup.de:23`
|
||||
- SSH key already configured on k3s-mgr (`/root/.ssh/id_ed25519`, fingerprint `docfast-backup`)
|
||||
- Per-machine subdir: `./docfast-1/` (existing), `./k3s-cluster/` and `./k3s-db/` (planned)
|
||||
**Repos:**
|
||||
| Repo | Contents | Size |
|
||||
|------|----------|------|
|
||||
| `./k3s-cluster` | K3s SQLite, manifests, token, all namespace YAML exports, CNPG specs | ~45 MB |
|
||||
| `./k3s-db` | pg_dump of all databases (docfast, docfast_staging, snapapi, snapapi_staging) + globals | ~30 KB |
|
||||
|
||||
**What to back up:**
|
||||
1. **Cluster state:** etcd snapshots + `/var/lib/rancher/k3s/server/manifests/` → daily
|
||||
2. **Databases:** pg_dump all databases → every 6h
|
||||
3. **K8s manifests:** export all resources as YAML → daily
|
||||
**Schedule (cron on k3s-mgr):**
|
||||
- `0 */6 * * *` — DB backup (pg_dump) every 6 hours
|
||||
- `0 3 * * *` — Full backup (DB + cluster state + manifests) daily at 03:00 UTC
|
||||
|
||||
**Recovery:** Fresh nodes → K3s install → restore etcd or re-apply manifests → restore DB → update DNS → ~15-30 min
|
||||
**Retention:** 7 daily, 4 weekly (auto-pruned)
|
||||
|
||||
**Recovery:**
|
||||
1. Provision 3 fresh CAX11 nodes
|
||||
2. Install K3s, restore SQLite DB from Borg (`/var/lib/rancher/k3s/server/db/`)
|
||||
3. Or: fresh K3s + re-apply manifest YAMLs from Borg
|
||||
4. Restore databases: `psql -U postgres -d <dbname> < dump.sql`
|
||||
5. Update DNS to new LB IP
|
||||
6. Estimated recovery time: ~15-30 minutes
|
||||
|
||||
**Verify backup:**
|
||||
```bash
|
||||
ssh k3s-mgr 'export BORG_RSH="ssh -p23"; export BORG_PASSPHRASE=$(cat /root/.borg-passphrase); borg list ssh://u149513-sub10@u149513-sub10.your-backup.de/./k3s-db'
|
||||
```
|
||||
|
||||
## Common Operations
|
||||
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue