201 lines
7.5 KiB
Markdown
201 lines
7.5 KiB
Markdown
# K3s Infrastructure Documentation
|
||
|
||
*Last updated: 2026-02-18*
|
||
|
||
## Cluster Overview
|
||
|
||
| Component | Details |
|
||
|-----------|---------|
|
||
| K3s Version | v1.34.4+k3s1 |
|
||
| Datacenter | Hetzner nbg1 |
|
||
| Server Type | CAX11 (ARM64, 2 vCPU, 4GB RAM) |
|
||
| Monthly Cost | €17.06 (3× CAX11 + LB) |
|
||
| Private Network | 10.0.0.0/16, ID 11949384 |
|
||
| Cluster CIDR | 10.42.0.0/16 |
|
||
| Service CIDR | 10.43.0.0/16 |
|
||
| Flannel Interface | enp7s0 (private network) |
|
||
|
||
## Nodes
|
||
|
||
| Node | Role | Public IP | Private IP | Hetzner ID |
|
||
|------|------|-----------|------------|------------|
|
||
| k3s-mgr | Control plane (tainted NoSchedule) | 188.34.201.101 | 10.0.1.5 | 121365837 |
|
||
| k3s-w1 | Worker | 159.69.23.121 | 10.0.1.6 | 121365839 |
|
||
| k3s-w2 | Worker | 46.225.169.60 | 10.0.1.7 | 121365840 |
|
||
|
||
## Load Balancer
|
||
|
||
| Field | Value |
|
||
|-------|-------|
|
||
| Name | k3s-lb |
|
||
| Hetzner ID | 5834131 |
|
||
| Public IP | 46.225.37.135 |
|
||
| Targets | k3s-w1, k3s-w2 (ports 80/443) |
|
||
| Health Checks | TCP, 15s interval, 3 retries, 10s timeout |
|
||
|
||
## Installed Operators & Components
|
||
|
||
| Component | Version | Notes |
|
||
|-----------|---------|-------|
|
||
| Traefik | Helm (DaemonSet) | Runs on all workers, handles ingress + TLS termination |
|
||
| cert-manager | 1.17.2 | Let's Encrypt ClusterIssuer `letsencrypt-prod` |
|
||
| CloudNativePG | 1.25.1 | PostgreSQL operator |
|
||
|
||
## Database (CNPG)
|
||
|
||
| Field | Value |
|
||
|-------|-------|
|
||
| Cluster Name | main-db |
|
||
| Namespace | postgres |
|
||
| Instances | 2 (primary + replica) |
|
||
| PostgreSQL | 17.4 |
|
||
| Storage | 10Gi local-path per instance |
|
||
| Databases | `docfast` (prod), `docfast_staging` (staging) |
|
||
| PgBouncer | `main-db-pooler`, 2 instances, transaction mode |
|
||
|
||
### Credentials
|
||
- `docfast-db-credentials` secret: user=docfast, pass=docfast
|
||
- `main-db-superuser` secret: managed by CNPG
|
||
|
||
## Namespaces
|
||
|
||
| Namespace | Purpose |
|
||
|-----------|---------|
|
||
| postgres | CNPG cluster + pooler |
|
||
| docfast | Production DocFast (2 replicas) |
|
||
| docfast-staging | Staging DocFast (1 replica) |
|
||
| cnpg-system | CNPG operator |
|
||
| cert-manager | cert-manager |
|
||
| kube-system | K3s system (CoreDNS, Traefik, etc.) |
|
||
|
||
## HA Configuration
|
||
|
||
All spread constraints are **runtime patches** — may not survive K3s upgrades. Re-apply after updates.
|
||
|
||
| Component | Replicas | Spread Strategy |
|
||
|-----------|----------|-----------------|
|
||
| CoreDNS | 3 | `preferredDuringScheduling` podAntiAffinity (mgr + w1 + w2) |
|
||
| CNPG Operator | 2 | `topologySpreadConstraints DoNotSchedule` (w1 + w2) |
|
||
| PgBouncer Pooler | 2 | `requiredDuringScheduling` podAntiAffinity via Pooler CRD (w1 + w2) |
|
||
| DocFast Prod | 2 | `preferredDuringScheduling` podAntiAffinity (w1 + w2) |
|
||
| DocFast Staging | 1 | Not HA by design |
|
||
|
||
### Failover Tuning (2026-02-18)
|
||
- **Readiness probe**: every 5s, fail after 2 = pod unhealthy in ~10s
|
||
- **Liveness probe**: every 10s, fail after 3
|
||
- **Node tolerations**: pods evicted after 10s (default was 300s)
|
||
- **Result**: Failover window ~10-15 seconds
|
||
|
||
### HA Test Results (2026-02-18)
|
||
- ✅ w1 down: 4/4 health checks passed
|
||
- ✅ w2 down: 4/4 health checks passed, CNPG promoted replica
|
||
- ✅ mgr down: 4/4 health checks passed (workers keep running)
|
||
|
||
## CI/CD Pipeline
|
||
|
||
| Field | Value |
|
||
|-------|-------|
|
||
| Registry | git.cloonar.com (Forgejo container registry) |
|
||
| Runner | Agent host (178.115.247.134), x86 → ARM64 cross-compile via QEMU |
|
||
| Build time | ~8 min |
|
||
| Deployer SA | `docfast:deployer` with namespace-scoped RBAC |
|
||
|
||
### Workflows
|
||
- **deploy.yml**: Push to `main` → build + deploy to `docfast-staging`
|
||
- **promote.yml**: Tag `v*` → build + deploy to `docfast` (prod)
|
||
|
||
### Secrets Required in Forgejo
|
||
- `REGISTRY_TOKEN` — PAT with write:package scope
|
||
- `KUBECONFIG` — base64 encoded deployer kubeconfig
|
||
|
||
### Pull Secrets
|
||
- `forgejo-registry` imagePullSecret in both `docfast` and `docfast-staging` namespaces
|
||
|
||
## DNS
|
||
|
||
| Record | Type | Value |
|
||
|--------|------|-------|
|
||
| docfast.dev | A | 46.225.37.135 (LB) |
|
||
| staging.docfast.dev | A | **NOT SET** — needed for staging TLS |
|
||
| MX | MX | mail.cloonar.com. |
|
||
|
||
## Firewall
|
||
|
||
- Name: coolify-fw, Hetzner ID 10553199
|
||
- Port 6443 open to: 10.0.0.0/16 (cluster internal) + 178.115.247.134/32 (CI runner)
|
||
|
||
## SSH Access
|
||
|
||
Config in `/home/openclaw/.ssh/config`:
|
||
- `k3s-mgr`, `k3s-w1`, `k3s-w2` — root access
|
||
- `deployer` user on k3s-mgr — limited kubeconfig at `/home/deployer/.kube-config.yaml`
|
||
- KUBECONFIG on mgr: `/etc/rancher/k3s/k3s.yaml`
|
||
|
||
---
|
||
|
||
## Backup Strategy (TO IMPLEMENT)
|
||
|
||
### Current State: ✅ OPERATIONAL (since 2026-02-19)
|
||
|
||
### Plan: Borg to Hetzner Storage Box
|
||
|
||
Target: `u149513-sub11@u149513-sub11.your-backup.de:23` (already set up, SSH key configured)
|
||
|
||
**1. Cluster State (etcd snapshots)**
|
||
- K3s built-in: `--etcd-snapshot-schedule-cron` on k3s-mgr
|
||
- Borg repo: `./k3s-cluster/` on Storage Box
|
||
- Contents: etcd snapshot + `/var/lib/rancher/k3s/server/manifests/` + all applied YAML manifests
|
||
- Schedule: Daily
|
||
- Retention: 7 daily, 4 weekly
|
||
|
||
**2. Database (pg_dump)**
|
||
- CronJob in `postgres` namespace → `pg_dump` both databases
|
||
- Push to Borg repo: `./k3s-db/` on Storage Box
|
||
- Schedule: Every 6 hours
|
||
- Retention: 7 daily, 4 weekly
|
||
- DB size: ~8 MB (tiny — Borg dedup makes this basically free)
|
||
|
||
**3. Kubernetes Manifests**
|
||
- Export all namespaced resources as YAML
|
||
- Include: deployments, services, ingresses, secrets (encrypted by Borg), configmaps, CNPG cluster spec, pooler spec
|
||
- Push to Borg alongside etcd snapshots
|
||
|
||
**4. Recovery Procedure**
|
||
1. Provision 3 fresh CAX11 nodes
|
||
2. Install K3s, restore etcd snapshot
|
||
3. Or: fresh K3s + re-apply manifests from Borg
|
||
4. Restore CNPG database from pg_dump
|
||
5. Update DNS to new LB IP
|
||
6. Estimated recovery time: ~15-30 minutes
|
||
|
||
### Future: CNPG Barman/S3 (when needed)
|
||
- Hetzner Object Storage (S3-compatible)
|
||
- Continuous WAL archiving for point-in-time recovery
|
||
- Worth it when DB grows past ~1 GB or revenue justifies €5/mo
|
||
- Current DB: 7.6 MB — overkill for now
|
||
|
||
---
|
||
|
||
## Future Improvements
|
||
|
||
### Priority: High
|
||
- [x] **Implement Borg backup** — operational since 2026-02-19 (DB every 6h, full daily at 03:00 UTC)
|
||
- [ ] **DNS: staging.docfast.dev** → 46.225.37.135 — needed for staging ingress TLS
|
||
- [ ] **Persist HA spread constraints** — CoreDNS scale, CNPG operator replicas, pooler anti-affinity are runtime patches. Need infra-as-code (manifests in Git) to survive K3s upgrades/reinstalls
|
||
- [x] **Old server decommissioned** (167.235.156.214) — deleted, no longer exists
|
||
|
||
### Priority: Medium
|
||
- [ ] **CNPG backup to S3** — upgrade from pg_dump to continuous WAL archiving when DB grows
|
||
- [ ] **Monitoring/alerting** — Prometheus + Grafana stack, or lightweight alternative (VictoriaMetrics)
|
||
- [ ] **Resource limits tuning** — current: 100m-1000m CPU, 256Mi-1Gi RAM per pod. Profile actual usage and right-size
|
||
- [ ] **Network policies** — restrict pod-to-pod traffic (e.g., only DocFast → PgBouncer, not direct to DB)
|
||
- [ ] **Pod Disruption Budgets** — ensure at least 1 pod stays running during voluntary disruptions (upgrades, drains)
|
||
- [ ] **Automated K3s upgrades** — system-upgrade-controller for rolling node updates
|
||
|
||
### Priority: Low
|
||
- [ ] **Multi-project namespaces** — SnapAPI and future products get own namespaces + RBAC
|
||
- [ ] **ServiceAccount per CEO agent** — scoped kubectl access for autonomous deployment
|
||
- [ ] **Horizontal Pod Autoscaler** — scale DocFast replicas based on CPU/request load
|
||
- [ ] **External Secrets Operator** — centralized secret management instead of per-namespace secrets
|
||
- [ ] **Loki for log aggregation** — centralized logging instead of `kubectl logs`
|
||
- [ ] **Node auto-scaling** — Hetzner Cloud Controller Manager + Cluster Autoscaler
|