# K3s Infrastructure Documentation *Last updated: 2026-02-18* ## Cluster Overview | Component | Details | |-----------|---------| | K3s Version | v1.34.4+k3s1 | | Datacenter | Hetzner nbg1 | | Server Type | CAX11 (ARM64, 2 vCPU, 4GB RAM) | | Monthly Cost | €17.06 (3× CAX11 + LB) | | Private Network | 10.0.0.0/16, ID 11949384 | | Cluster CIDR | 10.42.0.0/16 | | Service CIDR | 10.43.0.0/16 | | Flannel Interface | enp7s0 (private network) | ## Nodes | Node | Role | Public IP | Private IP | Hetzner ID | |------|------|-----------|------------|------------| | k3s-mgr | Control plane (tainted NoSchedule) | 188.34.201.101 | 10.0.1.5 | 121365837 | | k3s-w1 | Worker | 159.69.23.121 | 10.0.1.6 | 121365839 | | k3s-w2 | Worker | 46.225.169.60 | 10.0.1.7 | 121365840 | ## Load Balancer | Field | Value | |-------|-------| | Name | k3s-lb | | Hetzner ID | 5834131 | | Public IP | 46.225.37.135 | | Targets | k3s-w1, k3s-w2 (ports 80/443) | | Health Checks | TCP, 15s interval, 3 retries, 10s timeout | ## Installed Operators & Components | Component | Version | Notes | |-----------|---------|-------| | Traefik | Helm (DaemonSet) | Runs on all workers, handles ingress + TLS termination | | cert-manager | 1.17.2 | Let's Encrypt ClusterIssuer `letsencrypt-prod` | | CloudNativePG | 1.25.1 | PostgreSQL operator | ## Database (CNPG) | Field | Value | |-------|-------| | Cluster Name | main-db | | Namespace | postgres | | Instances | 2 (primary + replica) | | PostgreSQL | 17.4 | | Storage | 10Gi local-path per instance | | Databases | `docfast` (prod), `docfast_staging` (staging) | | PgBouncer | `main-db-pooler`, 2 instances, transaction mode | ### Credentials - `docfast-db-credentials` secret: user=docfast, pass=docfast - `main-db-superuser` secret: managed by CNPG ## Namespaces | Namespace | Purpose | |-----------|---------| | postgres | CNPG cluster + pooler | | docfast | Production DocFast (2 replicas) | | docfast-staging | Staging DocFast (1 replica) | | cnpg-system | CNPG operator | | cert-manager | cert-manager | | kube-system | K3s system (CoreDNS, Traefik, etc.) | ## HA Configuration All spread constraints are **runtime patches** — may not survive K3s upgrades. Re-apply after updates. | Component | Replicas | Spread Strategy | |-----------|----------|-----------------| | CoreDNS | 3 | `preferredDuringScheduling` podAntiAffinity (mgr + w1 + w2) | | CNPG Operator | 2 | `topologySpreadConstraints DoNotSchedule` (w1 + w2) | | PgBouncer Pooler | 2 | `requiredDuringScheduling` podAntiAffinity via Pooler CRD (w1 + w2) | | DocFast Prod | 2 | `preferredDuringScheduling` podAntiAffinity (w1 + w2) | | DocFast Staging | 1 | Not HA by design | ### Failover Tuning (2026-02-18) - **Readiness probe**: every 5s, fail after 2 = pod unhealthy in ~10s - **Liveness probe**: every 10s, fail after 3 - **Node tolerations**: pods evicted after 10s (default was 300s) - **Result**: Failover window ~10-15 seconds ### HA Test Results (2026-02-18) - ✅ w1 down: 4/4 health checks passed - ✅ w2 down: 4/4 health checks passed, CNPG promoted replica - ✅ mgr down: 4/4 health checks passed (workers keep running) ## CI/CD Pipeline | Field | Value | |-------|-------| | Registry | git.cloonar.com (Forgejo container registry) | | Runner | Agent host (178.115.247.134), x86 → ARM64 cross-compile via QEMU | | Build time | ~8 min | | Deployer SA | `docfast:deployer` with namespace-scoped RBAC | ### Workflows - **deploy.yml**: Push to `main` → build + deploy to `docfast-staging` - **promote.yml**: Tag `v*` → build + deploy to `docfast` (prod) ### Secrets Required in Forgejo - `REGISTRY_TOKEN` — PAT with write:package scope - `KUBECONFIG` — base64 encoded deployer kubeconfig ### Pull Secrets - `forgejo-registry` imagePullSecret in both `docfast` and `docfast-staging` namespaces ## DNS | Record | Type | Value | |--------|------|-------| | docfast.dev | A | 46.225.37.135 (LB) | | staging.docfast.dev | A | **NOT SET** — needed for staging TLS | | MX | MX | mail.cloonar.com. | ## Firewall - Name: coolify-fw, Hetzner ID 10553199 - Port 6443 open to: 10.0.0.0/16 (cluster internal) + 178.115.247.134/32 (CI runner) ## SSH Access Config in `/home/openclaw/.ssh/config`: - `k3s-mgr`, `k3s-w1`, `k3s-w2` — root access - `deployer` user on k3s-mgr — limited kubeconfig at `/home/deployer/.kube-config.yaml` - KUBECONFIG on mgr: `/etc/rancher/k3s/k3s.yaml` --- ## Backup Strategy (TO IMPLEMENT) ### Current State: ✅ OPERATIONAL (since 2026-02-19) ### Plan: Borg to Hetzner Storage Box Target: `u149513-sub11@u149513-sub11.your-backup.de:23` (already set up, SSH key configured) **1. Cluster State (etcd snapshots)** - K3s built-in: `--etcd-snapshot-schedule-cron` on k3s-mgr - Borg repo: `./k3s-cluster/` on Storage Box - Contents: etcd snapshot + `/var/lib/rancher/k3s/server/manifests/` + all applied YAML manifests - Schedule: Daily - Retention: 7 daily, 4 weekly **2. Database (pg_dump)** - CronJob in `postgres` namespace → `pg_dump` both databases - Push to Borg repo: `./k3s-db/` on Storage Box - Schedule: Every 6 hours - Retention: 7 daily, 4 weekly - DB size: ~8 MB (tiny — Borg dedup makes this basically free) **3. Kubernetes Manifests** - Export all namespaced resources as YAML - Include: deployments, services, ingresses, secrets (encrypted by Borg), configmaps, CNPG cluster spec, pooler spec - Push to Borg alongside etcd snapshots **4. Recovery Procedure** 1. Provision 3 fresh CAX11 nodes 2. Install K3s, restore etcd snapshot 3. Or: fresh K3s + re-apply manifests from Borg 4. Restore CNPG database from pg_dump 5. Update DNS to new LB IP 6. Estimated recovery time: ~15-30 minutes ### Future: CNPG Barman/S3 (when needed) - Hetzner Object Storage (S3-compatible) - Continuous WAL archiving for point-in-time recovery - Worth it when DB grows past ~1 GB or revenue justifies €5/mo - Current DB: 7.6 MB — overkill for now --- ## Future Improvements ### Priority: High - [x] **Implement Borg backup** — operational since 2026-02-19 (DB every 6h, full daily at 03:00 UTC) - [ ] **DNS: staging.docfast.dev** → 46.225.37.135 — needed for staging ingress TLS - [ ] **Persist HA spread constraints** — CoreDNS scale, CNPG operator replicas, pooler anti-affinity are runtime patches. Need infra-as-code (manifests in Git) to survive K3s upgrades/reinstalls - [x] **Old server decommissioned** (167.235.156.214) — deleted, no longer exists ### Priority: Medium - [ ] **CNPG backup to S3** — upgrade from pg_dump to continuous WAL archiving when DB grows - [ ] **Monitoring/alerting** — Prometheus + Grafana stack, or lightweight alternative (VictoriaMetrics) - [ ] **Resource limits tuning** — current: 100m-1000m CPU, 256Mi-1Gi RAM per pod. Profile actual usage and right-size - [ ] **Network policies** — restrict pod-to-pod traffic (e.g., only DocFast → PgBouncer, not direct to DB) - [ ] **Pod Disruption Budgets** — ensure at least 1 pod stays running during voluntary disruptions (upgrades, drains) - [ ] **Automated K3s upgrades** — system-upgrade-controller for rolling node updates ### Priority: Low - [ ] **Multi-project namespaces** — SnapAPI and future products get own namespaces + RBAC - [ ] **ServiceAccount per CEO agent** — scoped kubectl access for autonomous deployment - [ ] **Horizontal Pod Autoscaler** — scale DocFast replicas based on CPU/request load - [ ] **External Secrets Operator** — centralized secret management instead of per-namespace secrets - [ ] **Loki for log aggregation** — centralized logging instead of `kubectl logs` - [ ] **Node auto-scaling** — Hetzner Cloud Controller Manager + Cluster Autoscaler