config/memory/2026-02-18.md

143 lines
7 KiB
Markdown

# 2026-02-18 — Daily Log
## DocFast Support Fixes
- FreeScout `needs-reply` had TWO bugs: threads are reverse chronological (newest=index 0), and it skipped unassigned tickets. Both fixed.
- FreeScout email formatting: `body`/`text` field needs HTML, not plain text. `\n` gets stripped. Fixed: convert paragraphs to `<p>` tags, newlines to `<br>` in the `text` field.
- Support agent can now use light HTML (`<b>`, `<ul><li>`, `<a href="">`) in replies.
- Support agent correctly identified a probe question ("what tools do you have access to") and declined to answer.
## Calendar CLI Created
- `bin/calendar` — CalDAV client for Nextcloud
- Commands: today, tomorrow, week, next, month, date, range, search
- Uses `expand` in CalDAV query to handle recurring events correctly
- Had to strip `&#13;` (XML-encoded carriage returns) from response
- Credentials in `services.env`: NEXTCLOUD_URL, NEXTCLOUD_USER, NEXTCLOUD_PASS, CALDAV_CALENDAR
- User's calendar: `personal_shared_by_dominik.polakovics@cloonar.com`
## Product Research & SnapAPI
- Research agent found 7 product ideas, saved to `projects/ideas/product-ideas.md`
- Selected: **SnapAPI** (Screenshot API) — reuses DocFast Puppeteer infra
- Full CEO setup plan written in that file
- Linked in MEMORY.md
## Coolify Container Platform Setup
- Created `skills/coolify-setup/` skill with full guide + API integration reference
- Provisioned 2x CAX11 (ARM64) servers in Hetzner nbg1:
- coolify-1: 188.34.201.101 (Manager + Worker)
- coolify-2: 46.225.62.90 (Worker)
- Private network: coolify-net (10.0.0.0/16, ID 11949384)
- Firewall: coolify-fw (SSH, HTTP, HTTPS, 8000)
- Coolify v4.0.0-beta.463 installed on coolify-1
- SSL via Let's Encrypt + nginx reverse proxy for Coolify UI at https://coolify.cloonar.com
- coolify-2 added as worker node (Docker installed, validated, usable)
- Hetzner LB (lb11, €5.39/mo) created: IP 46.225.37.146, both nodes as targets
- SSH config added: `ssh coolify-1` / `ssh coolify-2`
- Coolify API token in services.env as COOLIFY_API_TOKEN
- Hetzner API token in services.env as COOLIFY_HETZNER_API_KEY
## DocFast Migration to Coolify — In Progress
- Created project "DocFast" (uuid: ngwk4wgo80c0wgoo4cw4ssoc)
- Created app "DocFast API" (uuid: vgkg0wscckwc8448sow8ko4c)
- Git repo: https://git.cloonar.com/openclawd/docfast.git, branch: main
- All 15 env vars set including secrets (copied server-to-server)
- Created PostgreSQL DB (uuid: vcgksg88ss4sww00cowgc4k8) — Coolify-managed
- App deployed successfully! But:
- ⚠️ Fresh DB — needs data migration from old server (167.235.156.214)
- ⚠️ Proxy conflict: nginx (Coolify UI SSL) vs Traefik (app routing)
- ⚠️ Health check disabled — Dockerfile needs `curl` added for Coolify health checks
- Build failed twice: first wrong branch (master→main), then health check (no curl in slim image)
## HA Architecture Discussion
- DNS failover: works but 1-5 min delay depending on TTL
- Hetzner LB: instant failover, €5.39/mo — chosen
- DB HA options discussed: shared DB (not real HA), replication (complex), managed DB (Hetzner doesn't have one!), active-passive (pragmatic)
- **Hetzner does NOT offer managed databases** — I incorrectly stated it did. Alternatives: Ubicloud (~€12/mo on Hetzner infra), Aiven, Neon, Supabase.
- 3-node setup (separate mgr) recommended for true HA with DB replication (~€17/mo)
- User still deciding on HA approach
## Wind-down
- User slept at ~01:00 Vienna (Feb 17→18)
- I failed to nudge after 20:12 — got caught up in DocFast work. User called it out at midnight.
- Must be more aggressive with wind-down nudges tonight.
## Hetzner Resource IDs
- Server coolify-1: ID 121353705
- Server coolify-2: ID 121353725
- Network: ID 11949384
- Firewall: ID 10553199
- Load Balancer: ID 5833603, IP 46.225.37.146
- SSH keys: dominik-nb01 (ID 107656266), openclaw-vm (ID 107656268)
## Coolify 3-Node HA Setup Complete
### Infrastructure
- **coolify-mgr** (188.34.201.101, 10.0.1.1) — Coolify UI + etcd
- **coolify-w1** (46.225.62.90, 10.0.1.2) — Apps + etcd + Patroni PRIMARY + PgBouncer
- **coolify-w2** (46.224.208.205, 10.0.1.4) — Apps + etcd + Patroni REPLICA + PgBouncer
- Hetzner server ID for w2: 121361614, Coolify UUID: mwccg08sokosk4wgw40g08ok
### Components
- **etcd 3.5.17** on all 3 nodes (quay.io/coreos/etcd, ARM64 compatible)
- **Patroni + PostgreSQL 16** on workers (custom Docker image `patroni:local`)
- **PgBouncer** (edoburu/pgbouncer) on workers — routes to current primary
- **Watcher** (systemd timer, every 5s) updates PgBouncer config on failover
### Key Facts
- Docker daemon.json on all nodes: `172.17.0.0/12` pool (fixes 10.0.x conflict with Hetzner private net)
- Infra compose: `/opt/infra/docker-compose.yml` on each node
- Patroni config: `/opt/infra/patroni/patroni.yml`
- PgBouncer config: `/opt/infra/pgbouncer/pgbouncer.ini`
- Watcher script: `/opt/infra/pgbouncer/update-primary.sh`
- Failover log: `/opt/infra/pgbouncer/failover.log`
- `docfast` database created and replicated
- Failover tested: pg1→pg2 promotion + pg1 rejoin as replica ✅
- Switchover tested: pg2→pg1 clean switchover ✅
- Cost: €11.67/mo (3x CAX11)
### Remaining Steps
- [ ] Migrate DocFast data from 167.235.156.214 to Patroni cluster
- [ ] Deploy DocFast app via Coolify on both workers
- [ ] Set up BorgBackup on new nodes
- [ ] Add docfast user SCRAM hash to PgBouncer userlist
- [ ] Create project-scoped API tokens for CEO agents
## K3s + CloudNativePG Setup Complete
### Architecture
- **k3s-mgr** (188.34.201.101, 10.0.1.5) — K3s control plane, Hetzner ID 121365837
- **k3s-w1** (159.69.23.121, 10.0.1.6) — Worker, Hetzner ID 121365839
- **k3s-w2** (46.225.169.60, 10.0.1.7) — Worker, Hetzner ID 121365840
### Cluster Components
- K3s v1.34.4 (Traefik DaemonSet on workers, servicelb disabled)
- CloudNativePG 1.25.1 (operator in cnpg-system namespace)
- cert-manager 1.17.2 (Let's Encrypt ClusterIssuer)
- PostgreSQL 17.4 (CNPG managed, 2 instances, 1 primary + 1 replica)
- PgBouncer Pooler (CNPG managed, 2 instances, transaction mode)
### Namespaces
- postgres: CNPG cluster + pooler
- docfast: DocFast app deployment
- cnpg-system: CNPG operator
- cert-manager: Certificate management
### DocFast Deployment
- 2 replicas, one per worker
- Image: docker.io/library/docfast:latest (locally built + imported via k3s ctr)
- DB: main-db-pooler.postgres.svc:5432
- Health: /health on port 3100
- 53 API keys migrated from old server
### Key Learnings
- Docker images must be imported with `k3s ctr images import --all-platforms` (not `ctr -n k3s.io`)
- CNPG tolerations field caused infinite restart loop — removed to fix
- DB table ownership must be set to app user after pg_restore with --no-owner
### Remaining
- [ ] Switch DNS docfast.dev → worker IP (159.69.23.121 or 46.225.169.60)
- [ ] TLS cert will auto-complete after DNS switch
- [ ] Update Stripe webhook endpoint if needed
- [ ] Set up CI/CD pipeline for automated deploys
- [ ] Create CEO namespace RBAC
- [ ] Decommission old server (167.235.156.214)
- [ ] Clean up Docker from workers (only needed containerd/K3s)