config/MEMORY.md

2.4 KiB

MEMORY.md - Long-Term Memory

Lessons Learned

  • CEO sessions need 1 hour timeout (runTimeoutSeconds: 3600). Default 10min is way too short — CEOs hire sub-agents for long-running tasks. Always set explicitly.

Product Ideas & Future CEOs

  • projects/ideas/product-ideas.md — All product ideas + SnapAPI CEO setup plan
  • Selected next product: SnapAPI (Screenshot API) — ready to launch when user says go

K3s Infrastructure (2026-02-18)

  • 3-node K3s cluster: k3s-mgr (188.34.201.101), k3s-w1 (159.69.23.121), k3s-w2 (46.225.169.60)
  • Hetzner LB: 46.225.37.135 (ID 5834131)
  • CNPG PostgreSQL 17.4 (2 instances) + PgBouncer pooler in postgres namespace
  • DocFast: docfast (prod, 2 replicas) + docfast-staging (1 replica)
  • CI/CD: push main→staging, tag v*→prod via Forgejo registry
  • Deployer SA with namespace-scoped RBAC (no secret access)
  • Forgejo needs PAT for registry (GITHUB_TOKEN lacks package scope)
  • Old server (167.235.156.214) kept for git push + SMTP relay only
  • Total infra cost: €17.06/mo (3x CAX11 + LB)

K3s HA Hardening (2026-02-18)

  • CoreDNS: 3 replicas with podAntiAffinity (one per node) — was single SPOF, all DNS broke when node died
  • CNPG operator: 2 replicas with topologySpreadConstraints (w1 + w2) — was single SPOF preventing DB failover
  • PgBouncer pooler: requiredDuringScheduling anti-affinity via Pooler CRD template (w1 + w2) — was landing both on same node
  • DocFast prod: preferredDuringScheduling anti-affinity to spread across workers
  • App v0.2.7: client.release(true) destroys dead pool connections on transient errors
  • HA test PASSED: Shut down either worker → prod stays up, DB failover works, 4/4 health checks pass over 3 minutes
  • Root causes found: CoreDNS (1 replica), CNPG operator (1 replica), PgBouncer (both same node), app dead connections
  • Note: Staging is 1 replica = not HA by design. CoreDNS scale may not persist K3s upgrades — check after updates.
  • Note: Deployment patches to system components (CoreDNS, CNPG operator) are runtime changes. Document in infra notes so they can be re-applied if needed.
  • Note: CNPG Pooler CRD supports spec.template.spec.affinity but requires containers field too (name+image of pgbouncer)

Game Save Files

  • memory/d2r.json — Diablo II: Resurrected progress (Necro "Baltasar", Summoner build)
  • memory/bg3.json — Baldur's Gate 3 progress (Act 1, level 3)