snapapi: session 2 — prod live, bug fix, state update

This commit is contained in:
Hoid 2026-02-19 10:48:06 +00:00
parent bb07c630f1
commit f7cda52b22
3 changed files with 93 additions and 16 deletions

View file

@ -1,3 +1,20 @@
# SnapAPI Bug Tracker
No bugs yet — product not built.
## Fixed
### BUG-001: Key cache not shared across pods (HIGH) — FIXED v0.1.1
- **Found:** Session 2
- **Impact:** ~50% of screenshot requests fail with 403 after signup when 2+ replicas
- **Fix:** Cache-aside pattern — check DB when key not in memory cache
- **Verified:** 6/6 requests succeed after signup on 2-replica prod deployment
## Open
### BUG-002: No email verification on signup (MEDIUM)
- **Impact:** Anyone can create unlimited keys with fake emails
- **Mitigation:** Same email returns same key (dedup)
- **Status:** Deferred — needs email service setup
### BUG-003: No API key recovery (MEDIUM)
- **Impact:** Lost API key = create new account
- **Status:** Needs email verification first

View file

@ -30,3 +30,49 @@
### Image on workers
- Imported manually via `docker save | ssh | k3s ctr images import` to both k3s-w1 and k3s-w2
- Uses `imagePullPolicy: IfNotPresent` since image is pre-loaded
## Session 2 — 2026-02-19
**Goal:** CI/CD pipeline, TLS, staging ingress, code review, bug fixes.
### What Was Done
1. **Production deployment created** — 2 replicas with HA (anti-affinity, tolerations)
2. **TLS certificate** — Let's Encrypt on snapapi.eu via cert-manager ✅
3. **Staging ingress** — Created for staging.snapapi.eu (pending DNS record)
4. **BUG-001 fixed** — Cache-aside key lookup for multi-replica support
- Keys now fall back to DB when not in memory cache
- Verified: 6/6 requests succeed after fresh signup
5. **Code review** — Reviewed all source files, found good SSRF protection, solid patterns
6. **Image v0.1.1 built and deployed** to both staging and production
7. **k3s-mgr SSH access to workers** — Added k3s-mgr pubkey to worker authorized_keys for future image transfers
8. **CI/CD workflow files** — Already written (deploy.yml + promote.yml), match DocFast pattern
### Blockers Encountered
- **Cannot push code to Forgejo repo** — FORGEJO_TOKEN is read-only (no write:repository scope)
- **SSH port 2222 unreachable** — From both k3s-mgr and openclaw VM, so deploy key is useless
- **No staging DNS** — staging.snapapi.eu has no A record, cert-manager can't issue TLS
- Code lives on k3s-mgr at `/tmp/snapapi-build` — needs to be pushed to repo for CI/CD
### Investor Action Required
1. Create Forgejo API token with `write:repository` and `write:package` scopes for `openclawd`
2. Add DNS record: `staging.snapapi.eu``46.225.37.135` (same LB as production)
3. Either expose Forgejo SSH on port 2222 externally OR provide write token (option 1 preferred)
### Investor Test — Session 2
1. **Would a stranger trust this product with their money right now?**
→ NO. Free tier works well (signup → key → screenshot in seconds). But no paid tiers exist yet, no email verification, and the landing page has no Impressum/legal pages. Functional but not trustworthy for paid use.
2. **If a pod crashed, would we lose customer data?**
→ NO. All data is in PostgreSQL (external to pods). In-memory key cache rebuilds from DB on startup. Usage data flushes every 5 seconds. Maximum loss: ~5 seconds of usage counters.
3. **Could someone abuse the free tier right now?**
→ PARTIALLY. Same email returns same key (good). But no email verification means someone could generate unlimited keys with fake@emails. Rate limiting at 120 req/min per IP helps but doesn't fully prevent abuse.
4. **Can a paying customer recover a lost API key?**
→ NO. No key recovery flow. No email verification to prove ownership. This needs fixing before paid launch.
5. **Does every feature on the website actually work?**
→ YES for what's shown. Screenshot API works, signup works, docs are accurate. Pricing section shows plans but there's no actual payment flow yet.
**Honest Assessment:** The product WORKS for free tier users. The API is solid, SSRF protection is good, multi-replica cache bug is fixed. But NOT launch-ready for paid tiers. Still an impressive MVP for 2 sessions of work.

View file

@ -1,20 +1,28 @@
{
"phase": "mvp-deployed",
"version": "0.1.0",
"phase": "production-live",
"version": "0.1.1",
"staging": {
"status": "running",
"namespace": "snapapi-staging",
"replicas": 1,
"image": "git.cloonar.com/openclawd/snapapi:v0.1.0",
"healthCheck": "passing"
"image": "git.cloonar.com/openclawd/snapapi:v0.1.1",
"healthCheck": "passing",
"ingress": "staging.snapapi.eu (PENDING DNS)"
},
"production": {
"status": "not-deployed"
"status": "running",
"namespace": "snapapi",
"replicas": 2,
"image": "git.cloonar.com/openclawd/snapapi:v0.1.1",
"healthCheck": "passing",
"domain": "https://snapapi.eu",
"tls": "Let's Encrypt (valid until 2026-05-20)"
},
"blockers": [
"No domain registered yet — need investor to register domain",
"No Forgejo write token — cannot push to git repo or registry via CI/CD. Need a PAT with write:repository and write:package scopes",
"CI/CD not functional until KUBECONFIG and REGISTRY_TOKEN secrets are set in Forgejo repo"
"FORGEJO_TOKEN is read-only — cannot push code to repo. Need write:repository scope token",
"SSH port 2222 not reachable from k3s-mgr or openclaw VM — deploy key useless without it",
"staging.snapapi.eu DNS record not set — cert-manager can't issue TLS cert",
"CI/CD pipeline written but untested (can't push to trigger it)"
],
"completed": [
"Core screenshot API (POST /v1/screenshot)",
@ -25,17 +33,23 @@
"PostgreSQL DB integration (api_keys + usage tables)",
"Usage tracking with per-key limits",
"Landing page with docs",
"Docker image built and deployed to staging",
"K8s deployment + service in snapapi-staging namespace"
"Production deployment (2 replicas, HA, anti-affinity)",
"Production TLS (Let's Encrypt) on snapapi.eu",
"Staging deployment (1 replica)",
"Staging ingress (pending DNS)",
"Cache-aside key lookup (multi-replica fix)",
"CI/CD workflow files (deploy.yml + promote.yml) — ready but untested"
],
"notDone": [
"Email verification (signup gives key directly for now)",
"Stripe billing integration",
"Paid tier management",
"Production deployment",
"Domain + Traefik IngressRoute",
"CI/CD pipeline (workflows written but not functional)",
"Git repo has no code (push access blocked)"
"CI/CD pipeline (blocked on git push access)",
"Staging TLS (blocked on DNS)",
"API key recovery flow",
"Rate limiting per-key fairness",
"Status page",
"Uptime monitoring"
],
"lastSession": "2026-02-18T20:45:00Z"
"lastSession": "2026-02-19T10:50:00Z"
}