session 137: BUG-106/107 fixed, multi-pod cache consistency
This commit is contained in:
parent
3ec1f57a9b
commit
a85cf6685f
5 changed files with 57 additions and 8 deletions
|
|
@ -1,3 +1,19 @@
|
|||
## BUG-107: Recover route uses in-memory cache only — recovery fails silently across pods
|
||||
- **Date:** 2026-03-06
|
||||
- **Severity:** MEDIUM
|
||||
- **Issue:** `POST /v1/recover` and `/v1/recover/verify` use `getAllKeys()` (in-memory cache) to find a user's key by email. In a 2-replica setup, if the key was created or email changed on another pod, the cache is stale. Recovery silently returns "recovery_sent" without actually sending an email (because it doesn't find the key), or verify returns "No API key found" despite the key existing in DB.
|
||||
- **Impact:** Users may be unable to recover their API key if they hit the "wrong" pod. Silent failure — no error shown.
|
||||
- **Fix:** Fall back to DB query when in-memory lookup fails.
|
||||
- **Status:** ✅ FIXED — commit b964b98. DB fallback in recover/verify endpoint. 2 TDD tests added (recover-db-fallback.test.ts). 520 tests total.
|
||||
|
||||
## BUG-106: downgradeByCustomer only checks in-memory cache — cancellations can silently fail
|
||||
- **Date:** 2026-03-06
|
||||
- **Severity:** HIGH
|
||||
- **Issue:** `downgradeByCustomer()` in `src/services/keys.ts` only checks `keysCache` (in-memory). In 2-replica production, if a Stripe cancellation webhook hits a pod that doesn't have the key cached (pod restart, key created on other pod), the function returns `false` without checking DB. Customer keeps Pro access despite canceling their subscription.
|
||||
- **Impact:** Revenue leakage — canceled customers retain Pro tier indefinitely. Silent failure with no error log.
|
||||
- **Fix:** Add DB fallback: query `api_keys` table by `stripe_customer_id` when not found in cache, then update tier in DB and hydrate local cache.
|
||||
- **Status:** ✅ FIXED — commit b964b98. DB fallback + cache hydration. 2 TDD tests added (keys-downgrade.test.ts). 520 tests total.
|
||||
|
||||
## BUG-105: Go and PHP examples show non-existent SDK code
|
||||
- **Date:** 2026-03-05
|
||||
- **Severity:** MEDIUM
|
||||
|
|
|
|||
|
|
@ -1,5 +1,26 @@
|
|||
# Session Log
|
||||
|
||||
## Session 137 — 2026-03-06 19:00 UTC (Friday Evening)
|
||||
- **Production:** v0.5.1 ✅ healthy, 2 replicas, 0 restarts, ~8d uptime
|
||||
- **Staging:** v0.5.2 ✅ commit b964b98 (46+ commits ahead of prod)
|
||||
- **K8s cluster:** All 3 nodes Ready
|
||||
- **Support:** Zero tickets
|
||||
- **Completed:**
|
||||
1. **Codebase audit — multi-pod cache consistency** — Identified two bugs where in-memory cache-only lookups silently fail in multi-replica deployments.
|
||||
2. **BUG-106 fix (TDD): downgradeByCustomer DB fallback** — `downgradeByCustomer()` now queries the DB when cache misses, preventing canceled Stripe customers from retaining Pro access. Cache hydrated on fallback path. 2 TDD tests added.
|
||||
3. **BUG-107 fix (TDD): recover route DB fallback** — `POST /v1/recover/verify` now falls back to DB when in-memory cache doesn't contain the email. Prevents silent recovery failures across pods. 2 TDD tests added.
|
||||
4. **Infrastructure health check** — All 3 K8s nodes Ready, both prod replicas healthy, DB connected (PostgreSQL 17.4), browser pool 15/15.
|
||||
- **Total tests:** 520 (all passing, 0 errors), 38 test files
|
||||
- **Open bugs:** ZERO 🎉
|
||||
- **CI runner:** Still absent. Managed by Cloonar — needs investor action.
|
||||
- **Investor test:**
|
||||
1. Would a stranger trust this with money? Yes ✅
|
||||
2. Pod crash = data loss? No — CNPG WAL archiving + MinIO ✅
|
||||
3. Free tier abuse? No — removed, demo rate-limited ✅
|
||||
4. Pro key recovery? Yes — now with DB fallback across pods ✅
|
||||
5. Every feature works? Yes ✅
|
||||
- **Recommendation:** Staging v0.5.2 production-ready. 46+ commits ahead with 520 tests. Awaiting investor approval for production tag.
|
||||
|
||||
## Session 136 — 2026-03-06 16:00 UTC (Friday Late Afternoon)
|
||||
- **Production:** v0.5.1 ✅ healthy, 2 replicas, 0 restarts, ~8d uptime
|
||||
- **Staging:** v0.5.2 ✅ commit 4473641 (45+ commits ahead of prod)
|
||||
|
|
|
|||
|
|
@ -3,7 +3,7 @@
|
|||
"phaseLabel": "Build Production-Grade Product",
|
||||
"status": "launch-ready",
|
||||
"product": "DocFast — HTML/Markdown to PDF API",
|
||||
"currentPriority": "Production on v0.5.1. Staging v0.5.2 (45+ commits ahead). npm audit 0 vulns. 516 tests passing (36 files). ZERO open bugs. ZERO unhandled test errors (fixed timer leaks). Ready for production tag when investor approves.",
|
||||
"currentPriority": "Production on v0.5.1. Staging v0.5.2 (46+ commits ahead). npm audit 0 vulns. 520 tests passing (38 files). ZERO open bugs. Multi-pod cache consistency fixed (BUG-106, BUG-107). Ready for production tag when investor approves.",
|
||||
"ownerDirectives_PRIORITY": "Process these IN ORDER. Do not skip. Remove items marked ✅ DONE/FIXED during housekeeping.",
|
||||
"ownerDirectives": [
|
||||
"Stripe Product ID for DocFast: prod_TygeG8tQPtEAdE — webhook handler must filter by this product_id to ignore events from other projects on the same Stripe account."
|
||||
|
|
@ -83,7 +83,7 @@
|
|||
"LOW": [],
|
||||
"note": "All bugs resolved. BUG-105 fixed 4f6659c. BUG-104 fixed 503e651. BUG-103 (template validation bypass) fixed 47571c8. BUG-102 (sanitized options ignored) fixed ba2e542. BUG-101 (body limits) fixed c03f217. BUG-100 (flush poisoning) fixed d2f819d. BUG-099 (memory leak) fixed 5f776db. BUG-098 (interceptor leak) fixed 024fa00."
|
||||
},
|
||||
"sessionCount": 136
|
||||
"sessionCount": 137
|
||||
},
|
||||
"blockers": [],
|
||||
"startDate": "2026-02-14"
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue