Implement self-hosted PowerSync on web-arm (Cloonar fit) #38
Labels
No labels
bug
enhancement
in-progress
needs-info
needs-triage
p0
ready-for-agent
ready-for-human
wontfix
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
Cloonar/nixos#38
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Blocked by #37 — do not begin implementation until that issue is closed. The artifacts produced there (Supabase role + publication,
reptide-powersync-source-dsnsops secret, sync rules export, Flutter URL constant location) are inputs to the work described here.Problem Statement
The Cloonar fit Flutter app currently syncs against PowerSync Cloud, paying a monthly subscription for a service that fits cleanly into the existing
web-armfleet host. The goal is to retire the Cloud dependency and run PowerSync Service ourselves, replicating from the same Supabase Postgres source, storing bucket state inweb-arm's existing Postgres 14 instance, and serving sync clients atpowersync.reptide.eu.Solution
Add a self-contained NixOS module to
web-armthat runsjourneyapps/powersync-serviceas avirtualisation.oci-containers(podman) container — matching the existingcollabora.nix/rustdesk.nix/pa11y.nixpattern — with:powersync_storagePostgres database + role on the existing PG14 instance.powersync.reptide.euwith lego/ACME TLS and WebSocket-aware proxy config.sha256digest, bumped via a scriptedupdate.shhelper following theturn-runnercadence./probes/livenessendpoint.The Flutter SDK keeps using its existing Supabase user JWTs unchanged; PowerSync verifies them via Supabase's published JWKS URI. There is no shared HS256 secret to store and no client code change at the auth layer.
User Stories
web-armas a podman container undervirtualisation.oci-containers, so that it slots into the existing systemd-managed container pattern without introducing a new runtime concept.sha256digest, so that upgrades are deliberate PR-reviewed events and a surprise upstream re-tag can't ship to my fleet.update.shhelper underutils/pkgs/powersync-service/that resolves Docker Hub's:latest(or a specified tag) to<tag>@<sha256>and rewrites the .nix file, so that bumps follow the same scripted shape asturn-runnerbumps.powersync_storagedatabase on the existing PG14 instance with its ownpowersync_storagerole, so that backup (viaservices.postgresqlBackup), monitoring, and lifecycle all stay inside the existing Postgres operational story.shared_buffersraised from 80MB to 512MB onweb-arm's PG14 in the same change, so that PowerSync's bucket-heavy write workload doesn't thrash a tiny page cache.8080bound to127.0.0.1only, so that nginx is the sole ingress path and the service isn't reachable on the LAN even if a downstream firewall rule loosens.powersync.reptide.euwithenableACME+forceSSL,proxyWebsockets = true, andproxy_read_timeout 3600s, so that the Flutter SDK's long-lived WebSocket sync connections aren't killed by nginx's default 60s timeout.reptide-powersync-source-dsn(created by #37), so that the agent never touches secret material directly.client_auth.jwks_uripointing at the Supabase project's/.well-known/jwks.json, so that Flutter clients keep using their existing Supabase user JWTs unchanged through cutover.https://powersync.reptide.eu/probes/liveness, so that a wedged or crash-looping container triggers alerts before the Supabase replication slot starts accumulating WAL on the source side.web-arm'sconfiguration.nixin the same shape as existing./modules/<service>.niximports, so that the host's import block stays consistent.service.yamlderivable from a single visible source in the repo (rendered by Nix or kept as a static YAML in the module dir), so that the running config is auditable withoutpodman execinto the container.git blame.Implementation Decisions
Module shape: a single new module under the
web-armhost (alongsidecollabora.nix,rustdesk.nix, etc.) that colocates the oci-containers definition, the rendered service config, the nginx vhost, the postgres role/database bootstrap, the sops secret reference, and the blackbox probe entry. The sync rules YAML lives as a sibling file inside the module directory and is mounted read-only.Postgres storage bootstrap: a new role
powersync_storage(no replication, no superuser) and a new databasepowersync_storageowned by that role, both declared viaservices.postgresql.ensureUsers/ensureDatabases. PowerSync auto-migrates its schema on first startup — no manual DDL.Postgres tuning: bump
shared_buffersfrom"80MB"to"512MB"in the existingservices.postgresql.settingsblock on the host. This requires a Postgres restart, whichnixos-rebuild switchperforms automatically; note in the commit message because it briefly interrupts every PG-using service on the host.Image pinning: image string is
journeyapps/powersync-service:<tag>@sha256:<digest>with both components pinned. No--pull=newer. The current latest stable tag should be selected at implementation time. Conservative tag policy: don't pick up major or minor bumps automatically; patch versions can be bumped via the helper script in a normal PR.update.shhelper: underutils/pkgs/powersync-service/(mirroringutils/pkgs/<package>/update.shfrom theturn-runnerandclaude-codepackages, even though this isn't a Nix derivation). It resolves a tag to<tag>@<sha256>via Docker Hub's manifest API and rewrites the image string in the module's .nix file. No actual derivation under that path — just the script.Container config: port
8080bound to127.0.0.1only; sync rules and the rendered service.yaml mounted read-only; environment file from the sops secret containingPS_SOURCE_DSN;extraOptionsconsistent with the existing podman containers (no--pull=newer).Service.yaml content (rendered by
pkgs.writeTextwith Nix interpolation for the JWKS URI; the DSN is read from env at runtime via!env PS_SOURCE_DSN):replication.connections[0].type: postgresql, URI from env,slot_name: powersync_selfhost.storage.type: postgresql, URI pointing at the local PG via unix socket against the newpowersync_storagedatabase.sync_rules.pathpointing at the mounted sync-rules.yaml.client_auth.jwks_uri=https://<supabase-project>.supabase.co/auth/v1/.well-known/jwks.json. The project subdomain is not secret (it's in every public JWKS URL) — hard-coding it in the module is acceptable.client_auth.audience: ["authenticated"].api.port: 8080.Nginx vhost lives in the same new module (mirroring how
collabora.nixcolocates its vhost):enableACME = true; forceSSL = true; acmeRoot = null;;locations."/"proxying tohttp://127.0.0.1:8080;proxyWebsockets = true;extraConfigsettingproxy_read_timeout 3600s; proxy_send_timeout 3600s;. No Authelia ForwardAuth —powersync.reptide.euis not in any Autheliaaccess_controlrule (those are scoped to*.cloonar.com), and the Flutter SDK couldn't satisfy a browser-cookie auth challenge anyway.sops integration:
sops.secrets.reptide-powersync-source-dsnreferences the entry the operator placed inhosts/web-arm/secrets.yamlvia #37.restartUnitsincludes the podman service so secret rotation triggers a container restart.blackbox-exporter probe: an additional entry against the new vhost added to the existing
hosts/web-arm/modules/blackbox-exporter.nix. Probe interval and module match the existing entries.No firewall changes:
fwalready forwards443toweb-armfor the 30+ existing vhosts.No reuse module under
utils/modules/: PowerSync is single-tenant onweb-armonly. If a second host ever needs PowerSync, extract then.Testing Decisions
This is NixOS infrastructure config; "tests" mean dry-build + functional verification, not unit tests. Prior art for tests in this repo for OCI-container services: there isn't any beyond the pre-commit dry-build gate.
collabora.nix,rustdesk.nix, andpa11y.nixall rely exclusively on the same workflow.Dry-build gate (pre-commit hook):
git commitinvokesscripts/pre-commit→scripts/test-configuration web-arm→nixos-rebuild dry-build. The change must build clean before the commit lands. No--no-verify.Functional verification after deploy (operator-driven, on
web-arm):systemctl status podman-powersync.serviceshowsactive (running), no recent restarts.journalctl -u podman-powersync.serviceshows replication start, slot creation, initial backfill messages without errors.SELECT * FROM pg_replication_slots WHERE slot_name = 'powersync_selfhost';showsactive = t.curl -sf https://powersync.reptide.eu/probes/livenessfrom anywhere returns 200.powersync.reptide.euproduces identical data to a build still on PowerSync Cloud (operator-driven, requires #37's prereqs done).Out of Scope
utils/modules/powersync.nixshared across hosts. PowerSync runs onweb-armonly; if a second tenant ever needs it, extract then.Further Notes
/grill-with-docsinterview before this issue was created; that conversation walked through and resolved every decision branch (storage DB = Postgres on existing PG14, runtime = podman/oci-containers, image pinning = tag + digest with scripted bumps, auth = Supabase JWKS URI, hostname =powersync.reptide.eu, sync rules location, source replication setup, hard-cutover migration plan).docs/adr/0007-self-hosted-powersync-on-web-arm.md. Not a blocker for landing the PR.chore(web-arm): bump powersync-service to <tag>matching the existingchore(web-arm): bump turn-runner to <rev>cadence.Agent Brief — update (supersedes the source-connection guidance in the issue body)
This issue was written before two things now on
mainthat change the source connection and the container networking. The prereq #37 is closed; the items below replace its DSN guidance and add a hard networking requirement. Everything else in the issue body still stands.Category: enhancement (unchanged)
What changed since the body was written
1. The replication source must be Supabase's direct endpoint, not the pooler.
PowerSync replicates over the Postgres logical-replication protocol, which Supabase's pooler (Supavisor) cannot carry — it proxies ordinary SQL but rejects the replication handshake (
IDENTIFY_SYSTEM→syntax error) while plainSELECTstill succeeds. Confirmed empirically:IDENTIFY_SYSTEMreturns a row againstdb.majxbigjafpzayzboxsf.supabase.co:5432in replication mode, and syntax-errors through the pooler. This resolves the open risk #37's closing note handed to this issue.reptide-powersync-source-dsnsops secret must hold the direct DSN:postgresql://powersync_selfhost:<pw>@db.majxbigjafpzayzboxsf.supabase.co:5432/postgres?sslmode=require.replication=databasein the stored DSN — PowerSync opens its own replication connection and also uses the URI for normal queries; baking the flag in breaks the latter. (Operator pre-deploy step; the agent never edits secrets.)2. The direct endpoint is IPv6-only, so the container must use the
v6egressnetwork.db.<project>.supabase.coresolves to AAAA only. Host outbound IPv6 exists (ADR-0010) but does not reach default-bridge containers. PR #86 / ADR-0011 added an opt-in dual-stack podman network namedv6egress(NAT66 from a ULA to the host GUA) created by a oneshot unit namedinit-v6egress-network.service. The PowerSync container must join that network and order after that unit, or it lands on the IPv4-only default bridge and silently fails to reach the source — the dry-build will not catch this.extraOptionswith--network=v6egress.podman-powersyncunitafter+requiresinit-v6egress-network.service.127.0.0.1:8080for nginx is independent of the egress network and still works (ingress vs egress are separate paths).Added acceptance criteria (on top of the issue body's)
db.majxbigjafpzayzboxsf.supabase.co:5432(direct), not a*.pooler.supabase.comhost.replication=database.v6egressnetwork and its unit is ordered afterinit-v6egress-network.service.powersync_selfhost(pg_replication_slots.active = ton Supabase).Unchanged / still in scope
Everything else in the issue body: the podman/oci-containers module shape, the
powersync_storageDB + role, theshared_buffers80MB→512MB bump, the nginx vhost onpowersync.reptide.eu(WebSocket + 3600s timeouts), image tag+digest pinning with theupdate.shhelper, JWKS auth via the Supabase project, the read-only sync-rules mount, and the blackbox liveness probe.Out of scope (unchanged)
Supabase-side role/publication/secret material (#37, done), sync-rules authoring, the Flutter cutover release, and PowerSync Cloud teardown.