feat(web-arm): fetch PowerSync sync rules live from Cloonar/fit instead of vendoring #95
Labels
No labels
bug
enhancement
in-progress
needs-info
needs-triage
p0
ready-for-agent
ready-for-human
wontfix
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
Cloonar/nixos#95
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Goal
powersync.reptide.eu(self-hosted PowerSync on web-arm, ADR-0012) currently mounts a vendored copy of the sync rules athosts/web-arm/modules/powersync/sync-rules.yaml, hand-copied from the app repo on every change. Make the deployed service source its sync rules live from the app repo so app-team changes reach prod with no nixos edit.Source of truth becomes
Cloonar/fit:powersync/sync-rules.yamlonmaster(ssh://forgejo@git.cloonar.com/Cloonar/fit.git). A merge tofit:masteris already human-reviewed, so that review — not a nixos PR — is the gate.Scope: sync rules only.
service.yamlstays nix-rendered (it carries the sops DSN / storage / JWKS) — do not change how it is produced.Design (decided)
Runtime fetch-and-restart — not an eval-time
fetchGit. (An unpinned eval-time fetch would couple every web-arm rebuild, and the local pre-commit dry-build, to git.cloonar.com availability and break reproducibility — rejected.)powersync-syncrules-fetch.service(oneshot, runs as root) + a.timerevery 5 min.Cloonar/fit:powersync/sync-rules.yaml@masterover HTTPS via Forgejo's raw API using a read-only token (sops secretreptide-powersync-syncrules-token, already created — see Secret). Expected endpoint (verify against the running Forgejo):GET https://git.cloonar.com/api/v1/repos/Cloonar/fit/raw/powersync/sync-rules.yaml?ref=master, headerAuthorization: token <T>.journeyapps/powersync-serviceimage exposes a cheap sync-rules validate/dry-run subcommand, run it too (verify; the liveness step backstops it if absent).sync-rules.yaml.prev, write the new file, thensystemctl try-restart podman-powersync.service. Usetry-restart(no-op when the container is not running, e.g. at boot) — PowerSync re-reads rules only on restart, it does not hot-reload.http://127.0.0.1:8080/probes/liveness) until healthy or an N-second timeout; if it does not go healthy, restore.prev, restart again, and exit non-zero so it pages. Net effect: a reviewed-but-bad-for-this-deploy file self-heals to last-good and still alerts.File & persistence
/var/lib/powersync/sync-rules.yaml. web-arm root is plain ext4 and fully persistent (no impermanence), so the last good rules always survive reboots. No persistence wiring needed beyond creating the dir.hosts/web-arm/modules/powersync/sync-rules.yaml. No seed file —Cloonar/fit@masteris the sole source of truth; a copy in nixos reintroduces the drift trap.service.yamlfrom the nix store (unchanged), mountsync-rules.yamlfrom/var/lib/powersync(read-only into the container). The currentconfigDirrunCommandcopies both — rework so onlyservice.yamlcomes from the store.podman-powersync.serviceafterthe fetch unit, and run the fetch oneshot at boot. The fetch unit must exit 0 when a usable file already exists even if the network fetch failed (so a git.cloonar.com blip at boot never blocks the container, which starts from the persisted file). Hard-fail only when no usable file exists at all (truly fresh host) — that legitimately blocks startup and should page.Alerting (reuse existing Pushover)
cp_dominik_normal/cp_dominik_emergency, seehosts/web-arm/modules/grafana/default.nix); node-exporter's systemd collector exposesnode_systemd_unit_state.hosts/web-arm/modules/grafana/alerting/service/services_down.nixalerts onstate="active" == 0, which is wrong for a oneshot (a healthy oneshot isinactive, notactive). Instead alert onnode_systemd_unit_state{instance="web-arm:9100", name="powersync-syncrules-fetch.service", state="failed"} == 1, or wireOnFailure=to a oneshot that pushes Pushover. Route to the existing normal receiver.blackbox_powersync_livenessprobe — no change there.Secret (already created — do NOT touch secrets files)
reptide-powersync-syncrules-tokenexists inhosts/web-arm/secrets.yaml: a read-only Forgejo token that can readCloonar/fit. Wire it viasops.secrets.reptide-powersync-syncrules-tokenand feed the fetch service (env file orsops.templates, mirroring howreptide-powersync-source-dsn->powersync.envis done in the same module). Per repo policy, never editsecrets.yaml.Verify at impl time
/api/v1/repos/.../raw/...?ref=masterform above is the expectation).journeyapps/powersync-serviceimage has a sync-rules validate/dry-run subcommand (adds a pre-swap check; liveness rollback covers it if absent).ADR (required part of this issue)
Write
docs/adr/0015-*.mdin the repo's ADR style (seedocs/adr/0012-self-hosted-powersync-on-web-arm.md). It must:Cloonar/fit@masterat runtime, validated before swap, with the vendored copy removed.fetchGit; an in-nixos structural/table-allowlist validator) and why.Constraints / conventions
origin/main; open a PR (tea pr create) — never push to main. Conventional Commits, scopefeat(web-arm).system.stateVersion.Acceptance criteria
hosts/web-arm/modules/powersync/sync-rules.yamldeleted; nixos holds no sync-rule content.Cloonar/fit:powersync/sync-rules.yaml@masterover HTTPS usingreptide-powersync-syncrules-token.try-restart..prevsnapshot + post-restart liveness rollback; rollback exits non-zero and pages.sync-rules.yamlfrom/var/lib/powersync;service.yamlstill from the store; container ordered after the fetch unit; boot fetch exits 0 when a usable file exists.failedstate).docs/adr/0015-*.mdwritten, amending ADR-0012.Relates to ADR-0012; follows the PowerSync work in #37 (operator prerequisites) / #38 (implementation).