feat(logging): fan out grafana-alloy to remaining hosts, delete promtail #125
No reviewers
Labels
No labels
bug
enhancement
in-progress
needs-info
needs-triage
p0
ready-for-agent
ready-for-human
wontfix
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
Cloonar/nixos!125
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "afk/122"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Step 2 of #118 (promtail→alloy migration). The nas canary (step 1, #121 / PR #124) is merged and confirmed live in Grafana, so this switches the remaining four hosts off promtail and removes the module.
Changes
./utils/modules/promtail→./utils/modules/alloyinfw,mail,web-arm,amzebs-01configuration.nix.utils/modules/promtail/(default.nix+secrets.yaml).utils/modules/promtail/…creation rule from.sops.yaml; theutils/modules/alloy/…rule already covers the same recipient set.CLAUDE.mdmodule list (promtail → alloy).Out of scope (intentionally untouched)
hosts/web-arm/modules/loki.nixand thepromtail-nginx-passwordsecret: that is the server-side Loki nginx basic-auth file. Alloy clients still authenticate toloki.cloonar.comaspromtail@cloonar.com, validated against it, so it stays. Renaming would require re-encrypting secrets and is unrelated to this swap.Verification
amzebs-01 fw mail nas nb web-armallOK(.sops.yaml+utils/are touched, so the hook builds every host).config.alloyships as a staticenvironment.etcfile, so neither the pre-commit eval nor a build parses it. After deploy, confirm at least one 25.11 host (e.g.mailoramzebs-01) shows up in Grafana/Loki with the expected labels — the canary only proved grafana-alloy 1.16.0 (26.05); these four run 1.12.2 (25.11). The config uses only GA-stable components, so the version gap is low-risk. Also confirm promtail is no longer running anywhere.Closes #122
Validation: PASS ✅
Verification signal relied on: the repo's commit-time gate — the pre-commit dry-build (eval) is green for all six hosts (this PR touches
.sops.yaml+utils/, so the hook builds every host). Per the repo's gate model I did not re-run it.Independently verified the one dimension eval/build cannot check — sops decryptability at deploy time.
#124exercised the alloy module only on nas, so this is the first time fw/mail/web-arm/amzebs-01 decryptalloy-env. The age recipients embedded inutils/modules/alloy/secrets.yamlare byte-for-byte identical to the oldutils/modules/promtail/secrets.yaml, and cover all five alloy hosts:&fw&ldap-server-arm(legacy anchor name)&web-arm&amzebs-01&nasOther checks:
./utils/modules/promtail. The only remainingpromtailstrings are historical ADRs (immutable), the server-side Loki nginx basic-auth on web-arm (intentionally out of scope;promtail@cloonar.comis the Loki credential username and still matches), and explanatory comments in the alloy module..sops.yaml: removing theutils/modules/promtail/…creation rule is correct orphan cleanup (the path is deleted); theutils/modules/alloy/…rule remains and governs the kept secret.src/*Hashchange, so the eval-only-gate vendorHash caveat does not apply.system.stateVersionchange; modules imported by explicit path.afk/122, body carriesCloses #122, the branch number matches the issue, and #122 is the open step-2 target of #118.Residual (post-deploy, already documented in the PR): the static
config.alloyis parsed only at runtime (unchanged from #124, already proven on nas), and these four hosts run alloy 1.12.2 (25.11) vs the canary's 1.16.0 (26.05). The config uses only GA-stable components, so the version gap is low-risk. After deploy, confirm a 25.11 host (mail/amzebs-01) appears in Grafana/Loki with the expected labels and that promtail is no longer running anywhere.mergeable ✔— no conflict resolution needed.