feat(logging): grafana-alloy shared module + nas canary (promtail→alloy step 1) #121
Labels
No labels
bug
enhancement
in-progress
needs-info
needs-triage
p0
ready-for-agent
ready-for-human
wontfix
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
Cloonar/nixos#121
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Step 1 of #118 (promtail→alloy migration). The eval-gate + sops prerequisites land in the PR "fix(fw,web-arm): pin docker_29 to unblock fleet eval" — merge that first.
Agent Brief
Category: enhancement
Summary: Create a shared
utils/modules/alloymodule (grafana-alloy) that ships the systemd journal to Loki as a drop-in promtail replacement, and switch nas to it as the canary. Verify in Grafana before fanning the rest out (step 2).Background / why: 26.05 removed promtail; nas (already on 26.05) dropped the import and currently has no central log shipping. grafana-alloy (
services.alloy, present in both 25.11 and 26.05) is the vendor successor. The design was grilled and settled — see below.Settled design decisions:
utils/modules/alloy/config.alloy, surfaced viaenvironment.etc."alloy/config.alloy".source(keeps hot-reload). It's a committed Alloy-language file, not Nix attrs.alloy convert --source-format=promtail <rendered promtail.yaml>as a correctness oracle, then commit a hand-cleaned config and diff the two component-by-component to prove parity. (Render the current promtail config to YAML first — it's Nix-generated.)alloy-envholdingLOKI_PASSWORD=<pw>, fed viaservices.alloy.environmentFile, referenced assys.env("LOKI_PASSWORD")in theloki.writebasic_auth. Keeps the module's DynamicUser sandbox; systemd reads the env file as root. The.sops.yamlrule +utils/modules/alloy/secrets.yamlscaffold already exist (from the docker_29 PR); populate the value withsops utils/modules/alloy/secrets.yamlif not already done.SupplementaryGroups = [ "systemd-journal" ].Current behavior (must be preserved):
utils/modules/promtailscrapes the journal (json, max_age 12h, job=systemd-journal) and pushes to https://loki.cloonar.com/loki/api/v1/push (basic authpromtail@cloonar.com). Pipeline: JSON-extract _TRANSPORT/SYSTEMD_UNIT/MESSAGE/COREDUMP*, default unit→transport, reshape coredumps into a human-readable message, normalizesession-N.scope→session.scope, drop known noise (nscd inotify, rpi undervoltage,refused connection: IN=), relabel__journal__hostname→host.Desired behavior:
utils/modules/alloyreproduces the same labels (host, unit, coredump_unit, job), drops, and coredump/session reshaping, pushing to the same Loki endpoint with the same creds. Only nas imports it in this step; promtail stays on the other four (removed in step 2).Acceptance:
Depends on: the docker_29 eval-unblock PR. Part of #118.