feat(metrics): evaluate consolidating metrics shipping into grafana-alloy (unify with logs) #120
Labels
No labels
bug
enhancement
in-progress
needs-info
needs-triage
p0
ready-for-agent
ready-for-human
wontfix
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
Cloonar/nixos#120
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Deferred follow-up to #118 (promtail→alloy logs migration). Not urgent.
Agent Brief
Category: enhancement
Summary: Evaluate folding the per-host metrics shipping (vmagent-style scraping → VictoriaMetrics) into grafana-alloy, so a single Alloy agent per host handles both journal→Loki logs and metrics→VictoriaMetrics fleet-wide.
Background / why: #118 migrates journal shipping from promtail to grafana-alloy, deliberately scoped to logs only. Alloy is a superset — it can
prometheus.scrapeandprometheus.remote_write— so once every host already runs Alloy for logs, the metrics path could be expressed in the same agent, leaving one observability agent per host instead of two (Alloy + vmagent). This was explicitly deferred out of #118 to avoid scope-creep on a forced, time-sensitive migration: metrics aren't broken, so there's no forcing function.Current behavior: Metrics are a separate subsystem —
utils/modules/victoriametrics(shared, with amonitoredServicesoption) on fw/nas/amzebs-01,hosts/web-arm/modules/victoriametrics.nixon web-arm, plus per-host exporters (mail dovecot/postfix, fw fwmetrics, nas disk-monitoring). After #118 every host also runs grafana-alloy for logs.Desired behavior: A spike + decision on whether to re-express the existing scrape targets (
monitoredServices+ each exporter) and remote_write to VictoriaMetrics as Alloy components, validated against the existing Grafana dashboards and alerting on web-arm. If worthwhile, migrate host-by-host and retire vmagent. Acceptance: dashboards/alerts unchanged, single agent per host.Dependencies: Blocked on #118 (Alloy must be fleet-wide first).