lab: Automatic AFK runs — per-project toggle + scheduler #64
Labels
No labels
bug
enhancement
in-progress
needs-info
needs-triage
p0
ready-for-agent
ready-for-human
wontfix
No milestone
No project
No assignees
1 participant
Notifications
Due date
No due date set.
Dependencies
No dependencies set.
Reference
Cloonar/nixos#64
Loading…
Add table
Add a link
Reference in a new issue
No description provided.
Delete branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Parent
#61
What to build
The automatic mode. Persist a per-project
autoEnabledflag, surface it as a toggle in the⋯menu, and add a scheduler that launches AFK runs on its own while the toggle is on and ready issues remain — serial, one auto-run per project, under the global cap.Acceptance criteria
autoEnabledinlab's JSON store (survives restart).⋯menu shows an Auto AFK runs toggle rendered as a stateful form-button whose label reflects On/Off — NOT an<input type=checkbox>, because the morph treatscheckedas client-owned and won't sync it from the server. Toggling POSTs to flip + persist the flag and re-renders.labrestart re-adopts live sessions.Blocked by
Agent Brief
Category: enhancement
Summary: Add the per-project automatic AFK mode — a persisted
autoEnabledtoggle in the ⋯ menu plus a server-side scheduler that launches AFK runs on its own, one at a time per project, under the global instance cap, while ready issues remain.Context: Parent #61. Blockers #62 (Slice 1 — manual run) and #63 (Slice 2 — reaping) are merged, so the claim path and the lifecycle/reaping this slice depends on already exist. The design is governed by ADR-0007 ("lab claims the issue and owns the AFK-run lifecycle"); read it first — its Consequences section is the spec for this slice (the ~30–60s scheduler goroutine distinct from the ~4s client poll, the
autoEnabledstore field, one auto-run per project under the cap, restart re-adoption from session names).Current behavior:
AFK runs are manual-only. A user clicks Start AFK run in a project's ⋯ menu;
labpicks the lowest openready-for-agentissue, claims it (label flipready-for-agent → in-progress, rolled back on any failure before a live session), creates an isolated git worktree on anafk/<N>branch, and spawns a seededclaude --remote-controlsession whose name encodes project + slot + issue. One long-lived background worker — the reaper — sweeps live runs on a ~30s ticker: it completes a run when an open/mergedafk/<N>PR appears, and fails it on session death or a ~45-minute budget overrun, freeing the global-cap slot. Nothing launches a run without a human click, and there is no persisted per-project automatic state.Desired behavior:
Each Forgejo-hosted project gains a persisted
autoEnabledflag, toggled from its ⋯ menu. A second server-side worker (the scheduler) runs on its own ~30–60s cadence. On each tick, for every project that isautoEnabled, has aready-for-agentissue available, and has no auto-run already in flight, it launches the next run through the same claim → worktree → spawn path the manual Start uses — providedlabis under its global instance cap. The loop is serial per project (at most one auto run per project at a time), drains the queue one issue per launch, idles when no ready issues remain, launches promptly after a toggle-on if work exists, and at the cap simply launches nothing (never errors). A manually started AFK run is additive — it does not block the auto-loop beyond consuming a cap slot. TogglingautoEnabledoff stops new auto-launches; runs already in flight are left to the reaper.Key interfaces / contracts (durable anchors — explore for the current shapes):
projectStatebehindStore, persisted as one atomic JSON file): add anautoEnabledboolean with an accessor + mutator pair that persists on write, mirroring the existingURL/SetURL(andLastOpenedAt/StampOpened) pattern. Must round-trip across a restart.handleAFKStartHTTP handler): this is currently welded to thehttp.ResponseWriter/*http.Request(it ends ins.fail/s.ok/s.noReady), so the scheduler cannot reuse it as written. Factor the core into a routine callable without an HTTP request — returning whether a run was launched (and any error) — so the handler and the scheduler share one claim path. ADR-0007's race-freedom depends on that claim staying single-flighted under the existingafkMu, so the shared routine (not two copies) must hold that mutex.afkLabel/parseAFKLabel/afkLabelPrefix, round-tripped viacomposeSessionName/parseSessionNameand recognized byparseAFKRun): extend it to distinguish auto from manual runs in a restart-proof way. ⚠️ Hazard:parseAFKLabelcurrently treats the entire suffix after theafk-prefix as the issue number (strconv.Atoi), so a naiveafk-auto-<N>label would fail to parse and the reaper would silently stop reaping auto runs. Whatever encoding you choose must keep both kinds recognized as AFK runs byparseAFKRun/parseAFKLabel(both must still be reaped), and must survivesanitizeLabel(which allows only[A-Za-z0-9._-]).watchAFKRuns) is. Like the reaper, it must no-op (return immediately) when the AFK seams are not wired (tracker/gitnil) rather than spin an idle ticker.liveInstanceCountvsmaxInstancesguard rather than reimplementing it.<input type=checkbox>. The client-side DOM morph deliberately treats form-control state (checked, input.value,<details open>) as client-owned and never repaints it from the server; only server-rendered text is synced. So a checkbox'scheckedwould never reflect the persisted flag after a poll. Model it on the sibling Start AFK run menu item: a<form method="post" data-action>posting to a new endpoint that flips + persistsautoEnabledand re-renders, with the button label reading e.g. "Auto AFK runs: On" / "Auto AFK runs: Off". Gate it to Forgejo-hosted projects exactly like Start AFK run (a direct POST for a non-Forgejo project must also be refused, not just hidden).Acceptance criteria:
autoEnabledboolean survives alabrestart (round-trips through the JSON store like the existing per-project fields).<input type=checkbox>); activating it POSTs to flip + persist the flag and re-renders with the new label. Shown/enabled only on Forgejo-hosted projects, and a direct POST against a non-Forgejo project is refused.autoEnabledForgejo project that is under the global cap, has no auto-run in flight, and has a ready issue, it launches the next run via the shared claim → worktree → spawn path (with the manual path's claim-rollback-on-failure intact).autoEnabled ∧ under-cap ∧ no-auto-in-flight ∧ ready-issue-exists), unit-tested in isolation from tmux/tea/clock — mirroring howclassifyAFKRunis tested today.Out of scope:
consecutiveFailurescounting and the three-strikes auto-pause + UI reset — that is #65. Do not add the failure counter or pause here. Do shape the launch predicate so a later∧ not-pausedterm can be added without restructuring.in-progressand (in #65) increments the failure counter.Verification: This is a Go change in the
labmodule; its tests are not exercised by the repo's pre-commit hook (which is eval-only). Rungo test ./...,go vet ./..., andgo build ./...in thelabmodule locally before opening the PR.