feat(dev): pause a project's auto AFK loop after 3 consecutive failures #78

Merged
dominik.polakovics merged 1 commit from afk/65 into main 2026-06-02 15:24:41 +02:00

Slice 4 of #61 (design locked in ADR-0007): keep a recurring failure from burning an auto-run every scheduler sweep indefinitely. Track consecutive failed AFK runs per project, pause the automatic loop at three, and expose a UI Reset so a human can re-arm it.

What changed

  • store: persist a per-project consecutiveFailures (omitempty, like autoEnabled) with atomic get / increment / reset methods — read-modify-write under the store lock, since the reaper goroutine and the Reset handler write it concurrently.
  • reap chokepoint (reapAFKRun): the single place the run lifecycle writes the counter — reset on a success reap, increment on a death/timeout failure. A user-initiated Stop never reaches the chokepoint, so it stays neutral. The counter is kind-agnostic (manual and auto runs both feed it); only the auto loop is paused by it.
  • scheduler predicate (afkAutoDecision / shouldLaunchAuto): a Paused term populated from consecutiveFailures >= afkPauseThreshold. It gates the auto loop only and is deliberately absent from the shared launchAFKRun claim path, so manual "Start AFK run" still works on a paused project.
  • POST /afk/reset/<project>: zeroes the counter and kicks one sweep to re-arm promptly; mirrors the auto-toggle handler (POST-only, Forgejo-only, shared ok/fail plumbing, no-JS 303 redirect plus fetch/morph #live fragment).
  • ⋯ menu: when paused, the card shows an "Auto paused · N fails" indicator and a Reset form-button, rendered as server text the morph syncs (never client-owned state).
  • afkPauseThreshold: a single tunable constant (default 3).

Acceptance criteria

  • Persisted per-project consecutiveFailures that round-trips across a store reload.
  • A death/timeout reap increments the counter; a success reap resets it; a user-initiated Stop leaves it unchanged.
  • At the threshold (3) the scheduler launches no further auto runs even while the auto toggle is on; other projects unaffected.
  • Manual "Start AFK run" still works on a paused project.
  • Card shows "Auto paused · N fails" and a Reset control (POST /afk/reset/<project>) that zeroes the counter and re-arms.
  • Reset works both no-JS (form POST → 303) and via the fetch/morph path.
  • The threshold is a single tunable constant.
  • Go unit tests cover the counter transitions (increment, reset, user-Stop neutral, pause at threshold, Reset clears).

Verification

  • go build ./..., go vet ./..., gofmt -l ., and go test -race ./... — all green.
  • Pre-commit hook dry-built fw — OK.
  • The fetch/morph path was verified with an ephemeral jsdom harness driving the real inline morph across the paused↔not-paused transition: it converges to the target DOM, keeps the ⋯ menu open, reuses the kept forms/card (no rebuild that would wipe typed input), and adds/removes the Reset control and indicator correctly.

Closes #65

Slice 4 of #61 (design locked in ADR-0007): keep a recurring failure from burning an auto-run every scheduler sweep indefinitely. Track consecutive failed AFK runs per project, pause the automatic loop at three, and expose a UI Reset so a human can re-arm it. ## What changed - **store**: persist a per-project `consecutiveFailures` (omitempty, like `autoEnabled`) with atomic get / increment / reset methods — read-modify-write under the store lock, since the reaper goroutine and the Reset handler write it concurrently. - **reap chokepoint** (`reapAFKRun`): the single place the run lifecycle writes the counter — reset on a success reap, increment on a death/timeout failure. A user-initiated Stop never reaches the chokepoint, so it stays neutral. The counter is kind-agnostic (manual and auto runs both feed it); only the auto loop is paused by it. - **scheduler predicate** (`afkAutoDecision` / `shouldLaunchAuto`): a `Paused` term populated from `consecutiveFailures >= afkPauseThreshold`. It gates the auto loop only and is deliberately absent from the shared `launchAFKRun` claim path, so manual "Start AFK run" still works on a paused project. - **`POST /afk/reset/<project>`**: zeroes the counter and kicks one sweep to re-arm promptly; mirrors the auto-toggle handler (POST-only, Forgejo-only, shared ok/fail plumbing, no-JS 303 redirect plus fetch/morph `#live` fragment). - **⋯ menu**: when paused, the card shows an "Auto paused · N fails" indicator and a Reset form-button, rendered as server text the morph syncs (never client-owned state). - **`afkPauseThreshold`**: a single tunable constant (default 3). ## Acceptance criteria - [x] Persisted per-project `consecutiveFailures` that round-trips across a store reload. - [x] A death/timeout reap increments the counter; a success reap resets it; a user-initiated Stop leaves it unchanged. - [x] At the threshold (3) the scheduler launches no further auto runs even while the auto toggle is on; other projects unaffected. - [x] Manual "Start AFK run" still works on a paused project. - [x] Card shows "Auto paused · N fails" and a Reset control (`POST /afk/reset/<project>`) that zeroes the counter and re-arms. - [x] Reset works both no-JS (form POST → 303) and via the fetch/morph path. - [x] The threshold is a single tunable constant. - [x] Go unit tests cover the counter transitions (increment, reset, user-Stop neutral, pause at threshold, Reset clears). ## Verification - `go build ./...`, `go vet ./...`, `gofmt -l .`, and `go test -race ./...` — all green. - Pre-commit hook dry-built `fw` — OK. - The fetch/morph path was verified with an ephemeral jsdom harness driving the real inline morph across the paused↔not-paused transition: it converges to the target DOM, keeps the ⋯ menu open, reuses the kept forms/card (no rebuild that would wipe typed input), and adds/removes the Reset control and indicator correctly. Closes #65
Keep a recurring failure from burning an auto-run every scheduler sweep
indefinitely. Track consecutive failed AFK runs per project, pause the
automatic loop at three, and expose a UI Reset so a human can re-arm it.

- store: persist a per-project consecutiveFailures (omitempty, like
  autoEnabled) with atomic get / increment / reset methods that
  read-modify-write under the store lock, since the reaper goroutine and the
  Reset handler write it concurrently.
- reap chokepoint (reapAFKRun): the single place the run lifecycle writes the
  counter — reset to zero on a success reap, increment on a death/timeout
  failure. A user-initiated Stop never reaches the chokepoint, so it stays
  neutral. The counter is kind-agnostic (manual and auto runs both feed it);
  only the auto loop is paused by it.
- scheduler predicate (afkAutoDecision / shouldLaunchAuto): add a Paused term
  populated from consecutiveFailures >= afkPauseThreshold. It gates the auto
  loop only and is deliberately absent from the shared launchAFKRun claim
  path, so manual "Start AFK run" still works on a paused project.
- POST /afk/reset/<project>: zeroes the counter and kicks one sweep to re-arm
  promptly; mirrors the auto-toggle handler (POST-only, Forgejo-only, shared
  ok/fail plumbing, no-JS 303 redirect plus fetch/morph #live fragment).
- index template: when paused the card's menu shows an
  "Auto paused . N fails" indicator and a Reset form-button, rendered as
  server text the morph syncs (never client-owned state).
- afkPauseThreshold: a single tunable constant (default 3).

Slice 4 of #61; design locked in ADR-0007.

Closes #65
Sign in to join this conversation.
No reviewers
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
Cloonar/nixos!78
No description provided.