config/skills/iso-bot/SKILL.md

7.6 KiB

ISO Bot — Isometric Game Bot Engine

Overview

Screen-reading bot engine for isometric games. First implementation: Diablo II: Resurrected.

Approach: Screen capture + computer vision + human-like input simulation. No memory injection, no hooking, no client modification. Engine runs on host, game runs in VM for detection isolation.

Repository

  • Local: /home/openclaw/.openclaw/workspace/projects/iso-bot
  • Remote: ssh://forgejo@git.cloonar.com/openclawd/iso-bot.git

Tech Stack

  • Engine: Go 1.23+
  • Vision: GoCV (OpenCV bindings for Go)
  • Screen capture: Platform-native (Win32 API / X11)
  • Input simulation: Platform-native (SendInput / uinput)
  • API: net/http + gorilla/websocket (REST + WS)
  • Dashboard: SolidJS + TypeScript (planned)
  • Config: YAML
  • Loot filter: Declarative YAML rule engine

Architecture

cmd/iso-bot/          Single binary entry point
pkg/
├── engine/
│   ├── capture/      Screen capture (window, VM, full screen)
│   │   └── backends/ Modular capture implementations (Win32, X11, Wayland, VNC, SPICE, Monitor, File)
│   ├── vision/       Template matching, color detection (GoCV)
│   ├── input/        Mouse (Bézier curves), keyboard, humanization
│   ├── state/        Game state machine with event callbacks
│   ├── safety/       Session timing, breaks, pattern randomization
│   ├── navigation/   A* pathfinding, click-to-move
│   ├── resolution/   Resolution profile system for multi-resolution support
│   └── loot/         Declarative rule-based loot filter
├── plugin/           Game plugin interface (now includes SupportedResolutions)
├── api/              REST + WebSocket API
└── auth/             License/account validation
plugins/d2r/          D2R game plugin (updated for resolution profiles)
config/               YAML configuration files
web/                  SolidJS dashboard (planned)

Plugin System

All game logic is behind interfaces in pkg/plugin/plugin.go:

  • Plugin — main entry point, returns detector/reader/routines
  • GameDetector — detect state from screenshots
  • ScreenReader — extract items, enemies, text
  • Routine — automated farming sequences (context-aware, cancellable)
  • LootFilter — item pickup rules
  • EngineServices — engine capabilities provided to plugins

Development Conventions

  • Go standard project layout
  • Type hints / godoc on all exported types
  • gofmt formatting
  • Tests in *_test.go files alongside code
  • Feature branches → main
  • Always commit and push after changes

Development Workflow — Claude Code

For all coding tasks, use Claude Code CLI instead of sub-agents. Claude Code can iterate (edit → build → see errors → fix), unlike one-shot sub-agents.

Wrapper script: /home/openclaw/.openclaw/workspace/bin/claude-code

  • Auto-injects Anthropic API key from OpenClaw's auth config
  • Adds Go to PATH

How to run a coding task:

cd /home/openclaw/.openclaw/workspace/projects/iso-bot
/home/openclaw/.openclaw/workspace/bin/claude-code -p "your task description here" --allowedTools "Edit,Write,Bash" --max-turns 50

Use exec with pty=true for interactive Claude Code sessions.

Project has CLAUDE.md at repo root with full project context.

When to use what:

  • Claude Code → all coding tasks (features, fixes, refactors)
  • Sub-agents (sessions_spawn) → research, analysis, non-coding tasks
  • Direct edits → small single-file changes

Sub-Agent Settings (for non-coding tasks)

When spawning sub-agents for research/analysis, use generous timeouts. Estimate per task.

  • Model: anthropic/claude-sonnet-4-20250514

Git Workflow

cd /home/openclaw/.openclaw/workspace/projects/iso-bot
git add -A && git commit -m "descriptive message"
GIT_SSH_COMMAND="ssh -o StrictHostKeyChecking=no" git push origin main

Current Status

  • Go project structure
  • Plugin interface system (pkg/plugin) with SupportedResolutions
  • Engine: capture, vision, input/humanize, state, safety, loot filter
  • Modular capture backend system (pkg/engine/capture/backends/)
  • Resolution profile system (pkg/engine/resolution/)
  • Platform-specific capture backend stubs (Win32, X11, Wayland, VNC, SPICE, Monitor, File)
  • D2R resolution profiles (1080p, 720p)
  • Updated D2R plugin for resolution-based regions
  • API server skeleton (REST endpoints)
  • D2R plugin: config, detector, reader, Mephisto routine
  • Declarative loot filter with YAML rules
  • YAML configuration system (config/d2r.yaml)
  • Backend implementation (actual Win32/X11/Wayland capture code)
  • GoCV integration for actual vision processing
  • Platform-specific input backends
  • Engine services implementation for EngineServices interface
  • Remaining D2R routines (Pindle, Countess)
  • Web dashboard (SolidJS)
  • Account/license system
  • Multi-instance support
  • Tests

Next Steps (Priority Order)

  1. Implement capture backend internals — Win32 BitBlt/DXGI, X11 XGetImage, Wayland PipeWire
  2. Engine services implementation — wire EngineServices interface to actual capture/resolution system
  3. GoCV integration — make vision pipeline actually work
  4. Platform input backends — Windows (SendInput) and Linux (uinput)
  5. D2R detector implementation — health orb reading, menu detection using new region system
  6. D2R Mephisto routine — complete implementation
  7. WebSocket real-time status streaming
  8. SolidJS dashboard
  9. Pindle + Countess routines
  10. Account system + licensing

Key Design Decisions

  1. Go over Python — performance for real-time capture+vision at 30+ FPS
  2. Plugin system — engine is game-agnostic, new game = new plugin
  3. VM isolation — engine on host, game in VM, zero detection surface
  4. Declarative loot — YAML rules, user-customizable via web UI
  5. Single binary — engine + API in one Go binary, easy distribution
  6. Human-like input — Bézier curves, fatigue, breaks, route randomization

New Features (Latest Update)

Modular Capture Backends

  • Registry system: pkg/engine/capture/backends/registry.go manages available backends
  • Platform support: Windows (Win32), Linux (X11, Wayland), cross-platform (VNC, SPICE, Monitor, File)
  • Build tags: Platform-specific code using //go:build constraints
  • Configurable: Backends selected via YAML config with backend-specific options

Resolution Profile System

  • Multi-resolution support: Games can now support multiple resolutions (1080p, 720p, etc.)
  • Named regions: Screen regions are named (e.g., "health_orb") instead of hardcoded rectangles
  • Plugin interface: Added SupportedResolutions() method to Plugin interface
  • Engine services: New methods in EngineServices for resolution-aware region access
  • D2R profiles: Pre-configured for 1920x1080 and 1280x720

Configuration System

  • YAML config: config/d2r.yaml with comprehensive bot settings
  • Flexible capture: Support for window capture, VNC, Wayland, file input, etc.
  • Safety settings: Session limits, break timings, health thresholds
  • Loot configuration: Item pickup rules, rune tiers, gem settings

D2R Notes

  • Supported resolutions: 1920x1080 (primary), 1280x720 (secondary)
  • Primary farming character: Sorceress with Teleport
  • Key routines: Mephisto (moat trick), Pindle (fastest), Countess (runes)
  • Resolution profiles replace hardcoded screen regions
  • Regions now accessed via engine services: Region("health_orb"), Region("mana_orb"), etc.
  • The user plays D2R actively and knows the game well