config/skills/iso-bot/SKILL.md

6.9 KiB

ISO Bot — Isometric Game Bot Engine

Overview

Screen-reading bot engine for isometric games. First implementation: Diablo II: Resurrected.

Approach: Screen capture + computer vision + human-like input simulation. No memory injection, no hooking, no client modification. Engine runs on host, game runs in VM for detection isolation.

Repository

  • Local: /home/openclaw/.openclaw/workspace/projects/iso-bot
  • Remote: ssh://forgejo@git.cloonar.com/openclawd/iso-bot.git

Tech Stack

  • Engine: Go 1.23+
  • Vision: GoCV (OpenCV bindings for Go)
  • Screen capture: Platform-native (Win32 API / X11)
  • Input simulation: Platform-native (SendInput / uinput)
  • API: net/http + gorilla/websocket (REST + WS)
  • Dashboard: SolidJS + TypeScript (planned)
  • Config: YAML
  • Loot filter: Declarative YAML rule engine

Architecture

cmd/iso-bot/          Single binary entry point
pkg/
├── engine/
│   ├── capture/      Screen capture (window, VM, full screen)
│   │   └── backends/ Modular capture implementations (Win32, X11, Wayland, VNC, SPICE, Monitor, File)
│   ├── vision/       Template matching, color detection (GoCV)
│   ├── input/        Mouse (Bézier curves), keyboard, humanization
│   ├── state/        Game state machine with event callbacks
│   ├── safety/       Session timing, breaks, pattern randomization
│   ├── navigation/   A* pathfinding, click-to-move
│   ├── resolution/   Resolution profile system for multi-resolution support
│   └── loot/         Declarative rule-based loot filter
├── plugin/           Game plugin interface (now includes SupportedResolutions)
├── api/              REST + WebSocket API
└── auth/             License/account validation
plugins/d2r/          D2R game plugin (updated for resolution profiles)
config/               YAML configuration files
web/                  SolidJS dashboard (planned)

Plugin System

All game logic is behind interfaces in pkg/plugin/plugin.go:

  • Plugin — main entry point, returns detector/reader/routines
  • GameDetector — detect state from screenshots
  • ScreenReader — extract items, enemies, text
  • Routine — automated farming sequences (context-aware, cancellable)
  • LootFilter — item pickup rules
  • EngineServices — engine capabilities provided to plugins

Development Conventions

  • Go standard project layout
  • Type hints / godoc on all exported types
  • gofmt formatting
  • Tests in *_test.go files alongside code
  • Feature branches → main
  • Always commit and push after changes

Sub-Agent Settings

When spawning sub-agents for this project, use generous timeouts:

  • Small tasks (single file, docs): runTimeoutSeconds: 300 (5 min)
  • Medium tasks (new feature, refactor): runTimeoutSeconds: 900 (15 min)
  • Large tasks (multi-file implementation): runTimeoutSeconds: 1800 (30 min)
  • Model: anthropic/claude-sonnet-4-20250514 (good balance of speed and quality for code)

Git Workflow

cd /home/openclaw/.openclaw/workspace/projects/iso-bot
git add -A && git commit -m "descriptive message"
GIT_SSH_COMMAND="ssh -o StrictHostKeyChecking=no" git push origin main

Current Status

  • Go project structure
  • Plugin interface system (pkg/plugin) with SupportedResolutions
  • Engine: capture, vision, input/humanize, state, safety, loot filter
  • Modular capture backend system (pkg/engine/capture/backends/)
  • Resolution profile system (pkg/engine/resolution/)
  • Platform-specific capture backend stubs (Win32, X11, Wayland, VNC, SPICE, Monitor, File)
  • D2R resolution profiles (1080p, 720p)
  • Updated D2R plugin for resolution-based regions
  • API server skeleton (REST endpoints)
  • D2R plugin: config, detector, reader, Mephisto routine
  • Declarative loot filter with YAML rules
  • YAML configuration system (config/d2r.yaml)
  • Backend implementation (actual Win32/X11/Wayland capture code)
  • GoCV integration for actual vision processing
  • Platform-specific input backends
  • Engine services implementation for EngineServices interface
  • Remaining D2R routines (Pindle, Countess)
  • Web dashboard (SolidJS)
  • Account/license system
  • Multi-instance support
  • Tests

Next Steps (Priority Order)

  1. Implement capture backend internals — Win32 BitBlt/DXGI, X11 XGetImage, Wayland PipeWire
  2. Engine services implementation — wire EngineServices interface to actual capture/resolution system
  3. GoCV integration — make vision pipeline actually work
  4. Platform input backends — Windows (SendInput) and Linux (uinput)
  5. D2R detector implementation — health orb reading, menu detection using new region system
  6. D2R Mephisto routine — complete implementation
  7. WebSocket real-time status streaming
  8. SolidJS dashboard
  9. Pindle + Countess routines
  10. Account system + licensing

Key Design Decisions

  1. Go over Python — performance for real-time capture+vision at 30+ FPS
  2. Plugin system — engine is game-agnostic, new game = new plugin
  3. VM isolation — engine on host, game in VM, zero detection surface
  4. Declarative loot — YAML rules, user-customizable via web UI
  5. Single binary — engine + API in one Go binary, easy distribution
  6. Human-like input — Bézier curves, fatigue, breaks, route randomization

New Features (Latest Update)

Modular Capture Backends

  • Registry system: pkg/engine/capture/backends/registry.go manages available backends
  • Platform support: Windows (Win32), Linux (X11, Wayland), cross-platform (VNC, SPICE, Monitor, File)
  • Build tags: Platform-specific code using //go:build constraints
  • Configurable: Backends selected via YAML config with backend-specific options

Resolution Profile System

  • Multi-resolution support: Games can now support multiple resolutions (1080p, 720p, etc.)
  • Named regions: Screen regions are named (e.g., "health_orb") instead of hardcoded rectangles
  • Plugin interface: Added SupportedResolutions() method to Plugin interface
  • Engine services: New methods in EngineServices for resolution-aware region access
  • D2R profiles: Pre-configured for 1920x1080 and 1280x720

Configuration System

  • YAML config: config/d2r.yaml with comprehensive bot settings
  • Flexible capture: Support for window capture, VNC, Wayland, file input, etc.
  • Safety settings: Session limits, break timings, health thresholds
  • Loot configuration: Item pickup rules, rune tiers, gem settings

D2R Notes

  • Supported resolutions: 1920x1080 (primary), 1280x720 (secondary)
  • Primary farming character: Sorceress with Teleport
  • Key routines: Mephisto (moat trick), Pindle (fastest), Countess (runes)
  • Resolution profiles replace hardcoded screen regions
  • Regions now accessed via engine services: Region("health_orb"), Region("mana_orb"), etc.
  • The user plays D2R actively and knows the game well