Update iso-bot skill with capture backends and resolution profiles

This commit is contained in:
Hoid 2026-02-14 10:01:16 +00:00
parent 986444b108
commit 4761c3c4e7
2 changed files with 52 additions and 19 deletions

@ -1 +1 @@
Subproject commit 3b363192f29393abe51cad2293a57a590476faf6 Subproject commit 80ba9b1b906cd73d361136889c38cfb616302929

View file

@ -15,7 +15,7 @@ Screen-reading bot engine for isometric games. First implementation: Diablo II:
- **Screen capture:** Platform-native (Win32 API / X11) - **Screen capture:** Platform-native (Win32 API / X11)
- **Input simulation:** Platform-native (SendInput / uinput) - **Input simulation:** Platform-native (SendInput / uinput)
- **API:** net/http + gorilla/websocket (REST + WS) - **API:** net/http + gorilla/websocket (REST + WS)
- **Dashboard:** React + TypeScript (planned) - **Dashboard:** SolidJS + TypeScript (planned)
- **Config:** YAML - **Config:** YAML
- **Loot filter:** Declarative YAML rule engine - **Loot filter:** Declarative YAML rule engine
@ -26,17 +26,20 @@ cmd/iso-bot/ Single binary entry point
pkg/ pkg/
├── engine/ ├── engine/
│ ├── capture/ Screen capture (window, VM, full screen) │ ├── capture/ Screen capture (window, VM, full screen)
│ │ └── backends/ Modular capture implementations (Win32, X11, Wayland, VNC, SPICE, Monitor, File)
│ ├── vision/ Template matching, color detection (GoCV) │ ├── vision/ Template matching, color detection (GoCV)
│ ├── input/ Mouse (Bézier curves), keyboard, humanization │ ├── input/ Mouse (Bézier curves), keyboard, humanization
│ ├── state/ Game state machine with event callbacks │ ├── state/ Game state machine with event callbacks
│ ├── safety/ Session timing, breaks, pattern randomization │ ├── safety/ Session timing, breaks, pattern randomization
│ ├── navigation/ A* pathfinding, click-to-move │ ├── navigation/ A* pathfinding, click-to-move
│ ├── resolution/ Resolution profile system for multi-resolution support
│ └── loot/ Declarative rule-based loot filter │ └── loot/ Declarative rule-based loot filter
├── plugin/ Game plugin interface ├── plugin/ Game plugin interface (now includes SupportedResolutions)
├── api/ REST + WebSocket API ├── api/ REST + WebSocket API
└── auth/ License/account validation └── auth/ License/account validation
plugins/d2r/ D2R game plugin plugins/d2r/ D2R game plugin (updated for resolution profiles)
web/ React dashboard (planned) config/ YAML configuration files
web/ SolidJS dashboard (planned)
``` ```
## Plugin System ## Plugin System
@ -65,30 +68,38 @@ GIT_SSH_COMMAND="ssh -o StrictHostKeyChecking=no" git push origin main
## Current Status ## Current Status
- ✅ Go project structure - ✅ Go project structure
- ✅ Plugin interface system (`pkg/plugin`) - ✅ Plugin interface system (`pkg/plugin`) with SupportedResolutions
- ✅ Engine: capture, vision, input/humanize, state, safety, loot filter - ✅ Engine: capture, vision, input/humanize, state, safety, loot filter
- ✅ Modular capture backend system (`pkg/engine/capture/backends/`)
- ✅ Resolution profile system (`pkg/engine/resolution/`)
- ✅ Platform-specific capture backend stubs (Win32, X11, Wayland, VNC, SPICE, Monitor, File)
- ✅ D2R resolution profiles (1080p, 720p)
- ✅ Updated D2R plugin for resolution-based regions
- ✅ API server skeleton (REST endpoints) - ✅ API server skeleton (REST endpoints)
- ✅ D2R plugin: config, detector, reader, Mephisto routine - ✅ D2R plugin: config, detector, reader, Mephisto routine
- ✅ Declarative loot filter with YAML rules - ✅ Declarative loot filter with YAML rules
- ⏳ Platform-specific capture backends (Win32, X11) - ✅ YAML configuration system (`config/d2r.yaml`)
- ⏳ Backend implementation (actual Win32/X11/Wayland capture code)
- ⏳ GoCV integration for actual vision processing - ⏳ GoCV integration for actual vision processing
- ⏳ Platform-specific input backends - ⏳ Platform-specific input backends
- ⏳ Engine services implementation for EngineServices interface
- ⏳ Remaining D2R routines (Pindle, Countess) - ⏳ Remaining D2R routines (Pindle, Countess)
- ⏳ Web dashboard (React) - ⏳ Web dashboard (SolidJS)
- ⏳ Account/license system - ⏳ Account/license system
- ⏳ Multi-instance support - ⏳ Multi-instance support
- ⏳ Tests - ⏳ Tests
## Next Steps (Priority Order) ## Next Steps (Priority Order)
1. GoCV integration — make vision pipeline actually work 1. Implement capture backend internals — Win32 BitBlt/DXGI, X11 XGetImage, Wayland PipeWire
2. Platform capture backends — Windows (BitBlt/DXGI) and Linux (X11) 2. Engine services implementation — wire EngineServices interface to actual capture/resolution system
3. Platform input backends — Windows (SendInput) and Linux (uinput) 3. GoCV integration — make vision pipeline actually work
4. D2R detector implementation — health orb reading, menu detection 4. Platform input backends — Windows (SendInput) and Linux (uinput)
5. D2R Mephisto routine — complete implementation 5. D2R detector implementation — health orb reading, menu detection using new region system
6. WebSocket real-time status streaming 6. D2R Mephisto routine — complete implementation
7. React dashboard 7. WebSocket real-time status streaming
8. Pindle + Countess routines 8. SolidJS dashboard
9. Account system + licensing 9. Pindle + Countess routines
10. Account system + licensing
## Key Design Decisions ## Key Design Decisions
1. **Go over Python** — performance for real-time capture+vision at 30+ FPS 1. **Go over Python** — performance for real-time capture+vision at 30+ FPS
@ -98,9 +109,31 @@ GIT_SSH_COMMAND="ssh -o StrictHostKeyChecking=no" git push origin main
5. **Single binary** — engine + API in one Go binary, easy distribution 5. **Single binary** — engine + API in one Go binary, easy distribution
6. **Human-like input** — Bézier curves, fatigue, breaks, route randomization 6. **Human-like input** — Bézier curves, fatigue, breaks, route randomization
## New Features (Latest Update)
### Modular Capture Backends
- **Registry system**: `pkg/engine/capture/backends/registry.go` manages available backends
- **Platform support**: Windows (Win32), Linux (X11, Wayland), cross-platform (VNC, SPICE, Monitor, File)
- **Build tags**: Platform-specific code using `//go:build` constraints
- **Configurable**: Backends selected via YAML config with backend-specific options
### Resolution Profile System
- **Multi-resolution support**: Games can now support multiple resolutions (1080p, 720p, etc.)
- **Named regions**: Screen regions are named (e.g., "health_orb") instead of hardcoded rectangles
- **Plugin interface**: Added `SupportedResolutions()` method to Plugin interface
- **Engine services**: New methods in EngineServices for resolution-aware region access
- **D2R profiles**: Pre-configured for 1920x1080 and 1280x720
### Configuration System
- **YAML config**: `config/d2r.yaml` with comprehensive bot settings
- **Flexible capture**: Support for window capture, VNC, Wayland, file input, etc.
- **Safety settings**: Session limits, break timings, health thresholds
- **Loot configuration**: Item pickup rules, rune tiers, gem settings
## D2R Notes ## D2R Notes
- Target resolution: 1920x1080 - Supported resolutions: 1920x1080 (primary), 1280x720 (secondary)
- Primary farming character: Sorceress with Teleport - Primary farming character: Sorceress with Teleport
- Key routines: Mephisto (moat trick), Pindle (fastest), Countess (runes) - Key routines: Mephisto (moat trick), Pindle (fastest), Countess (runes)
- Screen regions and HSV colors defined in `plugins/d2r/config.go` - Resolution profiles replace hardcoded screen regions
- Regions now accessed via engine services: `Region("health_orb")`, `Region("mana_orb")`, etc.
- The user plays D2R actively and knows the game well - The user plays D2R actively and knows the game well