Update iso-bot skill for Go stack

This commit is contained in:
Hoid 2026-02-14 09:44:08 +00:00
parent bb25311911
commit 986444b108
2 changed files with 77 additions and 63 deletions

@ -1 +1 @@
Subproject commit e0282a7111bc428438efd38fac178cfdb8c16a40 Subproject commit 3b363192f29393abe51cad2293a57a590476faf6

View file

@ -3,90 +3,104 @@
## Overview ## Overview
Screen-reading bot engine for isometric games. First implementation: Diablo II: Resurrected. Screen-reading bot engine for isometric games. First implementation: Diablo II: Resurrected.
**Approach:** Screen capture + computer vision + human-like input simulation. No memory injection, no hooking, no client modification. **Approach:** Screen capture + computer vision + human-like input simulation. No memory injection, no hooking, no client modification. Engine runs on host, game runs in VM for detection isolation.
## Repository ## Repository
- **Local:** `/home/openclaw/.openclaw/workspace/projects/iso-bot` - **Local:** `/home/openclaw/.openclaw/workspace/projects/iso-bot`
- **Remote:** `git@git.cloonar.com:openclawd/iso-bot.git` (pending repo creation) - **Remote:** `ssh://forgejo@git.cloonar.com/openclawd/iso-bot.git`
## Tech Stack
- **Engine:** Go 1.23+
- **Vision:** GoCV (OpenCV bindings for Go)
- **Screen capture:** Platform-native (Win32 API / X11)
- **Input simulation:** Platform-native (SendInput / uinput)
- **API:** net/http + gorilla/websocket (REST + WS)
- **Dashboard:** React + TypeScript (planned)
- **Config:** YAML
- **Loot filter:** Declarative YAML rule engine
## Architecture ## Architecture
``` ```
engine/ # Reusable core — game-agnostic cmd/iso-bot/ Single binary entry point
├── screen/ # Screenshot capture (mss), OCR, template matching pkg/
├── input/ # Mouse (bezier curves), keyboard, humanization ├── engine/
├── vision/ # Object detection, color analysis, UI element finding │ ├── capture/ Screen capture (window, VM, full screen)
├── state/ # State machine, event bus │ ├── vision/ Template matching, color detection (GoCV)
├── navigation/ # Pathfinding (A*), click-to-move │ ├── input/ Mouse (Bézier curves), keyboard, humanization
└── safety/ # Session timing, break scheduling, pattern randomization │ ├── state/ Game state machine with event callbacks
│ ├── safety/ Session timing, breaks, pattern randomization
games/d2r/ # Diablo II: Resurrected implementation │ ├── navigation/ A* pathfinding, click-to-move
├── config.py # Screen regions, colors, timings (1920x1080) │ └── loot/ Declarative rule-based loot filter
├── game.py # Main bot loop & orchestration ├── plugin/ Game plugin interface
├── screens/ # State detection (menu, in-game, inventory) ├── api/ REST + WebSocket API
├── routines/ # Farming routines (Mephisto, Pindle, Countess) └── auth/ License/account validation
└── templates/ # UI template images for matching plugins/d2r/ D2R game plugin
web/ React dashboard (planned)
ui/ # Web dashboard (FastAPI) — planned
config/ # YAML configuration
``` ```
## Key Design Decisions ## Plugin System
1. **Screen reading only** — captures screenshots, analyzes pixels/templates, never touches game memory All game logic is behind interfaces in `pkg/plugin/plugin.go`:
2. **Human-like input** — Bezier mouse curves, randomized delays, micro/long breaks, fatigue simulation - `Plugin` — main entry point, returns detector/reader/routines
3. **Reusable engine** — adding a new game = new `games/<name>/` directory implementing game-specific detection - `GameDetector` — detect state from screenshots
4. **Anti-detection** — session timing, route randomization, behavioral variation, configurable break schedules - `ScreenReader` — extract items, enemies, text
5. **Configuration-driven** — YAML configs for all tunable parameters - `Routine` — automated farming sequences (context-aware, cancellable)
- `LootFilter` — item pickup rules
## Tech Stack - `EngineServices` — engine capabilities provided to plugins
- Python 3.11+
- OpenCV (template matching, color detection)
- pytesseract (OCR)
- mss (fast screenshot capture)
- pyautogui/pynput (input simulation)
- FastAPI (dashboard — planned)
## Development Conventions ## Development Conventions
- Type hints on all functions - Go standard project layout
- Docstrings on all modules, classes, public methods - Type hints / godoc on all exported types
- `black` formatting - `gofmt` formatting
- Tests in `tests/` - Tests in `*_test.go` files alongside code
- Feature branches → main - Feature branches → main
- Always commit and push after changes
## Git Workflow ## Git Workflow
```bash ```bash
cd /home/openclaw/.openclaw/workspace/projects/iso-bot cd /home/openclaw/.openclaw/workspace/projects/iso-bot
git checkout -b feature/<name>
# ... work ...
git add -A && git commit -m "descriptive message" git add -A && git commit -m "descriptive message"
git checkout main && git merge feature/<name> GIT_SSH_COMMAND="ssh -o StrictHostKeyChecking=no" git push origin main
git push origin main
``` ```
## Current Status ## Current Status
- ✅ Project structure created - ✅ Go project structure
- ✅ Core engine modules with meaningful stubs - ✅ Plugin interface system (`pkg/plugin`)
- ✅ D2R config (screen regions, colors, timings) - ✅ Engine: capture, vision, input/humanize, state, safety, loot filter
- ✅ D2R farming routines (Mephisto, Pindle, Countess) — stub phase - ✅ API server skeleton (REST endpoints)
- ⏳ Remote git repo (needs write-access token) - ✅ D2R plugin: config, detector, reader, Mephisto routine
- ⏳ Implement actual screen detection logic - ✅ Declarative loot filter with YAML rules
- ⏳ Implement input simulation - ⏳ Platform-specific capture backends (Win32, X11)
- ⏳ Template image collection - ⏳ GoCV integration for actual vision processing
- ⏳ Web dashboard - ⏳ Platform-specific input backends
- ⏳ Testing - ⏳ Remaining D2R routines (Pindle, Countess)
- ⏳ Web dashboard (React)
- ⏳ Account/license system
- ⏳ Multi-instance support
- ⏳ Tests
## Next Steps (Priority Order) ## Next Steps (Priority Order)
1. Get remote repo set up and push 1. GoCV integration — make vision pipeline actually work
2. Implement `engine/screen/capture.py` — verified working on target machine 2. Platform capture backends — Windows (BitBlt/DXGI) and Linux (X11)
3. Implement `engine/input/mouse.py` — Bezier curve mouse movement 3. Platform input backends — Windows (SendInput) and Linux (uinput)
4. Implement `engine/input/keyboard.py` — human-like key presses 4. D2R detector implementation — health orb reading, menu detection
5. D2R menu detection (character select, create game) 5. D2R Mephisto routine — complete implementation
6. D2R basic Mephisto run (teleport → kill → loot → exit) 6. WebSocket real-time status streaming
7. Loot detection and filtering 7. React dashboard
8. Web dashboard for monitoring 8. Pindle + Countess routines
9. Account system + licensing
## Notes ## Key Design Decisions
- Bot runs on a separate machine or VM from the game client 1. **Go over Python** — performance for real-time capture+vision at 30+ FPS
2. **Plugin system** — engine is game-agnostic, new game = new plugin
3. **VM isolation** — engine on host, game in VM, zero detection surface
4. **Declarative loot** — YAML rules, user-customizable via web UI
5. **Single binary** — engine + API in one Go binary, easy distribution
6. **Human-like input** — Bézier curves, fatigue, breaks, route randomization
## D2R Notes
- Target resolution: 1920x1080 - Target resolution: 1920x1080
- Primary farming character: Sorceress with Teleport - Primary farming character: Sorceress with Teleport
- The user plays D2R (Necro "Baltasar", Summoner build) — knows the game well - Key routines: Mephisto (moat trick), Pindle (fastest), Countess (runes)
- Screen regions and HSV colors defined in `plugins/d2r/config.go`
- The user plays D2R actively and knows the game well