diff --git a/projects/iso-bot b/projects/iso-bot index e0282a7..3b36319 160000 --- a/projects/iso-bot +++ b/projects/iso-bot @@ -1 +1 @@ -Subproject commit e0282a7111bc428438efd38fac178cfdb8c16a40 +Subproject commit 3b363192f29393abe51cad2293a57a590476faf6 diff --git a/skills/iso-bot/SKILL.md b/skills/iso-bot/SKILL.md index d47abdd..cec7406 100644 --- a/skills/iso-bot/SKILL.md +++ b/skills/iso-bot/SKILL.md @@ -3,90 +3,104 @@ ## Overview Screen-reading bot engine for isometric games. First implementation: Diablo II: Resurrected. -**Approach:** Screen capture + computer vision + human-like input simulation. No memory injection, no hooking, no client modification. +**Approach:** Screen capture + computer vision + human-like input simulation. No memory injection, no hooking, no client modification. Engine runs on host, game runs in VM for detection isolation. ## Repository - **Local:** `/home/openclaw/.openclaw/workspace/projects/iso-bot` -- **Remote:** `git@git.cloonar.com:openclawd/iso-bot.git` (pending repo creation) +- **Remote:** `ssh://forgejo@git.cloonar.com/openclawd/iso-bot.git` + +## Tech Stack +- **Engine:** Go 1.23+ +- **Vision:** GoCV (OpenCV bindings for Go) +- **Screen capture:** Platform-native (Win32 API / X11) +- **Input simulation:** Platform-native (SendInput / uinput) +- **API:** net/http + gorilla/websocket (REST + WS) +- **Dashboard:** React + TypeScript (planned) +- **Config:** YAML +- **Loot filter:** Declarative YAML rule engine ## Architecture ``` -engine/ # Reusable core — game-agnostic -├── screen/ # Screenshot capture (mss), OCR, template matching -├── input/ # Mouse (bezier curves), keyboard, humanization -├── vision/ # Object detection, color analysis, UI element finding -├── state/ # State machine, event bus -├── navigation/ # Pathfinding (A*), click-to-move -└── safety/ # Session timing, break scheduling, pattern randomization - -games/d2r/ # Diablo II: Resurrected implementation -├── config.py # Screen regions, colors, timings (1920x1080) -├── game.py # Main bot loop & orchestration -├── screens/ # State detection (menu, in-game, inventory) -├── routines/ # Farming routines (Mephisto, Pindle, Countess) -└── templates/ # UI template images for matching - -ui/ # Web dashboard (FastAPI) — planned -config/ # YAML configuration +cmd/iso-bot/ Single binary entry point +pkg/ +├── engine/ +│ ├── capture/ Screen capture (window, VM, full screen) +│ ├── vision/ Template matching, color detection (GoCV) +│ ├── input/ Mouse (Bézier curves), keyboard, humanization +│ ├── state/ Game state machine with event callbacks +│ ├── safety/ Session timing, breaks, pattern randomization +│ ├── navigation/ A* pathfinding, click-to-move +│ └── loot/ Declarative rule-based loot filter +├── plugin/ Game plugin interface +├── api/ REST + WebSocket API +└── auth/ License/account validation +plugins/d2r/ D2R game plugin +web/ React dashboard (planned) ``` -## Key Design Decisions -1. **Screen reading only** — captures screenshots, analyzes pixels/templates, never touches game memory -2. **Human-like input** — Bezier mouse curves, randomized delays, micro/long breaks, fatigue simulation -3. **Reusable engine** — adding a new game = new `games//` directory implementing game-specific detection -4. **Anti-detection** — session timing, route randomization, behavioral variation, configurable break schedules -5. **Configuration-driven** — YAML configs for all tunable parameters - -## Tech Stack -- Python 3.11+ -- OpenCV (template matching, color detection) -- pytesseract (OCR) -- mss (fast screenshot capture) -- pyautogui/pynput (input simulation) -- FastAPI (dashboard — planned) +## Plugin System +All game logic is behind interfaces in `pkg/plugin/plugin.go`: +- `Plugin` — main entry point, returns detector/reader/routines +- `GameDetector` — detect state from screenshots +- `ScreenReader` — extract items, enemies, text +- `Routine` — automated farming sequences (context-aware, cancellable) +- `LootFilter` — item pickup rules +- `EngineServices` — engine capabilities provided to plugins ## Development Conventions -- Type hints on all functions -- Docstrings on all modules, classes, public methods -- `black` formatting -- Tests in `tests/` +- Go standard project layout +- Type hints / godoc on all exported types +- `gofmt` formatting +- Tests in `*_test.go` files alongside code - Feature branches → main +- Always commit and push after changes ## Git Workflow ```bash cd /home/openclaw/.openclaw/workspace/projects/iso-bot -git checkout -b feature/ -# ... work ... git add -A && git commit -m "descriptive message" -git checkout main && git merge feature/ -git push origin main +GIT_SSH_COMMAND="ssh -o StrictHostKeyChecking=no" git push origin main ``` ## Current Status -- ✅ Project structure created -- ✅ Core engine modules with meaningful stubs -- ✅ D2R config (screen regions, colors, timings) -- ✅ D2R farming routines (Mephisto, Pindle, Countess) — stub phase -- ⏳ Remote git repo (needs write-access token) -- ⏳ Implement actual screen detection logic -- ⏳ Implement input simulation -- ⏳ Template image collection -- ⏳ Web dashboard -- ⏳ Testing +- ✅ Go project structure +- ✅ Plugin interface system (`pkg/plugin`) +- ✅ Engine: capture, vision, input/humanize, state, safety, loot filter +- ✅ API server skeleton (REST endpoints) +- ✅ D2R plugin: config, detector, reader, Mephisto routine +- ✅ Declarative loot filter with YAML rules +- ⏳ Platform-specific capture backends (Win32, X11) +- ⏳ GoCV integration for actual vision processing +- ⏳ Platform-specific input backends +- ⏳ Remaining D2R routines (Pindle, Countess) +- ⏳ Web dashboard (React) +- ⏳ Account/license system +- ⏳ Multi-instance support +- ⏳ Tests ## Next Steps (Priority Order) -1. Get remote repo set up and push -2. Implement `engine/screen/capture.py` — verified working on target machine -3. Implement `engine/input/mouse.py` — Bezier curve mouse movement -4. Implement `engine/input/keyboard.py` — human-like key presses -5. D2R menu detection (character select, create game) -6. D2R basic Mephisto run (teleport → kill → loot → exit) -7. Loot detection and filtering -8. Web dashboard for monitoring +1. GoCV integration — make vision pipeline actually work +2. Platform capture backends — Windows (BitBlt/DXGI) and Linux (X11) +3. Platform input backends — Windows (SendInput) and Linux (uinput) +4. D2R detector implementation — health orb reading, menu detection +5. D2R Mephisto routine — complete implementation +6. WebSocket real-time status streaming +7. React dashboard +8. Pindle + Countess routines +9. Account system + licensing -## Notes -- Bot runs on a separate machine or VM from the game client +## Key Design Decisions +1. **Go over Python** — performance for real-time capture+vision at 30+ FPS +2. **Plugin system** — engine is game-agnostic, new game = new plugin +3. **VM isolation** — engine on host, game in VM, zero detection surface +4. **Declarative loot** — YAML rules, user-customizable via web UI +5. **Single binary** — engine + API in one Go binary, easy distribution +6. **Human-like input** — Bézier curves, fatigue, breaks, route randomization + +## D2R Notes - Target resolution: 1920x1080 - Primary farming character: Sorceress with Teleport -- The user plays D2R (Necro "Baltasar", Summoner build) — knows the game well +- Key routines: Mephisto (moat trick), Pindle (fastest), Countess (runes) +- Screen regions and HSV colors defined in `plugins/d2r/config.go` +- The user plays D2R actively and knows the game well