(adaptable to FE6/FE8, just requires different Lua reading files).
First post here! I actually left a comment on a couple different posts¹ over the past couple days because I’ve been working on a project for a course for my Master’s degree and recently just finished it (well, got it to “submit-able” state lol). Anyway… introducing EmblemMind, a shot at creating a self-taught, multi-agent algorithm for FE7. The original intention was to create a dynamic system for FE6-8, but I realized that data-gathering meant specifying for one game. I hope/plan to keep working on this project and improving, but the repo IS public so all my work can be duplicated and worked on, by other creators. I’m also more than willing to allow others write-access to the repo, feel free to DM me.
– Substance of the Project (Sneak Peek; checkout the README) –
Memory Reading & Game State Extraction
The core of EmblemMind is its ability to read the game’s memory state directly from the BizHawk emulator:
Memory Mapping: Uses exact CodeBreaker memory addresses to locate game data structures.
Memory Structure Access:
Character data (0x0202BD50): Stats, position, items, weapon ranks
Enemy units (0x0202CEC0): Same structure as character data
Map terrain (0x0202E3D8): Width, height, and terrain grid
Battle structs (0x0203A3F0): Combat stats and calculations
Turn phase (0x0202BC07): Player (0x00), Neutral (0x40), or Enemy (0x80) phase
Input Control System
The AI controls the game through:
listen_input.lua: Script running in BizHawk that executes joypad inputs
Keyboard Automation: Alternative input method using the keyboard Python package
Action Coordination: Manages input timing and sequences for complex game actions
It’s an experimental machine learning program that plays FE7 for you while not acting stupid like the ingame AI.
I assume it uses machine learning instead of behavior trees because 1. it’s easier and 2. it’s more feasibly adaptable to romhacks and their gameplay changes.
This uses a very negligible amount of energy (compared to big LLMs, you still need to leave your computer on and running the game to train it) and learns by doing actions until it reads a win state, like those funny Mario neural network experiments that used to be popular
Edit: On the note of romhacks, it might be pertinent to leave room for the ability to dynamically read the unitmenu in case of combat arts and such
Can you go into more details about the heuristic function you used? Did you infer it via PyTorch, or did you prime it based on remaining enemy units, etc?
What do you think the advantages of are of using Monte Carlo search instead of, e.g. AB pruning?
Yessirrr! (Sit down, it’s gunna be a bit of a longer answer…) Yeah, so the vision is a “multi-agent, game-independent, dynamic system” (or in less fancy terms, a bot that uses multiple different things to play FE6-8, without requiring game-specific code). Currently, I’m using a bit of a custom heuristic/forcing function, because I’m endeavoring to force behavior, in the training stage. I want the AI to be as near-perfect as possible (in the end), which includes route/pathing optimization, as well as damage minimization. However, because of this logic (specifically the “minimize damage taken” logic), the AI prioritizes moving out of the way of enemies, even if it understands the goal of the level to be defeat all enemies (which, right now, is hardcoded as the goal state. There’s more complex ways to get around this, like RAM hacking to detect more complex tiles and actions, like “seize” etc.), but for the time being, it needs to learn that attacking enemies is GOOD, but attacking enemies SMARTLY is better. This comes with some custom code that the agent probably COULD learn with time (stuff like the weapon triangle), but there IS some hardcoded logic to favor advantageous things (like forest terrain, etc.). The current heuristic is essentially a limited-depth search (esque) MCTS, where the agent plays out possible moves and situations by “probing” the player units (RAM doesn’t update until you click a unit AND move the cursor, hence you’d see in the README gif demo that it clicks and temp-moves the players a bunch of times). This allows the AI to score different types of moves, favoring those with weapon triangle bonuses, terrain bonuses, etc.