Fire Emblem GBA - Decompilations

Eebit · September 13, 2024, 4:38am

FE GBA Decomp Portal

What is a decompilation?

Decompilation is a form of reverse engineering. Specifically, it’s the process of turning compiled code in its machine-readable format back into equivalent, human-readable source code that a programmer could have written. Decompilation projects are useful for understanding how games work, and for documentation for making modifications.

The GBA ROMs are compiled binary files that are made up of code and data. The CPU executes machine code, which we can represent in an intermediate form called “assembly language” (or colloquially, ASM in the community). This first step is called “disassembly”.

From there, it’s our task as reverse engineers to, for a given function, figure out exactly what C code would have been written to produce that assembly. We call this “decompiling” the code, or reversing the process of compiling code to assembly.

Here’s a three-step example of the function GetUnitFogViewRange (FE8U:0x080178A8):

1. Machine code-level hexadecimal representation of a function, as seen from a hex editor:

You’ll see that this binary/hexadecimal representation is not readable by us. It’s not meant to be! It’s still possible to make modifications in this form (such as, “go to this particular offset and edit the number there”), but you have to be very careful and very particular when you’re poking around at this level.

00 B5 02 1C 09 48 43 7B 10 68 51 68 80 6A 89 6A
08 43 08 21 08 40 00 28 00 D0 05 33 10 1C 31 30 
00 78 00 07 00 0F 18 18 02 BC 08 47 F0 BC 02 02

2. Equivalent THUMB assembly code:

This is the THUMB assembly code generated by the compiler, which represents low-level instructions the CPU directly executes. At this stage, we can already recognize certain patterns like function calls and arithmetic operations (especially if you’ve read Tequila’s GBAFE Assembly for Dummies, by Dummies), but it’s still not as readable as high-level C code.

	.globl	GetUnitFogViewRange
	.type	 GetUnitFogViewRange,function
	.thumb_func
GetUnitFogViewRange:
	push	{lr}
	add	r2, r0, #0
	ldr	r0, _080178D4	@ gPlaySt
	ldrb	r3, [r0, #0xd]
	ldr	r0, [r2]
	ldr	r1, [r2, #0x4]
	ldr	r0, [r0, #0x28]
	ldr	r1, [r1, #0x28]
	orr	r0, r0, r1
	mov	r1, #0x8
	and	r0, r0, r1
	cmp	r0, #0
	beq	_80178C4	@cond_branch
	add	r3, r3, #0x5
_80178C4:
	add	r0, r2, #0
	add	r0, r0, #0x31
	ldrb	r0, [r0]
	lsl	r0, r0, #0x1c
	lsr	r0, r0, #0x1c
	add	r0, r3, r0
	pop	{r1}
	bx	r1
	.align	2, 0
_080178D4:
	.word	gPlaySt

3. C code (from the Fire Emblem 8 decompilation):

Finally, we get to the high-level C code. This is easier to read and modify - we don’t need to worry about things like what specific byte offset we are reading from, or how many bytes we need to load. Variable names like unit or torchDuration make the code’s purpose clearer. We can see what the function is intended to do, what it accepts as a parameter, and what it returns to the caller.

https://github.com/FireEmblemUniverse/fireemblem8u/blob/12004dfd31c8f860804b811fc110394747dbcd12/src/bmunit.c#L370

int GetUnitFogViewRange(struct Unit * unit)
{
    int result = gPlaySt.chapterVisionRange;

    if (UNIT_CATTRIBUTES(unit) & CA_THIEF)
        result += 5;

    return result + unit->torchDuration;
}

I like to think of decompilation as a bit like doing a puzzle. You can look at the final picture (the ASM code) and piece-by-piece try to add C code that brings the pieces together, until you eventually reach the point where the function matches.

For our purposes, what we are working on is called a matching decompilation. Matching decompilation means we’re trying to write C code that, when recompiled, produces assembly code that is identical to the original machine code from the game.

We know what compiler was used for the GBA Fire Emblem games (my understanding is that FE8 uses a modified gcc 2.95.1; we call it “agbcc”). By using this same compiler, we can ensure our recompiled code produces assembly identical to the original game’s binary.

Okay, where does that get us?

One great benefit to the decompilations is that they bring together a lot of community documentation in one central place. We try to use standardized names for functions and data pointers based on shared knowledge from the community’s notes over time.

Standardized names can help developers and modders refer to the same function across different hacks, which makes collaboration smoother and reduces confusion.

The FE8U decomp project outputs a symbols file every time a new change is pushed to the master branch. This shows the addresses of every named symbol that has been identified in the game’s code. This symbols file can be found here: https://fireemblemuniverse.github.io/fireemblem8u/symbols.txt

It’s a lot easier to read C code than ARM/THUMB representations of the same code – there are variable names rather than dealing directly with register usage. We don’t have to worry about byte offsets, pushing/popping registers, or other low-level concerns.

Tooling such as Nat’s lyn (thread) provides wonderful functionality for “auto-hooking” functions. This means if you write code that uses the same name as a function defined in the decompilation, lyn will automatically create a hook point at the original function to “replace” it with your custom code.

In simpler terms, auto-hooking allows you to replace a function in the original game with your own code, without needing to rebuild the whole game.

Vesly has written a tutorial for getting up and running quickly on this setup: C Setup for Dummies

In the future, once the code and data are fully decompiled, it would become possible to make modifications to the source code that automatically reflect across the entire codebase. For example, imagine wanting to give units a 6th inventory slot. With the source code fully decompiled, you wouldn’t need to manually update every location in assembly where the unit’s inventory is referenced; changes would propagate automatically.

How can I help out?

Currently, the code for Fire Emblem 8 US is very near completion. Fire Emblem 6 is about halfway done in terms of code, and Fire Emblem 7 (J) is in its early days.

There are plenty of ways to contribute, even if you’re not deeply experienced. For instance, helping with documentation (such as renaming existing unnamed functions, or adding comments to improve clarity) can make a big difference, and it’s a great way to get familiar with the project!

Laqieer has created a very helpful FE GBA Function Library which features a map of functions and their addresses across the games and versions. If you’d like to get started, you could help port functions from the more complete decompilations to the other games.

For getting started with function decompilation, you can check out the CONTRIBUTING.md guide in the Fire Emblem 8 repository.

Most of the non-code data (for example, graphics data) for all of the games still needs to be named and extracted from binary data into a processable form.

If you’re interested, the FEU Discord server has a channel called #decomp, which is primarily where collaboration and discussion around the decomp takes place. We’re always happy to help with getting started!

Resources

JesterEmblem_EX · September 14, 2024, 11:53pm

Bumping this so it doesn’t get burried. As someone who’s delved headlong into using the decomp for stuff, the ease with which people can make edits to gameplay mechanics now cannot be understated.

I was hesistant to stray from ASM after spending several months of trial and error learning it, but now I’ve seen the light.