IntSys pls | Vanilla ASM Goof Thread

If you look through a game’s routines long enough you’re bound to find something wrong. Maybe a register was clobbered. Maybe something was written twice when it didn’t need to be. Maybe something goes horribly wrong and just happens to work out, disaster narrowly avoided. It’s likely that your average player (and/or your average hacker) might never see these goofs, so I figured it’s time to have a thread for them.

Post whatever bugs, compiler goofs, etc. here. If anything this thread will keep me from pasting random snippets of ASM into the discord server, and hopefully this thread’ll be a good read.


Let’s get into it. I like to play around with FE5, so these examples will be 65816 assembly. I’d like to explain each of these so that someone with little to no ASM experience could hopefully follow along.

#Block Transfer To/From Anywhere

65816 has two block memory transfer opcodes, MVP and MVN, which are similar to THUMB’s ldmia+stmia for transferring data. The opcodes include the bank (the upper 8 bits of a pointer on the 65816) for both the source and the destination, with the number of bytes to transfer and the lower 16 bits of the source and destination in CPU registers. This poses a bit of a problem, as you can only copy data to/from locations known at compile time (you need to know the banks to write the opcode). To overcome this, FE5 has routines that build a block transfer routine in RAM. When you need to copy data, you fill out the banks of the opcode in RAM and hop to the routine. It’s quite clever in my opinion.

The first byte of the MVN/MVP opcodes are written to RAM on startup, along with another opcode to return from the routine:

Routine (written for the 64tass assembler)
blockcopy_copier
	phb 
	php 
	phk 
	plb 
	sep #$20
	ldx #size(mvn_routine) - 1

-
	lda mvn_routine,x
	sta $04AE,x
	dex 
	bpl -
	ldx #size(mvp_routine)

-
	lda mvp_routine,x
	sta $04B2,x
	dex 
	bpl -
	plp 
	plb 
	rts 

mvn_routine
	mvn #$00,#$00
	rts 

mvp_routine
	mvp #$00,#$00
	rts 

unknown_routine
	phb 

If you don’t know 65816, this might be mumbo jumbo to you, so let’s break it down. We’re copying two routines, mvn_routine and mvp_routine, to RAM addresses $0004AE and $0004B2 respectively. We copy them end first, using a loop counter in the X register. We need this counter to be one byte less than the size of the routine because we’re looping with a BPL opcode (0 is considered positive). After each byte, we decrement the loop counter.

MVN Transfer breakdown
X    Byte Part
0003 60   rts
0002 00   destination bank
0001 00   source bank
0000 54   mvn
FFFF      end loop

Here’s the issue: when copying the MVP routine, the size wasn’t reduced by one, so the first byte of the next routine (a phb opcode) gets copied into RAM at $0004B6, overwriting whatever was there accidentally. Man, that’s a huge explanation for such a tiny thing, right? So, what was originally at $0004B6? $0004B6 is used exactly once when setting up the sound system, and probably didn’t even need to be used. Lucky us, nothing of value was lost. Even better, the only known routines that use this block memory copier look like this:

MVN Routine user (64tass syntax)
    phb 
    php 

    ; program bank -> data bank

    phk 
    plb 
    phx 
    phy 

    ; get the source, dest banks and
    ; build the mvn opcode

    sep #$20
    lda $04AB ; dest bank
    sta $04AF
    lda $04A8 ; source bank
    sta $04B0
    lda #$54 ; mvn opcode
    sta $04AE
    lda #$60 ; rts opcode
    sta $04B1
    rep #$20

    ; get params

    ldx $04A6 ; source
    ldy $04A9 ; dest
    lda $04AC ; size
    dec a

    ; cool trick, can rts
    ; because $0000-$2000
    ; of RAM mirrored to
    ; every bank

    jsr $04AE
    ply 
    plx 
    plp 
    plb 
    rtl 

These rewrite the entire MVN/MVP routines anyway! $0004B6 was clobbered needlessly!

There’s some interesting other things to consider: The way the routine user loads the parts of the routine as literals and writes them to fixed points in RAM is faster than the startup routine. The startup routine would probably be faster if it actually used MVN/MVP to copy the MVN/MVP routines. And, finally, none of these seem to be called.

Same thing happens in FE4 in the same place, too.


9 Likes

This is a really weird way to structure a loop:

ASM dump
080249AE 4806   LDR r0, [PC, #0x18] # pointer:080249C8 -> 03004E50 (Pointer to the work memory of the operation character )
080249B0 6801   LDR r1, [r0, #0x0] # pointer:03004E50 (Pointer to the work memory of the operation character ) r0=Unit
080249B2 68CA   LDR r2, [r1, #0xC] r1=Unit
080249B4 2040   MOV r0, #0x40
080249B6 4010   AND r0 ,r2
080249B8 2800   CMP r0, #0x0      //moved this turn?
080249BA D12F   BNE #0x8024A1C
    080249BC 2080   MOV r0, #0x80
    080249BE 0100   LSL r0 ,r0 ,#0x4  //800 - ballista
    080249C0 4002   AND r2 ,r0
    080249C2 2A00   CMP r2, #0x0
    080249C4 D004   BEQ #0x80249D0  //if 0 then proceed
        080249C6 E029   B 0x8024A1C  //else exit; ballista attack is separate command
080249C8 4E50 0300   //LDRDATA
        080249CC 2001   MOV r0, #0x1
        080249CE E026   B 0x8024A1E
    080249D0 2600   MOV r6, #0x0
    080249D2 8BCC   LDRH r4, [r1, #0x1E] //r1=Unit 0x1E=1st item 
    080249D4 2C00   CMP r4, #0x0
    080249D6 D021   BEQ #0x8024A1C
        080249D8 1C20   MOV r0 ,r4
        080249DA F7F2 FDC7   BL 0x0801756C   //GetItemAttributes 
        080249DE 2101   MOV r1, #0x1
        080249E0 4001   AND r1 ,r0
        080249E2 2900   CMP r1, #0x0
        080249E4 D00F   BEQ #0x8024A06
            080249E6 4D0F   LDR r5, [PC, #0x3C] # pointer:08024A24 -> 03004E50 (Pointer to the work memory of the operation character )
            080249E8 6828   LDR r0, [r5, #0x0] # pointer:03004E50 (Pointer to the work memory of the operation character ) r5=Unit
            080249EA 1C21   MOV r1 ,r4
            080249EC F7F1 FEB0   BL 0x08016750   //CanUnitUseWeapon 
            080249F0 0600   LSL r0 ,r0 ,#0x18
            080249F2 2800   CMP r0, #0x0
            080249F4 D007   BEQ #0x8024A06
                080249F6 6828   LDR r0, [r5, #0x0] # pointer:03004E50 (Pointer to the work memory of the operation character ) r5=Unit
                080249F8 1C21   MOV r1 ,r4
                080249FA F000 FBDB   BL 0x080251B4   //MakeTargetListForWeapon 
                080249FE F02B F993   BL 0x0804FD28   //GetTargetListSize Gets list size (used to check for empty lists in usability routines) Number of entries in the list
                08024A02 2800   CMP r0, #0x0
                08024A04 D1E2   BNE #0x80249CC
        08024A06 3601   ADD r6, #0x1
        08024A08 2E04   CMP r6, #0x4
        08024A0A DC07   BGT #0x8024A1C
            08024A0C 4805   LDR r0, [PC, #0x14] # pointer:08024A24 -> 03004E50 (Pointer to the work memory of the operation character )
            08024A0E 6800   LDR r0, [r0, #0x0] # pointer:03004E50 (Pointer to the work memory of the operation character ) r0=Unit r0=Unit
            08024A10 0071   LSL r1 ,r6 ,#0x1
            08024A12 301E   ADD r0, #0x1E
            08024A14 1840   ADD r0 ,r0, R1
            08024A16 8804   LDRH r4, [r0, #0x0] r0=Unit
            08024A18 2C00   CMP r4, #0x0
            08024A1A D1DD   BNE #0x80249D8
08024A1C 2003   MOV r0, #0x3
08024A1E BC70   POP {r4,r5,r6}
08024A20 BC02   POP {r1}
08024A22 4708   BX r1
08024A24 4E50 0300   //LDRDATA

That “unreachable” bit at 249CC gets jumped to as a loop break. It’s shoved in right after the ballista check and just sets the return value to true and jumps right back down to the end of the function. I initially thought it was completely unreachable so it looks stupider than it is, but still, branching to write one byte and then branching back…

3 Likes

#65816 Quirks

The SNES has a 16-bit processor with an interesting property: Software can decide whether the CPU’s three general-purpose registers are 8 or 16 bit. It can change these sizes on the fly through the use of the rep and sep opcodes. Much like the stack, the state of how large the registers are must be restored after actions that change them.

There’s a routine in FE5 that forgets this rule and manages to avoid crashing the game, if only by chance.

Code Snippet

...
	beq _A5A6
...
	bra _A5B6

_A5A6
	sep #$20
	lda #$FE
	sta $51D4
	sta $51D6
	sta $51D8
	sta $51DA

_A5B6
	plx 
	lda #$03
	sta $E2
	lda #$FF
	sta $06BD,x
	plb 
	plp 
	rtl 

I’ve trimmed out the parts of this snippet that aren’t needed to demonstrate what’s happening. On one path this routine can take, it encounters a sep #$20 opcode which sets the accumulator, A, to be 8 bits. It continues executing through _A5B6 as intended. Now, the fun part of being able to change your register sizes is that certain opcodes, such as ones that load literals, also change size to match the register size. Under normal operation, the lda #$03 here loads the byte-sized value $03 into A. Without setting the accumulator to the right size, the other route the routine can take will encounter much different code. Here’s a snippet of what the code looks like from that route:

Bad Intsys

_A5B6
	plx
	lda #$8503
	sep #$A9
	sbc $06BD9D,x
	plb
	plp
	rtl

Luckily all of the pops (plx, plb, plp) are all still there, so there aren’t any stack issues, and it returns fine. The end result is that the delay between button reads when exiting an item selection menu is slightly different if the unit has a weapon.

The plp opcode at the end pops the processor’s state back to what it was at the start of the routine, so it returns with the right sizes.

2 Likes

Big bump, but I ran across a very weird vanilla FE8 moment while fixing an obscure bug

Did you know that the animation graphics for Hammerne/Restore are intentionally filled with a bunch of blank space at the bottom so when they’re decompressed they wipe out exactly half of the BG1 VRAM?

9B0AD0

This cuts off the HP/name boxes while the staff is used, unless units are positioned such that the boxes appear at the middle/bottom of the screen.
That can’t have been intentional, or if it was, it’s a very hacky solution.

6 Likes