[ASM] Notes on Calling Subroutines

I won’t list long theories here. Instead, I will show how a real compiler does. I will take GCC as an instance here.

  1. Branch Code
    ============================
    (1) near call
    test1.c

    arm-none-eabi-gcc -fomit-frame-pointer -c test1.c -S -o test1.s

(2) long call
void sub_80077E8(s16 *mempool);
void call_sub_80077E8(s16 *mempool)
{
sub_80077E8(mempool);
}


hint:
thumb → arm: bx pc
arm → thumb: bx 90008b5h

#pragma long_calls
void sub_8007824(s16 *mempool);
#pragma long_calls_off
void call_sub_8007824(s16 *mempool)
{
	sub_8007824(mempool);
}

(3) call thumb from arm
(copied from Nintendo’s official SDK)

Hint: why mov lr,pc here?

test2A.s

test2callA.s

arm-none-eabi-gcc -fomit-frame-pointer -s test2A.s test2callA.s
arm-none-eabi-objdump -S a.out

WRONG!
Hint: why?
change test2A.s

RIGHT:

(4) call arm from thumb
test2A.s

test2callA.s

the result:

(5) function pointer
test3.c

arm-none-eabi-gcc -fomit-frame-pointer -mthumb -c test3.c -S

(6) callback
test4.c

arm-none-eabi-gcc -fomit-frame-pointer -mthumb -c test4.c -S


  1. Arguments, Return Value and Local Variables
    ===================
    (1) at most 4 parameters
    test5.c

    arm-none-eabi-gcc -fomit-frame-pointer -mthumb -c test5.c -S


    Hint:
    a1: r0->[sp,#4]->r3->r0=>r0->[sp,#4]
    a2: r1->[sp]->r2->r1=>r1->[sp]

(2) more than 4 parameters
test6.c

arm-none-eabi-gcc -fomit-frame-pointer -mthumb -c test6.c -S


Hint:
a1: 1->r3->[sp,#44]->r0=>r0->[sp,#12]->r2
a4: 4->r3->[sp,#32]->r4->r3=>r3->[sp]->r3
a5: 5->r3->[sp,#28]->r3->[sp]=>[sp,#16]->r3
a7: 7->r3->[sp,#20]->r3->[sp,#8]=>[sp,#24]->r3

(3) parameter array
test7.c

test7.s



Hint: Is this right?

(4) pointer parameter
test8.c

test8.s


Hint: the address is passed to the subroutine instead of the value
Is this right?

(5) multi return values
(only in assembly)
test9.s

bios function

(6) return pointer
test10.c

test10.s


Hint: Is this right?

(7) static variables
test11.c

test11.s



Hint: what’s the difference from local variables?

(8) global variables
test12.c

test12.s

Hint: what’s the difference from static variables?

(9) static global variables
test13.c

test13.s

Hint: what’s the difference from global variables?

(10) register variables
test14.c

test14.s

Hint: notice the difference between the variable s and i

(11) inline asm
test15.c

test15.s

(12) frame pointer
test5.c

arm-none-eabi-gcc -c test5.c -S
test5.s

3 Likes

Note that bx pc in THUMB only works from a word-aligned address, and then the next two bytes will also be skipped over. According to GBATEK:

For BX/BLX, when Bit 0 of the value in Rs is zero:

  Processor will be switched into ARM mode!
  If so, Bit 1 of Rs must be cleared (32bit word aligned).
  Thus, BX PC (switch to ARM) may be issued from word-aligned address
  only, the destination is PC+4 (ie. the following halfword is skipped).

Basically, reading from the PC register gives you a value that is 2 instructions ahead of the one being executed, because of how the pipeline works. The BX works by actually copying that value back in, so in a roundabout way it jumps 2 instructions = 4 bytes ahead (while also changing the mode). But it has to jump to a word-aligned address because ARM code is word-aligned.

2 Likes

This document is super cool! I don’t know what does it mean “ldr ip, [pc]” at 0x8130. Does pc store a pointer? :thinking:

I am a newbie of this, so the question may seem to be silly :saluting_face:

Can you explain this part to me? Thanks a lot :grin:

Program counter - Wikipedia
ARM7TDMI Technical Reference Manual r4p1

Thanks, but I still don’t understand the meaning of “ldr ip,[pc]” at 0x8130 :hushed:

For example, if current pc value is 0x00008130, will ip change to 0xe12ff11c? But if so, the next instruction “bx ip” at 0x8134 seems to make a weird jump? :thinking:

You confuse different concepts: offset vs address.

0x8130 here is an offset. The code will be loaded to ROM region 0x8000000~0x9FFFFFF before running, so the address is 0x8008130.

Oh, sorry for being inaccurate. :flushed: Actually I know the memory map of a GBA. I just don’t understand the meaning of “ldr ip,[pc]”.

Does it mean “Load 4 bytes to ip from the address contained in pc” ?:thinking: But if so, how can “bx ip” branch to the right place :hushed:

It is because entry point address is not set when building. It doesn’t harm because the topic is not specified to GBA. Set the entry point address to 0x8000000 and it will jump to the correct address on GBA.

1 Like

Oh, thank you :hushed:

Maybe I should learn something about building :thinking:

Anyway, thanks a lot :grin: