How not to do indirect calls on ARM

The Thumb2 instruction encoding can reduce your code size by up to 30%, so it is nice if you’re writing assembler code to use the unified syntax which will allow your code to be encoded as either ARM or Thumb2 seamlessly. Or almost seamlessly – even though the unified syntax is the same for ARM and Thumb2 encodings, there are still some things you have to bear in mind when writing code for Thumb2.

For example, the following code implements an indirect call:

       mov     lr, pc         @ Save return address in lr
       ldr     pc, [sp], #8   @ Load function address

This code will not work correctly on Thumb2. Switching mode from ARM to Thumb happens when an address with the bottom bit set is jumped to, this can be by a branch, an ldr or a mov. However the pc does not store the Thumb mode bit, so moving from the pc as we do in the first instruction above will not capture the information that this is a return to a Thumb function. A better sequence would be:

       ldr     lr, [sp], #8   @ Load function address
       blx     lr             @ Call function

The blx instruction is only available on ARMv5 and above however, so if you need portability to historical ARM cores you may need to do a bit more work.

ARM C++ constructors are different

Programmers who have, like me, come to ARM from other architectures may find one or two things surprising.

For example, the following code is quite simple C++:

class A {
public:
    A() {}
    ~A() {}
};

A a;

int main(void)
{
    return 0;
}

But if we compile it and examine it with gdb, there’s something a bit unexpected:

$ gcc -g constructor.cpp -o constructor
$ gdb ./constructor 
GNU gdb (GDB) 7.5.91.20130417-cvs-ubuntu
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "arm-linux-gnueabihf".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /home/linaro/constructor...done.
(gdb) ptype A::A
type = class A {
  public:
    A(void);
    ~A(int);
} *(A * const)
(gdb) ptype A::~A
type = void *(A * const)
(gdb)

The types of the constructor and destructor are not quite what we might expect. Traditionally a C++ constructor or destructor does not return a value, however on ARM things are different – the constructor returns a pointer to class A, and the destructor returns a pointer to void.

Why is this the case? On ARM constructors and destructors are specified differently in order to provide scope for optimizing calls to a chain of constructors or destructors while minimizing the pushing stack frames (tail call optimization). There’s a very helpful document available here called the C++ ABI for the ARM Architecture which details the differences between the ARM ABI and the Generic GNU C++ ABI including this little quirk.