Runtime Loader Exported Symbols Address Size

Chris Johns chrisj at rtems.org
Sat Jun 18 01:12:48 UTC 2022


On 18/6/22 10:08 am, Joel Sherrill wrote:
> On Fri, Jun 17, 2022, 2:28 PM Alex White <alex.white at oarcorp.com
> <mailto:alex.white at oarcorp.com>> wrote:
> 
>     Hi,
> 
>     While testing libdl on AArch64 QEMU, we found a bug where the exported
>     symbol table appears to always use 32-bit values for addresses, but where
>     the exported symbols table is read in `rtems_rtl_symbol_global_add`, the
>     addresses are expected to be of size `sizeof(unsigned long)`.
> 
>     This did not cause a problem on ARM since `sizeof(unsigned long)` is 4, but
>     with AArch64 `sizeof(unsigned long)` is 8. So it is trying to read the
>     address as 64 bits instead of 32.
> 
>     The simple fix is to use an exact-width integer type - like
>     `sizeof(uint32_t)`. But this would not allow for 64-bit architectures to use
>     the full address space. It would also break on 64-bit RISC-V (see below).
>     Perhaps we could have an ifdef to choose a 32 or 64-bit type based on the
>     architecture rather than relying on `sizeof(unsigned long)`?
> 
>     It looks like there is an exception in the rtems-syms tool for 64-bit RISC-V
>     to emit 64-bit addresses rather than 32-bit addresses. Is this the right
>     solution? Should we add another exception for AArch64?
> 
> After having given this some thought, I think the code that is generated and the
> parsing code in rtl-sym.c should be conditionalized on the GCC CPP predefine
> that indicates the size of a void *.

Good idea. The symbol value is held as a `void*` in the symbol table so the key
is the generated asm size in rtems-syms.cpp matches the runtime `void*` size.
That code maybe conditional on the size selecting the correct variable size.

> The code in rtl-sym.c should use a type definition for the type that the sizeof
> is operating on. Using unsigned long is just not good enough. Both cases should
> be fixed width types.

Sure `unsigned long` may not be the best type to use. It was selected because it
happened change size with the arch but I am sure there are types that are better
suited. The type needs to be the size of a full address space pointer.

> There also should be a very good comment above the parsing method in rtl-sym.c
> explaining where the file comes from that it is reading and why there is the
> need for a type which varies by architecture.

Sure, happy to see comments added. Is this something you would like me to handle?

The generation and source of the symbols can be handled in a number of ways. I
have provided a tool that should cover most use cases however I can imagine
more, eg symbol filters for defined exported symbols. I documented things here:

https://docs.rtems.org/branches/master/user/exe/loader.html#base-image-symbols

Does that documentation help cover what you are looking for?

> Based on our current understanding, I think this address is everything and
> leaves sufficient information and organization to cover all the cases in the
> future. Very likely without anyone actively having to fix anything. Running into
> this issue again and again as architectures are added is not a good long-term
> solution.

Agreed. Figuring out stable 64bit addressing in libdl is a good idea and
welcome. I think the main requirement is for the full address range to be supported.

> Chris.. is this a good path to you? 

Yes this sounds good.

> Is there somewhere else in the dynamic loader system with a similar issue?

This is the only external interface of this type where static checking does not
help. The other is ELF which I hope is OK. I am not sure if other places may
have a problem. All care was taken however only using the code with 64bits will
determine this.

The elftools libelf support was ported and tested with only 32bit targets. I am
not sure if there are issues there.

Chris


More information about the devel mailing list