<div dir="ltr"><div dir="ltr"><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Fri, Jun 17, 2022 at 8:13 PM Chris Johns <<a href="mailto:chrisj@rtems.org">chrisj@rtems.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On 18/6/22 10:08 am, Joel Sherrill wrote:<br>

> On Fri, Jun 17, 2022, 2:28 PM Alex White <<a href="mailto:alex.white@oarcorp.com" target="_blank">alex.white@oarcorp.com</a><br>

> <mailto:<a href="mailto:alex.white@oarcorp.com" target="_blank">alex.white@oarcorp.com</a>>> wrote:<br>

> <br>

>     Hi,<br>

> <br>

>     While testing libdl on AArch64 QEMU, we found a bug where the exported<br>

>     symbol table appears to always use 32-bit values for addresses, but where<br>

>     the exported symbols table is read in `rtems_rtl_symbol_global_add`, the<br>

>     addresses are expected to be of size `sizeof(unsigned long)`.<br>

> <br>

>     This did not cause a problem on ARM since `sizeof(unsigned long)` is 4, but<br>

>     with AArch64 `sizeof(unsigned long)` is 8. So it is trying to read the<br>

>     address as 64 bits instead of 32.<br>

> <br>

>     The simple fix is to use an exact-width integer type - like<br>

>     `sizeof(uint32_t)`. But this would not allow for 64-bit architectures to use<br>

>     the full address space. It would also break on 64-bit RISC-V (see below).<br>

>     Perhaps we could have an ifdef to choose a 32 or 64-bit type based on the<br>

>     architecture rather than relying on `sizeof(unsigned long)`?<br>

> <br>

>     It looks like there is an exception in the rtems-syms tool for 64-bit RISC-V<br>

>     to emit 64-bit addresses rather than 32-bit addresses. Is this the right<br>

>     solution? Should we add another exception for AArch64?<br>

> <br>

> After having given this some thought, I think the code that is generated and the<br>

> parsing code in rtl-sym.c should be conditionalized on the GCC CPP predefine<br>

> that indicates the size of a void *.<br>

<br>

Good idea. The symbol value is held as a `void*` in the symbol table so the key<br>

is the generated asm size in rtems-syms.cpp matches the runtime `void*` size.<br>

That code maybe conditional on the size selecting the correct variable size.<br>

<br>

> The code in rtl-sym.c should use a type definition for the type that the sizeof<br>

> is operating on. Using unsigned long is just not good enough. Both cases should<br>

> be fixed width types.<br>

<br>

Sure `unsigned long` may not be the best type to use. It was selected because it<br>

happened change size with the arch but I am sure there are types that are better<br>

suited. The type needs to be the size of a full address space pointer.<br></blockquote><div><br></div><div>Glad you hear you agree on the basic approach. The cpp predefine is </div><div>__SIZEOF_POINTER__ and  it should be 4 or 8. If both sides use that, then this</div><div>should be permanently fixed.</div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

<br>

> There also should be a very good comment above the parsing method in rtl-sym.c<br>

> explaining where the file comes from that it is reading and why there is the<br>

> need for a type which varies by architecture.<br>

<br>

Sure, happy to see comments added. Is this something you would like me to handle?<br>

<br>

The generation and source of the symbols can be handled in a number of ways. I<br>

have provided a tool that should cover most use cases however I can imagine<br>

more, eg symbol filters for defined exported symbols. I documented things here:<br>

<br>

<a href="https://docs.rtems.org/branches/master/user/exe/loader.html#base-image-symbols" rel="noreferrer" target="_blank">https://docs.rtems.org/branches/master/user/exe/loader.html#base-image-symbols</a><br>

<br>

Does that documentation help cover what you are looking for?<br></blockquote><div><br></div><div>I'll let Ryan and Alex answer that since they are fixing this. I just was </div><div>helping Ryan debug the parsing loop and realized it was chopping off the </div><div>first four bytes of every symbol after the first one. </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

<br>

> Based on our current understanding, I think this address is everything and<br>

> leaves sufficient information and organization to cover all the cases in the<br>

> future. Very likely without anyone actively having to fix anything. Running into<br>

> this issue again and again as architectures are added is not a good long-term<br>

> solution.<br>

<br>

Agreed. Figuring out stable 64bit addressing in libdl is a good idea and<br>

welcome. I think the main requirement is for the full address range to be supported.<br>

<br>

> Chris.. is this a good path to you? <br>

<br>

Yes this sounds good.<br>

<br>

> Is there somewhere else in the dynamic loader system with a similar issue?<br>

<br>

This is the only external interface of this type where static checking does not<br>

help. The other is ELF which I hope is OK. I am not sure if other places may<br>

have a problem. All care was taken however only using the code with 64bits will<br>

determine this.<br>

<br>

The elftools libelf support was ported and tested with only 32bit targets. I am<br>

not sure if there are issues there.<br></blockquote><div><br></div><div>OK. Ryan and Alex will be fixing this as part of adding aarch64 libdl support. </div><div>Just having you available to give insight is very helpful.</div><div><br></div><div>--joel </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">

<br>

Chris<br>

</blockquote></div></div>