BBB/AM335x: isspace() results in data abort exception in strtol() in strtol.c and strtoul() in strtoul.c

Jarielle Catbagan jcatbagan93 at gmail.com
Mon Jun 22 04:16:57 UTC 2015


On Sun, Jun 21, 2015 at 4:47 PM, Chris Johns <chrisj at rtems.org> wrote:
> On 22/06/2015 2:27 am, Jarielle Catbagan wrote:
>> I was testing the "help", "dm" and "pm" commands on the Umon command
>> line and they resulted in what appears to cause the program to hang.
>> I captured the result and it can be found here
>> https://drive.google.com/file/d/0B_44Dkqbmn75TjQwZjJaczNJYk0/view?usp=sharing.
>>
>> After doing some debugging, I was able to pinpoint the location that
>> results in a halt in program execution.  It hangs when isspace() is
>> executed in strtol() in strtol.c.
>>
>
> Where do the libc functions come from ? What is the linker command line ?
>

isspace() seems to be pulled in just fine from
/usr/arm-none-eabi/include, because removing <ctype.h> from strtol.c
results in an undefined reference to isspace().


The linker command line, which is taken from the BBB Makefile, is

        $(LINK) -e coldstart $(OBJS) monbuilt.o libz.a libg.a
$(LIBABIDIR) $(LIBGCC)


where libg.a contains strtol.o and strtoul.o when libg.a is created in
/main/make/rules.make with

        libg.a: $(GLIBOBJ)
                $(AR) rc libg.a $(GLIBOBJ)


$(GLIBOBJ) is created in /main/make/rules.make as well with

        $(GLIBOBJ):
                $(CC) $(CFLAGS) -o ${@F} $(GLIBDIR)/$(@:%.o=%.c)


where $(CFLAGS) expands to the definition in /main/make/common.make as

        CFLAGS = $(COMMON_CFLAGS) $(CUSTOM_CFLAGS)  \
                $(COMMON_INCLUDE) $(CUSTOM_INCLUDE)


The compiler flags are in $(COMMON_CFLAGS) and $(CUSTOM_CFLAGS) which expands to

        COMMON_CFLAGS = -g -c -Wall -DPLATFORM_$(PLATFORM)=1 \
                -fno-builtin -fno-toplevel-reorder

and

        CUSTOM_CFLAGS = -mcpu=cortex-a8 -O2 -isystem $(ABIDIR)/include \
                -Wno-char-subscripts


So it appears that libc is not being linked in.  I added "-lc" to link
libc in COMMON_CFLAGS but the data exception is still occurring.


>> I was suspecting that perhaps some exception has occured.  I know that
>> the exception handlers defined by Umon are located in rom_reset.S.  I
>> then realized that the custom exception handlers have no way of being
>> called when an exception occurs because the AM335x does not know the
>> location of these custom exception handlers.  By referring to the
>> AM335x TRM, Section 26.1.3.2, it can be found that when an exception
>> occurs, the default exception handlers are used which are established
>> by the internal ROM code.  The locations of where these default
>> exception handlers are shown in Table 26-3.
>>
>> It is also stated in this section that the addresses of the default
>> exception handlers can be overridden with custom addresses indicating
>> the locations of the custom exception handlers.  The default exception
>> handlers are replaced with Umon's custom exception handlers with the
>> previous submitted patch, which can be found here
>> https://lists.rtems.org/pipermail/umon-devel/2015-June/000072.html.
>>
>> With Umon's exception handlers in place, the base Umon image was
>> rebuilt, and executed again.  This time, when the command such as
>> "help" is executed, an exception does occur, and Umon indicates this.
>> The type of exception is a data abort exception.  When this exception
>> occurs, Umon simply restarts.  I captured this exception occurring and
>> Umon restarting and it can be found here
>> https://drive.google.com/open?id=0B_44Dkqbmn75WnNsWlJlbkpvTHc&authuser=0.
>> In the image shown in the previous link, Umon indicates that the data
>> exception occurs around 0x403082dc.  This is a valid memory address as
>> it is within the 109 KB range of the internal SRAM and that Umon is
>> mapped within.
>>
>> I wanted to find out what exact data was located at this address.
>> strtol() and strtoul() do not hang if the invocation of isspace() in
>> the while conditional of the do/while loop is replaced with while(c ==
>> ' ').  Since "help" uses strtol() and "dm"/"pm" use strtoul(), I
>> modified strtoul() only so that I can use "dm" to poke around memory
>> when the data abort exception occurs as a result of executing "help".
>> This modification is only a temporary mechanism to assist me in
>> debugging.
>
> I think you have two problems here, the first you need to get stable
> working exceptions handlers that show the problem on the console. Once
> this is sorted the reason for the exception can be found and removed.
>
> That is, this is a nice test for the exception handlers and you should
> get them working.
>


I updated Umon's exception handlers to output the LR, CPSR, and SPSR
at the time of the data exception.  I captured this output which can
be found here https://drive.google.com/open?id=0B_44Dkqbmn75dzRXbHVrQ3FLSTQ&authuser=0.
Would something like this be sufficient enough to be considered stable
working exception handlers?


Looking at the Cortex-A8 TRM found here
http://infocenter.arm.com/help/topic/com.arm.doc.ddi0344k/DDI0344K_cortex_a8_r3p2_trm.pdf,
 it is stated in Section 2.15.1 "Exception entry and exit summary"
that when the data abort exception handler is entered, LR contains PC
+ 8, where PC is the address of the instruction that caused the data
abort exception.  Referring to the previous image, the value of LR at
the time the data abort exception handler is entered is 0x402f7c08.
This means that the instruction that caused the exception is at
0x402f7c00.

Looking at the disassembled output of the Umon image, boot.elf, the
instruction at the address 0x402f7c00 is

        ldrb r3, [r3, #1].

The code containing the instruction is as follows:

long
strtol(const char * __restrict nptr, char ** __restrict endptr, int base)
{
402f7bc8: e30435a0 movw r3, #17824 ; 0x45a0
402f7bcc: e3443030 movt r3, #16432 ; 0x4030
402f7bd0: e92d4ff0 push {r4, r5, r6, r7, r8, r9, sl, fp, lr}
402f7bd4: e1a06001 mov r6, r1
402f7bd8: e1a08002 mov r8, r2
402f7bdc: e24dd00c sub sp, sp, #12
402f7be0: e1a07000 mov r7, r0
402f7be4: e5931000 ldr r1, [r3]
        /*
        * Skip white space and pick up leading +/- sign if any.
        * If base is 0, allow 0x for hex and 0 for octal, else
        * assume decimal; if base is already 16, allow 0x.
        */
        s = nptr;
402f7be8: e1a02000 mov r2, r0
402f7bec: ea000000 b 402f7bf4 <strtol+0x2c>
402f7bf0: e1a02005 mov r2, r5
        do {
                c = *s++;
402f7bf4: e1a05002 mov r5, r2
402f7bf8: e4d5c001 ldrb ip, [r5], #1
        } while (isspace((unsigned char)c));
402f7bfc: e081300c add r3, r1, ip
402f7c00: e5d33001 ldrb r3, [r3, #1]
402f7c04: e2033008 and r3, r3, #8
402f7c08: e21330ff ands r3, r3, #255 ; 0xff
402f7c0c: 1afffff7 bne 402f7bf0 <strtol+0x28>

As mentioned previously, r3 contains 0x31.  So when ldrb r3, [r3, #1]
is executed it will be trying to access memory at location 0x32.  This
location is within the GPMC subsystem that is used for interfacing
with memories like NAND.  In the context of the Umon program in
execution, this memory is not relevant aside from the fact that the
interface is not set up yet.

My question is why is the code generated accessing this memory
location?  It is already specified in the linker script specifying
that the memory range of Umon is to be within 0x402f0400 - 0x4030b800.


>>
>> Using "dm" to view the contents of memory starting at address
>> 0x403082dc, I then captured the output and it can be found here
>> https://drive.google.com/open?id=0B_44Dkqbmn75LUVrVUZMLXFNanM&authuser=0.
>>
>> Before proceeding further, one thing worth mentioning is the general
>> function invocations leading to isspace().  When the "help" command is
>> executed, showhelp() is eventually invoked in docmd.c.  When Umon
>> attempts to display the output of the "help" command, it invokes
>> printf("%-12s", cptr->name) at the end of showhelp().  Eventually
>> strtol() is invoked which is passed the format string from the printf
>> but only "12".  In the first iteration of the do/while loop in
>> strtol(), isspace is first passed the first character from "12", i.e.
>> '1'.  The string "12" is stored starting at 0x403082dc as shown in the
>> output of the "dm" command shown in the previous image and is the
>> location where Umon indicates where the data exception occurs nearby.
>> Based on my intuition, this is valid data in valid memory and I
>> believe that the exception is not triggered by the memory access of
>> 0x403082dc but by a different memory access.
>>
>> Whenever Umon is built, there are two important images built.
>> boot.bin and boot.elf.  boot.bin is a raw image and is the image that
>> is booted on the BBB.  I wanted to see how the image was assembled,
>> and so I ran "arm-none-eabi-objdump -S boot.elf".  I am focusing on
>                 ^^^^^^^^^^^^^
>
> What tool set is being used ?


I am using arm-none-eabi-gcc 4.9.2 and binutils 2.25.0.


>
> Chris
>
>> how the data exception is triggered and I know that it must be
>> occuring near isspace() in strtol().
>>
>> A subset of the output of arm-none-eabi-objdump showing strtol() and
>> isspace() invoked within, is shown below.
>>
>> ---------------------------------------------------------------------------------
>>
>> 402f7b70 <strtol>:
>>  * Assumes that the upper and lower case
>>  * alphabets and digits are each contiguous.
>>  */
>> long
>> strtol(const char * __restrict nptr, char ** __restrict endptr, int base)
>> {
>> 402f7b70: e30435a0 movw r3, #17824 ; 0x45a0
>> 402f7b74: e3443030 movt r3, #16432 ; 0x4030
>> 402f7b78: e92d4ff0 push {r4, r5, r6, r7, r8, r9, sl, fp, lr}
>> 402f7b7c: e1a06001 mov r6, r1
>> 402f7b80: e1a08002 mov r8, r2
>> 402f7b84: e24dd00c sub sp, sp, #12
>> 402f7b88: e1a07000 mov r7, r0
>> 402f7b8c: e5931000 ldr r1, [r3]
>> /*
>> * Skip white space and pick up leading +/- sign if any.
>> * If base is 0, allow 0x for hex and 0 for octal, else
>> * assume decimal; if base is already 16, allow 0x.
>> */
>>         s = nptr;
>> 402f7b90: e1a02000 mov r2, r0
>> 402f7b94: ea000000 b 402f7b9c <strtol+0x2c>
>> 402f7b98: e1a02005 mov r2, r5
>>         do {
>>                 c = *s++;
>> 402f7b9c: e1a05002 mov r5, r2
>> 402f7ba0: e4d5c001 ldrb ip, [r5], #1
>>         } while (isspace((unsigned char)c));
>> 402f7ba4: e081300c add r3, r1, ip
>> 402f7ba8: e5d33001 ldrb r3, [r3, #1]
>> 402f7bac: e2033008 and r3, r3, #8
>> 402f7bb0: e21330ff ands r3, r3, #255 ; 0xff
>> 402f7bb4: 1afffff7 bne 402f7b98 <strtol+0x28>
>>         if (c == '-') {
>> 402f7bb8: e35c002d cmp ip, #45 ; 0x2d
>>         * If base is 0, allow 0x for hex and 0 for octal, else
>>         * assume decimal; if base is already 16, allow 0x.
>>         */
>>         s = nptr;
>>         do {
>>                 c = *s++;
>> 402f7bbc: e1a0900c mov r9, ip
>>         } while (isspace((unsigned char)c));
>>         if (c == '-') {
>> 402f7bc0: 0a000061 beq 402f7d4c <strtol+0x1dc>
>>                 neg = 1;
>>                 c = *s++;
>>         } else {
>>                 neg = 0;
>>                 if (c == '+')
>> 402f7bc4: e35c002b cmp ip, #43 ; 0x2b
>>         } while (isspace((unsigned char)c));
>>
>> ---------------------------------------------------------------------------------
>>
>> At the very top of the output posted, r3 is set to contain the value
>> 0x403045a0 from the instructions at 0x402f7b70 and 0x402f7b74, where
>> r3 is then dereferenced at 0x402f7b8c to retrieve the value contained
>> at this address and is stored in r1.
>>
>> Looking at the data at this location, there is nothing as shown in the
>> screen capture here
>> https://drive.google.com/open?id=0B_44Dkqbmn75VUlxSGt2cWJnck0&authuser=0.
>>
>> Later on, when a character is obtained from the string "12" with the
>> instruction at 0x402f7ba0 it is added with the value stored previously
>> in r1 and stored back in r3 with the instruction at 0x402f7ba4.
>>
>> If I take into account that r1 contains 0 and that the value of the
>> character obtained, which in the case of the first character would be
>> 0x31 for '1' in ASCII, then the value is still 0x31 after executing
>> the instruction at 0x402f7ba4.
>>
>> This value is then added with 1 and is used as the address of a memory
>> location which is dereferenced and stored in r3 with the instruction
>> at 0x402f7ba8.  Could this instruction, at 0x402f7ba8, be the
>> instruction that can cause the data abort exception.  Looking at the
>> memory map in the AM335x TRM, this is an address (0x31) within the
>> GPMC subsystem.  This subsystem is used for external memory such as
>> NAND.   The interface to NAND has not been set up yet, so it would
>> make sense that this could cause a data abort exception.  I apologize
>> that I am not actually able to verify the exact contents of the
>> registers at the time the data abort exception occurs as I don't have
>> a JTAG debugger yet.
>>
>> Is there a possibility that I could be missing something?  Am I
>> interpreting the disassembled output correctly?
>> _______________________________________________
>> umon-devel mailing list
>> umon-devel at rtems.org
>> http://lists.rtems.org/mailman/listinfo/umon-devel
>>


More information about the umon-devel mailing list