Determining the cause of a segfault in RTEMS

Gedare Bloom gedare at rtems.org
Wed Mar 13 14:48:18 UTC 2013


On Wed, Mar 13, 2013 at 8:39 AM, Gedare Bloom <gedare at rtems.org> wrote:
> On Wed, Mar 13, 2013 at 1:59 AM, Mohammed Khoory <mkhoory at eiast.ae> wrote:
>> I've increased the stack size to 32K by defining
>> CONFIGURE_MINIMUM_TASK_STACK_SIZE and CONFIGURE_MINIMUM_STACK_SIZE and
>> CONFIGURE_INIT_TASK_STACK_SIZE .. I also made sure that I was starting new
>> tasks using RTEMS_CONFIGURED_MINIMUM_STACK_SIZE .. I'm still getting the
>> issue however as if nothing has changed. I think this means that there's
>> something wrong in my code, like something somewhere is writing out of
>> bounds, and not from the stack being too small or anything like that... So
>> I'll keep looking
>>
>> I forgot to mention that the arrays in question that involve a lot of
>> copying are around 100-120 chars in size for each task (which is probably
>> nothing compared to the default 2k-4k allocated for the stacks).
>>
> An array of 100 char is 800 bytes. 5 such arrays will overflow 4k
> limit. If you call multiple such functions the stack pressure can grow
> quickly. One easy check is to move all your arrays to global
> variables. Then they will be pre-allocated for you in the .data
> section of your program binary. Of course this won't work if you are
> multitasking or have reentrant functions.
>
Oops! I should not do math before coffee. 100 chars would be 800 bits,
so 100 bytes. I guess you would need a lot of those to overflow your
stack. Unless of course you are writing past the end of your arrays,
in which case all bets are off.

>> Thanks for the replies, it really helped me look in the right direction.
>>
>> Small question: is it normal for RTEMS_CONFIGURED_MINIMUM_STACK_SIZE to be
>> defined as 0? I've noticed this while stepping through the program, and I
>> was expecting it to be 32768. I assume maybe the RTEMS code considers 0 as
>> "check configuration" or something... I just want to make sure.
>>
> See cpukit/rtems/include/rtems.h where it is defined. I don't actually
> see the macro used anywhere in the tree though, so I don't know if it
> has any effect.
>
>>>-----Original Message-----
>>>From: Joel Sherrill [mailto:Joel.Sherrill at OARcorp.com]
>>>Sent: Wednesday, March 13, 2013 11:03 AM
>>>To: Mohammed Khoory
>>>Cc: Chris Johns; rtems-users at rtems.org
>>>Subject: RE: Determining the cause of a segfault in RTEMS
>>>
>>>For architectural reasons, 2k is very likely much too small on any SPARC
>> tbsp.
>>>Try increasing the minimum to something like 32k or larger to prove it is a
>> stack
>>>problem.
>>>
>>>If it runs, we can talk about stack checker and usage reports.
>>>
>>>--joel
>>>
>>>Mohammed Khoory <mkhoory at eiast.ae> wrote:
>>>
>>>
>>>> > Normally in general-purpose (not embedded) programming, the most
>>>> > straightforward way to determine the cause of a segfault is to look
>>>> > at its backtrace. However, this approach isn't really helpful in my
>>>> > case.. I'm writing an RTEMS application that has around 4 tasks, and
>>>> > stepping through the program doesn't exactly show context switches.
>>>> > When I get a segfault, the backtrace only shows the following
>>>> >
>>>> > #0  0xcd95a758 in ?? ()
>>>> > #1  0x40000190 in trap_table () at
>>>> > ../../../../../../../../rtems-4.10.2/c/src/lib/libbsp/sparc/leon3/../.
>>>> > ./spar
>>>> > c/shared/start.S:88
>>>> >
>>>> > Which is extremely unhelpful. Stepping through the program also
>>>> > doesn't really help, because it seems to crash while waiting for
>>>> > events, which makes no sense to me.
>>>> >
>>>>
>>>> The stack appears corrupt because the exception stack frame is a
>>>> different format to the standard stack frame gdb expects and attempts
>>>> to decode. All the data is present, it is just not available via gdb's
>>>> stack frame
>>>printing.
>>>
>>>That is very helpful, thanks. I'm doing some string copying on arrays
>> allocated
>>>on the stack, which is what I suspected is causing it, but then I dismissed
>> it
>>>because I knew for sure that I'm not copying anything larger than what the
>>>array can hold. But I guess I should take a better look at the copying code
>> now
>>>as I hadn't considered the fact that embedded targets tend to have small
>>>stacks.
>>>
>>>As Angelo Fraietta mentioned it could be caused by my stack size being too
>>>small.. however I saw that my minimum stack size is configured to be
>> 1024*2,
>>>which should be enough for what I'm doing.. but I'll play around with it a
>> bit
>>>more and see how that goes.
>>>
>>>> > Is there any other proper way to figure out what's causing the
>>>> > segfault in RTEMS? I'm thinking maybe using the capture engine might
>>>> > be a good idea because it should tell what task was running last,
>>>> > but I haven't used it yet, I only know what it does.. so I'm not
>>>> > sure if
>>>that'll help.
>>>>
>>>> This is architecture and sometimes BSP specific so exact details are
>>>> not
>>>easy
>>>> to give. The best solution is find the address the exception is
>>>> branching
>>>to
>>>> and then set a break point there. The idea is to get as close to the
>>>> point
>>>the
>>>> exception happens. More often than not this lets you see a decent
>>>> stack frame in gdb. Have a look start.S and see if it is easy to see a
>>>> possible
>>>entry
>>>> point.
>>>
>>>The line in start.S that the backtrace refers to only defines an entry for
>> a table
>>>of traps from what I can tell.. in this case it's a DMA access error, which
>>>indicates that something is writing somewhere that it's not supposed to.
>> But
>>>that's the only thing I can figure out from it.
>>>
>>>Generally speaking the start.S file isn't very helpful.. It only contains
>> code
>>>related to starting up the SPARC cpu from what I can tell...
>>>
>>>I thought that having a backtrace like this from segfaults on RTEMS was
>> normal,
>>>which is why I sent the message in the first place :)
>>>
>>>> Which version of RTEMS are you using ?
>>>4.10.2
>>>
>>>> Which BSP are you using ?
>>>LEON3
>>>
>>>_______________________________________________
>>>rtems-users mailing list
>>>rtems-users at rtems.org
>>>http://www.rtems.org/mailman/listinfo/rtems-users
>>
>> _______________________________________________
>> rtems-users mailing list
>> rtems-users at rtems.org
>> http://www.rtems.org/mailman/listinfo/rtems-users



More information about the users mailing list