Determining the cause of a segfault in RTEMS

Mohammed Khoory mkhoory at eiast.ae
Wed Mar 13 05:59:44 UTC 2013


I've increased the stack size to 32K by defining
CONFIGURE_MINIMUM_TASK_STACK_SIZE and CONFIGURE_MINIMUM_STACK_SIZE and
CONFIGURE_INIT_TASK_STACK_SIZE .. I also made sure that I was starting new
tasks using RTEMS_CONFIGURED_MINIMUM_STACK_SIZE .. I'm still getting the
issue however as if nothing has changed. I think this means that there's
something wrong in my code, like something somewhere is writing out of
bounds, and not from the stack being too small or anything like that... So
I'll keep looking

I forgot to mention that the arrays in question that involve a lot of
copying are around 100-120 chars in size for each task (which is probably
nothing compared to the default 2k-4k allocated for the stacks). 

Thanks for the replies, it really helped me look in the right direction.

Small question: is it normal for RTEMS_CONFIGURED_MINIMUM_STACK_SIZE to be
defined as 0? I've noticed this while stepping through the program, and I
was expecting it to be 32768. I assume maybe the RTEMS code considers 0 as
"check configuration" or something... I just want to make sure.

>-----Original Message-----
>From: Joel Sherrill [mailto:Joel.Sherrill at OARcorp.com]
>Sent: Wednesday, March 13, 2013 11:03 AM
>To: Mohammed Khoory
>Cc: Chris Johns; rtems-users at rtems.org
>Subject: RE: Determining the cause of a segfault in RTEMS
>
>For architectural reasons, 2k is very likely much too small on any SPARC
tbsp.
>Try increasing the minimum to something like 32k or larger to prove it is a
stack
>problem.
>
>If it runs, we can talk about stack checker and usage reports.
>
>--joel
>
>Mohammed Khoory <mkhoory at eiast.ae> wrote:
>
>
>> > Normally in general-purpose (not embedded) programming, the most
>> > straightforward way to determine the cause of a segfault is to look
>> > at its backtrace. However, this approach isn't really helpful in my
>> > case.. I'm writing an RTEMS application that has around 4 tasks, and
>> > stepping through the program doesn't exactly show context switches.
>> > When I get a segfault, the backtrace only shows the following
>> >
>> > #0  0xcd95a758 in ?? ()
>> > #1  0x40000190 in trap_table () at
>> > ../../../../../../../../rtems-4.10.2/c/src/lib/libbsp/sparc/leon3/../.
>> > ./spar
>> > c/shared/start.S:88
>> >
>> > Which is extremely unhelpful. Stepping through the program also
>> > doesn't really help, because it seems to crash while waiting for
>> > events, which makes no sense to me.
>> >
>>
>> The stack appears corrupt because the exception stack frame is a
>> different format to the standard stack frame gdb expects and attempts
>> to decode. All the data is present, it is just not available via gdb's
>> stack frame
>printing.
>
>That is very helpful, thanks. I'm doing some string copying on arrays
allocated
>on the stack, which is what I suspected is causing it, but then I dismissed
it
>because I knew for sure that I'm not copying anything larger than what the
>array can hold. But I guess I should take a better look at the copying code
now
>as I hadn't considered the fact that embedded targets tend to have small
>stacks.
>
>As Angelo Fraietta mentioned it could be caused by my stack size being too
>small.. however I saw that my minimum stack size is configured to be
1024*2,
>which should be enough for what I'm doing.. but I'll play around with it a
bit
>more and see how that goes.
>
>> > Is there any other proper way to figure out what's causing the
>> > segfault in RTEMS? I'm thinking maybe using the capture engine might
>> > be a good idea because it should tell what task was running last,
>> > but I haven't used it yet, I only know what it does.. so I'm not
>> > sure if
>that'll help.
>>
>> This is architecture and sometimes BSP specific so exact details are
>> not
>easy
>> to give. The best solution is find the address the exception is
>> branching
>to
>> and then set a break point there. The idea is to get as close to the
>> point
>the
>> exception happens. More often than not this lets you see a decent
>> stack frame in gdb. Have a look start.S and see if it is easy to see a
>> possible
>entry
>> point.
>
>The line in start.S that the backtrace refers to only defines an entry for
a table
>of traps from what I can tell.. in this case it's a DMA access error, which
>indicates that something is writing somewhere that it's not supposed to.
But
>that's the only thing I can figure out from it.
>
>Generally speaking the start.S file isn't very helpful.. It only contains
code
>related to starting up the SPARC cpu from what I can tell...
>
>I thought that having a backtrace like this from segfaults on RTEMS was
normal,
>which is why I sent the message in the first place :)
>
>> Which version of RTEMS are you using ?
>4.10.2
>
>> Which BSP are you using ?
>LEON3
>
>_______________________________________________
>rtems-users mailing list
>rtems-users at rtems.org
>http://www.rtems.org/mailman/listinfo/rtems-users




More information about the users mailing list