Unable to reconfigure the stack size

Gedare Bloom gedare at rtems.org
Mon Nov 21 17:34:23 UTC 2011


Just to add on: note that the "StackStart" (BSP stack) is a small
stack area that is only used during boot-up. The applications get
their own stacks that are typically allocated through the Workspace
during rtems' initialization. It is possible that the application task
stack is overflowing, and if increasing
CONFIGURE_MINIMUM_TASK_STACK_SIZE to a large value (e.g. 32 * minimum)
changes the application behavior then that is a good indicator.

On Mon, Nov 21, 2011 at 12:19 PM, Joel Sherrill
<joel.sherrill at oarcorp.com> wrote:
> On 11/21/2011 11:31 AM, Fabricio de Novaes Kucinskis wrote:
>>
>> Hi Joel, and thank you for this sunday answer!
>>
>> So, RTEMS should have changed the stack size when I defined
>> CONFIGURE_MINIMUM_STACK_SIZE and used RTEMS_CONFIGURED_MINIMUM_STACK_SIZE
>> when
>> creating the task, but it didn't.
>
> The constant is CONFIGURE_MINIMUM_TASK_STACK_SIZE.
>
> Using the wrong name is certainly going to have no effect.
>
> Sorry I didn't see that on a Sunday.
>>
>> (I think my other comments - stack checker error, STACK_SIZE and other
>> RTEMS
>> configs - can be ignored for now, for the good of the discussion.)
>
> Yep.
>>
>> I'd like to know if someone on the list was able to change or not the
>> default
>> stack size for the ERC32/SIS BSPs. If someone have tried it, please report
>> if
>> it worked or not.
>>
> Try the right constant name.
>
> One of the examples-v2/ticker/low_ticker variations shows how to use this
> and I know it works
> on sis because that is where my workspace size numbers in presentations came
> from.
>>>>
>>>> You have pulled a lot of thread and looked for a horse (e.g. blown
>>>> stack) which it smells to me like a zebra (e.g. stray write onto stack
>>
>> memory).
>>
>> Among all the details I forgot to add that - remember, I was dizzy ;) -,
>> apart
>> from the size of the stack for the ISIS task (4036 bytes, almost at the 4
>> kbytes limit) a few instructions before the error, the local variable
>> which is
>> overwritten in the memcpy operation is inside the .bss area:
>>
>> - Local variable address: 0x02058920
>> - End of .bss section:    0x0205c150
>>
>> Also, the instruction that causes the overwrite seems to be preety safe:
>>
>> memcpy(&destinationElement,&sourceElement, sizeof(Element)); [not exactly
>> this, but this is what it does]
>
> The instruction is safe .. but maybe not the destination address or contents
> of the source.
>
>> Finally, when I reduce the size of the arrays that are placed inside the
>> .bss
>> area (moving its end far from the RTEMS stack start), I have no error.
>
> Something else got corrupted. :)
>
> Since you know the memcpy is the culprit, is the destination correct given
> the source?
>
> Can you do a backtrace?
>
>> So, do you think this can be a bug in the ERC32 BSP? If so, where to look
>> at?
>> This whole stack configuration seems a little complicated to me.
>
> It isn't a BSP error.  You are probably lucky it is reproducible. :)
>
> --joel
>>
>> Thanks again,
>>
>> Fabrício.
>>
>>
>> On Sun, 20 Nov 2011 10:18:06 -0600, Joel Sherrill wrote
>>>
>>> On 11/20/2011 09:19 AM, Fabricio de Novaes Kucinskis wrote:
>>>>
>>>> Hello everybody,
>>>>
>>>> I have an application that has blown the stack, tried a lot of
>>>> things to
>>
>> fix
>>>>
>>>> it, and up to now nothing worked. In fact, nothing that I've tried
>>>> so far
>>
>> had
>>>>
>>>> any effect on the stack size.
>>>>
>>>> It's clear to me that I'm missing or misunderstanding something. In
>>>> order
>>
>> to
>>>>
>>>> discover what, follows a detailed description (sorry for the length)
>>>> of my problem, and what I've tried - my hope is that, by describing
>>>> in detail, I expose what I'm doing wrong and allow you to point it.
>>>>
>>>> I'm using RTEMS 4.10.0 for the SIS BSP.
>>>>
>>>> I have a task that demands more than the RTEMS default stack size
>>>> for the
>>>> ERC32 (I'm using SIS to try it, but I think this should not be an
>>>> issue).
>>
>> At
>>>>
>>>> some point a local variable is overwritten by a memcpy applied to a
>>>> large array in the .bss area, and the application falls in an infinite
>>
>> loop.
>>>
>>> With sis you can use the watch command to find out where the write
>>> comes from.  It may not be a stack overflow but a stray write that
>>> just happens to hit the stack.
>>>>
>>>> Follows the stack report taken immediatelly before the blow:
>>>>
>>>> Stack usage by thread
>>>>      ID      NAME    LOW          HIGH     CURRENT     AVAILABLE
>>>> USED
>>>> 0x09010001  IDLE 000205E2D0 - 000205F2DF 000205F0A0      4096        752
>>>> 0x0A010002  ISIS 0002060BB0 - 0002061BBF 00020617C8      4096       4036
>>>> 0xFFFFFFFF  INTR 000205C5D0 - 000205D5CF 0000000000      4080        576
>>>> Memory exception at 2cbe13c (illegal address) Unexpected trap ( 9)
>>>> at address 0x02014228 data access exception at 0x02CBE13C
>>>>
>>>> Note: the "Memory exception" error only happens when I enable the
>>>> stack checker, but I assume this is expected.
>>>>
>>> Maybe.. maybe not.. :)
>>>>
>>>> "Ok", I thought, "the default stack is not enough, so let's change
>>>> it" -
>>
>> and
>>>>
>>>> that's what I've been trying to do for the last couple of days, with
>>>> no success.
>>>>
>>>> The first thing I tried was to change the stack size for the
>>>> application, setting CONFIGURE_MINIMUM_STACK_SIZE to 8 kbytes, and
>>>> changing the
>>
>> creation of
>>>>
>>>> the task, using RTEMS_CONFIGURED_MINIMUM_STACK_SIZE. But as a new
>>>> stack report has shown, it seems to have no effect on the stack size.
>>
>> The
>>>>
>>>> same goes for the CONFIGURE_EXTRA_TASK_STACKS #define.
>>>>
>>> CONFIGURE_MINIMUM_STACK_SIZE and the change you made to
>>> rtems_task_create for the "ISIS" task should have changed its size to
>>> 8K.
>>>
>>> CONFIGURE_EXTRA_TASK_STACKS just reserves memory in the work space to
>>> account for tasks which are created with greater than minimum.
>>>>
>>>> Starting to be worried, I've tried to change the start address of
>>>> the
>>
>> RTEMS
>>>>
>>>> work area by using CONFIGURE_EXECUTIVE_RAM_WORK_AREA, just to see
>>>> what happens. Again, nothing different.
>>>>
>>>> To illustrate, follows the configuration with everything I tried.
>>>> The corresponding stack report is exactly the same as above.
>>>>
>>>> #define CONFIGURE_MINIMUM_STACK_SIZE            (1024 * 8)
>>>> #define CONFIGURE_EXTRA_TASK_STACKS             (1024 * 8)
>>>> #define CONFIGURE_EXECUTIVE_RAM_WORK_AREA       0x02100000
>>>> #define CONFIGURE_STACK_CHECKER_ENABLED
>>>>
>>> Hmmm... I think CONFIGURE_EXECUTIVE_RAM_WORK_AREA may not be honoured
>>> by most BSPs and is definitely NOT supported with the new shared
>>> workspace shared framework.
>>>
>>> Anyway, the sparc BSPs are definitely overwriting that field in
>>> 4.10 without honouring if it was NULL or not.
>>>>
>>>> Now a little bit desperate, I went into the ERC32 BSP code. There is
>>>> a STACK_SIZE defined in the start.S file, but not used there. The
>>>> same is redefined in bspgetworkarea.c. But the way it is used
>>>> suggests to me that
>>
>> the
>>>>
>>>> RTEMS work area shall not touch into this area:
>>>>
>>> Yes .. unfortunately that is defined in two (or three) places. :(
>>>
>>> But it has nothing to do with task stack size.  It is the size of the
>>> stack that the BSP initialization runs on until the switch to the
>>> first task.
>>>>
>>>> void bsp_get_work_area(...) {
>>>>    /* must be identical to STACK_SIZE in start.S */
>>>>    #define STACK_SIZE (16 * 1024)
>>>>    *work_area_start      =&end;
>>>>    *work_area_size       = (void *)rdb_start - (void *)&end -
>>>> STACK_SIZE;
>>>>
>>>> Being "end" a symbol set at linkcmds.base, pointing to the end of
>>>> the .bss area, and "rdb_start" pointing to the end of RAM, I assumed
>>>> that RTEMS
>>
>> sets
>>>>
>>>> the ERC stack pointer to work_area_start + work_area_size, but this
>>>> seems
>>
>> not
>>>>
>>>> to be the case.
>>>>
>>>   From c/src/lib/libbsp/sparc/shared/start.S
>>>
>>>          set     (SYM(rdb_start)), %g6   ! End of RAM
>>>          st      %sp, [%g6]
>>>          sub     %sp, 4, %sp             ! stack starts at end of
>>> RAM - 4         andn    %sp, 0x0f, %sp          ! align stack on 16-
>>> byte boundary         mov     %sp, %fp                ! Set frame pointer
>>>          nop
>>>
>>> The starting stack pointer is set to the end of RAM and grows down.
>>> The area from end of ram to "end of ram - STACK_SIZE" is the starting
>>> stack.
>>>
>>>> Digging deeper into the BSP code, I saw that _CPU_Context_switch
>>>> saves the registers, including the stack pointer (not touched by
>>>> RTEMS yet). At the first call to _CPU_Context_restore_heir, the stack
>>
>> pointer is "restored"
>> to an
>>>>
>>>> address that I couldn't relate to anything:
>>>>
>>>> - work_area_start = 0x205c150 (end of the .bss section)
>>>> - work_area_size = 0x39feb0 (last RAM address - first free RAM
>>>> address -
>>>> STACK_SIZE)
>>>> - restored stack pointer at _CPU_Context_restore_heir = 0x20606f0
>>>> (???)
>>>>
>>> Tasks stacks are from the RTEMS Workspace.  That stack pointer
>>> (0x20606f0) is within the right address range since it is between
>>> work_area_start and its end.  It is also properly aligned.  I think
>>> that is correct.
>>>
>>> It looks to me that you have a stray write over something on the task
>>> stack.  It could be as simple as someone writing too many bytes into a
>>> buffer on the stack that isn't that large.
>>>
>>> Step out of that and watch the %sp of the task.  At some point, it
>>> must be going bad.  It is doing that because it is being restored from
>>> RAM and that memory must have been written to unintentionally.
>>>
>>> This will sound hard but you need to figure out what address the bad
>>> %sp is coming from and set a watchpoint on accesses to it.  At some
>>> point, the bad value will go in.  Then you have your culprit.
>>>
>>> Then the question is to find the fix.
>>>>
>>>> That's when I, already dizzy, stopped trying and decided to ask the
>>>> list.
>>
>> It
>>>>
>>>> shall be something (maybe elementary) that I'm doing wrong, but I
>>>> don't
>>
>> know
>>>>
>>>> what it could be, nor where to look at anymore.
>>>>
>>> You have pulled a lot of thread and looked for a horse (e.g. blown
>>> stack) which it smells to me like a zebra (e.g. stray write onto stack
>>> memory).
>>>
>>> --joel
>>>>
>>>> Thanks for your time and best regards,
>>>>
>>>> Fabrício Kucinskis.
>
> _______________________________________________
> rtems-users mailing list
> rtems-users at rtems.org
> http://www.rtems.org/mailman/listinfo/rtems-users
>




More information about the users mailing list