still problem with ARM and Unlimited Task Test

Joel Sherrill joel.sherrill at OARcorp.com
Wed Jun 15 13:44:08 UTC 2011


On 06/15/2011 08:08 AM, Joachim Rahn wrote:
> Hi Joel,
>
> hope I don't bother you too much...
>
> Now I found some time beside my main work to come back to my problem with the
> failing Unlimited Task Test on our ARM board.
>
> Any advice regarding the following would be welcome!!!
> ------------------------------------------------------
>
> In fact our problem has nothing directly to do with the update between RTEMS 4.9.3 and 4.9.5 !
>
> BTW: We now use the shared code in the BSP tree to initialize the workspace (BSP_BOOTCARD_HANDLES_RAM_ALLOCATION = TRUE)
>       and we compile every thing with all RTEMS and HEAP debugging on.
>
> We now have a setup with RTEMS 4.9.3 on a AT91SAM9263-EK evaluation board from Atmel which reproduces
> our data abort fault.
>
> The problem seems to be that under some circumstances the routine
>
>          _RTEMS_task_Switch_extension(Thread_Control *executing, Thread_Control *heir)
>
> will be called with a reference to an executing task (*executing) after
> the routine
>
>          _RTEMS_task_Delete_extension(Thread_Control *executing, Thread_Control *heir)
>
> has already deleted the certain task and a following call sequence to
>
>          _RTEMS_tasks_Free
>          _Objects_Free
>          _Objects_Shrink_information
>
> ends up in a call to
>
>          _Heap_Free(Heap_Control *the_heap, void *starting_address)
>
> which frees the memory used by this certain Thread_Control struct of that task and overwites the
> pointer "executing->task_variables" (which now should be NULL) with some heap information.
> Because "executing->task_variables" now is corrupted the call to _RTEMS_task_Switch_extension
> leads to a data_abort.
>
> GDB output of the concerning call sequences using the gdb up command looks like:
>
> GDB stack walk: _RTEMS_tasks_Delete_extension (executing=0x23fca500, deleted=0x23fca500)
>                  _User_extensions_Thread_delete (the_thread=0x23fca500)
>                  _Thread_Close (information=0x2002fd98, the_thread=0x23fca500)
>                   rtems_task_delete (id=0)
>                   test_task (my_number=8)
>                  _Thread_Handler ()
>                  _Objects_API_maximum_class (api=536959452)
>
> GDB stack walk: _Heap_Free (the_heap=0x2002fe4c, starting_address=0x23fca0a0)
>                  _Workspace_Free (block=0x23fca0a0)
>                  _Objects_Shrink_information (information=0x2002fd98)
>                  _Objects_Free (information=0x2002fd98, the_object=0x23fca500)
>                  _RTEMS_tasks_Free (the_task=0x23fca500)
>                   rtems_task_delete (id=0)
>                   test_task (my_number=8)
>                  _Thread_Handler ()
>                  _Objects_API_maximum_class (api=536959452)
>
> NOW "executing->task_variables" IS CORRUPTED !!!!!
>
> GDB stack walk: _RTEMS_tasks_Switch_extension (executing=0x23fca500, heir=0x23fac488)
>                  _User_extensions_Thread_switch (executing=0x23fca500, heir=0x23fac488)
>                  _Thread_Dispatch ()
>                  _Thread_Enable_dispatch ()
>                   rtems_task_delete (id=0)
>                   test_task (my_number=8)
>                  _Thread_Handler ()
>                  _Objects_API_maximum_class (api=536959452)
>
>
> By chance the pointer to "next_block->prev_size" in the call to _Heap_Free has the same location as
> "executing->task_variables" in the concerning Thread_Control struct and therefore _RTEMS_task_Switch_extension
> tries to access a bad memory location which of course leads to a data_abort.
> May be under other circumstances one will never stumble upon this?
>
The memory has indeed been freed and is not supposed to be used.
In fact, executing->task_variables should be NULL.  I see it set
to NULL in _RTEMS_tasks_Delete_extension.  Can you verify that?

I think the extensions should ensure they are not operating on
a deleted task.  The extensions pointers and task variable
pointer should be NULL at this point.  Worst case, they can
check the state of executing and if is has STATES_DORMANT set,
then don't do anything for executing.

I checked the 4.9 source for this part of the Classic API extensions.
They are setting things to NULL and the switch extension is checking
it.

FWIW there is a PR outstanding spotted on SMP work where the
thread stack is freed and potentially reallocated for some other
purpose before the delete(SELF) task is finished switching out.
I don't think that's happening here but it is worth mentioning.

> BTW: When I change (as a test) the definition of CPU_HEAP_ALIGNMENT in
>       ..../rtems/cpukit/score/cpu/arm/rtems/score/cpu.h
>       from CPU_ALIGNMENT (which is 4) to something larger than CPU_ALIGNMENT,
>       the unlimited test works fine.
>
> Any idea or advice ...?
>
> Regards,
> Joachim
>
> On 01.03.2011 17:09, Joel Sherrill wrote:
>> On 03/01/2011 07:48 AM, Joachim Rahn wrote:
>>> Hi all,
>>>
>>> after updating from rtems-4.9.3 to rtems-4.9.5 the "Unlimited Task Test" on my
>>> ARM cpu at91sam9263 fails with a message like...
>>>
>>> [...skip...]
>>> task 19 ending.
>>> task 20 ending.
>>> task 21 ending.
>>> task 7 ending.
>>> task 8 ending.
>>>
>>> INSN_LDR
>>> data_abort at address 0x20018CD8, instruction: 0xE5932000,   spsr = 0x20000013
>>> active thread thread 0x0A010001
>>> Previous sp=0x200629A8 lr=0x200135E0 and actual cpsr=60000097
>>>    0x20038E30 0x20056EA8 0x0000117C 0x200629E0 0x200629C4 0x200135E0
>>>    0x20018CB8 0x20038E30 0x20056EA8 0x20026EC0 0x20026EC0 0x20062A18
>>>    0x200629E4 0x20010100 0x200135AC 0x00000000 0x00000000 0x00000000
>>>    0x00000000 0x20056EA8 0x20038E30 0x60000013 0x600000D3 0x00000000
>>>    0x00000000 0x20062A28 0x20062A1C 0x2000AE48 0x2000FFF8 0x20062A4C
>>>    0x20062A2C 0x2000ADA0 0x2000AE24 0x521C9845 0x20056EA8 0x00000000
>>>    0x00000000 0x2002ACD8 0x20062A64 0x20062A50 0x20000348 0x2000AD28
>>>    0x00000008 0x00000001 0x20062A84 0x20062A68 0x2001C2D4 0x20000310
>>>
>>> [...skip...]
>>>
>>> which commonly means the cpu tries to access non available memory.
>>>
>>> After removing the bugfix bug1718 the "Unlimited Task Test" works fine.
>>>
>>> (https://www.rtems.org/bugzilla/show_bug.cgi?id=1718)
>>>
>>> *** rtems-4.9.3: ./cpukit/sapi/include/confdefs.h *** unlimited task test works
>>> [...skip...]
>>>
>>>     #define CONFIGURE_MEMORY_PER_TASK_FOR_POSIX_API \
>>>       _Configure_From_workspace( \
>>>         sizeof (POSIX_API_Control) + \
>>>        (sizeof (void *) * (CONFIGURE_MAXIMUM_POSIX_KEYS)) \
>>>       )
>>>
>>> [...skip...]
>>>
>>> *** rtems-4.9.5: ./cpukit/sapi/include/confdefs.h *** unlimited task test doesn't work
>>> [...skip...]
>>>     #define CONFIGURE_MEMORY_PER_TASK_FOR_POSIX_API \
>>>       _Configure_From_workspace( \
>>>         CONFIGURE_MINIMUM_TASK_STACK_SIZE + \
>>>         sizeof (POSIX_API_Control) + \
>>>        (sizeof (void *) * (CONFIGURE_MAXIMUM_POSIX_KEYS)) \
>>>       )
>>> [...skip...]
>>>
>>> Any hints respectively does anyone observe the same?
>>>
>> That patch wouldn't directly cause that failure.
>> The only think I can see is that does change the
>> amount of workspace reserved up front (by a lot).
>>
>> Is this a BSP which is in the RTEMS tree?  I am
>> suspicious that there isn't enough memory for
>> the workspace/heap and the BSP initialization
>> isn't recognizing this.  Eventually the task stacks,
>> heap, etc all collide, there is corruption and you crash.
>>
>> So we would need to know the following:
>>
>> + address of end of BSS
>> + start of memory for heap and length
>> + start of memory for RTEMS workspace and length.
>> + amount of RAM
>>
>> Assuming that the workspace/heap are from end of
>> BSS to the end of RAM.
>>> Cheers
>>>
>>> --
>>> Joachim
>>>
>>> ________________________________
>>>
>>> Helmholtz-Zentrum Berlin für Materialien und Energie GmbH
>>>
>>> Mitglied der Hermann von Helmholtz-Gemeinschaft Deutscher Forschungszentren e.V
>>>
>>> Aufsichtsrat: Vorsitzender Prof. Dr. Dr. h.c. mult. Joachim Treusch, stv. Vorsitzende Dr. Beatrix Vierkorn- Rudolph
>>> Geschäftsführer: Prof. Dr. Anke Rita Kaysser-Pyzalla, Prof. Dr. Dr. h.c. Wolfgang Eberhardt, Dr. Ulrich Breuer
>>>
>>> Sitz Berlin, AG Charlottenburg, 89 HRB 5583
>>>
>>> Postadresse:
>>> Hahn-Meitner-Platz 1
>>> D-14109 Berlin
>>>
>>> http://www.helmholtz-berlin.de
>>>
>>> _______________________________________________
>>> rtems-users mailing list
>>> rtems-users at rtems.org
>>> http://www.rtems.org/mailman/listinfo/rtems-users
>>
>
> ________________________________
>
> Helmholtz-Zentrum Berlin für Materialien und Energie GmbH
>
> Mitglied der Hermann von Helmholtz-Gemeinschaft Deutscher Forschungszentren e.V.
>
> Aufsichtsrat: Vorsitzender Prof. Dr. Dr. h.c. mult. Joachim Treusch, stv. Vorsitzende Dr. Beatrix Vierkorn-Rudolph
> Geschäftsführer: Prof. Dr. Anke Rita Kaysser-Pyzalla, Dr. Ulrich Breuer
>
> Sitz Berlin, AG Charlottenburg, 89 HRB 5583
>
> Postadresse:
> Hahn-Meitner-Platz 1
> D-14109 Berlin
>
> http://www.helmholtz-berlin.de


-- 
Joel Sherrill, Ph.D.             Director of Research&  Development
joel.sherrill at OARcorp.com        On-Line Applications Research
Ask me about RTEMS: a free RTOS  Huntsville AL 35805
    Support Available             (256) 722-9985





More information about the users mailing list