still problem with ARM and Unlimited Task Test
Joachim Rahn
Joachim.Rahn at helmholtz-berlin.de
Wed Jun 15 14:31:29 UTC 2011
On 15.06.2011 15:44, Joel Sherrill wrote:
> On 06/15/2011 08:08 AM, Joachim Rahn wrote:
>> Hi Joel,
>>
>> hope I don't bother you too much...
>>
>> Now I found some time beside my main work to come back to my problem with the
>> failing Unlimited Task Test on our ARM board.
>>
>> Any advice regarding the following would be welcome!!!
>> ------------------------------------------------------
>>
>> In fact our problem has nothing directly to do with the update between RTEMS 4.9.3 and 4.9.5 !
>>
>> BTW: We now use the shared code in the BSP tree to initialize the workspace (BSP_BOOTCARD_HANDLES_RAM_ALLOCATION = TRUE)
>> and we compile every thing with all RTEMS and HEAP debugging on.
>>
>> We now have a setup with RTEMS 4.9.3 on a AT91SAM9263-EK evaluation board from Atmel which reproduces
>> our data abort fault.
>>
>> The problem seems to be that under some circumstances the routine
>>
>> _RTEMS_task_Switch_extension(Thread_Control *executing, Thread_Control *heir)
>>
>> will be called with a reference to an executing task (*executing) after
>> the routine
>>
>> _RTEMS_task_Delete_extension(Thread_Control *executing, Thread_Control *heir)
>>
>> has already deleted the certain task and a following call sequence to
>>
>> _RTEMS_tasks_Free
>> _Objects_Free
>> _Objects_Shrink_information
>>
>> ends up in a call to
>>
>> _Heap_Free(Heap_Control *the_heap, void *starting_address)
>>
>> which frees the memory used by this certain Thread_Control struct of that task and overwites the
>> pointer "executing->task_variables" (which now should be NULL) with some heap information.
>> Because "executing->task_variables" now is corrupted the call to _RTEMS_task_Switch_extension
>> leads to a data_abort.
>>
>> GDB output of the concerning call sequences using the gdb up command looks like:
>>
>> GDB stack walk: _RTEMS_tasks_Delete_extension (executing=0x23fca500, deleted=0x23fca500)
>> _User_extensions_Thread_delete (the_thread=0x23fca500)
>> _Thread_Close (information=0x2002fd98, the_thread=0x23fca500)
>> rtems_task_delete (id=0)
>> test_task (my_number=8)
>> _Thread_Handler ()
>> _Objects_API_maximum_class (api=536959452)
>>
>> GDB stack walk: _Heap_Free (the_heap=0x2002fe4c, starting_address=0x23fca0a0)
>> _Workspace_Free (block=0x23fca0a0)
>> _Objects_Shrink_information (information=0x2002fd98)
>> _Objects_Free (information=0x2002fd98, the_object=0x23fca500)
>> _RTEMS_tasks_Free (the_task=0x23fca500)
>> rtems_task_delete (id=0)
>> test_task (my_number=8)
>> _Thread_Handler ()
>> _Objects_API_maximum_class (api=536959452)
>>
>> NOW "executing->task_variables" IS CORRUPTED !!!!!
>>
>> GDB stack walk: _RTEMS_tasks_Switch_extension (executing=0x23fca500, heir=0x23fac488)
>> _User_extensions_Thread_switch (executing=0x23fca500, heir=0x23fac488)
>> _Thread_Dispatch ()
>> _Thread_Enable_dispatch ()
>> rtems_task_delete (id=0)
>> test_task (my_number=8)
>> _Thread_Handler ()
>> _Objects_API_maximum_class (api=536959452)
>>
>>
>> By chance the pointer to "next_block->prev_size" in the call to _Heap_Free has the same location as
>> "executing->task_variables" in the concerning Thread_Control struct and therefore _RTEMS_task_Switch_extension
>> tries to access a bad memory location which of course leads to a data_abort.
>> May be under other circumstances one will never stumble upon this?
>>
> The memory has indeed been freed and is not supposed to be used.
> In fact, executing->task_variables should be NULL. I see it set
> to NULL in _RTEMS_tasks_Delete_extension. Can you verify that?
>
YES: I've verified it, executing->task_variables is set to NULL by _RTEMS_tasks_Delete_extension!
BUT: after _Heap_Free has been called executing->task_variables is altered
because at the former location of executing->task_variables now the _Heap_Free routine
expects next_block->prev_size and alters it to 3096 or 0xCE0.
The following call to _RTEMS_tasks_Switch_extension checks if executing->task_variables is NULL
but it's now 0xCE0 resp. NOT NULL.
<...snip... cpukit/rtems/src/tasks.c >
void _RTEMS_tasks_Switch_extension(
Thread_Control *executing,
Thread_Control *heir
)
{
rtems_task_variable_t *tvp;
/*
* Per Task Variables
*/
tvp = executing->task_variables;
while (tvp) {
tvp->tval = *tvp->ptr;
<...snip...>
therefore the check of NULL fails and the last line of code in the snippet results into a data abort...
> I think the extensions should ensure they are not operating on
> a deleted task. The extensions pointers and task variable
> pointer should be NULL at this point. Worst case, they can
> check the state of executing and if is has STATES_DORMANT set,
> then don't do anything for executing.
>
> I checked the 4.9 source for this part of the Classic API extensions.
> They are setting things to NULL and the switch extension is checking
> it.
>
> FWIW there is a PR outstanding spotted on SMP work where the
> thread stack is freed and potentially reallocated for some other
> purpose before the delete(SELF) task is finished switching out.
> I don't think that's happening here but it is worth mentioning.
>
>> BTW: When I change (as a test) the definition of CPU_HEAP_ALIGNMENT in
>> ..../rtems/cpukit/score/cpu/arm/rtems/score/cpu.h
>> from CPU_ALIGNMENT (which is 4) to something larger than CPU_ALIGNMENT,
>> the unlimited test works fine.
>>
>> Any idea or advice ...?
>>
>> Regards,
>> Joachim
>>
>> On 01.03.2011 17:09, Joel Sherrill wrote:
>>> On 03/01/2011 07:48 AM, Joachim Rahn wrote:
>>>> Hi all,
>>>>
>>>> after updating from rtems-4.9.3 to rtems-4.9.5 the "Unlimited Task Test" on my
>>>> ARM cpu at91sam9263 fails with a message like...
>>>>
>>>> [...skip...]
>>>> task 19 ending.
>>>> task 20 ending.
>>>> task 21 ending.
>>>> task 7 ending.
>>>> task 8 ending.
>>>>
>>>> INSN_LDR
>>>> data_abort at address 0x20018CD8, instruction: 0xE5932000, spsr = 0x20000013
>>>> active thread thread 0x0A010001
>>>> Previous sp=0x200629A8 lr=0x200135E0 and actual cpsr=60000097
>>>> 0x20038E30 0x20056EA8 0x0000117C 0x200629E0 0x200629C4 0x200135E0
>>>> 0x20018CB8 0x20038E30 0x20056EA8 0x20026EC0 0x20026EC0 0x20062A18
>>>> 0x200629E4 0x20010100 0x200135AC 0x00000000 0x00000000 0x00000000
>>>> 0x00000000 0x20056EA8 0x20038E30 0x60000013 0x600000D3 0x00000000
>>>> 0x00000000 0x20062A28 0x20062A1C 0x2000AE48 0x2000FFF8 0x20062A4C
>>>> 0x20062A2C 0x2000ADA0 0x2000AE24 0x521C9845 0x20056EA8 0x00000000
>>>> 0x00000000 0x2002ACD8 0x20062A64 0x20062A50 0x20000348 0x2000AD28
>>>> 0x00000008 0x00000001 0x20062A84 0x20062A68 0x2001C2D4 0x20000310
>>>>
>>>> [...skip...]
>>>>
>>>> which commonly means the cpu tries to access non available memory.
>>>>
>>>> After removing the bugfix bug1718 the "Unlimited Task Test" works fine.
>>>>
>>>> (https://www.rtems.org/bugzilla/show_bug.cgi?id=1718)
>>>>
>>>> *** rtems-4.9.3: ./cpukit/sapi/include/confdefs.h *** unlimited task test works
>>>> [...skip...]
>>>>
>>>> #define CONFIGURE_MEMORY_PER_TASK_FOR_POSIX_API \
>>>> _Configure_From_workspace( \
>>>> sizeof (POSIX_API_Control) + \
>>>> (sizeof (void *) * (CONFIGURE_MAXIMUM_POSIX_KEYS)) \
>>>> )
>>>>
>>>> [...skip...]
>>>>
>>>> *** rtems-4.9.5: ./cpukit/sapi/include/confdefs.h *** unlimited task test doesn't work
>>>> [...skip...]
>>>> #define CONFIGURE_MEMORY_PER_TASK_FOR_POSIX_API \
>>>> _Configure_From_workspace( \
>>>> CONFIGURE_MINIMUM_TASK_STACK_SIZE + \
>>>> sizeof (POSIX_API_Control) + \
>>>> (sizeof (void *) * (CONFIGURE_MAXIMUM_POSIX_KEYS)) \
>>>> )
>>>> [...skip...]
>>>>
>>>> Any hints respectively does anyone observe the same?
>>>>
>>> That patch wouldn't directly cause that failure.
>>> The only think I can see is that does change the
>>> amount of workspace reserved up front (by a lot).
>>>
>>> Is this a BSP which is in the RTEMS tree? I am
>>> suspicious that there isn't enough memory for
>>> the workspace/heap and the BSP initialization
>>> isn't recognizing this. Eventually the task stacks,
>>> heap, etc all collide, there is corruption and you crash.
>>>
>>> So we would need to know the following:
>>>
>>> + address of end of BSS
>>> + start of memory for heap and length
>>> + start of memory for RTEMS workspace and length.
>>> + amount of RAM
>>>
>>> Assuming that the workspace/heap are from end of
>>> BSS to the end of RAM.
>>>> Cheers
>>>>
>>>> --
>>>> Joachim
>>>>
>>>> ________________________________
>>>>
>>>> Helmholtz-Zentrum Berlin für Materialien und Energie GmbH
>>>>
>>>> Mitglied der Hermann von Helmholtz-Gemeinschaft Deutscher Forschungszentren e.V
>>>>
>>>> Aufsichtsrat: Vorsitzender Prof. Dr. Dr. h.c. mult. Joachim Treusch, stv. Vorsitzende Dr. Beatrix Vierkorn- Rudolph
>>>> Geschäftsführer: Prof. Dr. Anke Rita Kaysser-Pyzalla, Prof. Dr. Dr. h.c. Wolfgang Eberhardt, Dr. Ulrich Breuer
>>>>
>>>> Sitz Berlin, AG Charlottenburg, 89 HRB 5583
>>>>
>>>> Postadresse:
>>>> Hahn-Meitner-Platz 1
>>>> D-14109 Berlin
>>>>
>>>> http://www.helmholtz-berlin.de
>>>>
>>>> _______________________________________________
>>>> rtems-users mailing list
>>>> rtems-users at rtems.org
>>>> http://www.rtems.org/mailman/listinfo/rtems-users
>>>
>>
>> ________________________________
>>
>> Helmholtz-Zentrum Berlin für Materialien und Energie GmbH
>>
>> Mitglied der Hermann von Helmholtz-Gemeinschaft Deutscher Forschungszentren e.V.
>>
>> Aufsichtsrat: Vorsitzender Prof. Dr. Dr. h.c. mult. Joachim Treusch, stv. Vorsitzende Dr. Beatrix Vierkorn-Rudolph
>> Geschäftsführer: Prof. Dr. Anke Rita Kaysser-Pyzalla, Dr. Ulrich Breuer
>>
>> Sitz Berlin, AG Charlottenburg, 89 HRB 5583
>>
>> Postadresse:
>> Hahn-Meitner-Platz 1
>> D-14109 Berlin
>>
>> http://www.helmholtz-berlin.de
>
>
________________________________
Helmholtz-Zentrum Berlin für Materialien und Energie GmbH
Mitglied der Hermann von Helmholtz-Gemeinschaft Deutscher Forschungszentren e.V.
Aufsichtsrat: Vorsitzender Prof. Dr. Dr. h.c. mult. Joachim Treusch, stv. Vorsitzende Dr. Beatrix Vierkorn-Rudolph
Geschäftsführer: Prof. Dr. Anke Rita Kaysser-Pyzalla, Dr. Ulrich Breuer
Sitz Berlin, AG Charlottenburg, 89 HRB 5583
Postadresse:
Hahn-Meitner-Platz 1
D-14109 Berlin
http://www.helmholtz-berlin.de
More information about the users
mailing list