rtems_region_create, and starting_memory address.

Thu Feb 5 02:29:58 UTC 2009

Nick Thomas wrote:
> This is getting more frustrating by the minute.
> 
> Whatever I do, it just makes it worse.
> 
> Now, I seem to have an infinite loop in _Heap_Allocate.
> It looks like it is calling _Workspace_Allocate, with a size of 96.
> 
> But, when I break into the code, it is spinning in a for() loop in
> _Heap_Allocate function.
> Probably trying to find a big enough free block.
> 
> From gdb 'info locals'
> I see:
>  The_size=100
>  Search_count = 5293661
>  The_block = (Heap_Block *)0x0
>  Ptr = (void *) 0x0
>  Stats = (Heap_Statistics * const) 0x7dc428
>  Tail = (Heap_Block * const) 0x7dc400
> 
> 
> Now, 0x7dc400 corresponds with my _Workspace_Area from the .num file.
> 
> I just don't have a clue to what is going wrong, is it stack, heap,
> something else???
> 
> Please help, RTEMS is driving me mad.
> 

This type of bug can be frustrating but a couple of deep breaths helps. You 
are receiving generic and non-specific answers because we are getting snippets 
light in detail of what is happening. For example is this custom hardware and 
a custom BSP.

I think we can agree you have a memory corruption bug. That is memory is being 
written to that should not be. These bug can be simple or difficult to find so 
lets start with the simple reasons.

A common memory corruption bug in a real-time environment like RTEMS is stack 
corruption. The stack is set and does not change in size using VM tricks like 
you get on Unix type system. Check your code for local variables that are 
large. This can be arrays or a structure that contains an array. It could also 
be an alloc call. The stack overflow may occur in another task, not the 
faulting task or interrupt. The overflowing task could be happily sleeping 
unaware of the damage it has left. The context switcher switches tasks or 
dispatches an interrupt only to find the task woken is a mess. Check any 
recursive functions. These maybe overflowing the stack. If you are happy the 
stack based local variable are all small see if you can increase the stack 
size of your tasks. If you have the memory make them large. If this stops the 
corruption you will know the source but not the location. I then use the stack 
check command in the shell to tell me the stack usage and so the task which is 
using far more stack than I thought. Often a look or a stepping code in the 
task show the problem. The interrupt stack is a little harder. It can be 
configured via the rtems_configuration_table. See:

http://www.rtems.org/onlinedocs/releases/rtemsdocs-4.9.1/share/rtems/html/c_user/c_user00411.html

One last comment on stack corruption bugs. Often the back trace command in the 
debugger does not work as the stack is corrupted.

Check any dynamic memory allocation calls. Make sure you are handling the 
pointers carefully. Also check any pointer addition to make sure you are not 
accessing past the end of the allocated data. A common bug is adding what you 
think is a number of bytes to a struct pointer. A simple way to check that can 
  sometimes work if you have the memory is to increase the size of the memory 
allocated at the malloc call by a large amount. For example:

   p = malloc (sizeof(struct foo) + 10000);

If the pointer could be underrunning the buffer allocate more space and move 
the pointer up the buffer. Remember to observer alignment constraints.

A more difficult memory corruption problem is a memory map error. Here you 
need to check the linker map to make sure the memory is set out as it should 
be. This also extends into the workspace configuration. Make sure it matches 
the how the memory is set out. These sorts of errors usually appear soon after 
initialisation.

After this memory corruption bugs start to get harder to find. If you can find 
a location that is changing note it. If your debugger has hardware watchpoint 
support get it to watch that location for write accesses.

I hope this helps.

Regards
Chris