Stack full or something else?

Mon Nov 15 20:15:44 UTC 2010

Hi Gedare,

Will CONFIGURE_EXTRA_TASK_STACKS also work with the POSIX API?

Best,
JM

On Mon, Nov 15, 2010 at 6:34 PM, Gedare Bloom <gedare at gwmail.gwu.edu> wrote:

> João,
>
> You can configure extra space for all of your tasks like this:
>   #define CONFIGURE_EXTRA_TASK_STACKS (RTEMS_MINIMUM_STACK_SIZE * ??)
> Where you put ?? to make it a multiple of min stack size, or you can just
> put an arbitrary number of bytes.
>
> Otherwise you can increase the stack_size argument to rtems_task_create for
> a particular task.
>
> You might also try configuring with CONFIGURE_STACK_CHECKER_ENABLED
> defined.  It does some extra checks for stack bounds and might give you a
> tighter window for debugging where the problem is occurring.
>
> -Gedare
>
>
>
> On Mon, Nov 15, 2010 at 1:13 PM, João Rasta <freakforever at gmail.com>wrote:
>
>> Hi Joel,
>>
>> Good point. I have a task that is heavy on memory operations (multi-size
>> array operations and so on). It also has a lot of doubles so it can also be
>> corrupting its stack. Too bad it is 'silent' when corrupting the memory one
>> way or another..
>>
>> To eliminate the stack overflow issue, i guess it is not enough to
>> increase the initial task stack size. Is there a way to control the
>> subsequent called task stack sizes?
>>
>>
>> Best,
>> JM
>>
>>
>>
>> On Mon, Nov 15, 2010 at 5:43 PM, Joel Sherrill <joel.sherrill at oarcorp.com
>> > wrote:
>>
>>> On 11/15/2010 11:24 AM, João Rasta wrote:
>>>
>>>> After some hard debuggin' with gdb i found out that this error is
>>>> occurring at _Semaphore_Translate_core_mutex_return_code() but i still don't
>>>> know why it happens. Here's the disassembly of the code where the error is
>>>> generated
>>>>
>>>> 0x40040d88 <rtems_semaphore_obtain+192>: ld      [ %l3 ], %g1
>>>> 0x40040d8c <rtems_semaphore_obtain+196>: call    0x400410c8
>>>> <_Semaphore_Translate_core_mutex_return_code>
>>>> 0x40040d90 <rtems_semaphore_obtain+200>: ld      [ %g1 + 0x34 ], %o0
>>>> 0x40040e20 <rtems_semaphore_obtain+344>: b       0x40040d8c
>>>> <rtems_semaphore_obtain+196>
>>>> 0x40040e24 <rtems_semaphore_obtain+348>: ld      [ %l3 ], %g1
>>>> 0x40040ed8 <rtems_semaphore_obtain+528>: b       0x40040d8c
>>>> <rtems_semaphore_obtain+196>
>>>> 0x40040edc <rtems_semaphore_obtain+532>: ld      [ %l3 ], %g1
>>>>
>>>> And Stack copy:
>>>>
>>>> Thread [3] (Suspended: Signal 'SIGSEGV' received. Description:
>>>> Segmentation fault.)
>>>>    3 rtems_semaphore_obtain()
>>>> c:\opt\rtems-4.10-mingw\src\rtems-4.10\cpukit\rtems\src\semobtain.c:90
>>>> 0x40040edc
>>>>    2 <symbol is not available> 0x00000008
>>>>    1 <symbol is not available> 0x0000000c
>>>>
>>>>  If this is the backtrace, somehow the stack pointer has gotten
>>> corrupted.
>>>
>>>  The last value i could get from %g1 is 0.
>>>>
>>> If this task is not blowing its stack, then we are left with a couple
>>> of guesses:
>>>
>>> + another task is blowing its stack and corrupting memory
>>> that impacts this task.
>>> + stray write is corrupting something.
>>>
>>> When did you see the %g1 have 0?  What was the last instruction?
>>> Where was it loading from?
>>>
>>>  Here's how i'm configuring semaphores:
>>>>
>>>> #define CONFIGURE_MAXIMUM_POSIX_SEMAPHORES            15
>>>>
>>>>  This only affects POSIX semaphores sem_XXX.
>>>
>>>
>>>> Any hints on what it can be failing? Should i set
>>>> CONFIGURE_MAXIMUM_SEMAPHORES as well?
>>>>
>>>>  I doubt it since you appear to be completing a semaphore_obtain.
>>> That means (probably) that a semaphore_create worked.
>>>
>>>  Do a backtrace in gdb.  I suspect you will find you are  coming from a
>>> subsystem
>>> like termios.
>>>
>>> -joel
>>>
>>>>
>>>> Best,
>>>> JM
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Fri, Nov 12, 2010 at 7:34 PM, Joel Sherrill <
>>>> joel.sherrill at oarcorp.com <mailto:joel.sherrill at oarcorp.com>> wrote:
>>>>
>>>>    On 11/12/2010 01:05 PM, João Rasta wrote:
>>>>
>>>>        Hi Daron,
>>>>
>>>>        It is running on a LEON-3. The application exits raising the
>>>>        following exception:
>>>>
>>>>        IU in error mode (tt = 0x07)
>>>>
>>>>        which is a memory access to an unaligned address. The last
>>>>        instruction is this:
>>>>
>>>>        4003bd74  d0006034   ld  [%g1 + 0x34], %o0
>>>>
>>>>    What is the value of g1?  Is it unaligned?
>>>>
>>>>        I don't understand why i have this error. The code where this
>>>>        error is being reported is compiled independently and then put
>>>>        in a library, but it uses the same cross-compiler as the main
>>>>        source code. I don't think i'm doing something wrong while
>>>>        compiling the library files, i use the same compilation flags..
>>>>
>>>>        What can i be missing to have unaligned memory accesses?
>>>>
>>>>
>>>>        Best,
>>>>        JM
>>>>
>>>>
>>>>
>>>>        On Fri, Nov 12, 2010 at 6:43 PM, Daron Chabot
>>>>        <daron.chabot at gmail.com <mailto:daron.chabot at gmail.com>
>>>>        <mailto:daron.chabot at gmail.com
>>>>        <mailto:daron.chabot at gmail.com>>> wrote:
>>>>
>>>>
>>>>           On Fri, Nov 12, 2010 at 11:50 AM, João Rasta
>>>>        <freakforever at gmail.com <mailto:freakforever at gmail.com>
>>>>        <mailto:freakforever at gmail.com
>>>>
>>>>        <mailto:freakforever at gmail.com>>> wrote:
>>>>
>>>>               Hi,
>>>>
>>>>               I have an RTEMS POSIX API application which comes to a
>>>>        point
>>>>               that if a small function (with some doubles passed as
>>>>               arguments) is called, the application exits with an
>>>>        error. At
>>>>               first i thought of increasing the stack space. I did
>>>>        this with
>>>>               CONFIGURE_POSIX_INIT_THREAD_STACK_SIZE but the problem
>>>>        remains
>>>>               even if i remove all the contents of the function.
>>>>
>>>>               1) Am i setting up the stack size correctly? I think i
>>>>        am, but
>>>>               i ask just in case..
>>>>
>>>>               2) Is there any other explanation to why a function call
>>>>               crashes the application besides having a full stack?
>>>>        Again, i
>>>>               erased the function contents..
>>>>
>>>>
>>>>           What architecture is this running on ?
>>>>
>>>>           What is the application exit error (message and/or return
>>>>        code)?
>>>>
>>>>           It looks like all POSIX threads are created as floating-point
>>>>           tasks (FP state saved across context switches), so there
>>>>           "shouldn't" be a problem on that aspect...
>>>>
>>>>
>>>>
>>>>               Best,
>>>>               JM
>>>>
>>>>
>>>>
>>>>               _______________________________________________
>>>>               rtems-users mailing list
>>>>        rtems-users at rtems.org <mailto:rtems-users at rtems.org>
>>>>        <mailto:rtems-users at rtems.org <mailto:rtems-users at rtems.org>>
>>>>
>>>>
>>>>        http://www.rtems.org/mailman/listinfo/rtems-users
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>    --     Joel Sherrill, Ph.D.             Director of Research&
>>>>  Development
>>>>    joel.sherrill at OARcorp.com        On-Line Applications Research
>>>>    Ask me about RTEMS: a free RTOS  Huntsville AL 35805
>>>>      Support Available             (256) 722-9985
>>>>
>>>>
>>>>
>>>>
>>>
>>> --
>>> Joel Sherrill, Ph.D.             Director of Research&  Development
>>> joel.sherrill at OARcorp.com        On-Line Applications Research
>>> Ask me about RTEMS: a free RTOS  Huntsville AL 35805
>>>   Support Available             (256) 722-9985
>>>
>>>
>>>
>>
>> _______________________________________________
>> rtems-users mailing list
>> rtems-users at rtems.org
>> http://www.rtems.org/mailman/listinfo/rtems-users
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rtems.org/pipermail/users/attachments/20101115/aac9048b/attachment-0001.html>