rtems_semaphore_obtain

Jiri Gaisler jiri at gaisler.com
Thu Apr 5 10:56:28 UTC 2007


One idea could be to count the number of active interrupt
stack frames. If the limit of the interrupt stack size is
reached, dispatching would not be done at the end of
the interrupt handler. This would avoid stack overflow
and system crash. Dispatching should anyway be done when
the previous interrupt frame is unwound ..?

Jiri.

Joel Sherrill wrote:
> Jiri Gaisler wrote:
>> The investigation we made showed that a too fast interrupt rate
>> can cause a stack overflow and crash the system. Ideally, the
>> code should be changed such that the current interrupt stack
>> frame is fully unwound before switching treads and re-enabling
>> interrupts. This might or might not be difficult to implement,
>> and might have some undesired side-effects.
>>
>> Joel, what is your opinion ..?
>>
>>   
> It might be desirable but this is NOT a characteristic that is specific to
> the SPARC.  Every port does this the same way.  You eventually have
> to have enough spare CPU time to unwind the original interrupt frame
> 
> Plus you can't really unwind the stack then.  All task switches in RTEMS
> are task to task.  When an interrupt occurs, there are some registers not
> saved as part of the regular task to task switch.  These are the core of
> that
> Interrupt Stack Frame you need to pop off.  It isn't as clear on the SPARC
> but look at the same code on the m68k:
> 
>        .global SYM (_ISR_Dispatch)
> SYM (_ISR_Dispatch):
>        movml   d0-d1/a0-a1,a7 at -
>        jsr     SYM (_Thread_Dispatch)    <---- task switch occurs here
>        movml   a7 at +,d0-d1/a0-a1
>        rte
> 
> On the m68k, d0-d1/a0-a1 are scratch registers which are assumed to
> be used by _Thread_Dispatch and must be saved/restored around the call.
> On the m68k, this means that 4 registers and an interrupt stack frame are
> on the stack until the interrupted task gets switched back.
> 
> It would require each port to do some pretty fancy magic to avoid this
> and I don't even know it is possible.  The SPARC with its register windows
> MIGHT be contorted to do it but other architectures probably can't at all.
> 
> The test is structured so RTEMS thinks it needs to get to the IDLE thread
> but I am not sure it ever gets there (and back).  Does the IDLE task body
> ever run?
> 
> The bottom line is that the CPU must have some processing power left
> for tasks after the interrupts occur and in this case, there simply
> isn't any.
> 
> I did a quick experiment and set delay to 2000 and let it count down each
> iteration of the loop.  At 1114, the test died.  At 1234, we were still
> hitting
> the IDLE thread body every interrupt.  As the number of cycles between
> ISRs decreased, we keep interrupting one or two instructions back from
> the end of _ISR_Dispatch.  Maybe that area could be better protected.
> 
> Jiri... I have attached my version of the test ... maybe there is
> something that
> could be done to make things a little better but it still takes ~1114
> cycles to
> process each "block/switch to IDLE/ISR/unblock/switch from IDLE to task"
> iteration.  We have to let that complete. :)
> 
> --joel
> 
>> Jiri.
>>
>> Joel Sherrill wrote:
>>  
>>> John Pickwick wrote:
>>>    
>>>> Hi,
>>>>
>>>> on that thread, after Jiri's last message, it is unclear to me if
>>>> there is still a problem for LEON2 BSP in the official 4.6.5 release
>>>> (from RTEMS site).
>>>>
>>>> Please can someone clarify this point, thanx
>>>>
>>>>         
>>> Jiri ships a version of RTEMS with some patches that may or may not
>>> be in
>>> the official tree at any given point in time.  He will have to
>>> comment on what
>>> is in his current patch set.
>>>
>>> But so far, we haven't found anything wrong with RTEMS based upon the
>>> test code.
>>> So far it looks like the code is generating interrupts faster than
>>> they can be processed
>>> and eventually blowing the stack.  There must be enough time between
>>> interrupts
>>> so you do all cleanup from the interrupted task eventually.
>>>
>>> --joel
>>>    
>>>> John
>>>>
>>>> ----- Original Message ----- From: "Joel Sherrill"
>>>> <joel.sherrill at oarcorp.com>
>>>> To: "Thomas Doerfler (nt)" <Thomas.Doerfler at imd-systems.de>
>>>> Cc: <rtems-users at rtems.org>
>>>> Sent: Thursday, March 29, 2007 10:22 PM
>>>> Subject: Re: rtems_semaphore_obtain
>>>>
>>>>
>>>>        
>>>>> Thomas Doerfler (nt) wrote:
>>>>>             
>>> Hi,
>>>
>>> Some while ago we had a thread on the rtems mailing list which might
>>> handle your problem. We found out, that gcc takes the liberty to move
>>> some memory accesses that should occure between the irq disable/enable
>>> calls to a location before or after the irq disabled section. Try
>>> looking for a keyword like "memory barrier" on the mailing list
>>> archive :-)
>>>
>>>
>>>          
>>>>>> This was my first guess also but it fails with the edge of the 4.6
>>>>>> and
>>>>>> 4.7 branches
>>>>>> as well as using events for synchronization so it is most likely
>>>>>> not the
>>>>>> barrier problem
>>>>>> again.
>>>>>>
>>>>>> --joel
>>>>>>               
>>> wkr,
>>> Thomas.
>>>
>>>
>>> Johan Zandin schrieb:
>>>
>>>          
>>>>>>>> Sergei Organov  writes:
>>>>>>>>
>>>>>>>>
>>>>>>>>                      
>>>>>>>>> "Johan Zandin" <johan.zandin at space.se> writes:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>                          
>>>>>>>>>>       _Context_Switch( &executing->Registers, &heir->Registers );
>>>>>>>>>>
>>>>>>>>>>       executing = _Thread_Executing;
>>>>>>>>>>
>>>>>>>>>>       _ISR_Disable( level );         -----+
>>>>>>>>>>                                           |  Region where
>>>>>>>>>>    }                                      |  an occuring
>>>>>>>>>>                                           |  interrupt
>>>>>>>>>>     _Thread_Dispatch_disable_level = 0;   |  causes problems
>>>>>>>>>>                                           |
>>>>>>>>>>     _ISR_Enable( level );            -----+
>>>>>>>>>>
>>>>>>>>>>                               
>>>>>>>>> But how interrupt can occur when it's disabled in this region?! If
>>>>>>>>> _ISR_Disable()/_ISR_Enable() don't work on your target, you
>>>>>>>>> have hard
>>>>>>>>> trouble anyway.
>>>>>>>>>
>>>>>>>>>                           
>>>>>>>> The HW interrupt occurs but is left pending until ISRs are enabled,
>>>>>>>> so the ISR does not execute until somewhere within the
>>>>>>>> _ISR_Enable call
>>>>>>>> (in the first cycle when ISRs are enabled in the CPU again).
>>>>>>>>
>>>>>>>> /Johan
>>>>>>>>
>>>>>>>> -----------------------------------------------------------
>>>>>>>> Johan Zandin                      Software Engineer
>>>>>>>> Saab Space AB                     Phone: +46-31-735 41 47
>>>>>>>> SE-405 15 Gothenburg, Sweden      Fax:   +46-31-735 40 00
>>>>>>>> -----------------------------------------------------------
>>>>>>>>
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> rtems-users mailing list
>>>>>>>> rtems-users at rtems.com
>>>>>>>> http://rtems.rtems.org/mailman/listinfo/rtems-users
>>>>>>>>
>>>>>>>>                       
>> _______________________________________________
>> rtems-users mailing list
>> rtems-users at rtems.com
>> http://rtems.rtems.org/mailman/listinfo/rtems-users
>>  
>>  
>>>>> _______________________________________________
>>>>> rtems-users mailing list
>>>>> rtems-users at rtems.com
>>>>> http://rtems.rtems.org/mailman/listinfo/rtems-users
>>>>>
>>>>>             
>>>> _______________________________________________
>>>> rtems-users mailing list
>>>> rtems-users at rtems.com
>>>> http://rtems.rtems.org/mailman/listinfo/rtems-users
>>>>         
>>
>>  
>>> _______________________________________________
>>> rtems-users mailing list
>>> rtems-users at rtems.com
>>> http://rtems.rtems.org/mailman/listinfo/rtems-users
>>>     
>>
>>
>>   
> 



More information about the users mailing list