Joel Sherrill joel.sherrill at oarcorp.com
Thu Apr 5 14:03:27 UTC 2007

Johan Zandin wrote:
> Jiri Gaisler wrote:
>>> The investigation we made showed that a too fast interrupt rate
>>> can cause a stack overflow and crash the system. Ideally, the
>>> code should be changed such that the current interrupt stack
>>> frame is fully unwound before switching treads and re-enabling
>>> interrupts. This might or might not be difficult to implement,
>>> and might have some undesired side-effects.
> Joel Sherrill wrote:
>> It might be desirable but this is NOT a characteristic that is specific
> to
>> the SPARC.  Every port does this the same way.
> This means that my problem still may be the same as the one starting
> this
> maillist thread. (Any news on that one, by the way?)
>> The test is structured so RTEMS thinks it needs to get to the 
>> IDLE thread but I am not sure it ever gets there (and back).  Does the 
>> IDLE task body ever run?
> When interrupted in this sensitive area, I think not.
>> The bottom line is that the CPU must have some processing power left
>> for tasks after the interrupts occur and in this case, there simply 
>> isn't any.
> With the current RTEMS behaviour, the constraint is actually much harder
> than that: It is not sufficient that tasks in general get enough CPU
> time
> compared to ISRs. In order to avoid this problem, each individual task
> must
> be allowed to be completely dispatched, before being just
> halfway-dispatched-
> and-then-interrupted too many times in a row. Note that "times in a row"
> refers to the times when that particular task is dispatched at all.
> Those
> may be rather far apart if it's a low priority task and other tasks are
> competing hard for the CPU.

> How many is then "too many times?". Well, in the SPARC case, the stack
> pointer of a task seems to be decreased 376 bytes every time it is
> interrupted
> in the middle of its dispatching. (But once it is completely dispatched,
> all
> bytes accumulated this way seems to be properly restored.) The idle task
> has
> a size of 4106 bytes, which means that if it's interrupted while being
> dispatched 11 times in row, the memory area before the stack will be
> overwritten.

I don't see a way to eliminate this (completely).  In order to honor the 
and thread scheduling requirements, you have to:

+ get to end of ISR
+ return to a thread->thread switch
   - some context has to be saved and by making calls, some stack is used by
     making the thread look like it arbitrarily called _Thread_Dispatch at a
    random point in its execution.

Generically, RTEMS always returns the to the interrupted thread and then
makes that thread switch to another thread.  This is the design chosen 
interrupt and task modes are usually different enough on a CPU architecture
where you need to get back to a task state.

Right now, when you get to _ISR_Dispatch, _Thread_Dispatch_disable_level
is 0.  I can see where if _ISR_Dispatch was executed with
_Thread_Dispatch_disable_level set to 1, you would avoid reentering the
scheduling path until nearly the bottom of _Thread_Dispatch. 

It is also possible -- I would have to think on this though -- that the 
of _Thread_Dispatch_disable_level to 0 could/should be moved to after the
post context switch extension is run.  But that still leaves a windows from
the bottom of _Thread_Dipatch until _ISR_Dispatch completes where another
ISR could run and request another dispatch.

I seem to recall that some versions of RTEMS returned to _ISR_Dispatch at
the same interrupt disable level as the interrupt they were leaving.  I 
think this would really address this case because even though you are 
getting another interrupt, in general, you could switch to another task and
need to return to the interrupted thread before it could pop the last 
frame off
the stack.

I still can't see how you can return from an ISR, save some context for the
interrupted task, and switch to another task without leaving something on
the interrupted task's stack.
> Even in a fairly small system with maybe 20 tasks and 10 different
> interrupts sources, it is very hard to gurantee that this will not
> happen for a low
> priority task. Depending on the timing between interrupts and tasks, it
> will probably be interrupted in the middle of its dispatching from time
> to time.
> And someday, old Murphy will make sure that this happens too many times
> in
> a row...
Possible but scheduling says the next interrupt will likely be on a 
higher priority's
task and that eventually you will get back to the lower priority task.

Unless your interrupts are so fast that you never really process them 
> Increasing the stack size could lessen this risk, but it is not sure
> that
> such measures can remove it completely. Imagine for example the
> following, extremely simple software (which is very similar to the
> scenarios we have
> been running in the simulator the latest week):
>   1 low priority idle task doing nothing.
>   1 high priority active task doing this:
>     while (true)
>     {
>       Start hardware activity X.
>       Wait for semaphore S.
>       Do some other stuff for a long time without releasing the CPU.
>     }
>   1 ISR which
>     a) Is executed whenever hardware activity X is completed.
>     b) Just signals the semaphore S. 
> If you are unlucky enough with the hardware timing, the ISR will
> _always_
> occur just in the critical part of _Thread_Dispatch, when RTEMS is on
> its
> way to dispatch the idle task! In the long run, not even a 4 GB stack
> for
> the idle task will help you then...
But your CPU utilization is near 100%.  The high priority task is 
repeatedly consuming
nearly all CPU time before the interrupt occurs.  There is just enough 
time to partially
return to the IDLE thread.

> Best regards
> /Johan Zandin
> -----------------------------------------------------------
> Johan Zandin                      Software Engineer
> Saab Space AB                     Phone: +46-31-735 41 47
> SE-405 15 Gothenburg, Sweden      Fax:   +46-31-735 40 00
> -----------------------------------------------------------

More information about the users mailing list