rtems_semaphore_obtain

Thu Apr 5 11:56:29 UTC 2007

Jiri Gaisler wrote:
>> The investigation we made showed that a too fast interrupt rate
>> can cause a stack overflow and crash the system. Ideally, the
>> code should be changed such that the current interrupt stack
>> frame is fully unwound before switching treads and re-enabling
>> interrupts. This might or might not be difficult to implement,
>> and might have some undesired side-effects.

Joel Sherrill wrote:
>It might be desirable but this is NOT a characteristic that is specific
to
>the SPARC.  Every port does this the same way.

This means that my problem still may be the same as the one starting
this
maillist thread. (Any news on that one, by the way?)

>The test is structured so RTEMS thinks it needs to get to the 
>IDLE thread but I am not sure it ever gets there (and back).  Does the 
>IDLE task body ever run?

When interrupted in this sensitive area, I think not.

>The bottom line is that the CPU must have some processing power left
>for tasks after the interrupts occur and in this case, there simply 
>isn't any.

With the current RTEMS behaviour, the constraint is actually much harder
than that: It is not sufficient that tasks in general get enough CPU
time
compared to ISRs. In order to avoid this problem, each individual task
must
be allowed to be completely dispatched, before being just
halfway-dispatched-
and-then-interrupted too many times in a row. Note that "times in a row"
refers to the times when that particular task is dispatched at all.
Those
may be rather far apart if it's a low priority task and other tasks are
competing hard for the CPU.

How many is then "too many times?". Well, in the SPARC case, the stack
pointer of a task seems to be decreased 376 bytes every time it is
interrupted
in the middle of its dispatching. (But once it is completely dispatched,
all
bytes accumulated this way seems to be properly restored.) The idle task
has
a size of 4106 bytes, which means that if it's interrupted while being
dispatched 11 times in row, the memory area before the stack will be
overwritten.

Even in a fairly small system with maybe 20 tasks and 10 different
interrupts sources, it is very hard to gurantee that this will not
happen for a low
priority task. Depending on the timing between interrupts and tasks, it
will probably be interrupted in the middle of its dispatching from time
to time.
And someday, old Murphy will make sure that this happens too many times
in
a row...

Increasing the stack size could lessen this risk, but it is not sure
that
such measures can remove it completely. Imagine for example the
following, extremely simple software (which is very similar to the
scenarios we have
been running in the simulator the latest week):

  1 low priority idle task doing nothing.

  1 high priority active task doing this:
    while (true)
    {
      Start hardware activity X.
      Wait for semaphore S.
      Do some other stuff for a long time without releasing the CPU.
    }

  1 ISR which
    a) Is executed whenever hardware activity X is completed.
    b) Just signals the semaphore S. 

If you are unlucky enough with the hardware timing, the ISR will
_always_
occur just in the critical part of _Thread_Dispatch, when RTEMS is on
its
way to dispatch the idle task! In the long run, not even a 4 GB stack
for
the idle task will help you then...

Best regards
/Johan Zandin

-----------------------------------------------------------
Johan Zandin                      Software Engineer
Saab Space AB                     Phone: +46-31-735 41 47
SE-405 15 Gothenburg, Sweden      Fax:   +46-31-735 40 00
-----------------------------------------------------------