RTEMS scheduler bug ?

Catalin Demergian demergian at gmail.com
Fri Mar 29 13:31:51 UTC 2019


Sure, I will build and try.
Is _ISR_Get_level() something CPU-specific ? I mean we use it to figure out
if interrupts are enabled or not, right ?
forgot to say, I use a STM32H7. Does _ISR_Get_level work on it ?

Also, I thought the scheduler code runs with interrupts enabled, I mean
doesn't it use the tick interrupt itself ?
How does that work on RTEMS ?


On Fri, Mar 29, 2019 at 12:04 PM Sebastian Huber <
sebastian.huber at embedded-brains.de> wrote:

> Hello Catalin,
>
> On 29/03/2019 10:56, Catalin Demergian wrote:
> > Hi,
> > We had some time ago (sept/oct 2018) a long discussion where I was
> > suspecting a
> > scheduler issue (subject
> > "rtems_message_queue_receive/rtems_event_receive issues")
> >
> > We got to the point where I realized that _Chain_Append_unprotected
> > might fail to add an
> > element in the queue, with the effect of having a task in a funny
> > state where state=READY, but
> > the task will not be in the ready chain, so the task will never get
> > CPU time anymore since a task
> > needs to be blocked in order to be unblocked when new data arrives.
> >
> > We were using USB then, but this issue re-became hot because we just
> > got the same issue
> > over serial :)
> > I believe there is a possible chain of events that can make
> > _Chain_Append_unprotected to fail,
> > explanations follow.
> >
> > /*
> >
> > ** @note It does NOT disable interrupts to ensure the atomicity of the*
> >
> > **       append operation.*
> >
> > */
> >
> > RTEMS_INLINE_ROUTINE void _Chain_Append_unprotected(
> >
> >   Chain_Control *the_chain,
> >
> >   Chain_Node    *the_node
> >
> > )
> >
> > {
> >
> >   Chain_Node *tail = _Chain_Tail( the_chain );
> >
> >   Chain_Node *old_last = tail->previous;
> >
> >   the_node->next = tail;
> >
> > *  tail->previous = the_node;*
> >
> > *  old_last->next = the_node;*
> >
> >   the_node->previous = old_last;
> >
> > }
> >
> > The
> >
> > *  tail->previous = the_node;*
> >
> > *  old_last->next = the_node;*
> >
> > lines are the ones that actually add the element
> >
> > to the ready chain.
> >
> > If a thread executes those lines, but just before executing
> >
> > the_node->previous = old_last;
> >
> > another thread comes to add another node in this chain, it will set
> > another node in
> >
> > tail->previous and old_last->next, and as a result, when the interrupted
> >
> > thread will continue to execute the last line, it will be for nothing,
> > because the initial node will not be added to the ready chain.
> >
> >
> > If this chain of events occur (*and after a while they will*), we get
> > starvation for that task.
> >
> > I'm reproducing this issue in a long duration test, the duration
> > before this happens varies from run to run, but it always happens.
> >
> >
> > *What I'm proposing is the following*: call _Chain_Append instead of
> > _Chain_Append_unprotected in
> > schedulerpriorityimpl.h, _Scheduler_priority_Ready_queue_enqueue
> function.
> >
> >
> > void _Chain_Append(
> >
> >   Chain_Control *the_chain,
> >
> >   Chain_Node    *node
> >
> > )
> >
> > {
> >
> >   ISR_Level level;
> >
> >   _ISR_Disable( level );
> >
> >     _Chain_Append_unprotected( the_chain, node );
> >
> > _ISR_Enable( level );
> >
> > }
> >
> >
> > This way the add-element-to-chain operation becomes atomic.
> >
> > I was able to run a long duration test (8 hrs) in my setup with this
> > fix successfully.
> >
> >
> > What do you think ?
> >
>
> The _Scheduler_priority_Ready_queue_enqueue() should only be called with
> interrupts disabled. So, disabling interrupts again should have no
> effect. Could you please try out the attached patch and build the BSP
> with --enable-rtems-debug?
>
> --
> Sebastian Huber, embedded brains GmbH
>
> Address : Dornierstr. 4, D-82178 Puchheim, Germany
> Phone   : +49 89 189 47 41-16
> Fax     : +49 89 189 47 41-09
> E-Mail  : sebastian.huber at embedded-brains.de
> PGP     : Public key available on request.
>
> Diese Nachricht ist keine geschäftliche Mitteilung im Sinne des EHUG.
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rtems.org/pipermail/users/attachments/20190329/de89a61e/attachment.html>


More information about the users mailing list