rtems_message_queue_receive / rtems_event_receive issues

Catalin Demergian demergian at gmail.com
Thu Oct 25 09:00:22 UTC 2018


Hi,
First, I would like to conceptually understand how a function as simple
as _Chain_Append_unprotected could fail.
I added a patch like this
RTEMS_INLINE_ROUTINE void _Chain_Append_unprotected(
  Chain_Control *the_chain,
  Chain_Node    *the_node
)
{
  append_cnt++;
  if(append_cnt > append_cnt_max)
  append_cnt_max = append_cnt;
  Chain_Node *tail = _Chain_Tail( the_chain );
  Chain_Node *old_last = tail->previous;

  the_node->next = tail;
  tail->previous = the_node;
  old_last->next = the_node;
  the_node->previous = old_last;
  append_cnt--;
}

I could see append_cnt_max=2 in the output bellow, meaning at some point
there were two function calls
in progress (don't know if it was for the same chain, as the patch dind't
look if it's the_chain parameter is the same)
How would be possible as a scenario to have two calls in the same time ?

I have to say that I even tried to temporary dis/en interrupts with __asm__
volatile ("cpsid i" : : : "memory") /__asm__ volatile ("cpsie i" : : :
"memory")
, but I could still reproduce. plus I also reproduced it with append_errs=0
.. I hope this doesn't break my theory :D

[/] # i
Instruction count for the last second is 215992979.
CPU load is 99.99%.
intr_cnt=3220
cond1=1
cond2=1
jiffies=1622210
dbg_ready_UI1=287402
dbg_ready_LOGT=374072
dbg_ready_ntwk=1622094084077
dbg_ready_SCtx=1621678898952
dbg_ready_SCrx=1621678862701
dbg_ready_SHLL=1622201084177
dbg_extract_UI1=67127037
dbg_extract_LOGT=67144458
dbg_extract_ntwk=1622094096292
dbg_extract_SCtx=1621678924213
dbg_extract_SCrx=1621678883552
dbg_extract_SHLL=1622200088846
append_errs=0
ready_queue_100_elems=1
append_cnt_max=2
[/] #
[/] #
[/] #
[/] #
[/] # assertion "first != _Chain_Tail( &ready_queues[ index ] )" failed:
file
"../../cpukit/../../../stm32f7/lib/include/rtems/score/schedulerpriorityimpl.h",
line 232, function: _Scheduler_priority_Ready_queue_first

*** PROFILING REPORT BEGIN PMC_APP ***
  <ProfilingReport name="PMC_APP">
    <PerCPUProfilingReport processorIndex="0">
      <MaxThreadDispatchDisabledTime
unit="ns">2</MaxThreadDispatchDisabledTime>
      <MeanThreadDispatchDisabledTime
unit="ns">1</MeanThreadDispatchDisabledTime>
      <TotalThreadDispatchDisabledTime
unit="ns">3369295</TotalThreadDispatchDisabledTime>
      <ThreadDispatchDisabledCount>3369216</ThreadDispatchDisabledCount>
      <MaxInterruptDelay unit="ns">0</MaxInterruptDelay>
      <MaxInterruptTime unit="ns">0</MaxInterruptTime>
      <MeanInterruptTime unit="ns">0</MeanInterruptTime>
      <TotalInterruptTime unit="ns">0</TotalInterruptTime>
      <InterruptCount>0</InterruptCount>
    </PerCPUProfilingReport>
  </ProfilingReport>
*** PROFILING REPORT END PMC_APP ***
Creating /etc/passwd and group with three usable accounts
root/pwd , test/pwd, rtems/NO PASSWORD, chroot/NO PASSWORD

I may have to begin the integration again for 4.11.3 ... are there any
chances this might not reproduce in 4.11.3 ?
are there any changes in this area ?

regards,
Catalin


On Wed, Oct 24, 2018 at 3:00 AM Sebastian Huber <
sebastian.huber at embedded-brains.de> wrote:

>
>
> On 24/10/2018 11:10, Catalin Demergian wrote:
> > Hi,
> > I made a debug patch with some changes
> > in _Scheduler_priority_Ready_queue_enqueue
> >
> >   if(tcb->Object.id == 0x0a010005)
> > cnt_before = _Chain_Node_count_unprotected(ready_chain);
> >   _Chain_Append_unprotected( ready_chain, node );
> >   if(tcb->Object.id == 0x0a010005) {
> > cnt_after = _Chain_Node_count_unprotected(ready_chain);
> > if(cnt_after != cnt_before + 1)
> > append_errs++;
> >   }
> >   _Priority_bit_map_Add( bit_map, &ready_queue->Priority_map );
> >
> > I wanted to know if _Chain_Append_unprotected fails when adding my
> > SCrx task in the ready queue. Since it's a void
> > function, I used _Chain_Node_count_unprotected to test if the counter
> > after the insert is the count before + 1.
> >
> > The task command shows 2 ready tasks, but ready_queue_100_elems=1
> > (in ready_queue_100_elems I'm saving what
> > _Chain_Node_count_unprotected returns for ready_queues[100])
> > Also, append_errs=1, which means that at some point there was an
> > insert failure.
> >
> > [/] # task
> >   ID       NAME           PRI  STATE MODES  EVENTS    WAITID  WAITARG
> > NOTES
> >
> ------------------------------------------------------------------------------
> > 0a010001   UI1              1 Wevnt  P:T:nA   NONE   2002b0a4 0x80000000
> > 0a010002   LOGT            99 Wmsg   P:T:nA   NONE   22010001 0x80000000
> > 0a010003   ntwk           100 Wsysev P:T:nA   NONE   2005e43c 0x80000000
> > 0a010004   SCtx           100 Wsysev P:T:nA   NONE   2005ff6c 0x80000000
> > 0a010005   SCrx           100 READY  P:T:nA 08000000 20060fbc 0x80000000
> > 0a010006   SHLL           100 READY  P:T:nA   NONE   00000001 0x80000000
> > [/] #
> > [/] #
> > [/] #
> > [/] # i
> > Instruction count for the last second is 215993000.
> > CPU load is 99.99%.
> > intr_cnt=125916
> > cond1=0
> > cond2=1
> > jiffies=62980862
> > dbg_ready_UI1=288379
> > dbg_ready_LOGT=374942
> > dbg_ready_ntwk=62980798317434
> > dbg_ready_SCtx=62980515317452
> > dbg_ready_SCrx=8947511142891
> > dbg_ready_SHLL=62980853317438
> > dbg_extract_UI1=67129036
> > dbg_extract_LOGT=67147087
> > dbg_extract_ntwk=62980798327798
> > dbg_extract_SCtx=62980515326397
> > dbg_extract_SCrx=8947511124432
> > dbg_extract_SHLL=62980852322225
> > append_errs=1
> > ready_queue_100_elems=1
> > [/] #
> >
> > My theory is that because the SCrx task is in a funny state where
> > current_state=READY, but it's not in the ready queue because
> > of the insert failure, it will never be scheduled again because no
> > matter how many USB interrupts may arrive, unblock will be set to FALSE
> > in _Event_Surrender because the state is READY
> > so _Event_Is_blocking_on_event returns FALSE.
> >
> > Could this theory be valid ?
>
> Yes, this sounds valid. It fits also into the assertion failure reported
> in one of your previous e-mails.
>
> Can you set a break point to the
>
>    append_errs++;
>
> line? A stack trace would be helpful. I would also have a look at the
> chain data structures.
>
> I would add an extra debug state to the thread control block to ensure
> that you don't have a double insert/extract into the ready queue.
>
> --
> Sebastian Huber, embedded brains GmbH
>
> Address : Dornierstr. 4, D-82178 Puchheim, Germany
> Phone   : +49 89 189 47 41-16
> Fax     : +49 89 189 47 41-09
> E-Mail  : sebastian.huber at embedded-brains.de
> PGP     : Public key available on request.
>
> Diese Nachricht ist keine geschäftliche Mitteilung im Sinne des EHUG.
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rtems.org/pipermail/users/attachments/20181025/fbead2e9/attachment-0002.html>


More information about the users mailing list