<div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr">Hi,<div>First, I would like to conceptually understand how a function as simple as _Chain_Append_unprotected could fail.</div><div>I added a patch like this</div><div><div>RTEMS_INLINE_ROUTINE void _Chain_Append_unprotected(</div><div>  Chain_Control *the_chain,</div><div>  Chain_Node    *the_node</div><div>)</div><div>{</div><div>  append_cnt++;</div><div>  if(append_cnt > append_cnt_max)</div><div><span style="white-space:pre">     </span>  append_cnt_max = append_cnt;</div><div>  Chain_Node *tail = _Chain_Tail( the_chain );</div><div>  Chain_Node *old_last = tail->previous;</div><div><br></div><div>  the_node->next = tail;</div><div>  tail->previous = the_node;</div><div>  old_last->next = the_node;</div><div>  the_node->previous = old_last;</div><div>  append_cnt--;</div><div>}</div></div><div><br></div><div>I could see append_cnt_max=2 in the output bellow, meaning at some point there were two function calls</div><div>in progress (don't know if it was for the same chain, as the patch dind't look if it's the_chain parameter is the same)</div><div>How would be possible as a scenario to have two calls in the same time ?</div><div><br></div><div>I have to say that I even tried to temporary dis/en interrupts with __asm__ volatile ("cpsid i" : : : "memory") /__asm__ volatile ("cpsie i" : : : "memory")</div><div>, but I could still reproduce. plus I also reproduced it with append_errs=0 .. I hope this doesn't break my theory :D</div><div><br></div><div><div>[/] # i</div><div>Instruction count for the last second is 215992979.</div><div>CPU load is 99.99%.</div><div>intr_cnt=3220</div><div>cond1=1</div><div>cond2=1</div><div>jiffies=1622210</div><div>dbg_ready_UI1=287402</div><div>dbg_ready_LOGT=374072</div><div>dbg_ready_ntwk=1622094084077</div><div>dbg_ready_SCtx=1621678898952</div><div>dbg_ready_SCrx=1621678862701</div><div>dbg_ready_SHLL=1622201084177</div><div>dbg_extract_UI1=67127037</div><div>dbg_extract_LOGT=67144458</div><div>dbg_extract_ntwk=1622094096292</div><div>dbg_extract_SCtx=1621678924213</div><div>dbg_extract_SCrx=1621678883552</div><div>dbg_extract_SHLL=1622200088846</div><div>append_errs=0</div><div>ready_queue_100_elems=1</div><div>append_cnt_max=2</div><div>[/] #</div><div>[/] #</div><div>[/] #</div><div>[/] #</div><div>[/] # assertion "first != _Chain_Tail( &ready_queues[ index ] )" failed: file "../../cpukit/../../../stm32f7/lib/include/rtems/score/schedulerpriorityimpl.h", line 232, function: _Scheduler_priority_Ready_queue_first</div><div><br></div><div>*** PROFILING REPORT BEGIN PMC_APP ***</div><div>  <ProfilingReport name="PMC_APP"></div><div>    <PerCPUProfilingReport processorIndex="0"></div><div>      <MaxThreadDispatchDisabledTime unit="ns">2</MaxThreadDispatchDisabledTime></div><div>      <MeanThreadDispatchDisabledTime unit="ns">1</MeanThreadDispatchDisabledTime></div><div>      <TotalThreadDispatchDisabledTime unit="ns">3369295</TotalThreadDispatchDisabledTime></div><div>      <ThreadDispatchDisabledCount>3369216</ThreadDispatchDisabledCount></div><div>      <MaxInterruptDelay unit="ns">0</MaxInterruptDelay></div><div>      <MaxInterruptTime unit="ns">0</MaxInterruptTime></div><div>      <MeanInterruptTime unit="ns">0</MeanInterruptTime></div><div>      <TotalInterruptTime unit="ns">0</TotalInterruptTime></div><div>      <InterruptCount>0</InterruptCount></div><div>    </PerCPUProfilingReport></div><div>  </ProfilingReport></div><div>*** PROFILING REPORT END PMC_APP ***</div><div>Creating /etc/passwd and group with three usable accounts</div><div>root/pwd , test/pwd, rtems/NO PASSWORD, chroot/NO PASSWORD</div><div><br></div></div><div>I may have to begin the integration again for 4.11.3 ... are there any chances this might not reproduce in 4.11.3 ?</div><div>are there any changes in this area ?</div><div><br></div><div>regards,</div><div>Catalin</div><div><br></div></div></div></div></div></div></div></div></div><br><div class="gmail_quote"><div dir="ltr">On Wed, Oct 24, 2018 at 3:00 AM Sebastian Huber <<a href="mailto:sebastian.huber@embedded-brains.de">sebastian.huber@embedded-brains.de</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><br>

<br>

On 24/10/2018 11:10, Catalin Demergian wrote:<br>

> Hi,<br>

> I made a debug patch with some changes <br>

> in _Scheduler_priority_Ready_queue_enqueue<br>

><br>

>   if(tcb->Object.id == 0x0a010005)<br>

> cnt_before = _Chain_Node_count_unprotected(ready_chain);<br>

>   _Chain_Append_unprotected( ready_chain, node );<br>

>   if(tcb->Object.id == 0x0a010005) {<br>

> cnt_after = _Chain_Node_count_unprotected(ready_chain);<br>

> if(cnt_after != cnt_before + 1)<br>

> append_errs++;<br>

>   }<br>

>   _Priority_bit_map_Add( bit_map, &ready_queue->Priority_map );<br>

><br>

> I wanted to know if _Chain_Append_unprotected fails when adding my <br>

> SCrx task in the ready queue. Since it's a void<br>

> function, I used _Chain_Node_count_unprotected to test if the counter <br>

> after the insert is the count before + 1.<br>

><br>

> The task command shows 2 ready tasks, but ready_queue_100_elems=1<br>

> (in ready_queue_100_elems I'm saving what <br>

> _Chain_Node_count_unprotected returns for ready_queues[100])<br>

> Also, append_errs=1, which means that at some point there was an <br>

> insert failure.<br>

><br>

> [/] # task<br>

>   ID       NAME           PRI  STATE MODES  EVENTS    WAITID  WAITARG  <br>

> NOTES<br>

> ------------------------------------------------------------------------------<br>

> 0a010001   UI1              1 Wevnt  P:T:nA   NONE   2002b0a4 0x80000000<br>

> 0a010002   LOGT            99 Wmsg   P:T:nA   NONE   22010001 0x80000000<br>

> 0a010003   ntwk           100 Wsysev P:T:nA   NONE   2005e43c 0x80000000<br>

> 0a010004   SCtx           100 Wsysev P:T:nA   NONE   2005ff6c 0x80000000<br>

> 0a010005   SCrx           100 READY  P:T:nA 08000000 20060fbc 0x80000000<br>

> 0a010006   SHLL           100 READY  P:T:nA   NONE   00000001 0x80000000<br>

> [/] #<br>

> [/] #<br>

> [/] #<br>

> [/] # i<br>

> Instruction count for the last second is 215993000.<br>

> CPU load is 99.99%.<br>

> intr_cnt=125916<br>

> cond1=0<br>

> cond2=1<br>

> jiffies=62980862<br>

> dbg_ready_UI1=288379<br>

> dbg_ready_LOGT=374942<br>

> dbg_ready_ntwk=62980798317434<br>

> dbg_ready_SCtx=62980515317452<br>

> dbg_ready_SCrx=8947511142891<br>

> dbg_ready_SHLL=62980853317438<br>

> dbg_extract_UI1=67129036<br>

> dbg_extract_LOGT=67147087<br>

> dbg_extract_ntwk=62980798327798<br>

> dbg_extract_SCtx=62980515326397<br>

> dbg_extract_SCrx=8947511124432<br>

> dbg_extract_SHLL=62980852322225<br>

> append_errs=1<br>

> ready_queue_100_elems=1<br>

> [/] #<br>

><br>

> My theory is that because the SCrx task is in a funny state where <br>

> current_state=READY, but it's not in the ready queue because<br>

> of the insert failure, it will never be scheduled again because no <br>

> matter how many USB interrupts may arrive, unblock will be set to FALSE<br>

> in _Event_Surrender because the state is READY <br>

> so _Event_Is_blocking_on_event returns FALSE.<br>

><br>

> Could this theory be valid ?<br>

<br>

Yes, this sounds valid. It fits also into the assertion failure reported <br>

in one of your previous e-mails.<br>

<br>

Can you set a break point to the<br>

<br>

   append_errs++;<br>

<br>

line? A stack trace would be helpful. I would also have a look at the <br>

chain data structures.<br>

<br>

I would add an extra debug state to the thread control block to ensure <br>

that you don't have a double insert/extract into the ready queue.<br>

<br>

-- <br>

Sebastian Huber, embedded brains GmbH<br>

<br>

Address : Dornierstr. 4, D-82178 Puchheim, Germany<br>

Phone   : +49 89 189 47 41-16<br>

Fax     : +49 89 189 47 41-09<br>

E-Mail  : <a href="mailto:sebastian.huber@embedded-brains.de" target="_blank">sebastian.huber@embedded-brains.de</a><br>

PGP     : Public key available on request.<br>

<br>

Diese Nachricht ist keine geschäftliche Mitteilung im Sinne des EHUG.<br>

<br>

</blockquote></div>