Event not timing out with a wait of 1 tick (and sometimes 2)

Ian Caddy ianc at goanna.iinet.net.au
Fri Dec 16 03:46:09 UTC 2005


Hi,

I have been following this thread with interest.  I was looking at the 
watchdoginsert.c code (in 4.6.5) today and I can see what is causing 
this issue, thanks to Steven's debugging effort.

The only way to protect against this is to ensure that if you are 
inserting the new watchdog at the head of the watchdog list, we need to 
ensure that it does not have a zero delta_interval, this watchdog is 
effectively inserted after the tick and so should have a count of 1 
still to go which will be decremented at the next clock isr.

The simple solution would be to check after the for loop if the "after" 
watchdog is still active, and if not, then make sure the delta_interval 
is greater than zero, in watchdoginsert.c:

    } /* End of for loop */

    _Watchdog_Activate(the_watchdog);

+   /* Check that the after watchdog is still active
+    * (not removed in the ISR_Flash via a clock isr) */
+   if(after->state != WATCHDOG_ACTIVE)
+   {
+      /* Ensure we have a delta_count of greater than 0
+       * since we haven't really started our timeout yet */
+      if(delta_count == 0)
+         detla_count++;
+   }

    the_watchdog->delta_interval = delta_interval;

So, in my conclusion, I think Steven's initial change did not loose any 
watchdog counts and in fact effectively provides the same result as this 
code, but at least with this code, we know where it is coming from 
rather than a solution to an unknown problem.

I hope this is correct and makes sense to everyone.

regards,

Ian Caddy



Steven Johnson wrote:
> Hi Joel,
> 
>>
>> Since you are hacking on the source anyway, you could add a variable 
>> which is set to note that you are at the flash and clear it when 
>> interrupts are redisabled.  Then when the 0 at the head of the delta 
>> chain occurs there is a bread crumb.  Several places could be marked 
>> this way.
>>
>> When you get the fault, check _ISR_Nest_level and check it.
> 
> 
> 
> We did the following test:
> 
> The ISR_Flash macro was modified to set "_ISR_In_Flash" on entry and 
> clear it on exit.
> 
> We added the following code to the watchdog tickle:
> 
> if (_ISR_Nest_level > 1)
>    _ISR_Nest_Count++;
> 
> if (prev_in_flash)
> {
>    if (!the_watchdog->delta_interval)
>    {
>        the_watchdog->delta_interval = 1;
>    }
>    prev_in_flash = 0;
> }
> 
> 
> if (_ISR_In_Flash)
>    prev_in_flash = 1;
> 
> The results are these:
> 
> 1. A breakpoint on the the _ISR_Nest_Count++ never triggered.
> 2. A breakpoint on "prev_in_flash = 1" triggered every so often.
> 3. A breakpoint on "the_watchdog->delta_interval = 1;" also triggered
> 
> So this seems to prove that what is happening is that interrupts are 
> being enabled after the delta_interval is being set to zero, but before 
> it is added to the watchdog chain.  An ISR occurs (but not a nested one) 
> that removes all the entries from the head of the watchdog chain.  When 
> the ISR returns, the zero calculated delta is added, but instead of 
> being after the entries in the chain it is calculated on it is added at 
> the head because all of the entries it was calculated relative to have 
> been removed. Which causes the next watchdog tickle (following the one 
> that occured during the flash) to remove the head with 0 in it, 
> decrement to 2^32-1, and exhibit our fault.
> 
> I dont use simulators, but i imagine you could force this behaviour by 
> careful control of the simulator to exhibit these characteristics for 
> verification.
> 
> To also answer Jennifer,
> 
>> On most of the PowerPC boards using new interrupt processing, interrupts
>> are turned back on in C_dispatch_irq_handler allowing nested 
>> interrupts. If this is occuring in your bsp, you may want to stub out 
>> the sections
>> that do this and see if the problem goes away.
>>
> We dont actually use either the old or new PowerPC interrupt processing 
> in RTEMS, i did my own interrupt processing (loosely based on the old 
> powerpc interrupt processing) for my target years ago, and have stuck 
> with it cause its works fine for me and ive tuned it to my liking for 
> the mpc860 irq's.  Which was why i was pretty sure we shouldnt have 
> nested IRQ's.  The above test indicates that we arent getting nested 
> IRQ's in this instance, as _ISR_Nest_Count is never incremented.
> 
> Hope that helps.
> 
> Steven Johnson
> 

-- 
Ian Caddy
Goanna Technologies Pty Ltd
+61 8 9221 1860




More information about the users mailing list