Event not timing out with a wait of 1 tick (and sometimes 2)
Steven Johnson
sjohnson at sakuraindustries.com
Fri Dec 23 19:36:39 UTC 2005
Hi,
We have finally been able to test Ian's fix, sorry for the delay.
Following is a brief description of what we did, because it didnt work
for us, but we also had to change the code a little to do the test which
may have broken it if our understanding isnt correct.
We assume that Ian meant delta_interval not delta_count as there is no
delta_count variable (at least not where the compiler could see).
The compiler got tricky and read a bit from the condition register and
added it to r8 to perform the "if zero do a ++" code. So we changed it
to "if zero then set to 1". (so we could put a breakpoint on it)
We also had code in watchdog_tickle.c, that went, "if
the_watchdog->delta_interval == 0 then the_watchdog->delta_interval =
1". A breakpoint on that "the_watchdog->delta_interval = 1" in
watchdog_tickle.c got hit several times, but the one in watchdog_insert
did not get hit.
Best Regards, and Merry Christmas,
Steven Johnson
Ian Caddy wrote:
> Hi,
>
> I have been following this thread with interest. I was looking at the
> watchdoginsert.c code (in 4.6.5) today and I can see what is causing
> this issue, thanks to Steven's debugging effort.
>
> The only way to protect against this is to ensure that if you are
> inserting the new watchdog at the head of the watchdog list, we need
> to ensure that it does not have a zero delta_interval, this watchdog
> is effectively inserted after the tick and so should have a count of 1
> still to go which will be decremented at the next clock isr.
>
> The simple solution would be to check after the for loop if the
> "after" watchdog is still active, and if not, then make sure the
> delta_interval is greater than zero, in watchdoginsert.c:
>
> } /* End of for loop */
>
> _Watchdog_Activate(the_watchdog);
>
> + /* Check that the after watchdog is still active
> + * (not removed in the ISR_Flash via a clock isr) */
> + if(after->state != WATCHDOG_ACTIVE)
> + {
> + /* Ensure we have a delta_count of greater than 0
> + * since we haven't really started our timeout yet */
> + if(delta_count == 0)
> + detla_count++;
> + }
>
> the_watchdog->delta_interval = delta_interval;
>
> So, in my conclusion, I think Steven's initial change did not loose
> any watchdog counts and in fact effectively provides the same result
> as this code, but at least with this code, we know where it is coming
> from rather than a solution to an unknown problem.
>
> I hope this is correct and makes sense to everyone.
>
> regards,
>
> Ian Caddy
>
>
>
> Steven Johnson wrote:
>
>> Hi Joel,
>>
>>>
>>> Since you are hacking on the source anyway, you could add a variable
>>> which is set to note that you are at the flash and clear it when
>>> interrupts are redisabled. Then when the 0 at the head of the delta
>>> chain occurs there is a bread crumb. Several places could be marked
>>> this way.
>>>
>>> When you get the fault, check _ISR_Nest_level and check it.
>>
>>
>>
>>
>> We did the following test:
>>
>> The ISR_Flash macro was modified to set "_ISR_In_Flash" on entry and
>> clear it on exit.
>>
>> We added the following code to the watchdog tickle:
>>
>> if (_ISR_Nest_level > 1)
>> _ISR_Nest_Count++;
>>
>> if (prev_in_flash)
>> {
>> if (!the_watchdog->delta_interval)
>> {
>> the_watchdog->delta_interval = 1;
>> }
>> prev_in_flash = 0;
>> }
>>
>>
>> if (_ISR_In_Flash)
>> prev_in_flash = 1;
>>
>> The results are these:
>>
>> 1. A breakpoint on the the _ISR_Nest_Count++ never triggered.
>> 2. A breakpoint on "prev_in_flash = 1" triggered every so often.
>> 3. A breakpoint on "the_watchdog->delta_interval = 1;" also triggered
>>
>> So this seems to prove that what is happening is that interrupts are
>> being enabled after the delta_interval is being set to zero, but
>> before it is added to the watchdog chain. An ISR occurs (but not a
>> nested one) that removes all the entries from the head of the
>> watchdog chain. When the ISR returns, the zero calculated delta is
>> added, but instead of being after the entries in the chain it is
>> calculated on it is added at the head because all of the entries it
>> was calculated relative to have been removed. Which causes the next
>> watchdog tickle (following the one that occured during the flash) to
>> remove the head with 0 in it, decrement to 2^32-1, and exhibit our
>> fault.
>>
>> I dont use simulators, but i imagine you could force this behaviour
>> by careful control of the simulator to exhibit these characteristics
>> for verification.
>>
>> To also answer Jennifer,
>>
>>> On most of the PowerPC boards using new interrupt processing,
>>> interrupts
>>> are turned back on in C_dispatch_irq_handler allowing nested
>>> interrupts. If this is occuring in your bsp, you may want to stub
>>> out the sections
>>> that do this and see if the problem goes away.
>>>
>> We dont actually use either the old or new PowerPC interrupt
>> processing in RTEMS, i did my own interrupt processing (loosely based
>> on the old powerpc interrupt processing) for my target years ago, and
>> have stuck with it cause its works fine for me and ive tuned it to my
>> liking for the mpc860 irq's. Which was why i was pretty sure we
>> shouldnt have nested IRQ's. The above test indicates that we arent
>> getting nested IRQ's in this instance, as _ISR_Nest_Count is never
>> incremented.
>>
>> Hope that helps.
>>
>> Steven Johnson
>>
>
More information about the users
mailing list