Scheduler bug?
Joel Sherrill
joel.sherrill at OARcorp.com
Sun May 10 13:09:41 UTC 2009
Leon Pollak wrote:
> Hello, all.
>
>
> My customer reported the unit resets sometimes in the very old unit
> running RTEMS version from 2003. They changed the cpu clock to faster
> and the reset occurs.
>
>
> Although I do not believe to myself, but my investigations showed that
> I very probably encounter some bug in RTEMS scheduler (see below). The
> question is if the bug is corrected in any new version? I mean, will
> the upgrade to 4.9 help?
>
>
> As the end customer (the customer of my customer) is a military
> organization, they will be really upset when I will say "upgrading the
> RTOS"...:-) without being sure...
>
>
>
So they are OK with changing the hardware clock on a tested unit and
invalidating all testing but not upgrading the software. Any change on
a validated system is a change.
> So, please, advice...:-)
We will have to use the collective RTEMS memory on this one. I recall
a bug that does sound like this.
2005-08-17 Andrew Sinclair <Andrew.Sinclair at elprotech.com <mailto:Andrew.Sinclair at elprotech.com>>
PR 807/rtems
* rtems/src/timerfireafter.c, rtems/src/timerserverfireafter.c,
score/src/watchdoginsert.c: Tighten critical section checks on an ISR
using the same timer being inserted by a lower priority ISR or
interupt task.
Does this sound like it? It was fixed in 4.6.4 (not 4.6.2)
http://www.rtems.org/cgi-bin/cvsweb.cgi/rtems/cpukit/score/src/watchdoginsert.c
is where I found the ChangeLog entry.
This only impacted 3 files so is no more of a change than increasing the
clock
frequency. How is it OK to (*&% with the hardware and not with the
software.
Change is (*^ change.
--joel
> A lot of thanks ahead.
>
>
> =============================================================================
>
>
> The problem description:
> -------------------------
> The application has only 2 tasks:
>
>
> - WD - priority 50, task to reset HW watchdog, sleeps for 10 ticks,
> puts the reset line to high, sleeps for 10 clocks, puts the reset line
> to low;
>
>
> - MB - priority 100, task waits for event from interrupt with 1(!!!)
> tick timeout. If timeout occurs, it refreshes some variables and waits
> for event again. Interrupt frequency is about 50Hz (20ms).
>
>
>
> The WD tasks stops working rather fast (30-50s) when tick=2ms.
> Debugger shows, entering rtems_clock_tick() routine and further
> _Watchdog_Tickle() routine, that
> the_watchdog->Node.delta_interval=0xFFFFFeXX for the WD task.
>
>
> This obviously causes the HW watchdog to reset the system very soon...:-)
>
>
> -Increasing the MB task event waiting timeout even to 2 ticks seems to
> eliminate the problem.
> -Masking the incoming interrupts seems to eliminate the problem.
> -Increasing tick significantly reduces the problem probability.
>
>
> Other changes seem not to influence the situation.
> --
> Leon
>
>
More information about the users
mailing list