watchdog question

Chris Johns chrisj at rtems.org
Wed Jan 14 01:40:53 UTC 2015


Hi,

I have moved this to the user list as this is not about RTEMS development.

On 14/01/2015 2:40 am, Daniel Gutson wrote:
> Hi,
>
>     We are thinking about a "supervisor" watchdog, which runs in a high
> priority task, and
> has the following characteristics:
>
> a) tasks that "want" to be supervised are registered in the supervisor watchdog
> b) each supervised task is in one of the following mode:
>         - automatic supervision
>         - manual supervision
>         - sleeping
> c) in "automatic supervision" mode, the supervisor watchdog keeps
> track of the program counter of the task.
> When the PC is the same after N cycles, the watchdog performs a
> predefined action (e.g. reset).
> d) supervised tasks in "manual supervision" have to kick the watchdog
> explicitly (e.g. by invoking a function of the API).
> e) the watchdog leaves alone the tasks in sleeping mode.
>
> The idea of the "automatic supervision" mode is to avoid polluting the
> task code due to spreading calls to the kick function,
> specially difficult when having to estimate the "distance" between
> these function calls.
> The idea of the "manual supervision" mode, which is rather
> traditional, is when the task executes tight inner loops.
> In this scheme, tasks should be in automatic mode as much as possible
> and switch to manual just in small bounded
> places of the code.
>
> Before entering in the discussion of the implementation, I'd like
> feeedback about the general idea please.
>

I have not done anything clever that attempts to monitor a task's 
program counter. Apart of the possible complexity I would be concerned 
about the processing time required with a high priority task.

In systems such as air traffic control voice switching I have opted for 
a simple check-in type system. A module of code registers as an 
important service. When the important module runs it checks out with a 
time-out. The module must check-in before the time out expires. 
Typically watchdog hardware needs to be hit more often than the time a 
module may check out for so I have a low priority interrupt that 
decrements a counter and while not zero it hits the watchdog hardware. 
The watchdog supervisor updates this counter if all checked out users 
have not timed out.

Chris


More information about the users mailing list