New watchdog handler implementation

Sun Feb 7 06:37:28 UTC 2016

----- Joel Sherrill <joel at rtems.org> schrieb:
> On Feb 4, 2016 7:32 AM, "Sebastian Huber" <
> sebastian.huber at embedded-brains.de> wrote:
> >
> > Hello,
> >
> > for a proper timer support on SMP systems there is some work to do. I
> added a
> > ticket for this:
> >
> > https://devel.rtems.org/ticket/2554
> >
> > With respect to the data structure selection I think that the red-black
> trees
> > are a good choice.
> 
> I had reached this conclusion a while back.
> 
> I was thinking a good first step would be to replace the watchdog Delta
> chains with RB Trees using either TOD or uptime. Get away from tick units.

My current approach is to use ticks for relative timeouts and struct timespec for absolute timeouts. The watchdog headers with a red-black tree are quite small (one pointer for the root, one for the minimum), so the size overhead is negligible.  It keeps the conversion overhead for the Classic and POSIX APIs small.

> 
> If you do this work first, then others can pitch in on POSIX API changes
> and enhancements while you work through SMP improvements.
> 
> I say this knowing I was investigating missing POSIX capabilities and some
> like clock monotonic relate to this. So you could get help on the top side
> while you work on the bottom side.
> 
> > An open issue is which load balancing and work distribution do we want to
> add.
> > Currently we have one watchdog header of the ticks based timers (relative
> > timeouts) and one for the seconds based timers (absolute timeouts).
> 
> I would like to see both change to real units of time and be precise. So
> you can schedule watchdogs for a more precise length of time for either
> relative or absolute/calendar time.
> 
> POSIX absolute time based timeouts are now implicitly converted to relative
> timeouts. If the seconds watchdog set was a more precise absolute time, it
> would have been used. But the seconds is not of sufficient granularity to
> implement this.

Yes, this should be fixed. I think this will help to eliminate some failures in the libstdc++ testsuite. 

> And this means that POSIX absolute timeouts are actually treated like
> monotonic timeouts.
> 
> So IMO the user visible side of this needs to be proper support for POSIX
> absolute time timeouts, clock monotonic, and at least the condition
> variable attribute to specify the associated clock. There is also some
> clock monotonic support under POSIX timers.
> 
> Have you built a list of all the POSIX API points this touches? I have a
> good start on this because I have been reviewing POSIX conformance.

No, I didn't compile a list.

> 
> > The timer
> > routines execute in the context of the clock tick interrupt of processor
> zero.
> > This is not scalable in the processor count.  It connects different
> scheduler
> > instances via this central infrastructure.
> >
> > A simple enhancement would be to provide one watchdog header for
> >
> > A) each scheduler instance, or
> >
> > B) each processor.
> >
> > I am in favour of option B), since
> >
> > * the storage space for a watchdog header is quite small,
> >
> > * access to the watchdog header is easy via the _Per_CPU_Information and
> > requires not dynamic memory, and
> >
> > * schedulers supporting thread processor affinities could use a local
> watchdog
> > header.
> >
> > Each watchdog header uses a dedicated lock.  Threads would use the
> watchdog
> > header of the current processor.  Access to the thread watchdog control is
> > protected by a dedicated lock in the thread control block.
> 
> If a thread schedules a watchdog and is moved to another thread or
> scheduler, what's the impact.

If we use per-processor watchdog headers, then it doesn't matter. Its only important when a watchdog fires, but not where in this case.  We could use a random assignment to a per-processor watchdog header. However, for cache efficiency it is beneficial to use the processor of the executing thread.

> 
> The association of a watchdog timer with the current CPU seems like a good
> solution. It achieves a balancing similar to timer wheels but based on a
> system characteristic that is naturally balanced in SMP systems. Threads
> are spread across cores in a controlled manner by a combination of design
> and scheduler. Seems like a good way to put a big O bound on watchdog per
> set instance.
> 
> > Which watchdog header should be used in the timer objects?  Possible are
> >
> > C) the watchdog header is determined at timer creation time, e.g.
> processor of
> > executing thread, or
> >
> > D) selectable via new function, e.g. rtems_timer_set_processor().
> >
> > I suggest to implement C) and D).  Is D) possible via POSIX API?
> 
> POSIX has no approved APIs that are SMP related. They also have no concept
> of interrupts.  So there is no way to do this via POSIX. Purely an
> implementation detail.
> 
> I don't know that setting a preferred processor is necessary. But I can see
> where it could be used to help WCET analysis. But the complexity to
> consider at design time seems high to use it.
> 
> Bottom line, I think it might have some use to developers but we will have
> to teach people how using it can benefit them.
> 
> But in general, that's true for all SMP features. Choosing what to use on a
> real system implementation is hard.
> 
> > With the current clock drivers we must execute the clock interrupt service
> > routine on each processor all the time.  This is not very time and power
> > efficient.  The clock interrupt service on processor zero should execute
> all
> > the time, since otherwise the _Watchdog_Ticks_since_boot variable no
> longer
> > increments and this may confuse existing applications.  The clock
> interrupt
> > service on other processors should only execute in case their dedicated
> > watchdog set is not-empty.  Since this set changes dynamically, we must
> > regularly request the clock driver to disable/enable the clock interrupt
> > service execution on a particular processor.  I suggest to add a global
> clock
> > device handler table, which is initialized to a default and optionally
> replaced
> > by the clock driver providing the following handlers:
> >
> > typedef struct {
> >   void (*enable_on_processor)(uint32_t processor);
> >   void (*disable_on_processor)(uint32_t processor);
> > } Clock_Device;
> >
> > extern Clock_Device *_Clock_Device;
> >
> > This could be enhanced to offer a tickless operation for systems
> sensitive to
> > power-consumption in the future.
> >
> > Use of a global _Watchdog_Ticks_since_boot maintained by processor zero
> is a
> > bit problematic.  The clock interrupt service is triggered on different
> > processors simultaneously.  However, the clock interrupt service on
> processor
> > zero may be delayed due to various reasons, e.g. high priority nested
> > interrupt.  Thus the clock interrupt service on other processors would
> observe
> > a not up to date _Watchdog_Ticks_since_boot value.  So each watchdog
> header
> > must maintain its separate ticks value.  Some care must be taken to get
> the
> > start and stop tick values required for the timer objects.
> 
> You mentioned it in passing but what series of events triggers updates on
> all the watchdog sets? I think you said one tick cycles over all sets..
> 
> If this is the case, the one set per scheduler reduces the execution
> overhead pf checking all for updates and logically ties watchdog sets and
> thread sets together. Which also seems logically consistent from a system
> design perspective.
> 
> I don't have a strong case to go scheduler or per CPU, but per scheduler
> seems logical and allows it to follow the thread. When a thread moved
> schedulers, would the watchdog move? Or only on the next activation?

I think per-processor is easier to implement.  I already have a simple prototype implementation and it shows quite good results.  The bottleneck in the system moves a away from the watchdog to the scheduler.  The maximum measure thread dispatch disabled time dropped from 3ms to 100us.

-- 
Sebastian Huber, embedded brains GmbH

Address : Dornierstr. 4, D-82178 Puchheim, Germany
Phone   : +49 89 189 47 41-16
Fax     : +49 89 189 47 41-09
E-Mail  : sebastian.huber at embedded-brains.de
PGP     : Public key available on request.

Diese Nachricht ist keine geschäftliche Mitteilung im Sinne des EHUG.