<p dir="ltr"><br>

On Feb 4, 2016 7:32 AM, "Sebastian Huber" <<a href="mailto:sebastian.huber@embedded-brains.de">sebastian.huber@embedded-brains.de</a>> wrote:<br>

><br>

> Hello,<br>

><br>

> for a proper timer support on SMP systems there is some work to do. I added a<br>

> ticket for this:<br>

><br>

> <a href="https://devel.rtems.org/ticket/2554">https://devel.rtems.org/ticket/2554</a><br>

><br>

> With respect to the data structure selection I think that the red-black trees<br>

> are a good choice.</p>

<p dir="ltr">I had reached this conclusion a while back. </p>

<p dir="ltr">I was thinking a good first step would be to replace the watchdog Delta chains with RB Trees using either TOD or uptime. Get away from tick units.</p>

<p dir="ltr">If you do this work first, then others can pitch in on POSIX API changes and enhancements while you work through SMP improvements.</p>

<p dir="ltr">I say this knowing I was investigating missing POSIX capabilities and some like clock monotonic relate to this. So you could get help on the top side while you work on the bottom side.</p>

<p dir="ltr">> An open issue is which load balancing and work distribution do we want to add.<br>

> Currently we have one watchdog header of the ticks based timers (relative<br>

> timeouts) and one for the seconds based timers (absolute timeouts).</p>

<p dir="ltr">I would like to see both change to real units of time and be precise. So you can schedule watchdogs for a more precise length of time for either relative or absolute/calendar time.</p>

<p dir="ltr">POSIX absolute time based timeouts are now implicitly converted to relative timeouts. If the seconds watchdog set was a more precise absolute time, it would have been used. But the seconds is not of sufficient granularity to implement this.</p>

<p dir="ltr">And this means that POSIX absolute timeouts are actually treated like monotonic timeouts. </p>

<p dir="ltr">So IMO the user visible side of this needs to be proper support for POSIX absolute time timeouts, clock monotonic, and at least the condition variable attribute to specify the associated clock. There is also some clock monotonic support under POSIX timers.</p>

<p dir="ltr">Have you built a list of all the POSIX API points this touches? I have a good start on this because I have been reviewing POSIX conformance.</p>

<p dir="ltr">> The timer<br>

> routines execute in the context of the clock tick interrupt of processor zero.<br>

> This is not scalable in the processor count.  It connects different scheduler<br>

> instances via this central infrastructure.<br>

><br>

> A simple enhancement would be to provide one watchdog header for<br>

><br>

> A) each scheduler instance, or<br>

><br>

> B) each processor.<br>

><br>

> I am in favour of option B), since<br>

><br>

> * the storage space for a watchdog header is quite small,<br>

><br>

> * access to the watchdog header is easy via the _Per_CPU_Information and<br>

> requires not dynamic memory, and<br>

><br>

> * schedulers supporting thread processor affinities could use a local watchdog<br>

> header.<br>

><br>

> Each watchdog header uses a dedicated lock.  Threads would use the watchdog<br>

> header of the current processor.  Access to the thread watchdog control is<br>

> protected by a dedicated lock in the thread control block.</p>

<p dir="ltr">If a thread schedules a watchdog and is moved to another thread or scheduler, what's the impact. </p>

<p dir="ltr">The association of a watchdog timer with the current CPU seems like a good solution. It achieves a balancing similar to timer wheels but based on a system characteristic that is naturally balanced in SMP systems. Threads are spread across cores in a controlled manner by a combination of design and scheduler. Seems like a good way to put a big O bound on watchdog per set instance.</p>

<p dir="ltr">> Which watchdog header should be used in the timer objects?  Possible are<br>

><br>

> C) the watchdog header is determined at timer creation time, e.g. processor of<br>

> executing thread, or<br>

><br>

> D) selectable via new function, e.g. rtems_timer_set_processor().<br>

><br>

> I suggest to implement C) and D).  Is D) possible via POSIX API?</p>

<p dir="ltr">POSIX has no approved APIs that are SMP related. They also have no concept of interrupts.  So there is no way to do this via POSIX. Purely an implementation detail.</p>

<p dir="ltr">I don't know that setting a preferred processor is necessary. But I can see where it could be used to help WCET analysis. But the complexity to consider at design time seems high to use it.</p>

<p dir="ltr">Bottom line, I think it might have some use to developers but we will have to teach people how using it can benefit them.</p>

<p dir="ltr">But in general, that's true for all SMP features. Choosing what to use on a real system implementation is hard.</p>

<p dir="ltr">> With the current clock drivers we must execute the clock interrupt service<br>

> routine on each processor all the time.  This is not very time and power<br>

> efficient.  The clock interrupt service on processor zero should execute all<br>

> the time, since otherwise the _Watchdog_Ticks_since_boot variable no longer<br>

> increments and this may confuse existing applications.  The clock interrupt<br>

> service on other processors should only execute in case their dedicated<br>

> watchdog set is not-empty.  Since this set changes dynamically, we must<br>

> regularly request the clock driver to disable/enable the clock interrupt<br>

> service execution on a particular processor.  I suggest to add a global clock<br>

> device handler table, which is initialized to a default and optionally replaced<br>

> by the clock driver providing the following handlers:<br>

><br>

> typedef struct {<br>

>   void (*enable_on_processor)(uint32_t processor);<br>

>   void (*disable_on_processor)(uint32_t processor);<br>

> } Clock_Device;<br>

><br>

> extern Clock_Device *_Clock_Device;<br>

><br>

> This could be enhanced to offer a tickless operation for systems sensitive to<br>

> power-consumption in the future.<br>

><br>

> Use of a global _Watchdog_Ticks_since_boot maintained by processor zero is a<br>

> bit problematic.  The clock interrupt service is triggered on different<br>

> processors simultaneously.  However, the clock interrupt service on processor<br>

> zero may be delayed due to various reasons, e.g. high priority nested<br>

> interrupt.  Thus the clock interrupt service on other processors would observe<br>

> a not up to date _Watchdog_Ticks_since_boot value.  So each watchdog header<br>

> must maintain its separate ticks value.  Some care must be taken to get the<br>

> start and stop tick values required for the timer objects.</p>

<p dir="ltr">You mentioned it in passing but what series of events triggers updates on all the watchdog sets? I think you said one tick cycles over all sets..</p>

<p dir="ltr">If this is the case, the one set per scheduler reduces the execution overhead pf checking all for updates and logically ties watchdog sets and thread sets together. Which also seems logically consistent from a system design perspective.</p>

<p dir="ltr">I don't have a strong case to go scheduler or per CPU, but per scheduler seems logical and allows it to follow the thread. When a thread moved schedulers, would the watchdog move? Or only on the next activation?</p>

<p dir="ltr">> -- <br>

> Sebastian Huber, embedded brains GmbH<br>

><br>

> Address : Dornierstr. 4, D-82178 Puchheim, Germany<br>

> Phone   : +49 89 189 47 41-16<br>

> Fax     : +49 89 189 47 41-09<br>

> E-Mail  : <a href="mailto:sebastian.huber@embedded-brains.de">sebastian.huber@embedded-brains.de</a><br>

> PGP     : Public key available on request.<br>

><br>

> Diese Nachricht ist keine geschäftliche Mitteilung im Sinne des EHUG.<br>

><br>

> _______________________________________________<br>

> devel mailing list<br>

> <a href="mailto:devel@rtems.org">devel@rtems.org</a><br>

> <a href="http://lists.rtems.org/mailman/listinfo/devel">http://lists.rtems.org/mailman/listinfo/devel</a></p>