nanosleep.c remarks

Mon Aug 1 05:18:47 UTC 2016

Hello Pavel,

On 30/07/16 19:40, Pavel Pisa wrote:
> Hello Gedare and Sebastian,
>
> as the clock_nanosleep is in the place now, I am trying to
> analyze consequences and I have some questions.
>
> The first one, why is _Nanosleep_Pseudo_queue required
> there. nanosleep is critical function for realtime and
> it is quite possible that many threads on more CPUs
> us that concurrently.

in this case the application has a design problem. Currently, the 
nanosleep() is just an expensive wrapper for rtems_task_wake_after() 
with the addition of signal support. Why use this function at all?

> But _Thread_queue_Enqueue calls
> _Thread_queue_Acquire( the_thread_queue, &queue_context.Lock_context );
> unconditionally. This leads to _SMP_ticket_lock_Acquire on SMP.
> So this means that all calls are serialized and contend
> for cache lines. But I do not understand why queue
> used normally for wakeup request distribution is required
> for nanosleep. Original code has only selected appropriate
> state and activated scheduller
>
> Original nanosleep
>
>    /*
>     *  Block for the desired amount of time
>     */
>    _Thread_Disable_dispatch();
>      executing = _Thread_Executing;
>      _Thread_Set_state(
>        executing,
>        STATES_DELAYING | STATES_INTERRUPTIBLE_BY_SIGNAL
>      );

Please have a look at the commit history and relevant tickets, e.g.

https://devel.rtems.org/ticket/2130

This code is broken with signals.

>      _Watchdog_Initialize(
>        &executing->Timer,
>        _Thread_Delay_ended,
>        0,
>        executing
>      );
>      _Watchdog_Insert_ticks( &executing->Timer, ticks );
>    _Thread_Enable_dispatch();
>
> Actual nanosleep
>
>    /*
>     *  Block for the desired amount of time
>     */
>    _Thread_queue_Enqueue(
>      &_Nanosleep_Pseudo_queue,
>      &_Thread_queue_Operations_FIFO,
>      executing,
>      STATES_DELAYING | STATES_INTERRUPTIBLE_BY_SIGNAL,
>      ticks,
>      discipline,
>      1
>    );
>
> But if simple _Thread_Set_state approach is not supported then
> how is it that it is still used in rtems_task_wake_after

The rtems_task_wake_after() is not interruptible by signals.

>
> rtems_status_code rtems_task_wake_after(
>    rtems_interval ticks
> )
> {
>    /*
>     * It is critical to obtain the executing thread after thread dispatching is
>     * disabled on SMP configurations.
>     */
>    Thread_Control  *executing;
>    Per_CPU_Control *cpu_self;
>
>    cpu_self = _Thread_Dispatch_disable();
>      executing = _Thread_Executing;
>
>      if ( ticks == 0 ) {
>        _Thread_Yield( executing );
>      } else {
>        _Thread_Set_state( executing, STATES_DELAYING );
>        _Thread_Wait_flags_set( executing, THREAD_WAIT_STATE_BLOCKED );
>        _Thread_Timer_insert_relative(
>          executing,
>          cpu_self,
>          _Thread_Timeout,
>          ticks
>        );
>      }
>    _Thread_Dispatch_enable( cpu_self );
>    return RTEMS_SUCCESSFUL;
> }
>
> Then the time stuff.
>
> nanosleep_helper() does not distinguish between CLOCK_REALTIME
> and CLOCK_MONOTONIC when it computes remaining time (rmtp).
> But the intention of this field is that if you call again
> nanoslepp/clock_nanosleep with same parameters and rtmp
> used as time to wait (in case of TIMER_ABSTIME is not set) then
> the final wake time should be +/- same as if there has been
> no interruption. If we consider POSIX required behavior/difference
> between CLOCK_REALTIME and CLOCK_MONOTONIC and possibility
> to adjust realtime clock then it would not work as expected.
>
> By the way, _Timespec_From_ticks works expected way only for
> first 1.19 hour after boot if used for absolute time (not used
> that way in nanosleep).
> For relative time, If the nanosleep is used for longer delay
> than 4294 seconds then rtmp the result is complete garbage
>
> void _Timespec_From_ticks(
>    uint32_t         ticks,
>    struct timespec *time
> )
> {
>    uint32_t    usecs;
>
>    usecs = ticks * rtems_configuration_get_microseconds_per_tick();
>
>    time->tv_sec  = usecs / TOD_MICROSECONDS_PER_SECOND;
>    time->tv_nsec = (usecs % TOD_MICROSECONDS_PER_SECOND) *
>                      TOD_NANOSECONDS_PER_MICROSECOND;
> }

This function is probably superfluous now due to the timekeeping via the 
FreeBSD timecounters. We tried to keep the existing behaviour during the 
introduction of the FreeBSD timecounters. However, some parts of the 
POSIX implementation in RTEMS are not according to POSIX. This must be 
fixed step by step (its not on my current TODO list).

>
> If we consider that crystal oscillator is not perfect then
> value of rtems_configuration_get_microseconds_per_tick has to be
> tuned runtime but problem is that to not shift time by change
> of scale if it is not changed at ticks == 0, it means
> to use y = a * x + b there and at each time a from a1 to a2
> is changed change b such that a2 * x + b2 = a1 * x + b1
> to ensure tick to usec monotonicity for conversion of
> monotonic time from ticks to timespec.
>
> Another problem is that for higher frequency tick or ting time
> source is the value rtems_configuration_get_microseconds_per_tick
> is small so relative precision is insufficient.
>
> For clock_nanosleep we get to _TOD_Absolute_timeout_to_ticks
> which calls for CLOCK_MONOTONIC in
>
> I have mostly lost track in the call chain there.
> bintime2timespec is provided by NewLib as part of BSD time
> framework introduction
>
> https://devel.rtems.org/ticket/2271
> https://www.daemon-systems.org/man/timecounter.9.html
>
> Structure struct timecounter seems to be almost sane from
> the documentation. But u_int64_t tc_frequency without
> shifting right requires unnecessarily wide multiplication
> or even worse division and relative resolution can be
> low for some cases.

The tc_frequency is not used in the hot paths, please have a look at 
_Timecounter_Windup().

>
> I am trying to study the code
>
> static inline void _TOD_Get_zero_based_uptime_as_timespec(
>    struct timespec *time
> )
> {
>    _Timecounter_Nanouptime( time );
>    --time->tv_sec;
> }
>
> where seconds decrement seems suspicious to me.

For FreeBSD compatibility the uptime starts with one second. For RTEMS 
compatibility the uptime starts at zero.

>
> There seems to be data structures for precise time computation
> and synchronization (sys/timeffc.h, etc.) but I  am not sure
> if some of them are used.

The use of ntp_update_second() is currently disabled. This code could be 
easily ported from FreeBSD to RTEMS if someone is interested.

>
> General rule for POSIX systems is that CLOCK_MONOTONIC and
> CLOCK_REALTIME scaling is done in sync only the base and
> step corrections are applied to CLOCK_REALTIME only.
> But there seem to be two relatively independed paths
> in the actual sources.
>
> Other strict requirement for nanosleep is that it has
> to suspend task in minimum for specified time. But I am
> not sure if there is such round up in the actual code.

Yes, the some test cases of the libstdc++ testsuite rely on this.

> This is critical if user build his/her own timers
> queue and premature wakeup leads to repeated abundant nanosleep
> calls and (in the case of round down) it can even result in busy
> loop for last tick cycle for example.
>
> Generally there seems to be many multiplications, divisions
> etc at leas in clock_nanosleep path.
>
> I do not have full picture gained yet. But my feeling is
> that there are at least some problematic things
> which I have tried to analyze.
>
> But generally, it is great that clock_nanosleep
> is supported same as some other POSIX timed IPC
> variants.
>
> Best wishes,
>
>                 Pavel

-- 
Sebastian Huber, embedded brains GmbH

Address : Dornierstr. 4, D-82178 Puchheim, Germany
Phone   : +49 89 189 47 41-16
Fax     : +49 89 189 47 41-09
E-Mail  : sebastian.huber at embedded-brains.de
PGP     : Public key available on request.

Diese Nachricht ist keine geschäftliche Mitteilung im Sinne des EHUG.