[rtems commit] kern_tc: unify timecounter to bintime delta conversion

Sebastian Huber sebh at rtems.org
Mon Feb 21 13:14:59 UTC 2022


Module:    rtems
Branch:    master
Commit:    870cf9695f688c06743dd7fe8a7113e0cb5e3607
Changeset: http://git.rtems.org/rtems/commit/?id=870cf9695f688c06743dd7fe8a7113e0cb5e3607

Author:    Andriy Gapon <avg at FreeBSD.org>
Date:      Tue Nov 30 15:23:23 2021 +0200

kern_tc: unify timecounter to bintime delta conversion

There are two places where we convert from a timecounter delta to
a bintime delta: tc_windup and bintime_off.
Both functions use the same calculations when the timecounter delta is
small.  But for a large delta (greater than approximately an equivalent
of 1 second) the calculations were different.  Both functions use
approximate calculations based on th_scale that avoid division.  Both
produce values slightly greater than a true value, calculated with
division by tc_frequency, would be.  tc_windup is slightly more
accurate, so its result is closer to the true value and, thus, smaller
than bintime_off result.

As a consequence there can be a jump back in time when time hands are
switched after a long period of time (a large delta).  Just before the
switch the time would be calculated with a large delta from
th_offset_count in bintime_off.  tc_windup does the switch using its own
calculations of a new th_offset using the large delta.  As explained
earlier, the new th_offset may end up being less than the previously
produced binuptime.  So, for a period of time new binuptime values may
be "back in time" comparing to values just before the switch.

Such a jump must never happen.  All the code assumes that the uptime is
monotonically nondecreasing and some code works incorrectly when that
assumption is broken.  For example, we have observed sleepq_timeout()
ignoring a timeout when the sbinuptime value obtained by the callout
code was greater than the expiration value, but the sbinuptime obtained
in sleepq_timeout() was less than it.  In that case the target thread
would never get woken up.

The unified calculations should ensure the monotonic property of the
uptime.

The problem is quite rare as normally tc_windup should be called HZ
times per second (typically 1000 or 100).  But it may happen in VMs on
very busy hypervisors where a VM's virtual CPU may not get an execution
time slot for a second or more.

Reviewed by:	kib
MFC after:	2 weeks
Sponsored by:	Panzura LLC

---

 cpukit/score/src/kern_tc.c | 42 +++++++++++++++++++++---------------------
 1 file changed, 21 insertions(+), 21 deletions(-)

diff --git a/cpukit/score/src/kern_tc.c b/cpukit/score/src/kern_tc.c
index c0f67df..b757a2e 100644
--- a/cpukit/score/src/kern_tc.c
+++ b/cpukit/score/src/kern_tc.c
@@ -389,6 +389,23 @@ tc_delta(struct timehands *th)
 	    tc->tc_counter_mask);
 }
 
+static __inline void
+bintime_add_tc_delta(struct bintime *bt, uint64_t scale,
+    uint64_t large_delta, uint64_t delta)
+{
+	uint64_t x;
+
+	if (__predict_false(delta >= large_delta)) {
+		/* Avoid overflow for scale * delta. */
+		x = (scale >> 32) * delta;
+		bt->sec += x >> 32;
+		bintime_addx(bt, x << 32);
+		bintime_addx(bt, (scale & 0xffffffff) * delta);
+	} else {
+		bintime_addx(bt, scale * delta);
+	}
+}
+
 /*
  * Functions for reading the time.  We have to loop until we are sure that
  * the timehands that we operated on was not updated under our feet.  See
@@ -400,7 +417,7 @@ bintime_off(struct bintime *bt, u_int off)
 {
 	struct timehands *th;
 	struct bintime *btp;
-	uint64_t scale, x;
+	uint64_t scale;
 #ifndef __rtems__
 	u_int delta, gen, large_delta;
 #else /* __rtems__ */
@@ -423,15 +440,7 @@ bintime_off(struct bintime *bt, u_int off)
 	} while (gen != th->th_generation);
 #endif
 
-	if (__predict_false(delta >= large_delta)) {
-		/* Avoid overflow for scale * delta. */
-		x = (scale >> 32) * delta;
-		bt->sec += x >> 32;
-		bintime_addx(bt, x << 32);
-		bintime_addx(bt, (scale & 0xffffffff) * delta);
-	} else {
-		bintime_addx(bt, scale * delta);
-	}
+	bintime_add_tc_delta(bt, scale, large_delta, delta);
 }
 #define	GETTHBINTIME(dst, member)					\
 do {									\
@@ -1684,17 +1693,8 @@ _Timecounter_Windup(struct bintime *new_boottimebin,
 #endif
 	th->th_offset_count += delta;
 	th->th_offset_count &= th->th_counter->tc_counter_mask;
-	while (delta > th->th_counter->tc_frequency) {
-		/* Eat complete unadjusted seconds. */
-		delta -= th->th_counter->tc_frequency;
-		th->th_offset.sec++;
-	}
-	if ((delta > th->th_counter->tc_frequency / 2) &&
-	    (th->th_scale * delta < ((uint64_t)1 << 63))) {
-		/* The product th_scale * delta just barely overflows. */
-		th->th_offset.sec++;
-	}
-	bintime_addx(&th->th_offset, th->th_scale * delta);
+	bintime_add_tc_delta(&th->th_offset, th->th_scale,
+	    th->th_large_delta, delta);
 
 #ifndef __rtems__
 	/*



More information about the vc mailing list