RTEMS | cpukit: Create DHRL Library for DRAM Latency Mitigation (!1193)

Thu May 21 14:21:28 UTC 2026

Wayne Thornton commented on a discussion on cpukit/dhrl/dhrl.c: https://gitlab.rtems.org/rtems/rtos/rtems/-/merge_requests/1193#note_150685

 > +/* Determine optimal interleave bit for given memory region */
 > +static void dhrl_calibrate_interleave( struct dhrl_control *ctx )
 > +{
 > +  if ( ctx->active_config.interleave_bit != 0 ) {
 > +    return;
 > +  }
 > +
 > +  uint8_t  best_bit = 6;
 > +  uint64_t lowest_latency = UINT64_MAX;
 > +
 > +  uint8_t max_safe_bit = 0;
 > +  size_t  safe_size = ctx->active_config.memory_region_size / 2;
 > +  while ( safe_size > 1 ) {
 > +    max_safe_bit++;
 > +    safe_size >>= 1;
 > +  }

I looked into using `_Bitfield_Find_first_bit`, but the complication is that `safe_size` is a `size_t`, which can be 64-bit depending on the target architecture. My understanding is that the standard RTEMS bitfield routines are traditionally tailored for 32-bit words.

That being said, our modern GCC compilers provide `__builtin_clzll()` and by using `__builtin_clzll()` and casting to `unsigned long long`, we can guarantee it safely handles 64-bit memory boundaries across any architecture. It also has the major benefit of compiling directly down to a single `O(1)` hardware instruction (like `clz` on ARM or `bsr` on x86) instead of a software loop.

I know it relies on a GCC extension, but since the core RTEMS score headers already rely heavily on built-ins for `atomic` and bitwise operations, it seems like the safest and most performant approach here.

-- 
View it on GitLab: https://gitlab.rtems.org/rtems/rtos/rtems/-/merge_requests/1193#note_150685
You're receiving this email because of your account on gitlab.rtems.org.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rtems.org/pipermail/bugs/attachments/20260521/434af971/attachment-0001.htm>