RTEMS | cpukit: Create DHRL Library for DRAM Latency Mitigation (!1193)
Wayne Thornton (@wmthornton-dev)
gitlab at rtems.org
Thu May 21 14:21:28 UTC 2026
Wayne Thornton commented on a discussion on cpukit/dhrl/dhrl.c: https://gitlab.rtems.org/rtems/rtos/rtems/-/merge_requests/1193#note_150685
> +/* Determine optimal interleave bit for given memory region */
> +static void dhrl_calibrate_interleave( struct dhrl_control *ctx )
> +{
> + if ( ctx->active_config.interleave_bit != 0 ) {
> + return;
> + }
> +
> + uint8_t best_bit = 6;
> + uint64_t lowest_latency = UINT64_MAX;
> +
> + uint8_t max_safe_bit = 0;
> + size_t safe_size = ctx->active_config.memory_region_size / 2;
> + while ( safe_size > 1 ) {
> + max_safe_bit++;
> + safe_size >>= 1;
> + }
I looked into using `_Bitfield_Find_first_bit`, but the complication is that `safe_size` is a `size_t`, which can be 64-bit depending on the target architecture. My understanding is that the standard RTEMS bitfield routines are traditionally tailored for 32-bit words.
That being said, our modern GCC compilers provide `__builtin_clzll()` and by using `__builtin_clzll()` and casting to `unsigned long long`, we can guarantee it safely handles 64-bit memory boundaries across any architecture. It also has the major benefit of compiling directly down to a single `O(1)` hardware instruction (like `clz` on ARM or `bsr` on x86) instead of a software loop.
I know it relies on a GCC extension, but since the core RTEMS score headers already rely heavily on built-ins for `atomic` and bitwise operations, it seems like the safest and most performant approach here.
--
View it on GitLab: https://gitlab.rtems.org/rtems/rtos/rtems/-/merge_requests/1193#note_150685
You're receiving this email because of your account on gitlab.rtems.org.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rtems.org/pipermail/bugs/attachments/20260521/434af971/attachment-0001.htm>
More information about the bugs
mailing list