Some problems with the libbsd update

Wed Aug 22 07:50:57 UTC 2018

Hello,

I work currently on an update of the libbsd to the latest FreeBSD head 
(see also https://devel.rtems.org/ticket/3472). I was a quite smooth 
process until May 2018. FreeBSD seems to receive a significant amount of 
funding to perform better on NUMA systems. They started to use lock-free 
data structures in the kernel and included the Concurrency Kit in the 
base system:

https://github.com/freebsd/freebsd/tree/master/sys/contrib/ck

The weak point of lock-free data structures is the memory reclamation. 
FreeBSD introduced an epoch memory reclamation API:

https://github.com/freebsd/freebsd/blob/master/share/man/man9/epoch.9

It is now used for basic synchronization in the network stack and hard 
to avoid. The Concurrency Kit and the epoch memory reclamation API are 
interesting features for RTEMS as well. The FreeBSD implementation needs 
a thread pinning feature which is hard to implement in RTEMS. It turned 
out that this is only used as an optimization, see also:

https://lists.freebsd.org/pipermail/freebsd-hackers/2018-August/053165.html

To support everything in RTEMS is a lot of work, so I have to make some 
trade-offs. The implementation of this API must be as efficient as 
possible since it is used in the critical paths of the network stack. I 
will try to use a single global epoch and thread-specific records as 
suggested by Matthew Macy to avoid the need for per-processor data 
structures and the thread pinning. One key issue is that epoch records 
must not be destroyed:

https://www.mankier.com/3/ck_epoch_register

The consequence of this is that unlimited thread objects may lead to 
undefined behaviour with this implementation approach. Also thread-local 
storage cannot be used since it is reinitialized once a thread restarted 
or reused. The epoch record must be included in the Thread_Control and 
must not be touched by _Thread_Initialize(). This means I have to move 
the API and its implementation along with the Concurrency Kit to RTEMS.

Alternatively, I could try to implement the thread pinning feature. I am 
not sure if it is possible at all. It will definitely not work well 
together with mutex obtain timeouts.

Adding support for general purpose per-processor data structures would 
be quite easy. We just have to collect all per-processor data in a 
linker section and duplicate the section content for each secondary 
processor. Then use the _Per_CPU_Information[] to get a pointer to the 
corresponding memory area.

-- 
Sebastian Huber, embedded brains GmbH

Address : Dornierstr. 4, D-82178 Puchheim, Germany
Phone   : +49 89 189 47 41-16
Fax     : +49 89 189 47 41-09
E-Mail  : sebastian.huber at embedded-brains.de
PGP     : Public key available on request.

Diese Nachricht ist keine geschäftliche Mitteilung im Sinne des EHUG.