Self-contained one purpose objects

Thu Jul 23 13:05:05 UTC 2015

Hello Sebastian,

On Thursday 23 of July 2015 13:31:23 Sebastian Huber wrote:
> Hello Pavel,
>
> thanks for your comments.
>
> On 23/07/15 12:40, Pavel Pisa wrote:
> > Hello Sebastian,
...
> > I fully understand your motivation and for small footprint system
> > the direct pointers use is most efficient option.
> > But in the area of smallest footprint systems there are many
> > alternatives to RTEMS - MBED, Nuttx etc.
>
> my goal is to get it smaller compared to what we have now. I don't want
> the smallest system on the market. This would be only a side-effect, the
> main purpose of the self-contained objects is performance and an easier
> configuration. I think the conditional compilation in <rtems/confdefs.h>
> has reached a problematic complexity.

Yes, I understand. Only try to thought about complete overview.

> > So my suggestion is to take all these use cases into consideration.
> > It should not be taken as the hard requirement, if really means
> > unacceptable overhead for common use cases but should be considered.
>
> I don't want to change the existing APIs. This object identifier
> infrastructure is fine, but it was designed for a specific purpose, e.g.
> to enable a platform that supports asymmetric multiprocessing (the RTEMS
> MPCI support). With SMP we see now its limitations. A complex SMP
> application like the FreeBSD network stack uses hundreds of locks and
> the protection area of the locks are quite small. The lock/unlock
> sequence of an uncontested mutex is absolutely performance critical.

Understand and when the mutex is fast then even probability of
contest on muxes lower radically.

> > My feeling is that locking case with contention/wait should be
> > implemented the way that it allows future privilege separation
> > of scheduler/system core from applications as well as memory
> > context separation or use of hypervisors calls for wait.
> >
> > So I suggest to consider architecture similar to Linux FUTEX.
> >
> > http://www.akkadia.org/drepper/futex.pdf
> >
> > and for mutex implementation use this. May it be, add even
> > in the mutex structure field for identifier/RTEMS object ID.
> > At least for debug build, it would be great, if there is
> > in TCB (or in the case of kernel/user separation) well known
> > TLS variable which would hold pointer to the specific thread
> > taken mutex chain.
>
> For the optimized OpenMP support I use the Linux futex barrier
> implementation of libgomp and added two futex calls for RTEMS (see
> attached file of first e-mail). The performance is really good. For the
> mutex and semaphore objects, however, I don't use the futex approach of
> libgomp. Futexes have excellent properties for average case systems,
> e.g. they provide for example random fairness. RTEMS is supposed to be a
> real-time operating system. So, here random fairness is not enough,
> instead we need FIFO fairness.

RT properties for RTEMS are critical and I would even prefer to
define default mutex behavior to be equivalent to prioinherit (ideally
compatible with EDF policy) is the right option. Only other usable
type which has reasons in some situation is ceiling or
Stack Resource Policy (SRP).
Linux kernel uses internally only prioinherit mutexes in RT preempt
variant. I would consider inclusion of mutex without prioinherit
in any library as potential bug.

So I think that even STDLIB, OpenMP used mutexes and all other places
should use prioinherit. prioinherit can be combined with FUTEXes,

https://www.kernel.org/doc/Documentation/pi-futex.txt

Problem is combination with RWlocks.

> > But management of this is not so easy
> > if mutexes can be released in the different order than locked.
>
> We should not allow this. Such lock order reversals are bad.

I agree that this limitation makes applications more sane
and kernel simpler. But I have fear that it against POSIX
standard and that there can be applications which do not follow
that scheme. But I do not have problem with ordering requirement.

> > So it is not simple single locked list. But mapping which
> > mutexes are held by given thread is required even for priority
> > inheritance. This structure has to be kept in user manipulated
> > data only if we do not want overhead of the syscall in the future.
> > But all that is manageable and has been solved for FUTEX base
> > OS API.
...
> For the network stack, OpenMP and SMP in general its not a question of
> faster. Its a question of by far too slow or good enough. We should
> decide if we want to use self-contained objects for the Newlib internal
> locks and the C11/C++11 thread support in GCC.

I think that use of the same construct everywhere would be nice
(even for testing, analysis etc.). On the other hand, network
stack can be considered as kernel side part. But if we consider
microkernels than it is not.

Best wishes,

              Pavel