POSIX Mutex Performance
Joel Sherrill
joel at OARcorp.com
Thu Mar 25 15:18:02 UTC 2004
Hi,
I have been thinking about this one. The biggest
thing is the first one. Other ideas and comments
follow.
+ We mentioned this problem earlier but now I think I
can add some meat. The RTEMS and Linux mutexes are
different and have different feature sets. RTEMS supports
the priority inheritance and ceiling protocols. I do not
see any hint of that in the Linux pthread mutex code.
The default case for RTEMS and Linux mutexes are different.
Linux picks a "fast" case which performs no error checking
and will allow you to deadlock. RTEMS always error checks
and has attibutes on the base mutex whether nesting is legal
or an error. Linux will let you DEADLOCK in the default
case!!
RTEMS is a real-time operating system and wants to
make the application's execution predictable. This
means providing tools to detect deadlock, avoid priority
inversion, have application limits on resource usage, etc.
Linux wasn't designed to meet those goals and the feature
set in this area shows that.
+ The Classic API Mutexes (rtems_semaphore*) are a bit more
optimized. For sure, the POSIX API goes through a wrapper
function which could technically be avoided to save a few
instructions.
+ One person on the list noticed that your Linux times varied
fairly significantly between two reports and I did not
seem the explanation. You might want to doublecheck the
timing mechanism using something similar to the procedure
in the tmck check.
+ RTEMS and glibc pthreads have different design approaches
which impacts the create/destroy times. RTEMS uses an
ID which is opaque and thus even if dereferenced by the
user won't harm the associated OS memory. The Linux pthread
code returns direct pointers to the user. The pointer
approach is a bit faster but not as safe/robust. The other
thing to note with this approach is that RTEMS explicit
initializes every field and reuses a user configured and
finite set of objects. These have to be reinitialized every
use. So create and destroy is going to be more expensive.
+ In looking at the current glibc source, I don't see how there
can be that much difference in the number of instructions
actually executed. From what I can tell, they do not inline
anything into the application and when you get to lock,
they make actual subroutine calls. I could be misreading this.
A mutex is a very well understood OS object and assuming
that everyone implemented it well, then the differences are
going to be in default behavior selected, implementation
structure/overhead, safety checks, and the like.
--
Joel Sherrill, Ph.D. Director of Research & Development
joel at OARcorp.com On-Line Applications Research
Ask me about RTEMS: a free RTOS Huntsville AL 35805
Support Available (256) 722-9985
More information about the users
mailing list