Self-contained POSIX synchronization objects for RTEMS 4.12?

Sebastian Huber sebastian.huber at embedded-brains.de
Tue Sep 19 17:43:17 UTC 2017



----- Am 19. Sep 2017 um 16:40 schrieb Gedare Bloom gedare at rtems.org:

> On Tue, Sep 19, 2017 at 10:09 AM, Joel Sherrill <joel at rtems.org> wrote:
>>
>>
>> On Tue, Sep 19, 2017 at 8:16 AM, Sebastian Huber
>> <sebastian.huber at embedded-brains.de> wrote:
>>>
>>> Hello,
>>>
>>> we have to make some trade-offs in the implementation with respect to the
>>> error checking. The operations get a pointer to the synchronization object,
>>> e.g.
>>>
>>> int sem_post(sem_t *sem);
>>>
>>> int pthread_mutex_lock(pthread_mutex_t *mutex);
>>>
>>> Do we want to check for NULL pointers?
>>
>>
>> Is newlib consistently using the non-NULL declaration on the argument?

No, I didn't add this yet.  The GCC non-NULL attribute seems to enable dangerous optimizations. So, it was disabled on FreeBSD.

https://github.com/freebsd/freebsd/commit/7cf8787338d19546498331daa53d0d0a50a56178#diff-be62a312cb99ea95d7cdebd37c77b35d


>>
>> Newlib/Cygwin folks have been pretty insistent that they do not want to
>> check for NULL on the shared methods.
>>
>> Personally I hate to delete NULL argument pointer checks. Would a
>> long-term compromise be to move them to argument checking macros
>> that are enabled with --enable-rtems-debug?
>>
>>>
>>>
>>> Do we want to check for other obviously invalid pointer values, e.g.
>>> SEM_FAILED?
>>
>>
>> IMO Yes

With the move to self-contain objects the storage space management moves from the system to the user.  This makes it hard for the system to validate things.  I am not sure if we should check for error conditions that shouldn't be present in production code.  So, making these checks RTEMS_DEBUG dependent is something worth considering.  Maybe we need a RTEMS_ROBUST option focusing on user introduced errors.  RTEMS_DEBUG enables a lot of internal consistency checks.

>>
>>>
>>> Do we want to check if the object has been initialized?
>>
>>
>> Yes. We want to have predictable behavior.
>>
>>>
>>>
>>>
>>>
>>> glibc uses no checks at all.
>>
>>
>> Performance over correctness? That doesn't seem like a good trade.
>>
> It's actually performance assuming correctness. 

Yes, its not about correctness.  Its about different implementations of undefined behaviour.  Correct applications suffer from these error checks.

> In modern multicore
> software the lock acquire is a serious critical path. I don't know
> that RTEMS is quite to the point it matters so much, but eventually
> even some targets will prefer the optimized fast path of lock acquire.
> It's up to the caller to check that previous calls to create/init the
> object succeeded, and the main complication then is if the
> synchronization object is used after it is destroyed.
> 
> If possible we should provide control over the trade-off, e.g., with a
> debug flag for some checks.
> 
>>>
>>>
>>> FreeBSD checks that the object has been initialized. For this purpose it
>>> embeds a magic value field in the object structure. The drawback is that if
>>> we also do this, the objects are not zero-initialized and thus cannot reside
>>> in the BSS section.
>>
>>
>> This is an impossible trade. Some systems have big Flash and small RAM.
>> Others are the opposite.
>>
>> I would rather follow the FreeBSD model and know the object is initialized.
>>
> zero-initialized should also be check-able?

The RAM size doesn't matter.  Its about the ROM size.  For a magic value check, we could use some magic auto-initialization, e.g.

if (obj->magic != ok)
   check if completely zero, then set magic and ok, otherwise error

> 
>>>
>>>
>>> Destruction of synchronization objects in use is undefined behaviour
>>> according to POSIX. Do we want to flush waiting threads during destruction?
>>> This is a complex operation.
>>
>>
>> We have over 20 years of defined behavior in this case. I think
>> we flush and return the same error we always did. Otherwise,
>> we get a deadlock.
>>
> Is this true for all current synchronization objects we have? Where in
> the code does this happen?
> 
> If the destroyed object is easily identifiable, then waiting threads
> can be denied the acquire without having to flush at time of object
> destruction. (What about threads in the critical section of non-mutex
> locks, e.g., counting semaphores?)

Applications with objects that are in use during destruction are bound to have use after free problems.  The flush doesn't really help here.  It should be a fatal error or assert.

The object life-time safety provided by uni-processor RTEMS with system managed objects can no longer guaranteed with self-contained objects or on SMP.  You need a global lock for this.

> 
>>>
>>>
>>> What you think?
>>>
>>>
>>> --
>>> Sebastian Huber, embedded brains GmbH
>>>
>>> Address : Dornierstr. 4, D-82178 Puchheim, Germany
>>> Phone   : +49 89 189 47 41-16
>>> Fax     : +49 89 189 47 41-09
>>> E-Mail  : sebastian.huber at embedded-brains.de
>>> PGP     : Public key available on request.
>>>
>>> Diese Nachricht ist keine geschäftliche Mitteilung im Sinne des EHUG.
>>>
>>> _______________________________________________
>>> devel mailing list
>>> devel at rtems.org
>>> http://lists.rtems.org/mailman/listinfo/devel
>>
>>
>>
>> _______________________________________________
>> devel mailing list
>> devel at rtems.org
> > http://lists.rtems.org/mailman/listinfo/devel



More information about the devel mailing list