Synchronization problem in message queue mechanism

Joel Sherrill joel.sherrill at
Thu Aug 22 13:23:10 UTC 2013

On 8/22/2013 7:56 AM, Sebastian Huber wrote:
> Hello,
> there was a PR related to message queues:
> It was fixed in 4.10.2, but not in 4.10.1.  So this may explain why it needs
> longer in 4.10.2 to get into trouble.
> I remember that there was a similar problem with a NULL pointer access in the
> RTEMS events.
> If I compare the functions _Event_Timeout() and _Thread_queue_Process_timeout()
> I am a bit surprised that _Thread_queue_Process_timeout() doesn't use
> _ISR_Disable/Enable() to protect the access to the_thread_queue->sync_state.
> On a first glance this looks like a major bug.
The assumption is that _Thread_queue_Process_timeout() is called from a 
clock tick
ISR but thinking that doesn't prevent a nested interrupt from occurring.

Does this system nest interrupts?
> I added a test case for the RTEMS event problem:
> It should be possible to use this as a template to reproduce your message queue
> problem.
> On 2013-08-22 14:14, Cezar Antohe wrote:
>> Hello guys,
>> We have been using RTEMS 4.10.1 version in a clinical care med unit, and we
>> believe there may be a synchronization problem in the message queue mechanisms.
>> We've observed that sometimes, the values from the currently running thread TCB
>> table are not valid anymore.
>> Let me give you 2 examples:
>> 1. In function "rtems_message_queue_receive" there is a call to
>> "_Message_queue_Translate_core_message_queue_return_code" with input
>> parameter "_Thread_Executing->Wait.return_code".
>> This parameters gets corrupted after some hours of unit functioning, looking
>> into the code for "_Message_queue_Translate_core_message_queue_return_code",
>> the input should be less that 6 value, however, the return_code returns 13, out
>> of bound array and invalid.
>> 2. Another bad situation happens in "_Thread_queue_Timeout" function, when
>> calling "_Thread_queue_Process_timeout" - the input parameter
>> "Thread_Control*the_thread" has its Wait.queue NULL. No check on that queue
>> pointer is made in "_Thread_queue_Process_timeout" function, which tries to
>> access a NULL pointer.
>> We are no experts in RTEMS functionality and we haven't modified anything in
>> the current RTEMS code, however, we've noticed that the problem seems to appear
>> when a thread consumes the messages from the queue, sets the queue to NULL,
>> another thread calls queue insertion, wakes the first thread, however, its
>> queue remains NULL.
>> We are making tests with patches for RTEMS version 4.10.2, the problem still
>> exists, however it's diminished, meaning is appears after more functioning time
>> for the infusing unit.
>> Any help / idea / fast debug RTEMS method would be very much appreciated.
>> Thank you very much,
>> Cezar Antohe
>> _______________________________________________
>> rtems-users mailing list
>> rtems-users at

Joel Sherrill, Ph.D.             Director of Research & Development
joel.sherrill at        On-Line Applications Research
Ask me about RTEMS: a free RTOS  Huntsville AL 35805
Support Available                (256) 722-9985

More information about the users mailing list