Synchronization problem in message queue mechanism
antohecezar at yahoo.com
Thu Aug 22 13:35:24 UTC 2013
Thank you very much for the response, indeed we are using rtems_message_queue_send in an interrupt routine (and also some finite timeouts for receive as well).
Yes, we also have nested interrupts, T2 can be interrupted by T1.
We will test with your template to see if we can reproduce the problem.
I will let you know how it develops.
Thank you very much,
From: Sebastian Huber <sebastian.huber at embedded-brains.de>
To: rtems-users at rtems.org
Sent: Thursday, August 22, 2013 3:56 PM
Subject: Re: Synchronization problem in message queue mechanism
there was a PR related to message queues:
It was fixed in 4.10.2, but not in 4.10.1. So this may explain why it needs
longer in 4.10.2 to get into trouble.
I remember that there was a similar problem with a NULL pointer access in the
If I compare the functions _Event_Timeout() and _Thread_queue_Process_timeout()
I am a bit surprised that _Thread_queue_Process_timeout() doesn't use
_ISR_Disable/Enable() to protect the access to the_thread_queue->sync_state.
On a first glance this looks like a major bug.
I added a test case for the RTEMS event problem:
It should be possible to use this as a template to reproduce your message queue
On 2013-08-22 14:14, Cezar Antohe wrote:
> Hello guys,
> We have been using RTEMS 4.10.1 version in a clinical care med unit, and we
> believe there may be a synchronization problem in the message queue mechanisms.
> We've observed that sometimes, the values from the currently running thread TCB
> table are not valid anymore.
> Let me give you 2 examples:
> 1. In function "rtems_message_queue_receive" there is a call to
> "_Message_queue_Translate_core_message_queue_return_code" with input
> parameter "_Thread_Executing->Wait.return_code".
> This parameters gets corrupted after some hours of unit functioning, looking
> into the code for "_Message_queue_Translate_core_message_queue_return_code",
> the input should be less that 6 value, however, the return_code returns 13, out
> of bound array and invalid.
> 2. Another bad situation happens in "_Thread_queue_Timeout" function, when
> calling "_Thread_queue_Process_timeout" - the input parameter
> "Thread_Control*the_thread" has its Wait.queue NULL. No check on that queue
> pointer is made in "_Thread_queue_Process_timeout" function, which tries to
> access a NULL pointer.
> We are no experts in RTEMS functionality and we haven't modified anything in
> the current RTEMS code, however, we've noticed that the problem seems to appear
> when a thread consumes the messages from the queue, sets the queue to NULL,
> another thread calls queue insertion, wakes the first thread, however, its
> queue remains NULL.
> We are making tests with patches for RTEMS version 4.10.2, the problem still
> exists, however it's diminished, meaning is appears after more functioning time
> for the infusing unit.
> Any help / idea / fast debug RTEMS method would be very much appreciated.
> Thank you very much,
> Cezar Antohe
> rtems-users mailing list
> rtems-users at rtems.org
Sebastian Huber, embedded brains GmbH
Address : Dornierstr. 4, D-82178 Puchheim, Germany
Phone : +49 89 189 47 41-16
Fax : +49 89 189 47 41-09
E-Mail : sebastian.huber at embedded-brains.de
PGP : Public key available on request.
Diese Nachricht ist keine geschäftliche Mitteilung im Sinne des EHUG.
rtems-users mailing list
rtems-users at rtems.org
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the users