Synchronization problem in message queue mechanism
antohecezar at yahoo.com
Thu Aug 22 12:14:45 UTC 2013
We have been using RTEMS 4.10.1 version in a clinical care med unit, and we believe there may be a synchronization problem in the message queue mechanisms.
We've observed that sometimes, the values from the currently running thread TCB table are not valid anymore.
Let me give you 2 examples:
1. In function "rtems_message_queue_receive" there is a call to "_Message_queue_Translate_core_message_queue_return_code" with input parameter "_Thread_Executing->Wait.return_code".
This parameters gets corrupted after some hours of unit functioning, looking into the code for "_Message_queue_Translate_core_message_queue_return_code", the input should be less that 6 value, however, the return_code returns 13, out of bound array and invalid.
2. Another bad situation happens in "_Thread_queue_Timeout" function, when calling "_Thread_queue_Process_timeout" - the input parameter "Thread_Control*the_thread" has its Wait.queue NULL. No check on that queue pointer is made in "_Thread_queue_Process_timeout" function, which tries to access a NULL pointer.
We are no experts in RTEMS functionality and we haven't modified anything in the current RTEMS code, however, we've noticed that the problem seems to appear when a thread consumes the messages from the queue, sets the queue to NULL, another thread calls queue insertion, wakes the first thread, however, its queue remains NULL.
We are making tests with patches for RTEMS version 4.10.2, the problem still exists, however it's diminished, meaning is appears after more functioning time for the infusing unit.
Any help / idea / fast debug RTEMS method would be very much appreciated.
Thank you very much,
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the users