Synchronization problem in message queue mechanism

Cezar Antohe antohecezar at yahoo.com
Thu Aug 22 13:35:24 UTC 2013


Hi,

Thank you very much for the response, indeed we are using rtems_message_queue_send in an interrupt routine (and also some finite timeouts for receive as well).
Yes, we also have nested interrupts, T2 can be interrupted by T1.

We will test with your template to see if we can reproduce the problem.
I will let you know how it develops.

Thank you very much,
Cezar Antohe


________________________________
 From: Sebastian Huber <sebastian.huber at embedded-brains.de>
To: rtems-users at rtems.org 
Sent: Thursday, August 22, 2013 3:56 PM
Subject: Re: Synchronization problem in message queue mechanism
 

Hello,

there was a PR related to message queues:

https://www.rtems.org/bugzilla/show_bug.cgi?id=1961

It was fixed in 4.10.2, but not in 4.10.1.  So this may explain why it needs 
longer in 4.10.2 to get into trouble.

I remember that there was a similar problem with a NULL pointer access in the 
RTEMS events.

If I compare the functions _Event_Timeout() and _Thread_queue_Process_timeout() 
I am a bit surprised that _Thread_queue_Process_timeout() doesn't use 
_ISR_Disable/Enable() to protect the access to the_thread_queue->sync_state. 
On a first glance this looks like a major bug.

I added a test case for the RTEMS event problem:

http://git.rtems.org/rtems/commit/?id=57f125d02595661b72d66f27b6f71c9b9579f516

It should be possible to use this as a template to reproduce your message queue 
problem.

On 2013-08-22 14:14, Cezar Antohe wrote:
>
> Hello guys,
>
> We have been using RTEMS 4.10.1 version in a clinical care med unit, and we
> believe there may be a synchronization problem in the message queue mechanisms.
> We've observed that sometimes, the values from the currently running thread TCB
> table are not valid anymore.
> Let me give you 2 examples:
>
> 1. In function "rtems_message_queue_receive" there is a call to
> "_Message_queue_Translate_core_message_queue_return_code" with input
> parameter "_Thread_Executing->Wait.return_code".
> This parameters gets corrupted after some hours of unit functioning, looking
> into the code for "_Message_queue_Translate_core_message_queue_return_code",
> the input should be less that 6 value, however, the return_code returns 13, out
> of bound array and invalid.
>
> 2. Another bad situation happens in "_Thread_queue_Timeout" function, when
> calling "_Thread_queue_Process_timeout" - the input parameter
> "Thread_Control*the_thread" has its Wait.queue NULL. No check on that queue
> pointer is made in "_Thread_queue_Process_timeout" function, which tries to
> access a NULL pointer.
>
> We are no experts in RTEMS functionality and we haven't modified anything in
> the current RTEMS code, however, we've noticed that the problem seems to appear
> when a thread consumes the messages from the queue, sets the queue to NULL,
> another thread calls queue insertion, wakes the first thread, however, its
> queue remains NULL.
>
> We are making tests with patches for RTEMS version 4.10.2, the problem still
> exists, however it's diminished, meaning is appears after more functioning time
> for the infusing unit.
>
> Any help / idea / fast debug RTEMS method would be very much appreciated.
>
> Thank you very much,
>
> Cezar Antohe
>
>
>
> _______________________________________________
> rtems-users mailing list
> rtems-users at rtems.org
> http://www.rtems.org/mailman/listinfo/rtems-users
>


-- 
Sebastian Huber, embedded brains GmbH

Address : Dornierstr. 4, D-82178 Puchheim, Germany
Phone   : +49 89 189 47 41-16
Fax     : +49 89 189 47 41-09
E-Mail  : sebastian.huber at embedded-brains.de
PGP     : Public key available on request.

Diese Nachricht ist keine geschäftliche Mitteilung im Sinne des EHUG.
_______________________________________________
rtems-users mailing list
rtems-users at rtems.org
http://www.rtems.org/mailman/listinfo/rtems-users
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rtems.org/pipermail/users/attachments/20130822/a0ec21ad/attachment-0001.html>


More information about the users mailing list