Events sent to a task entering rtems_event_receive() may be lost
Mick Davis
mickd at microsol.iinet.net.au
Tue Jun 1 05:48:00 UTC 2004
Bug report; GNATS is down
Originator
Mick Davis
Organization
Microsol (Aust)
Confidential
no
Synopsis
Events sent to a task entering rtems_event_receive() may be lost
Severty
serious
Priority
medium
Category
rtems
Release
4.5.0 and 4.6.1 see description
Environment
Motorola coldfire 5307 (m68k) target, cygwin tools.
Description
Although we are using RTEMS 4.5.0, we have replaced the files event*.c
with those from 4.6.1 to address previously
reported problems with lost events.
The problem here is that events may be lost when sent to a task which enters
rtems_event_receive() with options set to wait
with a timeout and to return on receipt of any event.
The events are sent from an interrupt source such as a timer service
routine. If more than one set of events is sent to the
task before it returns, the first event set may be overwritten.
Using a debug logging function for the task which receives the event and the
timer service routine which sends the event, it
was seen that the following happened;
1) Task entered rtems_event_receive (RTEMS_ALL_EVENTS, RTEMS_EVENT_ANY |
RTEMS_WAIT, TEST_INTERVAL, &rx_event)
2) Timer service routine sent event 1
3) Timer service routine sent event 2
4) Timer service routine sent event 3
5) Task returned successfully from rtems_event_receive() with events 2 and 3
6) Event 1 was never received with successive calls to rtems_event_receive()
A review of the event code led to the following analysis and successful fix.
The task enters the function rtems_event_receive() and then _Event_Seize()
in eventseize.c.
For the case of no pending events, the task does not immediately return from
this function. At line 91 of eventseize.c, the
function enables interrupts in order to begin a timeout to wait for sent
events. At the time the function enables interrupts,
the variable _Event_Sync_state has been set to EVENT_SYNC_NOTHING_HAPPENED
but the call to set the thread state to
STATES_WAITING_FOR_EVENT has not been made (Line 103).
If the task is now interrupted with a call from the timer service routine to
rtems_event_send(), execution continues in
eventsend.c .
The new events are set in the task's pending events variable, and the
service routine enters _Event_Surrender() in
eventsurrender.c .
Since the task is waiting for the new events, execution enters the block
if ( !_Event_sets_Is_empty( seized_events ) ) {
However, since the task has not yet set the waiting for event flag, the code
does not enter the block
if ( _States_Is_waiting_for_event( the_thread->current_state ) )
and proceeds to the switch statement for the variable _Event_Sync_state,
entering the case for EVENT_SYNC_NOTHING_HAPPENED.
Since the thread is executing, and the option to return on any event is set,
the Wait.return_argument is set to the value of
seized_events and these events are cleared from those which are pending. The
timer service routine then returns.
At this point, if the task resumes execution, the sent events will be
received correctly. However, if a second timer service
routine is pending, hen execution will reenter _Event_Surrender(). The same
path may be followed, and since the second set of
new events will also satisfy the options used in the call to
rtems_event_receive(), they will also be written to the
Wait.return_argument and cleared from the pending events, which will
overwrite the value of Wait.return_argument set by the
first timer service routine. This means the events set by the first timer
service routine are not returned to the task, and
are lost.
How-To-Repeat
Sample code is attached.
The problem may be demonstrated with a number of tasks which use
rtems_timer_fire_after() to run a service routine which
sends an event to the same task. The delay used to fire the timer should be
the same as the delay used to timeout the call to
rtems_event_receive (), which is quickly re entered. In this way, events are
being sent to the task as it enters
rtems_event_receive ().
Fix
Diff file to eventsurrender.c is attached.
The fix I propose is similar to that used in PR584, which closed another
window. After writing the seized_events to
the_thread->Wait.return_argument at line 99 of eventsurrender.c, this
variable may be protected by clearing
the_thread->Wait.count . That is, in eventsurrender.c
98a99
> (rtems_event_set) the_thread->Wait.count = 0;
Successive calls to rtems_event_send() will post events to the task's
pending_events, but these events will not be included
in the task's return because in _Event_Surrender() the event_condition will
be 0, and no event can satisfy the wait
condition.
This fix was found to return the events previously lost when using the test
code attached.
Mick Davis
Microsol (Aust) Pty Ltd
mickd AT microsol.iinet.net.au
More information about the users
mailing list