Events sent to a task entering rtems_event_receive() may be lost

Mick Davis mickd at microsol.iinet.net.au
Tue Jun 1 05:48:00 UTC 2004


Bug report; GNATS is down

Originator
    Mick Davis

Organization
    Microsol (Aust)

Confidential
    no

Synopsis
    Events sent to a task entering rtems_event_receive() may be lost

Severty
    serious

Priority
    medium

Category
    rtems

Release
    4.5.0 and 4.6.1 see description

Environment
    Motorola coldfire 5307 (m68k) target, cygwin tools.

Description
    Although we are using RTEMS 4.5.0, we have replaced the files event*.c
with those from 4.6.1 to address previously

reported problems with lost events.

The problem here is that events may be lost when sent to a task which enters
rtems_event_receive() with options set to wait

with a timeout and to return on receipt of any event.

The events are sent from an interrupt source such as a timer service
routine.  If more than one set of events is sent to the

task before it returns, the first event set may be overwritten.

Using a debug logging function for the task which receives the event and the
timer service routine which sends the event, it

was seen that the following happened;

1) Task entered rtems_event_receive (RTEMS_ALL_EVENTS, RTEMS_EVENT_ANY |
RTEMS_WAIT, TEST_INTERVAL, &rx_event)
2) Timer service routine sent event 1
3) Timer service routine sent event 2
4) Timer service routine sent event 3
5) Task returned successfully from rtems_event_receive() with events 2 and 3
6) Event 1 was never received with successive calls to rtems_event_receive()

A review of the event code led to the following analysis and successful fix.

The task enters the function rtems_event_receive() and then _Event_Seize()
in eventseize.c.

For the case of no pending events, the task does not immediately return from
this function. At line 91 of eventseize.c, the

function enables interrupts in order to begin a timeout to wait for sent
events. At the time the function enables interrupts,

the variable _Event_Sync_state has been set to EVENT_SYNC_NOTHING_HAPPENED
but the call to set the thread state to

STATES_WAITING_FOR_EVENT has not been made (Line 103).

If the task is now interrupted with a call from the timer service routine to
rtems_event_send(), execution continues in

eventsend.c .

The new events are set in the task's pending events variable, and the
service routine enters _Event_Surrender() in

eventsurrender.c .

Since the task is waiting for the new events, execution enters the block

if ( !_Event_sets_Is_empty( seized_events ) ) {

However, since the task has not yet set the waiting for event flag, the code
does not enter the block

if ( _States_Is_waiting_for_event( the_thread->current_state ) )

and proceeds to the switch statement for the variable _Event_Sync_state,
entering the case for EVENT_SYNC_NOTHING_HAPPENED.

Since the thread is executing, and the option to return on any event is set,
the Wait.return_argument is set to the value of

seized_events and these events are cleared from those which are pending. The
timer service routine then returns.

At this point, if the task resumes execution, the sent events will be
received correctly. However, if a second timer service

routine is pending, hen execution will reenter _Event_Surrender(). The same
path may be followed, and since the second set of

new events will also satisfy the options used in the call to
rtems_event_receive(), they will also be written to the

Wait.return_argument and cleared from the pending events, which will
overwrite the value of Wait.return_argument set by the

first timer service routine. This means the events set by the first timer
service routine are not returned to the task, and

are lost.


How-To-Repeat
    Sample code is attached.

The problem may be demonstrated with a number of tasks which use
rtems_timer_fire_after() to run a service routine which

sends an event to the same task. The delay used to fire the timer should be
the same as the delay used to timeout the call to

rtems_event_receive (), which is quickly re entered. In this way, events are
being sent to the task as it enters

rtems_event_receive ().


Fix
    Diff file to eventsurrender.c is attached.

The fix I propose is similar to that used in PR584, which closed another
window. After writing the seized_events to

the_thread->Wait.return_argument at line 99 of eventsurrender.c, this
variable may be protected by clearing

the_thread->Wait.count . That is, in eventsurrender.c

98a99
>           (rtems_event_set) the_thread->Wait.count = 0;

Successive calls to rtems_event_send() will post events to the task's
pending_events, but these events will not be included

in the task's return because in _Event_Surrender() the event_condition will
be 0, and no event can satisfy the wait

condition.

This fix was found to return the events previously lost when using the test
code attached.


Mick Davis
Microsol (Aust) Pty Ltd
mickd AT microsol.iinet.net.au





More information about the users mailing list