IMPORTANT WAS Re: FW: RTEMS event send/receive - events apparently lost.
Joel Sherrill
joel.sherrill at OARcorp.com
Thu Mar 29 15:46:54 UTC 2001
I don't know how this one slipped through. Zoltan's memory is correct.
We discussed this privately a long time ago. Somehow the patch never
got applied.
If you are using a snapshot prior to rtems-19990528 or release
4.0 or earlier, the same modification will have to be made to the
_Event_Surrender routine in the file event.c. As of rtems-19990528,
the file event.c was split. [If you are using code old enough to
precede the split of event.c, you probably should update for
general purposes anyway. :)]
Given that the network stack does use events, this patch should be
applied if you are using the network stack.
--joel
Nick.SIMON at syntegra.bt.co.uk wrote:
>
> Zoltan Kocsi kindly sent me a fix for the problem I had encountered with
> RTEMS events, where one was getting missed if two event sources fired in
> close time. Attached is a patch file for 4.5.0, and here FYI is Zoltan's
> email:
>
> -- Nick Simon
>
> -----Original Message-----
> From: Zoltan Kocsi [mailto:zoltan at bendor.com.au]
> Sent: 22 March 2001 03:29
> To: Nick.SIMON at syntegra.bt.co.uk
> Subject: RTEMS event send/receive - events apparently lost.
>
> Hi,
>
> I have sent a bug report and a fix with relation to lost events
> about a year ago (4.0.0). People in the know (including Joel)
> said that indeed it was a bug and that my solution was OK.
> I don't know that the fix was actually applied or not in 4.5.0.
>
> Nevertheless, here's my original mail, it might be relevant:
>
> ---------------------------------
> It seems to me that there's a mutual-exclusivity bug in RTEMS events
> and I also think I know where and what and how to fix it.
>
> The symptom
> ===========
>
> I have a task which listens to 2 events. One is sent from
> an other task, the other from an interrupt routine.
>
> It seems that if the interrupt routine calls rtems_event_send() when
> the signalling task is also calling rtems_event_send(), then the
> task's signal gets lost. That is ( DEBUG( c ) is a very fast,
> uninterruptible log function that stores a single character in a
> buffer):
>
> task1()
> {
> for ( ;; ) {
> DEBUG( '?' );
> rtems_event_receive( 3, RTEMS_WAIT | RTEMS_EVENT_ANY,
> RTEMS_NO_TIMEOUT, &event );
> DEBUG( event+'0' );
> }
> }
>
> task2()
> {
> for (;;) {
> sleep( 1 );
> DEBUG( '[' );
> rtems_event_send( task1_id, 1 );
> DEBUG( ']' );
> }
> }
>
> interrupt()
> {
> <all sorts of interruptish things>
> DEBUG( '*' );
> rtems_event_send( task1_id, 2 );
> }
>
> Thus, in the debug ?[]2 represents task1() going to wait, task2()
> going signalling and task1 waking up with event 2. Similarly,
> ?*1 means task1() waiting, interrupt signalling, task1 waking up.
>
> I have a log showing this:
>
> ?[]2?*1?*1?*1?[]2?*1?[*]1?*1?*1
>
> As it seems from the log, one event is lost: ?[*]1?*1 shows it.
>
> Analyzis
> ========
>
> ?[*] means that the following happened:
>
> - task1() goes to wait
> - task2() wakes up and goes to send event 2
> - before it finishes sending, an interrupt comes and sends signal 1
> - task2()'s event_send returns, task2() goes to sleep.
>
> At this moment, the system could be in the following 3 states:
>
> 1) The interrupt came when task2() has already sent the signal to
> task1() but task1() have not woken up yet.
> In this case when task1() wakes up, it should receive all 2
> signals. It would look like ?[*]3 in the log.
>
> 2) The interrupt came when task1() was already waken up by task2().
> In this case task1() received event 2 and have a pending event 1,
> which it will receive immediatelly next time when it goes to wait.
> In the log it would be ?[*]2?1
>
> 3) The interrupt came before task2() had a chance to send its event
> and task1() is waken up by the interrupt. In this case task1()
> receives event 1 and has a pending event 2. In the log it would be
> ?[*]1?2
>
> However, the log shows ?[*]1?*1. This means that task1() went to
> sleep, both task2() and the interrupt sent an event, task1() woke up,
> received one event and had *no* pending events (for if it had had, the
> next time around it would have woken up immediately, with nothing
> between its ? and 1 or 2 in the log).
>
> Cause
> =====
>
> Looking into the event_send() routine offers an explanation.
> The following is happening, IMHO:
>
> rtems_event_send() does this:
>
> _Event_sets_Post( event_in, &api->pending_events );
> _Event_Surrender( the_thread );
> _Thread_Enable_dispatch();
>
> task2 calls event_send( 2 ).
>
> _Event_sets_Post() sets task1()'s pending_events to 2.
>
> Now: pending_events 2 and event_condition is 3.
>
> _Event_Surrender() will then:
>
> - Disable the interrupt
> - Task is waiting for an event ? YES
> - Task's wait mask and mode satisfied by pending_events ? YES
> - Delete seized event from pending list ==> event_pending is now 0
> - Set the return_argument to the seized event ==> return_argument is now 2
> - Enable interrupt
>
> If at this moment the interrupt routine arrives and calls
> rtems_event_send( 1 ), it will re-enter _Event_Surrender():
>
> - Task is waiting for an event ? YES AND THIS IS A BUG !!!
> - Task's wait mask and mode satisfied by pending_events ? YES
> - Delete seized event from pending list ==> event_pending is now 0
> - Set the return_argument to the seized event ==> return_argument is
> now 1 WHICH IS WRONG !
>
> That is, since _Thread_Unblock() has not been called yet by task2(),
> task1() is *still* in waiting for event state when the interrupt
> comes, even though the event sent by task2() has already been
> delivered and removed from the pending list. Therefore, the interrupt
> routine's event will simply overwrite task2()'s.
>
> The fix
> =======
>
> The solution seems to be relatively simple:
>
> If an event was seized, then event condition should be cleared, that
> is, in event.c (from line 281):
>
> _ISR_Disable( level );
> pending_events = api->pending_events;
> event_condition = (rtems_event_set) the_thread->Wait.count;
>
> seized_events = _Event_sets_Get( pending_events, event_condition );
>
> if ( !_Event_sets_Is_empty( seized_events ) ) {
> if ( _States_Is_waiting_for_event( the_thread->current_state ) ) {
> if ( seized_events == event_condition || _Options_Is_any( option_set )
> ) {
> api->pending_events =
> _Event_sets_Clear( pending_events, seized_events );
> *(rtems_event_set *)the_thread->Wait.return_argument =
> seized_events;
>
> _ISR_Flash( level );
>
> should be changed to:
>
> _ISR_Disable( level );
> pending_events = api->pending_events;
> event_condition = (rtems_event_set) the_thread->Wait.count;
>
> seized_events = _Event_sets_Get( pending_events, event_condition );
>
> if ( !_Event_sets_Is_empty( seized_events ) ) {
> if ( _States_Is_waiting_for_event( the_thread->current_state ) ) {
> if ( seized_events == event_condition || _Options_Is_any( option_set )
> ) {
> api->pending_events =
> _Event_sets_Clear( pending_events, seized_events );
> *(rtems_event_set *)the_thread->Wait.return_argument =
> seized_events;
> (rtems_event_set) the_thread->Wait.count = 0; /* NEW CODE */
>
> _ISR_Flash( level );
>
> This would assure that until the task's state changes to something
> other than waiting for events, no more events will be delivered,
> (subsequent calls will find seized_events 0) all new events will be
> left pending.
>
> Regards,
>
> Zoltan
>
> ******************************************************************************
>
> Check us out at http://www.syntegra.com
>
> ***********************************************************************
>
> ------------------------------------------------------------------------
> Name: eventpatch
> eventpatch Type: unspecified type (application/octet-stream)
> Encoding: quoted-printable
--
Joel Sherrill, Ph.D. Director of Research & Development
joel at OARcorp.com On-Line Applications Research
Ask me about RTEMS: a free RTOS Huntsville AL 35805
Support Available (256) 722-9985
More information about the users
mailing list