Critical Section: rtems_timer_fire_after, rtems_timer_server_fire_after

Andrew Sinclair Andrew.Sinclair at elprotech.com
Mon Jun 13 23:26:15 UTC 2005


Thanks Joel. I didn't expect a quick reply on this one. It took me some
time to understand what was going on, and I appreciate everyone is busy.
I hope you are feeling better.

I agree with all your discussion points. I agree that the higher
interrupt is the one that should ultimately set the timer, and both
attempts by the low and high priority tasks to set the timer should be
classed as a success. 


-----Original Message-----
From: Joel Sherrill <joel at OARcorp.com>
[mailto:joel.sherrill at OARcorp.com] 
Sent: Saturday, 11 June 2005 4:15 AM
To: Andrew Sinclair
Cc: rtems-users at rtems.com
Subject: Re: Critical Section: rtems_timer_fire_after,
rtems_timer_server_fire_after

Andrew Sinclair wrote:
> Hello there,
> 
> I believe I have encountered a critical section problem with 
> rtems_timer_fire_after and rtems_timer_server_fire_after. According to

> previous discussions on the mailing list it is acceptable to call 
> these during interrupts. There is a case I have found where calling 
> these will cause issues. I am using release 4.6.2.
> 
> If a task or low priority interrupt modify the same timer as a high 
> priority interrupt, then corruption of the timer chain can occur.
> 
> For sake of simplicity, I have listed the general algorithm in RTEMS 
> below.
> 
> Consider a task/low priority interrupt calling rtems_timer_fire_after.
> Inside the marked critical section, a high priority interrupt may 
> occur and insert the same timer into the chain without the task/lower 
> priority interrupt having knowledge. When the task/lower priority 
> interrupt resumes, it blindly modifies watchdog->state back to 
> inactive, and reinserts the timer. This usually results in the timer 
> chain previous and next pointers forming a loop, and ultimately 
> locking up the device when another timer must be inserted.
> 
> Task
>  \-->rtems_timer_fire_after
>       \-->Watchdog_Remove
>            \--> disable interrrupts
>            \--> remove timer if WATCHDOG_ACTIVE or stop if being 
> inserted
>            \--> watchdog->state=WATCHDOG_INACTIVE
>            \--> enable interrupts
> [critical section start]
>       \-->Watchdog_Initialize
>            \--> watchdog->state=WATCHDOG_INACTIVE
>       \-->Watchdog_Insert_ticks
>            \-->Watchdog_Insert
>                 \-->watchdog->state=WATCHDOG_BEING_INSERTED
> [critical section end]
>                 \-->disable interrupts
>                 \-->while in timer chain 
>                      \-->if at insert position exit
>                      \-->flash interrupts
>                      \-->check if timer has been touched
>                 \-->insert timer
>                 \-->enable interrupts
> 
> My solution to this problem is listed below involves modification to 
> rems_timer_fire_after, rtems_timer_fire_after, and Watchdog_Insert.
> 
> Before watchdog->state is modified, I check disable interrupts and 
> check if it has been modified. If it has been modified, then abort.
> 
> I have listed relevant sections of code I hacked at below. Testing so 
> far appears to have solved this issue. I see now that I probably 
> should return RTEMS_INTERNAL_ERROR on abort.
> Can anyone see problems with the changes I have made? 

It took me some time to mentally digest this and I think your changes
are OK.  Once the discussion is done, you need to file a PR with a
patch.

The question is more of a desired behavior.

+ Should the interrupted operation be thrown away?  Or should
   the watchdog state be "initialzed atomically to being inserted"
   and the interrupting operation return imemdiately.  Now the
   interrupting operation trumps the interrupted one.  I think this
   is probably OK logically because it should be logically the same
   as if the 1st completed and the 2nd then ran to completion.

+ Should the interrupted operation return successful or something
   like invalid state?  Successful is OK if we take the "2nd operation
   trumps" view.

+ I don't like calling it abort just because it sounds like an error.
   It could be called something like insert_interrupted.

+ How about an early exit rather than a goto?

Sorry for the delay.  This really was hard to think about -- especially
with the allergies I had going on last week.

> Kind Regards
>  
> Andrew Sinclair
> 
> ------------------------------------------------------------
> 
> rtems_status_code rtems_timer_fire_after(
>   Objects_Id                         id,
>   rtems_interval                     ticks,
>   rtems_timer_service_routine_entry  routine,
>   void                              *user_data
> )
> {
>   Timer_Control      *the_timer;
>   Objects_Locations   location;
>   ISR_Level          	level; /*AS: line modified*/
> 
>   if ( ticks == 0 )
>     return RTEMS_INVALID_NUMBER;
> 
>   if ( !routine )
>     return RTEMS_INVALID_ADDRESS;
> 
>   the_timer = _Timer_Get( id, &location );
>   switch ( location ) {
>     case OBJECTS_REMOTE:            /* should never return this */
>       return RTEMS_INTERNAL_ERROR;
> 
>     case OBJECTS_ERROR:
>       return RTEMS_INVALID_ID;
> 
>     case OBJECTS_LOCAL:
>       (void) _Watchdog_Remove( &the_timer->Ticker );
> /*AS: start, perform inline initialisation with interrupts disabled, 
> abort if watchdog touched*/
>       _ISR_Disable( level );
> 	  if (the_timer->Ticker.state != WATCHDOG_INACTIVE)
> 		goto timer_fire_after_abort_insert;
>       the_timer->the_class = TIMER_INTERVAL;
>       _Watchdog_Initialize( &the_timer->Ticker, routine, id, user_data

> );
>       _ISR_Enable( level );
> /*AS: end*/
> 
>       _Watchdog_Insert_ticks( &the_timer->Ticker, ticks );
>       _Thread_Enable_dispatch();
>       return RTEMS_SUCCESSFUL;
>   }
> 
>   return RTEMS_INTERNAL_ERROR;   /* unreached - only to remove
warnings
> */
> /*AS: start*/
> timer_fire_after_abort_insert:
>   _ISR_Enable( level );
>   _Thread_Enable_dispatch();
>   return RTEMS_SUCCESSFUL;
> /*AS: end*/
> }
> 
> ------------------------------------------------------------
> 
> rtems_status_code rtems_timer_server_fire_after(
>   Objects_Id                         id,
>   rtems_interval                     ticks,
>   rtems_timer_service_routine_entry  routine,
>   void                              *user_data
> )
> {
>   Timer_Control        *the_timer;
>   Objects_Locations     location;
>   extern Chain_Control  _Timer_Ticks_chain;
>   ISR_Level          	level; /*AS: line modified*/
> 
>   if ( !_Timer_Server )
>     return RTEMS_INCORRECT_STATE;
> 
>   if ( !routine )
>     return RTEMS_INVALID_ADDRESS;
> 
>   if ( ticks == 0 )
>     return RTEMS_INVALID_NUMBER;
> 
>   the_timer = _Timer_Get( id, &location );
>   switch ( location ) {
>     case OBJECTS_REMOTE:            /* should never return this */
>       return RTEMS_INTERNAL_ERROR;
> 
>     case OBJECTS_ERROR:
>       return RTEMS_INVALID_ID;
> 
>     case OBJECTS_LOCAL:
>       (void) _Watchdog_Remove( &the_timer->Ticker );
> 
> /*AS: start, perform inline initialisation with interrupts disabled, 
> abort if watchdog touched*/
>       _ISR_Disable( level );
> 	  if (the_timer->Ticker.state != WATCHDOG_INACTIVE)
> 		goto timer_server_fire_after_abort_insert;
>       the_timer->the_class = TIMER_INTERVAL_ON_TASK;
>       _Watchdog_Initialize( &the_timer->Ticker, routine, id, user_data

> );
>       the_timer->Ticker.initial = ticks;
>       _ISR_Enable( level );
> /*AS: end*/
>       _Timer_Server_stop_ticks_timer();
>       _Timer_Server_process_ticks_chain();
>       _Watchdog_Insert( &_Timer_Ticks_chain, &the_timer->Ticker );
>        _Timer_Server_reset_ticks_timer();
>       _Thread_Enable_dispatch();
>       return RTEMS_SUCCESSFUL;
>   }
> 
>   return RTEMS_INTERNAL_ERROR;   /* unreached - only to remove
warnings
> */
> /*AS: start*/
> timer_server_fire_after_abort_insert:
>   _ISR_Enable( level );
>   _Thread_Enable_dispatch();
>   return RTEMS_SUCCESSFUL;
> /*AS: end*/
> }
> 
> ------------------------------------------------------------
> 
> void _Watchdog_Insert(
>   Chain_Control         *header,
>   Watchdog_Control      *the_watchdog
> )
> {
>   ISR_Level          level;
>   Watchdog_Control  *after;
>   unsigned32         insert_isr_nest_level;
>   Watchdog_Interval  delta_interval;
>   
> 
>   insert_isr_nest_level   = _ISR_Nest_level;
> /*AS: start , check if watchdog has been touched, and abort if 
> necessary*/
>   _ISR_Disable( level ); 
>   if (the_watchdog->state != WATCHDOG_INACTIVE)
> 	goto abort_insert;
> /*AS: end*/
> 
>   the_watchdog->state = WATCHDOG_BEING_INSERTED;
> 
> 
>   _Watchdog_Sync_count++;
> 
> restart:
>   delta_interval = the_watchdog->initial;
> 
>   /*
>    * We CANT use _Watchdog_First() here, because a TICK interrupt
>    * could modify the chain during the _ISR_Flash() below. Hence,
>    * the header is pointing to volatile data. The _Watchdog_First()
>    * INLINE routine (but not the macro - note the subtle difference)
>    * casts away the 'volatile'...
>    *
>    * Also, this is only necessary because we call no other routine
>    * from this piece of code, hence the compiler thinks it's safe to
>    * cache *header!!
>    *
>    *  Till Straumann, 7/2003 (gcc-3.2.2 -O4 on powerpc)
>    *
>    */
>   for ( after = (Watchdog_Control *) ((volatile Chain_Control 
> *)header)->first ;
>         ;
>         after = _Watchdog_Next( after ) ) {
> 
>      if ( delta_interval == 0 || !_Watchdog_Next( after ) )
>        break;
> 
>      if ( delta_interval < after->delta_interval ) {
>        after->delta_interval -= delta_interval;
>        break;
>      }
> 
>      delta_interval -= after->delta_interval;
> 
>      /*
>       *  If you experience problems comment out the _ISR_Flash line.  
>       *  3.2.0 was the first release with this critical section 
> redesigned.
>       *  Under certain circumstances, the PREVIOUS critical section 
> algorithm
>       *  used around this flash point allowed interrupts to execute
>       *  which violated the design assumptions.  The critical section 
>       *  mechanism used here WAS redesigned to address this.
>       */
> 
>      _ISR_Flash( level );
> 
>      if ( the_watchdog->state != WATCHDOG_BEING_INSERTED ) {
>        goto exit_insert;
>      }
> 
>      if ( _Watchdog_Sync_level > insert_isr_nest_level ) {
>        _Watchdog_Sync_level = insert_isr_nest_level;
>        goto restart;
>      }
>   }
> 
>   _Watchdog_Activate( the_watchdog );
> 
>   the_watchdog->delta_interval = delta_interval;
> 
>   _Chain_Insert_unprotected( after->Node.previous, &the_watchdog->Node

> );
> 
>   the_watchdog->start_time = _Watchdog_Ticks_since_boot;
> 
> exit_insert:
>   _Watchdog_Sync_level = insert_isr_nest_level;
>   _Watchdog_Sync_count--;
> abort_insert: /*AS: line modified*/
>   _ISR_Enable( level );
> }
> 
> ------------------------------------------------------------


-- 
Joel Sherrill, Ph.D.             Director of Research & Development
joel at OARcorp.com                 On-Line Applications Research
Ask me about RTEMS: a free RTOS  Huntsville AL 35805
    Support Available             (256) 722-9985








More information about the users mailing list