Critical Section: rtems_timer_fire_after, rtems_timer_server_fire_after

Andrew Sinclair Andrew.Sinclair at elprotech.com
Wed Jun 1 23:43:30 UTC 2005


Hello there,

I believe I have encountered a critical section problem with
rtems_timer_fire_after and rtems_timer_server_fire_after. According to
previous discussions on the mailing list it is acceptable to call these
during interrupts. There is a case I have found where calling these will
cause issues. I am using release 4.6.2.

If a task or low priority interrupt modify the same timer as a high
priority interrupt, then corruption of the timer chain can occur.

For sake of simplicity, I have listed the general algorithm in RTEMS
below. 

Consider a task/low priority interrupt calling rtems_timer_fire_after.
Inside the marked critical section, a high priority interrupt may occur
and insert the same timer into the chain without the task/lower priority
interrupt having knowledge. When the task/lower priority interrupt
resumes, it blindly modifies watchdog->state back to inactive, and
reinserts the timer. This usually results in the timer chain previous
and next pointers forming a loop, and ultimately locking up the device
when another timer must be inserted.

Task
 \-->rtems_timer_fire_after
      \-->Watchdog_Remove
           \--> disable interrrupts
           \--> remove timer if WATCHDOG_ACTIVE or stop if being
inserted
           \--> watchdog->state=WATCHDOG_INACTIVE
           \--> enable interrupts
[critical section start]
      \-->Watchdog_Initialize
           \--> watchdog->state=WATCHDOG_INACTIVE
      \-->Watchdog_Insert_ticks
           \-->Watchdog_Insert
                \-->watchdog->state=WATCHDOG_BEING_INSERTED
[critical section end]
                \-->disable interrupts
                \-->while in timer chain 
                     \-->if at insert position exit
                     \-->flash interrupts
                     \-->check if timer has been touched
                \-->insert timer
                \-->enable interrupts

My solution to this problem is listed below involves modification to
rems_timer_fire_after, rtems_timer_fire_after, and Watchdog_Insert. 

Before watchdog->state is modified, I check disable interrupts and check
if it has been modified. If it has been modified, then abort.

I have listed relevant sections of code I hacked at below. Testing so
far appears to have solved this issue. I see now that I probably should
return RTEMS_INTERNAL_ERROR on abort. 
Can anyone see problems with the changes I have made? 

Kind Regards
 
Andrew Sinclair

------------------------------------------------------------

rtems_status_code rtems_timer_fire_after(
  Objects_Id                         id,
  rtems_interval                     ticks,
  rtems_timer_service_routine_entry  routine,
  void                              *user_data
)
{
  Timer_Control      *the_timer;
  Objects_Locations   location;
  ISR_Level          	level; /*AS: line modified*/

  if ( ticks == 0 )
    return RTEMS_INVALID_NUMBER;

  if ( !routine )
    return RTEMS_INVALID_ADDRESS;

  the_timer = _Timer_Get( id, &location );
  switch ( location ) {
    case OBJECTS_REMOTE:            /* should never return this */
      return RTEMS_INTERNAL_ERROR;

    case OBJECTS_ERROR:
      return RTEMS_INVALID_ID;

    case OBJECTS_LOCAL:
      (void) _Watchdog_Remove( &the_timer->Ticker );
/*AS: start, perform inline initialisation with interrupts disabled,
abort if watchdog touched*/
      _ISR_Disable( level );
	  if (the_timer->Ticker.state != WATCHDOG_INACTIVE)
		goto timer_fire_after_abort_insert;
      the_timer->the_class = TIMER_INTERVAL;
      _Watchdog_Initialize( &the_timer->Ticker, routine, id, user_data
);
      _ISR_Enable( level );
/*AS: end*/

      _Watchdog_Insert_ticks( &the_timer->Ticker, ticks );
      _Thread_Enable_dispatch();
      return RTEMS_SUCCESSFUL;
  }

  return RTEMS_INTERNAL_ERROR;   /* unreached - only to remove warnings
*/
/*AS: start*/
timer_fire_after_abort_insert:
  _ISR_Enable( level );
  _Thread_Enable_dispatch();
  return RTEMS_SUCCESSFUL;
/*AS: end*/
}

------------------------------------------------------------

rtems_status_code rtems_timer_server_fire_after(
  Objects_Id                         id,
  rtems_interval                     ticks,
  rtems_timer_service_routine_entry  routine,
  void                              *user_data
)
{
  Timer_Control        *the_timer;
  Objects_Locations     location;
  extern Chain_Control  _Timer_Ticks_chain;
  ISR_Level          	level; /*AS: line modified*/

  if ( !_Timer_Server )
    return RTEMS_INCORRECT_STATE;

  if ( !routine )
    return RTEMS_INVALID_ADDRESS;

  if ( ticks == 0 )
    return RTEMS_INVALID_NUMBER;

  the_timer = _Timer_Get( id, &location );
  switch ( location ) {
    case OBJECTS_REMOTE:            /* should never return this */
      return RTEMS_INTERNAL_ERROR;

    case OBJECTS_ERROR:
      return RTEMS_INVALID_ID;

    case OBJECTS_LOCAL:
      (void) _Watchdog_Remove( &the_timer->Ticker );

/*AS: start, perform inline initialisation with interrupts disabled,
abort if watchdog touched*/
      _ISR_Disable( level );
	  if (the_timer->Ticker.state != WATCHDOG_INACTIVE)
		goto timer_server_fire_after_abort_insert;
      the_timer->the_class = TIMER_INTERVAL_ON_TASK;
      _Watchdog_Initialize( &the_timer->Ticker, routine, id, user_data
);
      the_timer->Ticker.initial = ticks;
      _ISR_Enable( level );
/*AS: end*/
      _Timer_Server_stop_ticks_timer();
      _Timer_Server_process_ticks_chain();
      _Watchdog_Insert( &_Timer_Ticks_chain, &the_timer->Ticker );
       _Timer_Server_reset_ticks_timer();
      _Thread_Enable_dispatch();
      return RTEMS_SUCCESSFUL;
  }

  return RTEMS_INTERNAL_ERROR;   /* unreached - only to remove warnings
*/
/*AS: start*/
timer_server_fire_after_abort_insert:
  _ISR_Enable( level );
  _Thread_Enable_dispatch();
  return RTEMS_SUCCESSFUL;
/*AS: end*/
}

------------------------------------------------------------

void _Watchdog_Insert(
  Chain_Control         *header,
  Watchdog_Control      *the_watchdog
)
{
  ISR_Level          level;
  Watchdog_Control  *after;
  unsigned32         insert_isr_nest_level;
  Watchdog_Interval  delta_interval;
  

  insert_isr_nest_level   = _ISR_Nest_level;
/*AS: start , check if watchdog has been touched, and abort if
necessary*/
  _ISR_Disable( level ); 
  if (the_watchdog->state != WATCHDOG_INACTIVE)
	goto abort_insert;
/*AS: end*/

  the_watchdog->state = WATCHDOG_BEING_INSERTED;


  _Watchdog_Sync_count++;

restart:
  delta_interval = the_watchdog->initial;

  /*
   * We CANT use _Watchdog_First() here, because a TICK interrupt
   * could modify the chain during the _ISR_Flash() below. Hence,
   * the header is pointing to volatile data. The _Watchdog_First()
   * INLINE routine (but not the macro - note the subtle difference)
   * casts away the 'volatile'...
   *
   * Also, this is only necessary because we call no other routine
   * from this piece of code, hence the compiler thinks it's safe to
   * cache *header!!
   *
   *  Till Straumann, 7/2003 (gcc-3.2.2 -O4 on powerpc)
   *
   */
  for ( after = (Watchdog_Control *) ((volatile Chain_Control
*)header)->first ;
        ;
        after = _Watchdog_Next( after ) ) {

     if ( delta_interval == 0 || !_Watchdog_Next( after ) )
       break;

     if ( delta_interval < after->delta_interval ) {
       after->delta_interval -= delta_interval;
       break;
     }

     delta_interval -= after->delta_interval;

     /*
      *  If you experience problems comment out the _ISR_Flash line.  
      *  3.2.0 was the first release with this critical section
redesigned.
      *  Under certain circumstances, the PREVIOUS critical section
algorithm
      *  used around this flash point allowed interrupts to execute
      *  which violated the design assumptions.  The critical section 
      *  mechanism used here WAS redesigned to address this.
      */

     _ISR_Flash( level );

     if ( the_watchdog->state != WATCHDOG_BEING_INSERTED ) {
       goto exit_insert;
     }

     if ( _Watchdog_Sync_level > insert_isr_nest_level ) {
       _Watchdog_Sync_level = insert_isr_nest_level;
       goto restart;
     }
  }

  _Watchdog_Activate( the_watchdog );

  the_watchdog->delta_interval = delta_interval;

  _Chain_Insert_unprotected( after->Node.previous, &the_watchdog->Node
);

  the_watchdog->start_time = _Watchdog_Ticks_since_boot;

exit_insert:
  _Watchdog_Sync_level = insert_isr_nest_level;
  _Watchdog_Sync_count--;
abort_insert: /*AS: line modified*/
  _ISR_Enable( level );
}

------------------------------------------------------------



More information about the users mailing list