Critical Section: rtems_timer_fire_after, rtems_timer_server_fire_after
Joel Sherrill <joel@OARcorp.com>
joel.sherrill at OARcorp.com
Fri Jun 10 18:09:11 UTC 2005
Andrew Sinclair wrote:
> Hello there,
>
> I believe I have encountered a critical section problem with
> rtems_timer_fire_after and rtems_timer_server_fire_after. According to
> previous discussions on the mailing list it is acceptable to call these
> during interrupts. There is a case I have found where calling these will
> cause issues. I am using release 4.6.2.
>
> If a task or low priority interrupt modify the same timer as a high
> priority interrupt, then corruption of the timer chain can occur.
>
> For sake of simplicity, I have listed the general algorithm in RTEMS
> below.
>
> Consider a task/low priority interrupt calling rtems_timer_fire_after.
> Inside the marked critical section, a high priority interrupt may occur
> and insert the same timer into the chain without the task/lower priority
> interrupt having knowledge. When the task/lower priority interrupt
> resumes, it blindly modifies watchdog->state back to inactive, and
> reinserts the timer. This usually results in the timer chain previous
> and next pointers forming a loop, and ultimately locking up the device
> when another timer must be inserted.
>
> Task
> \-->rtems_timer_fire_after
> \-->Watchdog_Remove
> \--> disable interrrupts
> \--> remove timer if WATCHDOG_ACTIVE or stop if being
> inserted
> \--> watchdog->state=WATCHDOG_INACTIVE
> \--> enable interrupts
> [critical section start]
> \-->Watchdog_Initialize
> \--> watchdog->state=WATCHDOG_INACTIVE
> \-->Watchdog_Insert_ticks
> \-->Watchdog_Insert
> \-->watchdog->state=WATCHDOG_BEING_INSERTED
> [critical section end]
> \-->disable interrupts
> \-->while in timer chain
> \-->if at insert position exit
> \-->flash interrupts
> \-->check if timer has been touched
> \-->insert timer
> \-->enable interrupts
>
> My solution to this problem is listed below involves modification to
> rems_timer_fire_after, rtems_timer_fire_after, and Watchdog_Insert.
>
> Before watchdog->state is modified, I check disable interrupts and check
> if it has been modified. If it has been modified, then abort.
>
> I have listed relevant sections of code I hacked at below. Testing so
> far appears to have solved this issue. I see now that I probably should
> return RTEMS_INTERNAL_ERROR on abort.
> Can anyone see problems with the changes I have made?
It took me some time to mentally digest this and I think your
changes are OK. Once the discussion is done, you need to file
a PR with a patch.
The question is more of a desired behavior.
+ Should the interrupted operation be thrown away? Or should
the watchdog state be "initialzed atomically to being inserted"
and the interrupting operation return imemdiately. Now the
interrupting operation trumps the interrupted one. I think this
is probably OK logically because it should be logically the same
as if the 1st completed and the 2nd then ran to completion.
+ Should the interrupted operation return successful or something
like invalid state? Successful is OK if we take the "2nd operation
trumps" view.
+ I don't like calling it abort just because it sounds like an error.
It could be called something like insert_interrupted.
+ How about an early exit rather than a goto?
Sorry for the delay. This really was hard to think about -- especially
with the allergies I had going on last week.
> Kind Regards
>
> Andrew Sinclair
>
> ------------------------------------------------------------
>
> rtems_status_code rtems_timer_fire_after(
> Objects_Id id,
> rtems_interval ticks,
> rtems_timer_service_routine_entry routine,
> void *user_data
> )
> {
> Timer_Control *the_timer;
> Objects_Locations location;
> ISR_Level level; /*AS: line modified*/
>
> if ( ticks == 0 )
> return RTEMS_INVALID_NUMBER;
>
> if ( !routine )
> return RTEMS_INVALID_ADDRESS;
>
> the_timer = _Timer_Get( id, &location );
> switch ( location ) {
> case OBJECTS_REMOTE: /* should never return this */
> return RTEMS_INTERNAL_ERROR;
>
> case OBJECTS_ERROR:
> return RTEMS_INVALID_ID;
>
> case OBJECTS_LOCAL:
> (void) _Watchdog_Remove( &the_timer->Ticker );
> /*AS: start, perform inline initialisation with interrupts disabled,
> abort if watchdog touched*/
> _ISR_Disable( level );
> if (the_timer->Ticker.state != WATCHDOG_INACTIVE)
> goto timer_fire_after_abort_insert;
> the_timer->the_class = TIMER_INTERVAL;
> _Watchdog_Initialize( &the_timer->Ticker, routine, id, user_data
> );
> _ISR_Enable( level );
> /*AS: end*/
>
> _Watchdog_Insert_ticks( &the_timer->Ticker, ticks );
> _Thread_Enable_dispatch();
> return RTEMS_SUCCESSFUL;
> }
>
> return RTEMS_INTERNAL_ERROR; /* unreached - only to remove warnings
> */
> /*AS: start*/
> timer_fire_after_abort_insert:
> _ISR_Enable( level );
> _Thread_Enable_dispatch();
> return RTEMS_SUCCESSFUL;
> /*AS: end*/
> }
>
> ------------------------------------------------------------
>
> rtems_status_code rtems_timer_server_fire_after(
> Objects_Id id,
> rtems_interval ticks,
> rtems_timer_service_routine_entry routine,
> void *user_data
> )
> {
> Timer_Control *the_timer;
> Objects_Locations location;
> extern Chain_Control _Timer_Ticks_chain;
> ISR_Level level; /*AS: line modified*/
>
> if ( !_Timer_Server )
> return RTEMS_INCORRECT_STATE;
>
> if ( !routine )
> return RTEMS_INVALID_ADDRESS;
>
> if ( ticks == 0 )
> return RTEMS_INVALID_NUMBER;
>
> the_timer = _Timer_Get( id, &location );
> switch ( location ) {
> case OBJECTS_REMOTE: /* should never return this */
> return RTEMS_INTERNAL_ERROR;
>
> case OBJECTS_ERROR:
> return RTEMS_INVALID_ID;
>
> case OBJECTS_LOCAL:
> (void) _Watchdog_Remove( &the_timer->Ticker );
>
> /*AS: start, perform inline initialisation with interrupts disabled,
> abort if watchdog touched*/
> _ISR_Disable( level );
> if (the_timer->Ticker.state != WATCHDOG_INACTIVE)
> goto timer_server_fire_after_abort_insert;
> the_timer->the_class = TIMER_INTERVAL_ON_TASK;
> _Watchdog_Initialize( &the_timer->Ticker, routine, id, user_data
> );
> the_timer->Ticker.initial = ticks;
> _ISR_Enable( level );
> /*AS: end*/
> _Timer_Server_stop_ticks_timer();
> _Timer_Server_process_ticks_chain();
> _Watchdog_Insert( &_Timer_Ticks_chain, &the_timer->Ticker );
> _Timer_Server_reset_ticks_timer();
> _Thread_Enable_dispatch();
> return RTEMS_SUCCESSFUL;
> }
>
> return RTEMS_INTERNAL_ERROR; /* unreached - only to remove warnings
> */
> /*AS: start*/
> timer_server_fire_after_abort_insert:
> _ISR_Enable( level );
> _Thread_Enable_dispatch();
> return RTEMS_SUCCESSFUL;
> /*AS: end*/
> }
>
> ------------------------------------------------------------
>
> void _Watchdog_Insert(
> Chain_Control *header,
> Watchdog_Control *the_watchdog
> )
> {
> ISR_Level level;
> Watchdog_Control *after;
> unsigned32 insert_isr_nest_level;
> Watchdog_Interval delta_interval;
>
>
> insert_isr_nest_level = _ISR_Nest_level;
> /*AS: start , check if watchdog has been touched, and abort if
> necessary*/
> _ISR_Disable( level );
> if (the_watchdog->state != WATCHDOG_INACTIVE)
> goto abort_insert;
> /*AS: end*/
>
> the_watchdog->state = WATCHDOG_BEING_INSERTED;
>
>
> _Watchdog_Sync_count++;
>
> restart:
> delta_interval = the_watchdog->initial;
>
> /*
> * We CANT use _Watchdog_First() here, because a TICK interrupt
> * could modify the chain during the _ISR_Flash() below. Hence,
> * the header is pointing to volatile data. The _Watchdog_First()
> * INLINE routine (but not the macro - note the subtle difference)
> * casts away the 'volatile'...
> *
> * Also, this is only necessary because we call no other routine
> * from this piece of code, hence the compiler thinks it's safe to
> * cache *header!!
> *
> * Till Straumann, 7/2003 (gcc-3.2.2 -O4 on powerpc)
> *
> */
> for ( after = (Watchdog_Control *) ((volatile Chain_Control
> *)header)->first ;
> ;
> after = _Watchdog_Next( after ) ) {
>
> if ( delta_interval == 0 || !_Watchdog_Next( after ) )
> break;
>
> if ( delta_interval < after->delta_interval ) {
> after->delta_interval -= delta_interval;
> break;
> }
>
> delta_interval -= after->delta_interval;
>
> /*
> * If you experience problems comment out the _ISR_Flash line.
> * 3.2.0 was the first release with this critical section
> redesigned.
> * Under certain circumstances, the PREVIOUS critical section
> algorithm
> * used around this flash point allowed interrupts to execute
> * which violated the design assumptions. The critical section
> * mechanism used here WAS redesigned to address this.
> */
>
> _ISR_Flash( level );
>
> if ( the_watchdog->state != WATCHDOG_BEING_INSERTED ) {
> goto exit_insert;
> }
>
> if ( _Watchdog_Sync_level > insert_isr_nest_level ) {
> _Watchdog_Sync_level = insert_isr_nest_level;
> goto restart;
> }
> }
>
> _Watchdog_Activate( the_watchdog );
>
> the_watchdog->delta_interval = delta_interval;
>
> _Chain_Insert_unprotected( after->Node.previous, &the_watchdog->Node
> );
>
> the_watchdog->start_time = _Watchdog_Ticks_since_boot;
>
> exit_insert:
> _Watchdog_Sync_level = insert_isr_nest_level;
> _Watchdog_Sync_count--;
> abort_insert: /*AS: line modified*/
> _ISR_Enable( level );
> }
>
> ------------------------------------------------------------
--
Joel Sherrill, Ph.D. Director of Research & Development
joel at OARcorp.com On-Line Applications Research
Ask me about RTEMS: a free RTOS Huntsville AL 35805
Support Available (256) 722-9985
More information about the users
mailing list