[RTEMS Project] #2054: Race condition in FIFO/pipe implementation
RTEMS trac
trac at rtems.org
Thu Dec 18 11:13:05 UTC 2014
#2054: Race condition in FIFO/pipe implementation
-----------------------------+----------------------------
Reporter: sebastian.huber | Owner: joel.sherrill
Type: defect | Status: new
Priority: low | Milestone: 5.0
Component: cpukit | Version: 4.11
Severity: critical | Resolution:
Keywords: |
-----------------------------+----------------------------
Changes (by sebastian.huber):
* priority: normal => low
* severity: normal => critical
* milestone: 4.11 => 5.0
Old description:
> Does anyone know why the FIFO/pipe implementation uses barriers for
> synchronization? For me the usage of the barrier looks like a hand
> crafted condition variable.
>
> I think there is also a race condition in its use. The
> rtems_barrier_release() releases only the threads currently waiting at
> the barrier. Now we have several code sections like this:
>
> PIPE_UNLOCK(pipe);
> if (! PIPE_WRITEWAIT(pipe))
> ret = -EINTR;
>
> Thus the release may have no effect in case it happens after the
> PIPE_UNLOCK and before the PIPE_WRITEWAIT (which is a
> rtems_barrier_wait() with no time out).
>
> The bdbuf code uses a similar approach with semaphores. It disables the
> preemption to avoid this race condition:
>
> static void
> rtems_bdbuf_anonymous_wait (rtems_bdbuf_waiters *waiters)
> {
> rtems_status_code sc;
> rtems_mode prev_mode;
>
> /*
> * Indicate we are waiting.
> */
> ++waiters->count;
>
> /*
> * Disable preemption then unlock the cache and block. There is no
> POSIX
> * condition variable in the core API so this is a work around.
> *
> * The issue is a task could preempt after the cache is unlocked
> because it is
> * blocking or just hits that window, and before this task has blocked
> on the
> * semaphore. If the preempting task flushes the queue this task will
> not see
> * the flush and may block for ever or until another transaction
> flushes this
> * semaphore.
> */
> prev_mode = rtems_bdbuf_disable_preemption ();
>
> /*
> * Unlock the cache, wait, and lock the cache when we return.
> */
> rtems_bdbuf_unlock_cache ();
>
> sc = rtems_semaphore_obtain (waiters->sema, RTEMS_WAIT,
> RTEMS_BDBUF_WAIT_TIMEOUT);
>
> if (sc == RTEMS_TIMEOUT)
> rtems_fatal_error_occurred (RTEMS_BLKDEV_FATAL_BDBUF_CACHE_WAIT_TO);
>
> if (sc != RTEMS_UNSATISFIED)
> rtems_fatal_error_occurred (RTEMS_BLKDEV_FATAL_BDBUF_CACHE_WAIT_2);
>
> rtems_bdbuf_lock_cache ();
>
> rtems_bdbuf_restore_preemption (prev_mode);
>
> --waiters->count;
> }
>
> Please note that this approach is not SMP proof. I think we need
> condition variables in the RTEMS Classic API.
New description:
Does anyone know why the FIFO/pipe implementation uses barriers for
synchronization? For me the usage of the barrier looks like a hand
crafted condition variable.
I think there is also a race condition in its use. The
rtems_barrier_release() releases only the threads currently waiting at the
barrier. Now we have several code sections like this:
PIPE_UNLOCK(pipe);
if (! PIPE_WRITEWAIT(pipe))
ret = -EINTR;
Thus the release may have no effect in case it happens after the
PIPE_UNLOCK and before the PIPE_WRITEWAIT (which is a rtems_barrier_wait()
with no time out).
The bdbuf code uses a similar approach with semaphores. It disables the
preemption to avoid this race condition:
static void
rtems_bdbuf_anonymous_wait (rtems_bdbuf_waiters *waiters)
{
rtems_status_code sc;
rtems_mode prev_mode;
/*
* Indicate we are waiting.
*/
++waiters->count;
/*
* Disable preemption then unlock the cache and block. There is no
POSIX
* condition variable in the core API so this is a work around.
*
* The issue is a task could preempt after the cache is unlocked because
it is
* blocking or just hits that window, and before this task has blocked
on the
* semaphore. If the preempting task flushes the queue this task will
not see
* the flush and may block for ever or until another transaction flushes
this
* semaphore.
*/
prev_mode = rtems_bdbuf_disable_preemption ();
/*
* Unlock the cache, wait, and lock the cache when we return.
*/
rtems_bdbuf_unlock_cache ();
sc = rtems_semaphore_obtain (waiters->sema, RTEMS_WAIT,
RTEMS_BDBUF_WAIT_TIMEOUT);
if (sc == RTEMS_TIMEOUT)
rtems_fatal_error_occurred (RTEMS_BLKDEV_FATAL_BDBUF_CACHE_WAIT_TO);
if (sc != RTEMS_UNSATISFIED)
rtems_fatal_error_occurred (RTEMS_BLKDEV_FATAL_BDBUF_CACHE_WAIT_2);
rtems_bdbuf_lock_cache ();
rtems_bdbuf_restore_preemption (prev_mode);
--waiters->count;
}
Please note that this approach is not SMP proof. I think we need
condition variables in the RTEMS Classic API.
--
--
Ticket URL: <http://devel.rtems.org/ticket/2054#comment:3>
RTEMS Project <http://www.rtems.org/>
RTEMS Project
More information about the bugs
mailing list